Abhinav Madahar, Yuze Ma, Kunal Patel
Comments: 4 pages
Subjects: Neural and Evolutionary Computing (cs.NE)
Machine learning is increasingly prevalent in stock market trading. Though
neural networks have seen success in computer vision and natural language
processing, they have not been as useful in stock market trading. To
demonstrate the applicability of a neural network in stock trading, we made a
single-layer neural network that recommends buying or selling shares of a stock
by comparing the highest high of 10 consecutive days with that of the next 10
days, a process repeated for the stock’s year-long historical data. A
chi-squared analysis found that the neural network can accurately and
appropriately decide whether to buy or sell shares for a given stock, showing
that a neural network can make simple decisions about the stock market.
Andrea Soltoggio, Kenneth O. Stanley, Sebastian Risi
Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI)
Biological neural networks are systems of extraordinary computational
capabilities shaped by evolution, development, and lifetime learning. The
interplay of these elements leads to the emergence of adaptive behavior and
intelligence, but the complexity of the whole system of interactions is an
obstacle to the understanding of the key factors at play. Inspired by such
intricate natural phenomena, Evolved Plastic Artificial Neural Networks
(EPANNs) use simulated evolution in-silico to breed plastic neural networks,
artificial systems composed of sensors, outputs, and plastic components that
change in response to sensory-output experiences in an environment. These
systems may reveal key algorithmic ingredients of adaptation, autonomously
discover novel adaptive algorithms, and lead to hypotheses on the emergence of
biological adaptation. EPANNs have seen considerable progress over the last two
decades. Current scientific and technological advances in artificial neural
networks are now setting the conditions for radically new approaches and
results. In particular, the limitations of hand-designed structures and
algorithms currently used in most deep neural networks could be overcome by
more flexible and innovative solutions. This paper brings together a variety of
inspiring ideas that define the field of EPANNs. The main computational methods
and results are reviewed. Finally, new opportunities and developments are
presented.
Lior Fritz, David Burshtein
Comments: Submitted to Interspeech 2017
Subjects: Learning (cs.LG); Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE)
An hybrid of a hidden Markov model (HMM) and a deep neural network (DNN) is
considered. End-to-end training using gradient descent is suggested, similarly
to the training of connectionist temporal classification (CTC). We use a
maximum a-posteriori (MAP) criterion with a simple language model in the
training stage, and a standard HMM decoder without approximations. Recognition
results are presented using speech databases. Our method compares favorably to
CTC in terms of performance, robustness and quality of alignments.
Jun-Yan Zhu, Taesung Park, Phillip Isola, Alexei A. Efros
Comments: Submitted to ICCV 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Image-to-image translation is a class of vision and graphics problems where
the goal is to learn the mapping between an input image and an output image
using a training set of aligned image pairs. However, for many tasks, paired
training data will not be available. We present an approach for learning to
translate an image from a source domain (X) to a target domain (Y) in the
absence of paired examples. Our goal is to learn a mapping (G: X
ightarrow Y)
such that the distribution of images from (G(X)) is indistinguishable from the
distribution (Y) using an adversarial loss. Because this mapping is highly
under-constrained, we couple it with an inverse mapping (F: Y
ightarrow X)
and introduce a cycle consistency loss to push (F(G(X)) approx X) (and vice
versa). Qualitative results are presented on several tasks where paired
training data does not exist, including collection style transfer, object
transfiguration, season transfer, photo enhancement, etc. Quantitative
comparisons against several prior methods demonstrate the superiority of our
approach.
Eduardo Ruiz, Walterio Mayol-Cuevas
Comments: 10 pages, 12 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
This paper develops and evaluates a new tensor field representation to
express the geometric affordance of one object over another. We expand the well
known bisector surface representation to one that is weight-driven and that
retains the provenance of surface points with directional vectors. We also
incorporate the notion of affordance keypoints which allow for faster decisions
at a point of query and with a compact and straightforward descriptor. Using a
single interaction example, we are able to generalize to previously-unseen
scenarios; both synthetic and also real scenes captured with RGBD sensors. We
show how our interaction tensor allows for significantly better performance
over alternative formulations. Evaluations also include crowdsourcing
comparisons that confirm the validity of our affordance proposals, which agree
on average 84% of the time with human judgments, and which is 20-40% better
than the baseline methods.
Ayush Tewari, Michael Zollhöfer, Hyeongwoo Kim, Pablo Garrido, Florian Bernard, Patrick Pérez, Christian Theobalt
Comments: 10 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In this work we propose a novel model-based deep convolutional autoencoder
that addresses the highly challenging problem of reconstructing a 3D human face
from a single in-the-wild color image. To this end, we combine a convolutional
encoder network with an expert-designed generative model that serves as
decoder. The core innovation is our new differentiable parametric decoder that
encapsulates image formation analytically based on a generative model. Our
decoder takes as input a code vector with exactly defined semantic meaning that
encodes detailed face pose, shape, expression, skin reflectance and scene
illumination. Due to this new way of combining CNN-based with model-based face
reconstruction, the CNN-based encoder learns to extract semantically meaningful
parameters from a single monocular input image. For the first time, a CNN
encoder and an expert-designed generative model can be trained end-to-end in an
unsupervised manner, which renders training on very large (unlabeled) real
world data feasible. The obtained reconstructions compare favorably to current
state-of-the-art approaches in terms of quality and richness of representation.
Aram Ter-Sarkisov, Robert Ross, John Kelleher
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)
This paper introduces a new approach to the long-term tracking of an object
in a challenging environment. The object is a cow and the environment is an
enclosure in a cowshed. Some of the key challenges in this domain are a
cluttered background, low contrast and high similarity between moving objects
which greatly reduces the efficiency of most existing approaches, including
those based on background subtraction. Our approach is split into object
localization, instance segmentation, learning and tracking stages. Our solution
is compared to a range of semi-supervised object tracking algorithms and we
show that the performance is strong and well suited to subsequent analysis. We
present our solution as a first step towards broader tracking and behavior
monitoring for cows in precision agriculture with the ultimate objective of
early detection of lameness.
Mu Li, Wangmeng Zuo, Shuhang Gu, Debin Zhao, David Zhang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Lossy image compression is generally formulated as a joint rate-distortion
optimization to learn encoder, quantizer, and decoder. However, the quantizer
is non-differentiable, and discrete entropy estimation usually is required for
rate control. These make it very challenging to develop a convolutional network
(CNN)-based image compression system. In this paper, motivated by that the
local information content is spatially variant in an image, we suggest that the
bit rate of the different parts of the image should be adapted to local
content. And the content aware bit rate is allocated under the guidance of a
content-weighted importance map. Thus, the sum of the importance map can serve
as a continuous alternative of discrete entropy estimation to control
compression rate. And binarizer is adopted to quantize the output of encoder
due to the binarization scheme is also directly defined by the importance map.
Furthermore, a proxy function is introduced for binary operation in backward
propagation to make it differentiable. Therefore, the encoder, decoder,
binarizer and importance map can be jointly optimized in an end-to-end manner
by using a subset of the ImageNet database. In low bit rate image compression,
experiments show that our system significantly outperforms JPEG and JPEG 2000
by structural similarity (SSIM) index, and can produce the much better visual
result with sharp edges, rich textures, and fewer artifacts.
Hossam Isack, Olga Veksler, Ipek Oguz, Milan Sonka, Yuri Boykov
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We propose an effective optimization algorithm for a general hierarchical
segmentation model with geometric interactions between segments. Any given tree
can specify a partial order over object labels defining a hierarchy. It is
well-established that segment interactions, such as inclusion/exclusion and
margin constraints, make the model significantly more discriminant. However,
existing optimization methods do not allow full use of such models. Generic
-expansion results in weak local minima, while common binary multi-layered
formulations lead to non-submodularity, complex high-order potentials, or polar
domain unwrapping and shape biases. In practice, applying these methods to
arbitrary trees does not work except for simple cases. Our main contribution is
an optimization method for the Hierarchically-structured Interacting Segments
(HINTS) model with arbitrary trees. Our Path-Moves algorithm is based on
multi-label MRF formulation and can be seen as a combination of well-known
a-expansion and Ishikawa techniques. We show state-of-the-art biomedical
segmentation for many diverse examples of complex trees.
Grigorios Kalliatakis, Shoaib Ehsan, Klaus D. McDonald-Maier
Comments: Position paper, 8 pages, 3 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY)
The growing presence of devices carrying digital cameras, such as mobile
phones and tablets, combined with ever improving internet networks have enabled
ordinary citizens, victims of human rights abuse, and participants in armed
conflicts, protests, and disaster situations to capture and share via social
media networks images and videos of specific events. This paper discusses the
potential of images in human rights context including the opportunities and
challenges they present. This study demonstrates that real-world images have
the capacity to contribute complementary data to operational human rights
monitoring efforts when combined with novel computer vision approaches. The
analysis is concluded by arguing that if images are to be used effectively to
detect and identify human rights violations by rights advocates, greater
attention to gathering task-specific visual concepts from large-scale web
images is required.
Jose Dolz, Nicolas Reyns, Nacim Betrouni, Dris Kharroubi, Mathilde Quidet, Laurent Massoptier, Maximilien Vermandel
Comments: Submitted to the Journal of Physics in Biology and Medicine
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Radiation therapy has emerged as one of the preferred techniques to treat
brain cancer patients. During treatment, a very high dose of radiation is
delivered to a very narrow area. Prescribed radiation therapy for brain cancer
requires precisely defining the target treatment area, as well as delineating
vital brain structures which must be spared from radiotoxicity. Nevertheless,
delineation task is usually still manually performed, which is inefficient and
operator-dependent. Several attempts of automatizing this process have
reported. however, marginal results when analyzing organs in the optic region.
In this work we present a deep learning classification scheme based on
augmented-enhanced features to automatically segment organs at risk (OARs) in
the optic region -optic nerves, optic chiasm, pituitary gland and pituitary
stalk-. Fifteen MR images with various types of brain tumors were
retrospectively collected to undergo manual and automatic segmentation. Mean
Dice Similarity coefficients around 0.80 were reported. Incorporation of
proposed features yielded to improvements on the segmentation. Compared with
support vector machines, our method achieved better performance with less
variation on the results, as well as a considerably reduction on the
classification time. Performance of the proposed approach was also evaluated
with respect to manual contours. In this case, results obtained from the
automatic contours mostly lie on the variability of the observers, showing no
significant differences with respect to them. These results suggest therefore
that the proposed system is more accurate than other presented approaches, up
to date, to segment these structures. The speed, reproducibility, and
robustness of the process make the proposed deep learning-based classification
system a valuable tool for assisting in the delineation task of small OARs in
brain cancer.
Rakshith Shetty, Marcus Rohrbach, Lisa Anne Hendricks, Mario Fritz, Bernt Schiele
Comments: 16 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
While strong progress has been made in image captioning over the last years,
machine and human captions are still quite distinct. A closer look reveals that
this is due to the deficiencies in the generated word distribution, vocabulary
size, and strong bias in the generators towards frequent captions. Furthermore,
humans — rightfully so — generate multiple, diverse captions, due to the
inherent ambiguity in the captioning task which is not considered in today’s
systems.
To address these challenges, we change the training objective of the caption
generator from reproducing groundtruth captions to generating a set of captions
that is indistinguishable from human generated captions. Instead of
handcrafting such a learning target, we employ adversarial training in
combination with an approximate Gumbel sampler to implicitly match the
generated distribution to the human one. While our method achieves comparable
performance to the state-of-the-art in terms of the correctness of the
captions, we generate a set of diverse captions, that are significantly less
biased and match the word statistics better in several aspects.
Zhichao Li, Yi Yang, Xiao Liu, Shilei Wen, Wei Xu
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We propose a dynamic computational time model to accelerate the average
processing time for recurrent visual attention (RAM). Rather than attention
with a fixed number of steps for each input image, the model learns to decide
when to stop on the fly. To achieve this, we add an additional continue/stop
action per time step to RAM and use reinforcement learning to learn both the
optimal attention policy and stopping policy. The modification is simple but
could dramatically save the average computational time while keeping the same
recognition performance as RAM. Experimental results on CUB-200-2011 and
Stanford Cars dataset demonstrate the dynamic computational model can work
effectively for fine-grained image recognition.The source code of this paper
can be obtained from this https URL
Lei Fan, Ziyu Pan, Long Chen, Kai Huang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Reconstruction based on the stereo camera has received considerable attention
recently, but two particular challenges still remain. The first concerns the
need to aggregate similar pixels in an effective approach, and the second is to
maintain as much of the available information as possible while ensuring
sufficient accuracy. To overcome these issues, we propose a new 3D
representation method, namely, planecell, that extracts planarity from the
depth-assisted image segmentation and then projects these depth planes into the
3D world. An energy function formulated from Conditional Random Field that
generalizes the planar relationships is maximized to merge coplanar segments.
We evaluate our method with a variety of reconstruction baselines on both KITTI
and Middlebury datasets, and the results indicate the superiorities compared to
other 3D space representation methods in accuracy, memory requirements and
further applications.
Lachlan Tychsen-Smith, Lars Petersson
Comments: 8 pages, currently under review for ICCV2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We define the object detection from imagery problem as estimating a very
large but extremely sparse bounding box dependent probability distribution.
Subsequently we develop a novel sparse distribution estimation scheme called
Directed Sparse Sampling, and employ it in a single end-to-end CNN based
detection model. This methodology extends and formalizes previous
state-of-the-art detection models with an additional emphasis on high
evaluation rates and reduced manual engineering. The resulting model is scene
adaptive, does not require manually defined reference bounding boxes and
produces highly competitive results on MSCOCO, Pascal VOC 2007 and Pascal VOC
2012 with real-time evaluation rates. Further analysis suggests our model
performs particularly well when finegrained object localization is desirable.
We argue that this advantage stems from the much larger set of available
regions-of-interest relative to other methods.
Alireza Fathi, Zbigniew Wojna, Vivek Rathod, Peng Wang, Hyun Oh Song, Sergio Guadarrama, Kevin P. Murphy
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We propose a new method for semantic instance segmentation, by first
computing how likely two pixels are to belong to the same object, and then by
grouping similar pixels together. Our similarity metric is based on a deep,
fully convolutional embedding model. Our grouping method is based on selecting
all points that are sufficiently similar to a set of “seed points”, chosen from
a deep, fully convolutional scoring model. We show competitive results on the
Pascal VOC instance segmentation benchmark.
Kiana Ehsani, Roozbeh Mottaghi, Ali Farhadi
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Objects often occlude each other in scenes; Inferring their appearance beyond
their visible parts plays an important role in scene understanding, depth
estimation, object interaction and manipulation. In this paper, we study the
challenging problem of completing the appearance of occluded objects. Doing so
requires knowing which pixels to paint (segmenting the invisible parts of
objects) and what color to paint them (generating the invisible parts). Our
proposed novel solution, SeGAN, jointly optimizes for both segmentation and
generation of the invisible parts of objects. Our experimental results show
that: (a) SeGAN can learn to generate the appearance of the occluded parts of
objects; (b) SeGAN outperforms state-of-the-art segmentation baselines for the
invisible parts of objects; (c) trained on synthetic photo realistic images,
SeGAN can reliably segment natural images; (d) by reasoning about occluder
occludee relations, our method can infer depth layering.
Ali Y. Mutlu, Volkan Kılıç, Gizem K. Özdemir, Abdullah Bayram, Nesrin Horzum, Mehmet E. Solmaz
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We report the application of machine learning to smartphone based
colorimetric detection of pH values. The strip images were used as the training
set for Least Squares-Support Vector Machine (LS-SVM) classifier algorithms
that were able to successfully classify the distinct pH values. The difference
in the obtained image formats was found not to significantly affect the
performance of the proposed machine learning approach. Moreover, the influence
of the illumination conditions on the perceived color of pH strips was
investigated and further experiments were carried out to study effect of color
change on the learning model. Test results on JPEG, RAW and RAW-corrected image
formats captured in different lighting conditions lead to perfect
classification accuracy, sensitivity and specificity, which proves that the
colorimetric detection using machine learning based systems is able to adapt to
various experimental conditions and is a great candidate for smartphone based
sensing in paper-based colorimetric assays.
Jinsong Zhang, Jean-François Lalonde
Comments: 8 pages + 2 pages of citations, 10 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Outdoor lighting has extremely high dynamic range. This makes the process of
capturing outdoor environment maps notoriously challenging since special
equipment must be used. In this work, we propose an alternative approach. We
first capture lighting with a regular, LDR omnidirectional camera, and aim to
recover the HDR after the fact via a novel, learning-based tonemapping method.
We propose a deep autoencoder framework which regresses linear, high dynamic
range data from non-linear, saturated, low dynamic range panoramas. We validate
our method through a wide set of experiments on synthetic data, as well as on a
novel dataset of real photographs with ground truth. Our approach finds
applications in a variety of settings, ranging from outdoor light capture to
image matching.
Edward Boyda, Colin McCormick, Dan Hammer
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We present an algorithm capable of identifying a wide variety of
human-induced change on the surface of the planet by analyzing matches between
local features in time-sequenced remote sensing imagery. We evaluate feature
sets, match protocols, and the statistical modeling of feature matches. With
application of KAZE features, k-nearest-neighbor descriptor matching, and
geometric proximity and bi-directional match consistency checks, average match
rates increase more than two-fold over the previous standard. In testing our
platform, we developed a small, labeled benchmark dataset expressing
large-scale residential, industrial, and civic construction, along with null
instances, in California between the years 2010 and 2012. On the benchmark set,
our algorithm makes precise, accurate change proposals on two-thirds of scenes.
Further, the detection threshold can be tuned so that all or almost all
proposed detections are true positives.
Lingyu Lyu, Mehmed Kantardzic
Comments: 8 pages, 13 figures, the paper is accepted by ICCSE 2016
Subjects: Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)
With the popularity of massive open online courses, grading through
crowdsourcing has become a prevalent approach towards large scale classes.
However, for getting grades for complex tasks, which require specific skills
and efforts for grading, crowdsourcing encounters a restriction of insufficient
knowledge of the workers from the crowd. Due to knowledge limitation of the
crowd graders, grading based on partial perspectives becomes a big challenge
for evaluating complex tasks through crowdsourcing. Especially for those tasks
which not only need specific knowledge for grading, but also should be graded
as a whole instead of being decomposed into smaller and simpler subtasks. We
propose a framework for grading complex tasks via multiple views, which are
different grading perspectives defined by experts for the task, to provide
uniformity. Aggregation algorithm based on graders variances are used to
combine the grades for each view. We also detect bias patterns of the graders,
and debias them regarding each view of the task. Bias pattern determines how
the behavior is biased among graders, which is detected by a statistical
technique. The proposed approach is analyzed on a synthetic data set. We show
that our model gives more accurate results compared to the grading approaches
without different views and debiasing algorithm.
Alejandro Ramos-Soto, Jose M. Alonso, Ehud Reiter, Kees van Deemter, Albert Gatt
Comments: Conference paper: Accepted for FUZZIEEE-2017. One column version for arXiv (8 pages)
Subjects: Artificial Intelligence (cs.AI)
We present a novel heuristic approach that defines fuzzy geographical
descriptors using data gathered from a survey with human subjects. The
participants were asked to provide graphical interpretations of the descriptors
`north’ and `south’ for the Galician region (Spain). Based on these
interpretations, our approach builds fuzzy descriptors that are able to compute
membership degrees for geographical locations. We evaluated our approach in
terms of efficiency and precision. The fuzzy descriptors are meant to be used
as the cornerstones of a geographical referring expression generation algorithm
that is able to linguistically characterize geographical locations and regions.
This work is also part of a general research effort that intends to establish a
methodology which reunites the empirical studies traditionally practiced in
data-to-text and the use of fuzzy sets to model imprecision and vagueness in
words and expressions for text generation purposes.
Katharina Eggensperger, Marius Lindauer, Holger H. Hoos, Frank Hutter, Kevin Leyton-Brown
Subjects: Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
The optimization of algorithm (hyper-)parameters is crucial for achieving
peak performance across a wide range of domains, ranging from deep neural
networks to solvers for hard combinatorial problems. The resulting algorithm
configuration (AC) problem has attracted much attention from the machine
learning community. However, the proper evaluation of new AC procedures is
hindered by two key hurdles. First, AC benchmarks are hard to set up. Second
and even more significantly, they are computationally expensive: a single run
of an AC procedure involves many costly runs of the target algorithm whose
performance is to be optimized in a given AC benchmark scenario. One common
workaround is to optimize cheap-to-evaluate artificial benchmark functions
(e.g., Branin) instead of actual algorithms; however, these have different
properties than realistic AC problems. Here, we propose an alternative
benchmarking approach that is similarly cheap to evaluate but much closer to
the original AC problem: replacing expensive benchmarks by surrogate benchmarks
constructed from AC benchmarks. These surrogate benchmarks approximate the
response surface corresponding to true target algorithm performance using a
regression model, and the original and surrogate benchmark share the same
(hyper-)parameter space. In our experiments, we construct and evaluate
surrogate benchmarks for hyperparameter optimization as well as for AC problems
that involve performance optimization of solvers for hard combinatorial
problems, drawing training data from the runs of existing AC procedures. We
show that our surrogate benchmarks capture overall important characteristics of
the AC scenarios, such as high- and low-performing regions, from which they
were derived, while being much easier to use and orders of magnitude cheaper to
evaluate.
Denghui Zhang, Manling Li, Yantao Jia, Yuanzhuo Wang
Subjects: Artificial Intelligence (cs.AI)
Knowledge graph embedding aims to embed entities and relations of knowledge
graphs into low-dimensional vector spaces. Translating embedding methods regard
relations as the translation from head entities to tail entities, which achieve
the state-of-the-art results among knowledge graph embedding methods. However,
a major limitation of these methods is the time consuming training process,
which may take several days or even weeks for large knowledge graphs, and
result in great difficulty in practical applications. In this paper, we propose
an efficient parallel framework for translating embedding methods, called
ParTrans-X, which enables the methods to be paralleled without locks by
utilizing the distinguished structures of knowledge graphs. Experiments on two
datasets with three typical translating embedding methods, i.e., TransE [3],
TransH [17], and a more efficient variant TransE- AdaGrad [10] validate that
ParTrans-X can speed up the training process by more than an order of
magnitude.
Mark O. Riedl, Brent Harrison
Comments: 7 pages, 1 figure
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG)
Robots and autonomous systems that operate around humans will likely always
rely on kill switches that stop their execution and allow them to be
remote-controlled for the safety of humans or to prevent damage to the system.
It is theoretically possible for an autonomous system with sufficient sensor
and effector capability and using reinforcement learning to learn that the kill
switch deprives it of long-term reward and learn to act to disable the switch
or otherwise prevent a human operator from using the switch. This is referred
to as the big red button problem. We present a technique which prevents a
reinforcement learning agent from learning to disable the big red button. Our
technique interrupts the agent or robot by placing it in a virtual simulation
where it continues to receive reward. We illustrate our technique in a simple
grid world environment.
Aram Ter-Sarkisov, Robert Ross, John Kelleher
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)
This paper introduces a new approach to the long-term tracking of an object
in a challenging environment. The object is a cow and the environment is an
enclosure in a cowshed. Some of the key challenges in this domain are a
cluttered background, low contrast and high similarity between moving objects
which greatly reduces the efficiency of most existing approaches, including
those based on background subtraction. Our approach is split into object
localization, instance segmentation, learning and tracking stages. Our solution
is compared to a range of semi-supervised object tracking algorithms and we
show that the performance is strong and well suited to subsequent analysis. We
present our solution as a first step towards broader tracking and behavior
monitoring for cows in precision agriculture with the ultimate objective of
early detection of lameness.
Srijan Kumar, Bryan Hooi, Disha Makhija, Mohit Kumar, Christos Faloutsos, V.S. Subrahamanian
Subjects: Social and Information Networks (cs.SI); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Rating platforms enable large-scale collection of user opinion about items
(products, other users, etc.). However, many untrustworthy users give
fraudulent ratings for excessive monetary gains. In the paper, we present
FairJudge, a system to identify such fraudulent users. We propose three
metrics: (i) the fairness of a user that quantifies how trustworthy the user is
in rating the products, (ii) the reliability of a rating that measures how
reliable the rating is, and (iii) the goodness of a product that measures the
quality of the product. Intuitively, a user is fair if it provides reliable
ratings that are close to the goodness of the product. We formulate a mutually
recursive definition of these metrics, and further address cold start problems
and incorporate behavioral properties of users and products in the formulation.
We propose an iterative algorithm, FairJudge, to predict the values of the
three metrics. We prove that FairJudge is guaranteed to converge in a bounded
number of iterations, with linear time complexity. By conducting five different
experiments on five rating platforms, we show that FairJudge significantly
outperforms nine existing algorithms in predicting fair and unfair users. We
reported the 100 most unfair users in the Flipkart network to their review
fraud investigators, and 80 users were correctly identified (80% accuracy). The
FairJudge algorithm is already being deployed at Flipkart.
Rakshith Shetty, Marcus Rohrbach, Lisa Anne Hendricks, Mario Fritz, Bernt Schiele
Comments: 16 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
While strong progress has been made in image captioning over the last years,
machine and human captions are still quite distinct. A closer look reveals that
this is due to the deficiencies in the generated word distribution, vocabulary
size, and strong bias in the generators towards frequent captions. Furthermore,
humans — rightfully so — generate multiple, diverse captions, due to the
inherent ambiguity in the captioning task which is not considered in today’s
systems.
To address these challenges, we change the training objective of the caption
generator from reproducing groundtruth captions to generating a set of captions
that is indistinguishable from human generated captions. Instead of
handcrafting such a learning target, we employ adversarial training in
combination with an approximate Gumbel sampler to implicitly match the
generated distribution to the human one. While our method achieves comparable
performance to the state-of-the-art in terms of the correctness of the
captions, we generate a set of diverse captions, that are significantly less
biased and match the word statistics better in several aspects.
Andrea Soltoggio, Kenneth O. Stanley, Sebastian Risi
Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI)
Biological neural networks are systems of extraordinary computational
capabilities shaped by evolution, development, and lifetime learning. The
interplay of these elements leads to the emergence of adaptive behavior and
intelligence, but the complexity of the whole system of interactions is an
obstacle to the understanding of the key factors at play. Inspired by such
intricate natural phenomena, Evolved Plastic Artificial Neural Networks
(EPANNs) use simulated evolution in-silico to breed plastic neural networks,
artificial systems composed of sensors, outputs, and plastic components that
change in response to sensory-output experiences in an environment. These
systems may reveal key algorithmic ingredients of adaptation, autonomously
discover novel adaptive algorithms, and lead to hypotheses on the emergence of
biological adaptation. EPANNs have seen considerable progress over the last two
decades. Current scientific and technological advances in artificial neural
networks are now setting the conditions for radically new approaches and
results. In particular, the limitations of hand-designed structures and
algorithms currently used in most deep neural networks could be overcome by
more flexible and innovative solutions. This paper brings together a variety of
inspiring ideas that define the field of EPANNs. The main computational methods
and results are reviewed. Finally, new opportunities and developments are
presented.
Dale McConachie, Dmitry Berenson
Comments: Presented at the Workshop on the Algorithmic Foundations of Robotics, 2016, San Francisco, CA
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI)
We present a novel approach to deformable object manipulation that does not
rely on highly-accurate modeling. The key contribution of this paper is to
formulate the task as a Multi-Armed Bandit problem, with each arm representing
a model of the deformable object. To “pull” an arm and evaluate its utility, we
use the arm’s model to generate a velocity command for the gripper(s) holding
the object and execute it. As the task proceeds and the object deforms, the
utility of each model can change. Our framework estimates these changes and
balances exploration of the model set with exploitation of high-utility models.
We also propose an approach based on Kalman Filtering for Non-stationary
Multi-armed Normal Bandits (KF-MANB) to leverage the coupling between models to
learn more from each arm pull. We demonstrate that our method outperforms
previous methods on synthetic trials, and performs competitively on several
manipulation tasks in simulation.
Dimitrios Kartsaklis, Sanjaye Ramgoolam, Mehrnoosh Sadrzadeh
Comments: 32 pages, 3 figures
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); High Energy Physics – Theory (hep-th); Combinatorics (math.CO)
Recent research in computational linguistics has developed algorithms which
associate matrices with adjectives and verbs, based on the distribution of
words in a corpus of text. These matrices are linear operators on a vector
space of context words. They are used to construct the meaning of composite
expressions from that of the elementary constituents, forming part of a
compositional distributional approach to semantics. We propose a Matrix Theory
approach to this data, based on permutation symmetry along with Gaussian
weights and their perturbations. A simple Gaussian model is tested against word
matrices created from a large corpus of text. We characterize the cubic and
quartic departures from the model, which we propose, alongside the Gaussian
parameters, as signatures for comparison of linguistic corpora. We propose that
perturbed Gaussian models with permutation symmetry provide a promising
framework for characterizing the nature of universality in the statistical
properties of word matrices. The matrix theory framework developed here
exploits the view of statistics as zero dimensional perturbative quantum field
theory. It perceives language as a physical system realizing a universality
class of matrix statistics characterized by permutation symmetry.
A. Mani
Comments: 57 pages. This paper is scheduled to appear as two separate papers (because of length) with some overlap and enhancements
Subjects: Logic (math.LO); Artificial Intelligence (cs.AI); Information Theory (cs.IT); Logic in Computer Science (cs.LO)
In one perspective, the central problem pursued in this research is that of
the inverse problem in the context of general rough sets. The problem is about
the existence of rough basis for given approximations in a context. Granular
operator spaces were recently introduced by the present author as an optimal
framework for anti-chain based algebraic semantics of general rough sets and
the inverse problem. In the framework, various subtypes of crisp and non crisp
objects are identifiable that may be missed in more restrictive formalism. This
is also because in the latter cases the concept of complementation and negation
are taken for granted. This opens the door for a general approach to
dialectical rough sets building on previous work of the present author and
figures of opposition. In this paper dialectical rough logics are developed
from a semantic perspective, concept of dialectical predicates is formalized,
connection with dialethias and glutty negation established, parthood analyzed
and studied from the point of view of classical and dialectical figures of
opposition. Potential semantics through dialectical counting based on these
figures are proposed building on earlier work by the present author. Her
methods become more geometrical and encompass parthood as a primary relation
(as opposed to roughly equivalent objects) for algebraic semantics. Dialectical
counting strategies over anti chains (a specific form of dialectical structure)
for semantics are also proposed.
Besnik Fetahu, Ujwal Gadiraju, Stefan Dietze
Subjects: Information Retrieval (cs.IR)
The increasing amount of data on the Web, in particular of Linked Data, has
led to a diverse landscape of datasets, which make entity retrieval a
challenging task. Explicit cross-dataset links, for instance to indicate
co-references or related entities can significantly improve entity retrieval.
However, only a small fraction of entities are interlinked through explicit
statements. In this paper, we propose a two-fold entity retrieval approach. In
a first, offline preprocessing step, we cluster entities based on the
emph{x–means} and emph{spectral} clustering algorithms. In the second step,
we propose an optimized retrieval model which takes advantage of our
precomputed clusters. For a given set of entities retrieved by the BM25F
retrieval approach and a given user query, we further expand the result set
with relevant entities by considering features of the queries, entities and the
precomputed clusters. Finally, we re-rank the expanded result set with respect
to the relevance to the query. We perform a thorough experimental evaluation on
the Billions Triple Challenge (BTC12) dataset. The proposed approach shows
significant improvements compared to the baseline and state of the art
approaches.
Besnik Fetahu, Abhijit Anand, Avishek Anand
Subjects: Information Retrieval (cs.IR)
Wikipedia, rich in entities and events, is an invaluable resource for various
knowledge harvesting, extraction and mining tasks. Numerous resources like
DBpedia, YAGO and other knowledge bases are based on extracting entity and
event based knowledge from it. Online news, on the other hand, is an
authoritative and rich source for emerging entities, events and facts relating
to existing entities. In this work, we study the creation of entities in
Wikipedia with respect to news by studying how entity and event based
information flows from news to Wikipedia.
We analyze the lag of Wikipedia (based on the revision history of the English
Wikipedia) with 20 years of emph{The New York Times} dataset (NYT). We model
and analyze the lag of entities and events, namely their first appearance in
Wikipedia and in NYT, respectively. In our extensive experimental analysis, we
find that almost 20\% of the external references in entity pages are news
articles encoding the importance of news to Wikipedia. Second, we observe that
the entity-based lag follows a normal distribution with a high standard
deviation, whereas the lag for news-based events is typically very low.
Finally, we find that events are responsible for creation of emergent entities
with as many as 12\% of the entities mentioned in the event page are created
after the creation of the event page.
Besnik Fetahu, Katja Markert, Avishek Anand
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Social and Information Networks (cs.SI)
Wikipedia entity pages are a valuable source of information for direct
consumption and for knowledge-base construction, update and maintenance. Facts
in these entity pages are typically supported by references. Recent studies
show that as much as 20\% of the references are from online news sources.
However, many entity pages are incomplete even if relevant information is
already available in existing news articles. Even for the already present
references, there is often a delay between the news article publication time
and the reference time. In this work, we therefore look at Wikipedia through
the lens of news and propose a novel news-article suggestion task to improve
news coverage in Wikipedia, and reduce the lag of newsworthy references. Our
work finds direct application, as a precursor, to Wikipedia page generation and
knowledge-base acceleration tasks that rely on relevant and high quality input
sources.
We propose a two-stage supervised approach for suggesting news articles to
entity pages for a given state of Wikipedia. First, we suggest news articles to
Wikipedia entities (article-entity placement) relying on a rich set of features
which take into account the emph{salience} and emph{relative authority} of
entities, and the emph{novelty} of news articles to entity pages. Second, we
determine the exact section in the entity page for the input article
(article-section placement) guided by class-based section templates. We perform
an extensive evaluation of our approach based on ground-truth data that is
extracted from external references in Wikipedia. We achieve a high precision
value of up to 93\% in the emph{article-entity} suggestion stage and upto 84\%
for the emph{article-section placement}. Finally, we compare our approach
against competitive baselines and show significant improvements.
Besnik Fetahu, Katja Markert, Wolfgang Nejdl, Avishek Anand
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Social and Information Networks (cs.SI)
An important editing policy in Wikipedia is to provide citations for added
statements in Wikipedia pages, where statements can be arbitrary pieces of
text, ranging from a sentence to a paragraph. In many cases citations are
either outdated or missing altogether.
In this work we address the problem of finding and updating news citations
for statements in entity pages. We propose a two-stage supervised approach for
this problem. In the first step, we construct a classifier to find out whether
statements need a news citation or other kinds of citations (web, book,
journal, etc.). In the second step, we develop a news citation algorithm for
Wikipedia statements, which recommends appropriate citations from a given news
collection. Apart from IR techniques that use the statement to query the news
collection, we also formalize three properties of an appropriate citation,
namely: (i) the citation should entail the Wikipedia statement, (ii) the
statement should be central to the citation, and (iii) the citation should be
from an authoritative source.
We perform an extensive evaluation of both steps, using 20 million articles
from a real-world news collection. Our results are quite promising, and show
that we can perform this task with high precision and at scale.
Dimitrios Kartsaklis, Sanjaye Ramgoolam, Mehrnoosh Sadrzadeh
Comments: 32 pages, 3 figures
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); High Energy Physics – Theory (hep-th); Combinatorics (math.CO)
Recent research in computational linguistics has developed algorithms which
associate matrices with adjectives and verbs, based on the distribution of
words in a corpus of text. These matrices are linear operators on a vector
space of context words. They are used to construct the meaning of composite
expressions from that of the elementary constituents, forming part of a
compositional distributional approach to semantics. We propose a Matrix Theory
approach to this data, based on permutation symmetry along with Gaussian
weights and their perturbations. A simple Gaussian model is tested against word
matrices created from a large corpus of text. We characterize the cubic and
quartic departures from the model, which we propose, alongside the Gaussian
parameters, as signatures for comparison of linguistic corpora. We propose that
perturbed Gaussian models with permutation symmetry provide a promising
framework for characterizing the nature of universality in the statistical
properties of word matrices. The matrix theory framework developed here
exploits the view of statistics as zero dimensional perturbative quantum field
theory. It perceives language as a physical system realizing a universality
class of matrix statistics characterized by permutation symmetry.
Will Monroe, Robert X.D. Hawkins, Noah D. Goodman, Christopher Potts
Comments: 12 pages, 3 tables, 5 figures. To appear in TACL (pre-camera-ready draft)
Subjects: Computation and Language (cs.CL)
We present a model of pragmatic referring expression interpretation in a
grounded communication task (identifying colors from descriptions) that draws
upon predictions from two recurrent neural network classifiers, a speaker and a
listener, unified by a recursive pragmatic reasoning framework. Experiments
show that this combined pragmatic model interprets color descriptions more
accurately than the classifiers from which it is built. We observe that
pragmatic reasoning helps primarily in the hardest cases: when the model must
distinguish very similar colors, or when few utterances adequately express the
target color. Our findings make use of a newly-collected corpus of human
utterances in color reference games, which exhibit a variety of pragmatic
behaviors. We also show that the embedded speaker model reproduces many of
these pragmatic behaviors.
Rakshith Shetty, Marcus Rohrbach, Lisa Anne Hendricks, Mario Fritz, Bernt Schiele
Comments: 16 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
While strong progress has been made in image captioning over the last years,
machine and human captions are still quite distinct. A closer look reveals that
this is due to the deficiencies in the generated word distribution, vocabulary
size, and strong bias in the generators towards frequent captions. Furthermore,
humans — rightfully so — generate multiple, diverse captions, due to the
inherent ambiguity in the captioning task which is not considered in today’s
systems.
To address these challenges, we change the training objective of the caption
generator from reproducing groundtruth captions to generating a set of captions
that is indistinguishable from human generated captions. Instead of
handcrafting such a learning target, we employ adversarial training in
combination with an approximate Gumbel sampler to implicitly match the
generated distribution to the human one. While our method achieves comparable
performance to the state-of-the-art in terms of the correctness of the
captions, we generate a set of diverse captions, that are significantly less
biased and match the word statistics better in several aspects.
Lior Fritz, David Burshtein
Comments: Submitted to Interspeech 2017
Subjects: Learning (cs.LG); Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE)
An hybrid of a hidden Markov model (HMM) and a deep neural network (DNN) is
considered. End-to-end training using gradient descent is suggested, similarly
to the training of connectionist temporal classification (CTC). We use a
maximum a-posteriori (MAP) criterion with a simple language model in the
training stage, and a standard HMM decoder without approximations. Recognition
results are presented using speech databases. Our method compares favorably to
CTC in terms of performance, robustness and quality of alignments.
Besnik Fetahu, Katja Markert, Avishek Anand
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Social and Information Networks (cs.SI)
Wikipedia entity pages are a valuable source of information for direct
consumption and for knowledge-base construction, update and maintenance. Facts
in these entity pages are typically supported by references. Recent studies
show that as much as 20\% of the references are from online news sources.
However, many entity pages are incomplete even if relevant information is
already available in existing news articles. Even for the already present
references, there is often a delay between the news article publication time
and the reference time. In this work, we therefore look at Wikipedia through
the lens of news and propose a novel news-article suggestion task to improve
news coverage in Wikipedia, and reduce the lag of newsworthy references. Our
work finds direct application, as a precursor, to Wikipedia page generation and
knowledge-base acceleration tasks that rely on relevant and high quality input
sources.
We propose a two-stage supervised approach for suggesting news articles to
entity pages for a given state of Wikipedia. First, we suggest news articles to
Wikipedia entities (article-entity placement) relying on a rich set of features
which take into account the emph{salience} and emph{relative authority} of
entities, and the emph{novelty} of news articles to entity pages. Second, we
determine the exact section in the entity page for the input article
(article-section placement) guided by class-based section templates. We perform
an extensive evaluation of our approach based on ground-truth data that is
extracted from external references in Wikipedia. We achieve a high precision
value of up to 93\% in the emph{article-entity} suggestion stage and upto 84\%
for the emph{article-section placement}. Finally, we compare our approach
against competitive baselines and show significant improvements.
Besnik Fetahu, Katja Markert, Wolfgang Nejdl, Avishek Anand
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Social and Information Networks (cs.SI)
An important editing policy in Wikipedia is to provide citations for added
statements in Wikipedia pages, where statements can be arbitrary pieces of
text, ranging from a sentence to a paragraph. In many cases citations are
either outdated or missing altogether.
In this work we address the problem of finding and updating news citations
for statements in entity pages. We propose a two-stage supervised approach for
this problem. In the first step, we construct a classifier to find out whether
statements need a news citation or other kinds of citations (web, book,
journal, etc.). In the second step, we develop a news citation algorithm for
Wikipedia statements, which recommends appropriate citations from a given news
collection. Apart from IR techniques that use the statement to query the news
collection, we also formalize three properties of an appropriate citation,
namely: (i) the citation should entail the Wikipedia statement, (ii) the
statement should be central to the citation, and (iii) the citation should be
from an authoritative source.
We perform an extensive evaluation of both steps, using 20 million articles
from a real-world news collection. Our results are quite promising, and show
that we can perform this task with high precision and at scale.
Miguel E. Coimbra, Alexandre P. Francisco, Luis Veiga
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS)
Community network micro-clouds (CNMCs) have seen an increase in the last
fifteen years. Their members contact nodes which operate Internet proxies, web
servers, user file storage and video streaming services, to name a few.
Detecting communities of nodes with properties (such as co-location) and
assessing node eligibility for service placement is thus a key-factor in
optimizing the experience of users. We present an approach for community
finding using a label propagation graph algorithm to address the
multi-objective challenge of optimizing service placement in CNMCs. Herein we:
i) highlight the applicability of leader election heuristics which are
important for service placement in community networks and scheduler-dependent
scenarios; ii) present a novel decentralized solution designed as a scalable
alternative for the problem of service placement, which has mostly seen
computational approaches based on centralization.
Antonio Tadeu A. Gomes, Weslley S. Pereira, Frederic Valentin, Diego Paredes
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Numerical Analysis (math.NA)
The family of Multiscale Hybrid-Mixed (MHM) finite element methods has
received considerable attention from the mathematics and engineering community
in the last few years. The MHM methods allow solving highly heterogeneous
problems on coarse meshes while providing solutions with high-order precision.
It embeds independent local problems which are responsible for upscaling
unresolved scales into the numerical solution. These local contributions are
brought together through a global problem defined on the skeleton of the coarse
partition. Since the local problems are completely independent, they can be
easily computed in parallel. In this paper, we present two simulator prototypes
specifically crafted for the MHM methods, which adopt two different
implementation strategies: (i) a multi-programming language approach, each
language tackling different simulation issues; and (ii) a classical,
single-programming language approach. Specifically, we use C++ for numerical
computation of the global and local problems in a modular way; for process
distribution in the simulator, we adopt the Erlang concurrent language in the
first approach, and the MPI standard in the second approach. The aim of
exploring these different approaches is twofold: (i) allow for the deployment
of the simulator both in high-performance computing (with MPI) and in cloud
computing environments (with Erlang); and (ii) pave the way for further
exploration of quality attributes related to software productivity and
fault-tolerance, which are key to Exascale systems. We present a performance
evaluation of the two simulator prototypes taking into account their
efficiency.
Sung-Han Lin, Ranjan Pal, Marco Paolieri, Leana Golubchik
Comments: To be published in ICDCS 2017
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Small-scale clouds (SCs) often suffer from resource under-provisioning during
peak demand, leading to inability to satisfy service level agreements (SLAs)
and consequent loss of customers. One approach to address this problem is for a
set of autonomous SCs to share resources among themselves in a cost-induced
cooperative fashion, thereby increasing their individual capacities (when
needed) without having to significantly invest in more resources. A central
problem (in this context) is how to properly share resources (for a price) to
achieve profitable service while maintaining customer SLAs. To address this
problem, in this paper, we propose the SC-Share framework that utilizes two
interacting models: (i) a stochastic performance model that estimates the
achieved performance characteristics under given SLA requirements, and (ii) a
market-based game-theoretic model that (as shown empirically) converges to
efficient resource sharing decisions at market equilibrium. Our results include
extensive evaluations that illustrate the utility of the proposed framework.
Robert Grandl, Arjun Singhvi, Aditya Akella
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Existing data analytics frameworks are intrinsically compute-centric in
nature. Their computation structure is complex and determined early, and they
take decisions that bind early to this structure. This impacts expressiveness,
job performance, and cluster efficiency.
We present F2, a new analytics framework that separates computation from data
management, making the latter an equal first-class entity. We argue that this
separation enables more flexibility in expressing analytics jobs and enables
data driven optimizations. Furthermore, it enables a new kind of “tasks” with
loose semantics that can multiplex their execution across different sets of
data and multiple jobs.
David Richie, James Ross
Comments: 7 pages, 2 figures, example code, accepted for publication at the 7th NSF/TCPP Workshop on Parallel and Distributed Computing Education (EduPar-17) workshop in conjunction with the 31st IEEE International Parallel & Distributed Processing Symposium (IPDPS 17)
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Programming Languages (cs.PL)
A novel approach is presented to teach the parallel and distributed computing
concepts of synchronization and remote memory access. The single program
multiple data (SPMD) partitioned global address space (PGAS) model presented in
this paper uses a procedural programming language appealing to undergraduate
students. We propose that the amusing nature of the approach may engender
creativity and interest using these concepts later in more sober environments.
Specifically, we implement parallel extensions to LOLCODE within a
source-to-source compiler sufficient for the development of parallel and
distributed algorithms normally implemented using conventional high-performance
computing languages and APIs.
Joseph Gomes, Bharath Ramsundar, Evan N. Feinberg, Vijay S. Pande
Subjects: Learning (cs.LG); Chemical Physics (physics.chem-ph); Machine Learning (stat.ML)
Empirical scoring functions based on either molecular force fields or
cheminformatics descriptors are widely used, in conjunction with molecular
docking, during the early stages of drug discovery to predict potency and
binding affinity of a drug-like molecule to a given target. These models
require expert-level knowledge of physical chemistry and biology to be encoded
as hand-tuned parameters or features rather than allowing the underlying model
to select features in a data-driven procedure. Here, we develop a general
3-dimensional spatial convolution operation for learning atomic-level chemical
interactions directly from atomic coordinates and demonstrate its application
to structure-based bioactivity prediction. The atomic convolutional neural
network is trained to predict the experimentally determined binding affinity of
a protein-ligand complex by direct calculation of the energy associated with
the complex, protein, and ligand given the crystal structure of the binding
pose. Non-covalent interactions present in the complex that are absent in the
protein-ligand sub-structures are identified and the model learns the
interaction strength associated with these features. We test our model by
predicting the binding free energy of a subset of protein-ligand complexes
found in the PDBBind dataset and compare with state-of-the-art cheminformatics
and machine learning-based approaches. We find that all methods achieve
experimental accuracy and that atomic convolutional networks either outperform
or perform competitively with the cheminformatics based methods. Unlike all
previous protein-ligand prediction systems, atomic convolutional networks are
end-to-end and fully-differentiable. They represent a new data-driven,
physics-based deep learning model paradigm that offers a strong foundation for
future improvements in structure-based bioactivity prediction.
Jiashi Feng
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
We consider the problems of robust PAC learning from distributed and
streaming data, which may contain malicious errors and outliers, and analyze
their fundamental complexity questions. In particular, we establish lower
bounds on the communication complexity for distributed robust learning
performed on multiple machines, and on the space complexity for robust learning
from streaming data on a single machine. These results demonstrate that gaining
robustness of learning algorithms is usually at the expense of increased
complexities. As far as we know, this work gives the first complexity results
for distributed and online robust PAC learning.
Lior Fritz, David Burshtein
Comments: Submitted to Interspeech 2017
Subjects: Learning (cs.LG); Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE)
An hybrid of a hidden Markov model (HMM) and a deep neural network (DNN) is
considered. End-to-end training using gradient descent is suggested, similarly
to the training of connectionist temporal classification (CTC). We use a
maximum a-posteriori (MAP) criterion with a simple language model in the
training stage, and a standard HMM decoder without approximations. Recognition
results are presented using speech databases. Our method compares favorably to
CTC in terms of performance, robustness and quality of alignments.
Senjian An, Farid Boussaid, Mohammed Bennamoun, Jiankun Hu
Comments: Technical Report
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
In this paper, we introduce transformations of deep rectifier networks,
enabling the conversion of deep rectifier networks into shallow rectifier
networks. We subsequently prove that any rectifier net of any depth can be
represented by a maximum of a number of functions that can be realized by a
shallow network with a single hidden layer. The transformations of both deep
rectifier nets and deep residual nets are conducted to demonstrate the
advantages of the residual nets over the conventional neural nets and the
advantages of the deep neural nets over the shallow neural nets. In summary,
for two rectifier nets with different depths but with same total number of
hidden units, the corresponding single hidden layer representation of the
deeper net is much more complex than the corresponding single hidden
representation of the shallower net. Similarly, for a residual net and a
conventional rectifier net with the same structure except for the skip
connections in the residual net, the corresponding single hidden layer
representation of the residual net is much more complex than the corresponding
single hidden layer representation of the conventional net.
Aram Ter-Sarkisov, Robert Ross, John Kelleher
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)
This paper introduces a new approach to the long-term tracking of an object
in a challenging environment. The object is a cow and the environment is an
enclosure in a cowshed. Some of the key challenges in this domain are a
cluttered background, low contrast and high similarity between moving objects
which greatly reduces the efficiency of most existing approaches, including
those based on background subtraction. Our approach is split into object
localization, instance segmentation, learning and tracking stages. Our solution
is compared to a range of semi-supervised object tracking algorithms and we
show that the performance is strong and well suited to subsequent analysis. We
present our solution as a first step towards broader tracking and behavior
monitoring for cows in precision agriculture with the ultimate objective of
early detection of lameness.
Zhaoqiang Liu, Vincent Y. F. Tan
Comments: 10 pages, 3 figures
Subjects: Machine Learning (stat.ML); Learning (cs.LG); Methodology (stat.ME)
The learning of Gaussian mixture models (GMMs) is a classical problem in
machine learning and applied statistics. This can also be interpreted as a
clustering problem. Indeed, given data samples independently generated from a
GMM, we would like to find the correct target clustering of the samples
according to which Gaussian they were generated from. Despite the large number
of algorithms designed to find the correct target clustering, many
practitioners prefer to use the k-means algorithm because of its simplicity.
k-means tries to find an optimal clustering which minimizes the sum of squared
distances between each point and its cluster center. In this paper, we provide
sufficient conditions for the closeness of any optimal clustering and the
correct target clustering of the samples which are independently generated from
a GMM. Moreover, to achieve significantly faster running time and reduced
memory usage, we show that under weaker conditions on the GMM, any optimal
clustering for the samples with reduced dimensionality is also close to the
correct target clustering. These results provide intuition for the
informativeness of k-means as an algorithm for learning a GMM, further
substantiating the conclusions in Kumar and Kannan [2010]. We verify the
correctness of our theorems using numerical experiments and show, using
datasets with reduced dimensionality, significant speed ups for the time
required to perform clustering.
Mark O. Riedl, Brent Harrison
Comments: 7 pages, 1 figure
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG)
Robots and autonomous systems that operate around humans will likely always
rely on kill switches that stop their execution and allow them to be
remote-controlled for the safety of humans or to prevent damage to the system.
It is theoretically possible for an autonomous system with sufficient sensor
and effector capability and using reinforcement learning to learn that the kill
switch deprives it of long-term reward and learn to act to disable the switch
or otherwise prevent a human operator from using the switch. This is referred
to as the big red button problem. We present a technique which prevents a
reinforcement learning agent from learning to disable the big red button. Our
technique interrupts the agent or robot by placing it in a virtual simulation
where it continues to receive reward. We illustrate our technique in a simple
grid world environment.
Xuejian Xu, Meixia Tao
Comments: Part of this work is accepted by IEEE ICC 2017
Subjects: Information Theory (cs.IT)
Coded caching is able to exploit accumulated cache size and hence superior to
uncoded caching by distributing different fractions of a file in different
nodes. This work investigates coded caching in a large-scale small-cell network
(SCN) where the locations of small base stations (SBSs) are modeled by
stochastic geometry. We first propose a content delivery framework, where
multiple SBSs that cache different coded packets of a desired file transmit
concurrently upon a user request and the user decodes the signals using
successive interference cancellation (SIC). We characterize the performance of
coded caching by two performance metrics, average fractional offloaded traffic
(AFOT) and average ergodic rate (AER), for which a closed-form expression and a
tractable expression are derived, respectively, in the high signal-to-noise
ratio region. We then formulate the coded cache placement problem for AFOT
maximization as a multiple-choice knapsack problem (MCKP). By utilizing the
analytical properties of AFOT, a greedy but optimal algorithm is proposed. We
also consider the coded cache placement problem for AER maximization. By
converting this problem into a standard MCKP, a heuristic algorithm is
proposed. Analytical and numerical results reveal several design and
performance insights of coded caching in conjunction with SIC receiver in
interference-limited SCNs.
Shuai Huang, Trac D. Tran
Subjects: Information Theory (cs.IT)
Compressive sensing relies on the sparse prior imposed on the signal to solve
the ill-posed recovery problem in an under-determined linear system. The
objective function that enforces the sparse prior information should be both
effective and easily optimizable. Motivated by the entropy concept from
information theory, in this paper we propose the generalized Shannon entropy
function and R'{e}nyi entropy function of the signal as the sparsity promoting
objectives. Both entropy functions are nonconvex, and their local minimums only
occur on the boundaries of the orthants in the Euclidean space. Compared to
other popular objective functions such as the (|oldsymbol
x|_1),(|oldsymbol x|_p^p), minimizing the proposed entropy functions not
only promotes sparsity in the recovered signals, but also encourages the signal
energy to be concentrated towards a few significant entries. The corresponding
optimization problem can be converted into a series of reweighted (l_1)
minimization problems and solved efficiently. Sparse signal recovery
experiments on both the simulated and real data show the proposed entropy
function minimization approaches are better than other popular approaches and
achieve state-of-the-art performances.
Sahar Imtiaz, Hadi Ghauch, Muhammad Mahboob Ur Rahman, George Koudouridis, James Gross
Subjects: Information Theory (cs.IT)
Next generation cellular networks will have to leverage large cell
densifications to accomplish the ambitious goals for aggregate multi-user sum
rates, for which CRAN architecture is a favored network design. This shifts the
attention back to applicable resource allocation (RA), which need to be
applicable for very short radio frames, large and dense sets of radio heads,
and large user populations in the coordination area. So far, mainly CSI-based
RA schemes have been proposed for this task. However, they have considerable
complexity and also incur a significant CSI acquisition overhead on the system.
In this paper, we study an alternative approach which promises lower complexity
with also a lower overhead. We propose to base the RA in multi-antenna CRAN
systems on the position information of user terminals only. We use Random
Forests as supervised machine learning approach to determine the multi-user
RAs. This likely leads to lower overhead costs, as the acquisition of position
information requires less radio resources in comparison to the acquisition of
instantaneous CSI. The results show the following findings: I) In general,
learning-based RA schemes can achieve comparable spectral efficiency to
CSI-based scheme; II) If taking the system overhead into account,
learning-based RA scheme utilizing position information outperform legacy
CSI-based scheme by up to 100%; III) Despite their dependency on the training
data, Random Forests based RA scheme is robust against position inaccuracies
and changes in the propagation scenario; IV) The most important factor
influencing the performance of learning-based RA scheme is the antenna
orientation, for which we present three approaches that restore most of the
original performance results. To the best of our knowledge, these insights are
new and indicate a novel as well as promising approach to master the complexity
in future cellular networks.
Mehdi Salehi Heydar Abad, Ozgur Ercetin, Deniz Gündüz
Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI); Probability (math.PR)
We consider an energy harvesting (EH) transmitter communicating over a
time-correlated wireless channel. The transmitter is capable of sensing the
current channel state, albeit at the cost of both energy and transmission time.
The EH transmitter aims to maximize its long-term throughput by choosing one of
the following actions: (i)) defer its transmission to save energy for future
use, (ii)) transmit reliably at a low rate, (iii)) transmit at a high rate, and
(iv)) sense the channel to reveal the channel state information at a cost of
energy and transmission time, and then decide to defer or to transmit. The
problem is formulated as a partially observable Markov decision process with a
belief on the channel state. The optimal policy is shown to exhibit a threshold
behavior on the belief state, with battery-dependent threshold values. The
optimal threshold values and performance are characterized numerically via the
value iteration algorithm. Our results demonstrate that, despite the associated
time and energy cost, sensing the channel intelligently to track the channel
state improves the achievable long-term throughput significantly as compared to
the performance of those protocols lacking this ability as well as the one that
always senses the channel.
Kilavo Hassan, Kisangiri Michael, Salehe I. Mrutu
Journal-ref: International Journal of Computer Science, Engineering and
Applications (IJCSEA) Vol. 7, No. 1, February 2017
Subjects: Information Theory (cs.IT)
Viterbi Algorithm Decoder Enhanced with Non-transmittable Codewords is one of
the best decoding algorithm which effectively improves forward error correction
performance. HoweverViterbi decoder enhanced with NTCs is not yet designed to
work in storage media devices. Currently Reed Solomon (RS) Algorithm is almost
the dominant algorithm used in correcting error in storage media. Conversely,
recent studies show that there still exist low reliability of data in storage
media while the demand for storage media increases drastically. This study
proposes a design of the Soft Viterbi Algorithm decoder enhanced with
Non-transmittable Codewords (SVAD-NTCs) to be used in storage media for error
correction. Matlab simulation was used in this design in order to investigate
behavior and effectiveness of SVAD-NTCs in correcting errors in data retrieving
from storage media.Sample data of one million bits are randomly generated,
Additive White Gaussian Noise (AWGN) was used as data distortion model and
Binary Phase- Shift Keying (BPSK) was applied for simulation modulation.
Results show that,behaviors of SVAD-NTC performance increase as you increase
the NTCs, but beyond 6NTCs there is no significant change and SVAD-NTCs design
drastically reduce the total residual error from 216,878 of Reed Solomon to
23,900.
Lianfeng Zou, Shulabh Gupta, Christophe Caloz
Subjects: Information Theory (cs.IT)
We model, demonstrate and characterize Dispersion Code Multiple Access (DCMA)
and hence show the applicability of this purely analog and real-time multiple
access scheme to high-speed wireless communications. We first mathematically
describe DCMA and show the appropriateness of Chebyshev dispersion coding in
this technology. We next provide an experimental proof-of-concept in a 2 X 2
DCMA system. Finally,we statistically characterize DCMA in terms of bandwidth,
dispersive group delay swing, system dimension and signal-to-noise ratio.
Mehdi Ganji, Hamid Jafarkhani
Subjects: Information Theory (cs.IT)
There has been extensive research on large scale multi-user multiple-input
multiple-output (MU-MIMO) systems recently. Researchers have shown that there
are great opportunities in this area, however, there are many obstacles in the
way to achieve full potential of using large number of receive antennas. One of
the main issues, which will be investigated thoroughly in this paper, is timing
asynchrony among signals of different users. Most of the works in the
literature, assume that received signals are perfectly aligned which is not
practical. We show that, neglecting the asynchrony can significantly degrade
the performance of existing designs, particularly maximum ratio combining
(MRC). We quantify the uplink achievable rates obtained by MRC receiver with
perfect channel state information (CSI) and imperfect CSI while the system is
impaired by unknown time delays among received signals. We then use these
results to design new algorithms in order to alleviate the effects of timing
mismatch. We also analyze the performance of introduced receiver design, which
is called MRC-ZF, with perfect and imperfect CSI. For performing MRC-ZF, the
only required information is the distribution of timing mismatch which
circumvents the necessity of time delay acquisition or synchronization. To
verify our analytical results, we present extensive simulation results which
thoroughly investigate the performance of the traditional MRC receiver and the
introduced MRC-ZF receiver.
Meysam Sadeghi, Luca Sanguinetti, Chau Yuen
Comments: 5 pages, 5 figures, submitted to IEEE GLOBECOM 2017, Singapore, Dec. 2017
Subjects: Information Theory (cs.IT)
Next generation of wireless networks will likely rely on large-scale antenna
systems, either in the form of massive multi-input-multi-output (MIMO) or
millimeter wave (mmWave) systems. Therefore, the conventional fully-digital
precoders are not suitable for physical layer multicasting as they require a
dedicated radio frequency chain per antenna element. In this paper, we show
that in a multi-group multicasting system with an arbitrary number of transmit
antennas, (G) multicasting groups, and an arbitrary number of users in each
group, one can achieve the performance of any fully-digital precoder with just
(G) radio frequency chains using the proposed hybrid multi-group multicasting
structure.
Yodai Watanabe
Subjects: Information Theory (cs.IT); Quantum Physics (quant-ph)
Randomness extraction against side information is the art of distilling from
a given source a key which is almost uniform conditioned on the side
information. This paper provides randomness extraction against quantum side
information whose extractable key length is given by a quantum generalization
of the conditional collision entropy defined without the conventional
smoothing. Based on the fact that the collision entropy is not subadditive, the
quantum conditional collision entropy maximized with respect to additional side
information is introduced, and is shown to be asymptotically optimal. The lower
bound on it derived there ensures faster convergence to the conditional von
Neumann entropy than that on the smooth min-entropy.
Jialing Liao, Muhammad R. A. Khandaker, Kai-Kit Wong
Comments: Presented in IEEE SPAWC 2016
Journal-ref: Proc. 17th IEEE Int. Workshop Signal Process. Adv. Wireless
Commun., SPAWC 2016, Edinburgh, Scottland, UK, July 3 – 6, 2016
Subjects: Information Theory (cs.IT)
This paper considers a multiple-input multiple-output (MIMO) relay system
with an energy harvesting relay node. All nodes are equipped with multiple
antennas, and the relay node depends on the harvested energy from the received
signal to support information forwarding. In particular, the relay node deploys
power splitting based energy harvesting scheme. The capacity maximization
problem subject to power constraints at both the source and relay nodes is
considered for both fixed source covariance matrix and optimal source
covariance matrix cases. Instead of using existing software solvers, iterative
approaches using dual decomposition technique are developed based on the
structures of the optimal relay precoding and source covariance matrices.
Simulation results demonstrate the performance gain of the joint optimization
against the fixed source covariance matrix case.
George R. MacCartney Jr., Theodore S. Rappaport
Comments: To be published in 2017 IEEE International Conference on Communications (ICC), Paris, France, May 2017
Subjects: Information Theory (cs.IT)
Little research has been done to reliably model millimeter wave (mmWave) path
loss in rural macrocell settings, yet, models have been hastily adopted without
substantial empirical evidence. This paper studies past rural macrocell (RMa)
path loss models and exposes concerns with the current 3rd Generation
Partnership Project (3GPP) TR 38.900 (Release 14) RMa path loss models adopted
from the International Telecommunications Union – Radiocommunications (ITU-R)
Sector. This paper shows how the 3GPP RMa large-scale path loss models were
derived for frequencies below 6 GHz, yet they are being asserted for use up to
30 GHz, even though there has not been sufficient work or published data to
support their validity at frequencies above 6 GHz or in the mmWave bands. We
present the background of the 3GPP RMa path loss models and their use of odd
correction factors not suitable for rural scenarios, and show that the
multi-frequency close-in free space reference distance (CI) path loss model is
more accurate and reliable than current 3GPP and ITU-R RMa models. Using field
data and simulations, we introduce a new close-in free space reference distance
with height dependent path loss exponent model (CIH), that predicts rural
macrocell path loss using an effective path loss exponent that is a function of
base station antenna height. This work shows the CI and CIH models can be used
from 500 MHz to 100 GHz for rural mmWave coverage and interference analysis,
without any discontinuity at 6 GHz as exists in today’s 3GPP and ITU-R RMa
models.
Omid Taghizadeh, Ali Cagatay Cirik, Rudolf Mathar
Comments: To be submitted to IEEE for possible publication
Subjects: Information Theory (cs.IT)
In this work we study the behavior of a full-duplex (FD) and
amplify-and-forward (AF) relay with multiple antennas, where hardware
impairments of the FD relay transceiver is taken into account. Due to the
inter-dependency of the transmit relay power on each antenna and the residual
self-interference in an AF-FD relay, we observe a distortion loop that degrades
the system performance when the relay dynamic range is not high. In this
regard, we analyze the relay function in presence of the hardware inaccuracies
and an optimization problem is formulated to maximize the signal to
distortion-plus-noise ratio (SDNR), under relay and source transmit power
constraints. Due to the problem complexity, we propose a
gradient-projection-based (GP) algorithm to obtain an optimal solution.
Moreover, a nonalternating sub-optimal solution is proposed by assuming a
rank-1 relay amplification matrix, and separating the design of the relay
process into multiple stages (MuStR1). The proposed MuStR1 method is then
enhanced by introducing an alternating update over the optimization variables,
denoted as AltMuStR1 algorithm. It is observed that compared to GP, (Alt)MuStR1
algorithms significantly reduce the required computational complexity at the
expense of a slight performance degradation. Finally, the proposed methods are
evaluated under various system conditions, and compared with the methods
available in the current literature. In particular, it is observed that as the
hardware impairments increase, or for a system with a high transmit power, the
impact of applying a distortion-aware design is significant.
Le Zheng, Marco Lops, Xiaodong Wang, Emanuele Grossi
Subjects: Information Theory (cs.IT)
The focus of this paper is on co-existence between a communication system and
a pulsed radar sharing the same bandwidth. Based on the fact that the
interference generated by the radar onto the communication receiver is
intermittent and depends on the pulse train duty cycle and on the density of
scattering objects (such as, e.g., targets), we first show that the
communication system is equivalent to a set of independent parallel channels,
whereby pre-coding on each channel can be introduced as a new degree of
freedom. We introduce a new figure of merit, named the {em compound rate}
which is a convex combination of rates with and without interference, to be
optimized under constraints concerning the Signal-to-Interference-plus-Noise
Ratio (SINR) experienced by the radar and obviously the powers emitted by the
two systems: the degrees of freedom are the radar waveform and the
afore-mentioned encoding matrix for the communication symbols. We provide
closed-form solution for the optimum transmit policies for both systems under a
variety of conditions, including arbitrary correlation of the interference
impinging the radar and/or two basic models for the covariance matrix of the
scattering of the interfering objects towards the communication system. We also
discuss the region of the achievable communication rates with and without
interference. A thorough performance assessment shows the potentials and the
limitations of the proposed co-existing architecture.
Benny Van Houdt
Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT)
In this paper we study how to estimate the back-off rates in an idealized
CSMA network to achieve a given throughput vector using free energy
approximations. More specifically, we introduce the class of region-based free
energy approximations with clique belief and present a closed form expression
for the back-off rates based on the zero gradient points of the free energy
approximation (in terms of the conflict graph, target throughput vector and
counting numbers).
Next we introduce the size (k_{max}) clique free energy approximation as a
special case and derive an explicit expression for the counting numbers, as
well as a recursion to compute the back-off rates. We subsequently show that
the size (k_{max}) clique approximation coincides with a Kikuchi free energy
approximation and prove that it is exact on chordal conflict graphs. As a
by-product these results provide us with an explicit expression of a fixed
point of the inverse generalized belief propagation algorithm for CSMA
networks.
A. Mani
Comments: 57 pages. This paper is scheduled to appear as two separate papers (because of length) with some overlap and enhancements
Subjects: Logic (math.LO); Artificial Intelligence (cs.AI); Information Theory (cs.IT); Logic in Computer Science (cs.LO)
In one perspective, the central problem pursued in this research is that of
the inverse problem in the context of general rough sets. The problem is about
the existence of rough basis for given approximations in a context. Granular
operator spaces were recently introduced by the present author as an optimal
framework for anti-chain based algebraic semantics of general rough sets and
the inverse problem. In the framework, various subtypes of crisp and non crisp
objects are identifiable that may be missed in more restrictive formalism. This
is also because in the latter cases the concept of complementation and negation
are taken for granted. This opens the door for a general approach to
dialectical rough sets building on previous work of the present author and
figures of opposition. In this paper dialectical rough logics are developed
from a semantic perspective, concept of dialectical predicates is formalized,
connection with dialethias and glutty negation established, parthood analyzed
and studied from the point of view of classical and dialectical figures of
opposition. Potential semantics through dialectical counting based on these
figures are proposed building on earlier work by the present author. Her
methods become more geometrical and encompass parthood as a primary relation
(as opposed to roughly equivalent objects) for algebraic semantics. Dialectical
counting strategies over anti chains (a specific form of dialectical structure)
for semantics are also proposed.