Thomas Laurent, James von Brecht
Subjects: Neural and Evolutionary Computing (cs.NE); Computation and Language (cs.CL); Learning (cs.LG)
We introduce an exceptionally simple gated recurrent neural network (RNN)
that achieves performance comparable to well-known gated architectures, such as
LSTMs and GRUs, on the word-level language modeling task. We prove that our
model has simple, predicable and non-chaotic dynamics. This stands in stark
contrast to more standard gated architectures, whose underlying dynamical
systems exhibit chaotic behavior.
Min Jiang, Zhongqiang Huang, Liming Qiu, Wenzhen Huang, Gary G. Yen
Subjects: Neural and Evolutionary Computing (cs.NE)
One of the major distinguishing features of the dynamic multiobjective
optimization problems (DMOPs) is the optimization objectives will change over
time, thus tracking the varying Pareto-optimal front becomes a challenge. One
of the promising solutions is reusing the “experiences” to construct a
prediction model via statistical machine learning approaches. However most of
the existing methods ignore the non-independent and identically distributed
nature of data used to construct the prediction model. In this paper, we
propose an algorithmic framework, called Tr-DMOEA, which integrates transfer
learning and population-based evolutionary algorithm for solving the DMOPs.
This approach takes the transfer learning method as a tool to help reuse the
past experience for speeding up the evolutionary process, and at the same time,
any population based multiobjective algorithms can benefit from this
integration without any extensive modifications. To verify this, we incorporate
the proposed approach into the development of three well-known algorithms,
nondominated sorting genetic algorithm II (NSGA-II), multiobjective particle
swarm optimization (MOPSO), and the regularity model-based multiobjective
estimation of distribution algorithm (RM-MEDA), and then employ twelve
benchmark functions to test these algorithms as well as compare with some
chosen state-of-the-art designs. The experimental results confirm the
effectiveness of the proposed method through exploiting machine learning
technology.
Deepak Pathak, Ross Girshick, Piotr Dollár, Trevor Darrell, Bharath Hariharan
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)
This paper presents a novel yet intuitive approach to unsupervised feature
learning. Inspired by the human visual system, we explore whether low-level
motion-based grouping cues can be used to learn an effective visual
representation. Specifically, we use unsupervised motion-based segmentation on
videos to obtain segments, which we use as ‘pseudo ground truth’ to train a
convolutional network to segment objects from a single frame. Given the
extensive evidence that motion plays a key role in the development of the human
visual system, we hope that this straightforward approach to unsupervised
learning will be more effective than cleverly designed ‘pretext’ tasks studied
in the literature. Indeed, our extensive experiments show that this is the
case. When used for transfer learning on object detection, our representation
significantly outperforms previous unsupervised approaches across multiple
settings, especially when training data for the target task is scarce.
Shervin Ardeshir, Krishna Regmi, Ali Borji
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Mirror neurons have been observed in the primary motor cortex of primate
species, in particular in humans and monkeys. A mirror neuron fires when a
person performs a certain action, and also when he observes the same action
being performed by another person. A crucial step towards building fully
autonomous intelligent systems with human-like learning abilities is the
capability in modeling the mirror neuron. On one hand, the abundance of
egocentric cameras in the past few years has offered the opportunity to study a
lot of vision problems from the first-person perspective. A great deal of
interesting research has been done during the past few years, trying to explore
various computer vision tasks from the perspective of the self. On the other
hand, videos recorded by traditional static cameras, capture humans performing
different actions from an exocentric third-person perspective. In this work, we
take the first step towards relating motion information across these two
perspectives. We train models that predict motion in an egocentric view, by
observing it from an exocentric view, and vice versa. This allows models to
predict how an egocentric motion would look like from outside. To do so, we
train linear and nonlinear models and evaluate their performance in terms of
retrieving the egocentric (exocentric) motion features, while having access to
an exocentric (egocentric) motion feature. Our experimental results demonstrate
that motion information can be successfully transferred across the two views.
Daniel Crawford, Anna Levit, Navid Ghadermarzy, Jaspreet S. Oberoi, Pooya Ronagh
Comments: 16 pages
Subjects: Quantum Physics (quant-ph); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Optimization and Control (math.OC)
We investigate whether quantum annealers with select chip layouts can
outperform classical computers in reinforcement learning tasks. We associate a
transverse field Ising spin Hamiltonian with a layout of qubits similar to that
of a deep Boltzmann machine (DBM) and use simulated quantum annealing (SQA) to
numerically simulate quantum sampling from this system. We design a
reinforcement learning algorithm in which the set of visible nodes representing
the states and actions of an optimal policy are the first and last layers of
the deep network. In absence of a transverse field, our simulations show that
DBMs train more effectively than restricted Boltzmann machines (RBM) with the
same number of weights. Since sampling from Boltzmann distributions of a DBM is
not classically feasible, this is evidence of supremacy of a non-Turing
sampling oracle. We then develop a framework for training the network as a
quantum Boltzmann machine (QBM) in the presence of a significant transverse
field for reinforcement learning. The latter method further improves the
reinforcement learning method using DBMs.
Gunnar A. Sigurdsson, Santosh Divvala, Ali Farhadi, Abhinav Gupta
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Actions are more than just movements and trajectories: we cook to eat and we
hold a cup to drink from it. A thorough understanding of videos requires going
beyond appearance modeling and necessitates reasoning about the sequence of
activities, as well as the higher-level constructs such as intentions. But how
do we model and reason about these? We propose a fully-connected temporal CRF
model for reasoning over various aspects of activities that includes objects,
actions, and intentions, where the potentials are predicted by a deep network.
End-to-end training of such structured models is a challenging endeavor: For
inference and learning we need to construct mini-batches consisting of whole
videos, leading to mini-batches with only a few videos. This causes
high-correlation between data points leading to breakdown of the backprop
algorithm. To address this challenge, we present an asynchronous variational
inference method that allows efficient end-to-end training. Our method achieves
a classification mAP of 21.9% on the Charades benchmark, outperforming the
state-of-the-art (17.2% mAP), and offers equal gains on the task of temporal
localization.
Deepak Pathak, Ross Girshick, Piotr Dollár, Trevor Darrell, Bharath Hariharan
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)
This paper presents a novel yet intuitive approach to unsupervised feature
learning. Inspired by the human visual system, we explore whether low-level
motion-based grouping cues can be used to learn an effective visual
representation. Specifically, we use unsupervised motion-based segmentation on
videos to obtain segments, which we use as ‘pseudo ground truth’ to train a
convolutional network to segment objects from a single frame. Given the
extensive evidence that motion plays a key role in the development of the human
visual system, we hope that this straightforward approach to unsupervised
learning will be more effective than cleverly designed ‘pretext’ tasks studied
in the literature. Indeed, our extensive experiments show that this is the
case. When used for transfer learning on object detection, our representation
significantly outperforms previous unsupervised approaches across multiple
settings, especially when training data for the target task is scarce.
Aron Yu, Kristen Grauman
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Distinguishing subtle differences in attributes is valuable, yet learning to
make visual comparisons remains non-trivial. Not only is the number of possible
comparisons quadratic in the number of training images, but also access to
images adequately spanning the space of fine-grained visual differences is
limited. We propose to overcome the sparsity of supervision problem via
synthetically generated images. Building on a state-of-the-art image generation
engine, we sample pairs of training images exhibiting slight modifications of
individual attributes. Augmenting real training image pairs with these
examples, we then train attribute ranking models to predict the relative
strength of an attribute in novel pairs of real images. Our results on datasets
of faces and fashion images show the great promise of bootstrapping imperfect
image generators to counteract sample sparsity for learning to rank.
Hyeonwoo Noh, Andre Araujo, Jack Sim, Bohyung Han
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We introduce a local feature descriptor for large-scale image retrieval
applications, called DELF (DEep Local Feature). The new feature is based on
convolutional neural networks, which are trained without object- and
patch-level annotations on a landmark image dataset. To enhance DELF’s image
retrieval performance, we also propose an attention mechanism for keypoint
selection, which shares most network layers with the descriptor. This new
framework can be used in image retrieval as a drop-in replacement for other
keypoint detectors and descriptors, enabling more accurate feature matching and
geometric verification. Our technique is particularly useful for the
large-scale setting, where a system must operate with high precision. In this
case, our system produces reliable confidence scores to reject false positives
effectively—in particular, our system is robust against queries that have no
correct match in the database. We present an evaluation methodology for this
challenging retrieval setting, using standard and large-scale datasets. We show
that recently proposed methods do not perform well in this setup; DELF
outperforms several recent global and local descriptors by substantial margins.
Dimitris Spathis
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Researchers try to model the aesthetic quality of photographs into low and
high- level features, drawing inspiration from art theory, psychology and
marketing. We attempt to describe every feature extraction measure employed in
the above process. The contribution of this literature review is the taxonomy
of each feature by its implementation complexity, considering real-world
applications and integration in mobile apps and digital cameras. Also, we
discuss the machine learning results along with some unexplored research areas
as future work.
Lars M. Mescheder, Dirk A. Lorenz
Subjects: Computer Vision and Pattern Recognition (cs.CV); Numerical Analysis (math.NA); Machine Learning (stat.ML)
The Perona-Malik model has been very successful at restoring images from
noisy input. In this paper, we reinterpret the Perona-Malik model in the
language of Gaussian scale mixtures and derive some extensions of the model.
Specifically, we show that the expectation-maximization (EM) algorithm applied
to Gaussian scale mixtures leads to the lagged-diffusivity algorithm for
computing stationary points of the Perona-Malik diffusion equations. Moreover,
we show how mean field approximations to these Gaussian scale mixtures lead to
a modification of the lagged-diffusivity algorithm that better captures the
uncertainties in the restoration. Since this modification can be hard to
compute in practice we propose relaxations to the mean field objective to make
the algorithm computationally feasible. Our numerical experiments show that
this modified lagged-diffusivity algorithm often performs better at restoring
textured areas and fuzzy edges than the unmodified algorithm. As a second
application of the Gaussian scale mixture framework, we show how an efficient
sampling procedure can be obtained for the probabilistic model, making the
computation of the conditional mean and other expectations algorithmically
feasible. Again, the resulting algorithm has a strong resemblance to the
lagged-diffusivity algorithm. Finally, we show that a probabilistic version of
the Mumford-Shah segementation model can be obtained in the same framework with
a discrete edge-prior.
Weiya Ren
Comments: 23 pages. 10 figures. 5 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Social and Information Networks (cs.SI)
Collectiveness motions of crowd systems have attracted a great deal of
attentions in recently years. In this paper, we try to measure the
collectiveness of a crowd system by the proposed node clique learning method.
The proposed method is a graph based method, and investigates the influence
from one node to other nodes. A node is represented by a set of nodes which
named a clique, which is obtained by spreading information from this node to
other nodes in graph. Then only nodes with sufficient information are selected
as the clique of this node. The motion coherence between two nodes is defined
by node cliques comparing. The collectiveness of a node and the collectiveness
of the crowd system are defined by the nodes coherence. Self-driven particle
(SDP) model and the crowd motion database are used to test the ability of the
proposed method in measuring collectiveness.
Zhongwen Xu, Linchao Zhu, Yi Yang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
With the tremendous advances of Convolutional Neural Networks (ConvNets) on
object recognition, we can now obtain reliable enough machine-labeled
annotations easily by predictions from off-the-shelf ConvNets. In this work, we
present an abstraction memory based framework for few-shot learning, building
upon machine-labeled image annotations. Our method takes some large-scale
machine-annotated datasets (e.g., OpenImages) as an external memory bank. In
the external memory bank, the information is stored in the memory slots with
the form of key-value, where image feature is regarded as key and label
embedding serves as value. When queried by the few-shot examples, our model
selects visually similar data from the external memory bank, and writes the
useful information obtained from related external data into another memory
bank, i.e., abstraction memory. Long Short-Term Memory (LSTM) controllers and
attention mechanisms are utilized to guarantee the data written to the
abstraction memory is correlated to the query example. The abstraction memory
concentrates information from the external memory bank, so that it makes the
few-shot recognition effective. In the experiments, we firstly confirm that our
model can learn to conduct few-shot object recognition on clean human-labeled
data from ImageNet dataset. Then, we demonstrate that with our model,
machine-labeled image annotations are very effective and abundant resources to
perform object recognition on novel categories. Experimental results show that
our proposed model with machine-labeled annotations achieves great performance,
only with a gap of 1% between of the one with human-labeled annotations.
Christoph Käding, Erik Rodner, Alexander Freytag, Joachim Denzler
Comments: accepted contribution at NIPS 2016 Workshop on Continual Learning and Deep Networks
Subjects: Computer Vision and Pattern Recognition (cs.CV)
The demands on visual recognition systems do not end with the complexity
offered by current large-scale image datasets, such as ImageNet. In
consequence, we need curious and continuously learning algorithms that actively
acquire knowledge about semantic concepts which are present in available
unlabeled data. As a step towards this goal, we show how to perform continuous
active learning and exploration, where an algorithm actively selects relevant
batches of unlabeled examples for annotation. These examples could either
belong to already known or to yet undiscovered classes. Our algorithm is based
on a new generalization of the Expected Model Output Change principle for deep
architectures and is especially tailored to deep neural networks. Furthermore,
we show easy-to-implement approximations that yield efficient techniques for
active selection. Empirical experiments show that our method outperforms
currently used heuristics.
Sailesh Conjeti, Anees Kazi, Nassir Navab, Amin Katouzian
Subjects: Computer Vision and Pattern Recognition (cs.CV)
This paper presents a new scalable algorithm for cross-modal similarity
preserving retrieval in a learnt manifold space. Unlike existing approaches
that compromise between preserving global and local geometries, the proposed
technique respects both simultaneously during manifold alignment. The global
topologies are maintained by recovering underlying mapping functions in the
joint manifold space by deploying partially corresponding instances. The
inter-, and intra-modality affinity matrices are then computed to reinforce
original data skeleton using perturbed minimum spanning tree (pMST), and
maximizing the affinity among similar cross-modal instances, respectively. The
performance of proposed algorithm is evaluated upon two multimodal image
datasets (coronary atherosclerosis histology and brain MRI) for two
applications: classification, and regression. Our exhaustive validations and
results demonstrate the superiority of our technique over comparative methods
and its feasibility for improving computer-assisted diagnosis systems, where
disease-specific complementary information shall be aggregated and interpreted
across modalities to form the final decision.
Shadi Albarqouni, Javad Fotouhi, Nassir Navab
Subjects: Computer Vision and Pattern Recognition (cs.CV)
X-ray radiography is the most readily available imaging modality and has a
broad range of applications that spans from diagnosis to intra-operative
guidance in cardiac, orthopedics, and trauma procedures. Proper interpretation
of the hidden and obscured anatomy in X-ray images remains a challenge and
often requires high radiation dose and imaging from several perspectives.
In this work, we aim at decomposing the conventional X-ray image into d X-ray
components of independent, non-overlapped, clipped sub-volumes using deep
learning approach. Despite the challenging aspects of modelling such a highly
ill-posed problem, exciting and encouraging results are obtained paving the
path for further contributions in this direction.
Mihir Mongia, Kundan Kumar, Akram Erraqabi, Yoshua Bengio
Comments: ICASSP 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
Recent work in the literature has shown experimentally that one can use the
lower layers of a trained convolutional neural network (CNN) to model natural
textures. More interestingly, it has also been experimentally shown that only
one layer with random filters can also model textures although with less
variability. In this paper we ask the question as to why one layer CNNs with
random filters are so effective in generating textures? We theoretically show
that one layer convolutional architectures (without a non-linearity) paired
with the an energy function used in previous literature, can in fact preserve
and modulate frequency coefficients in a manner so that random weights and
pretrained weights will generate the same type of images. Based on the results
of this analysis we question whether similar properties hold in the case where
one uses one convolution layer with a non-linearity. We show that in the case
of ReLu non-linearity there are situations where only one input will give the
minimum possible energy whereas in the case of no nonlinearity, there are
always infinite solutions that will give the minimum possible energy. Thus we
can show that in certain situations adding a ReLu non-linearity generates less
variable images.
Zhizhen Chi, Hongyang Li, Huchuan Lu, Ming-Hsuan Yang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Visual tracking addresses the problem of identifying and localizing an
unknown target in a video given the target specified by a bounding box in the
first frame. In this paper, we propose a dual network to better utilize
features among layers for visual tracking. It is observed that features in
higher layers encode semantic context while its counterparts in lower layers
are sensitive to discriminative appearance. Thus we exploit the hierarchical
features in different layers of a deep model and design a dual structure to
obtain better feature representation from various streams, which is rarely
investigated in previous work. To highlight geometric contours of the target,
we integrate the hierarchical feature maps with an edge detector as the coarse
prior maps to further embed local details around the target. To leverage the
robustness of our dual network, we train it with random patches measuring the
similarities between the network activation and target appearance, which serves
as a regularization to enforce the dual network to focus on target object. The
proposed dual network is updated online in a unique manner based on the
observation that the target being tracked in consecutive frames should share
more similar feature representations than those in the surrounding background.
It is also found that for a target object, the prior maps can help further
enhance performance by passing message into the output maps of the dual
network. Therefore, an independent component analysis with reference algorithm
(ICA-R) is employed to extract target context using prior maps as guidance.
Online tracking is conducted by maximizing the posterior estimate on the final
maps with stochastic and periodic update. Quantitative and qualitative
evaluations on two large-scale benchmark data sets show that the proposed
algorithm performs favourably against the state-of-the-arts.
Victor Yurchenko, Victor Lempitsky
Subjects: Computer Vision and Pattern Recognition (cs.CV)
This work is motivated by the mostly unsolved task of parsing biological
images with multiple overlapping articulated model organisms (such as worms or
larvae). We present a general approach that separates the two main challenges
associated with such data, individual object shape estimation and object groups
disentangling. At the core of the approach is a deep feed-forward singling-out
network (SON) that is trained to map each local patch to a vectorial descriptor
that is sensitive to the characteristics (e.g. shape) of a central object,
while being invariant to the variability of all other surrounding elements.
Given a SON, a local image patch can be matched to a gallery of isolated
elements using their SON-descriptors, thus producing a hypothesis about the
shape of the central element in that patch. The image-level optimization based
on integer programming can then pick a subset of the hypotheses to explain
(parse) the whole image and disentangle groups of organisms.
While sharing many similarities with existing “analysis-by-synthesis”
approaches, our method avoids the need for stochastic search in the
high-dimensional configuration space and numerous rendering operations at
test-time. We show that our approach can parse microscopy images of three
popular model organisms (the C.Elegans roundworms, the Drosophila larvae, and
the E.Coli bacteria) even under significant crowding and overlaps between
organisms. We speculate that the overall approach is applicable to a wider
class of image parsing problems concerned with crowded articulated objects, for
which rendering training images is possible.
Wentao Zhu, Xiaohui Xie
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
Mass segmentation is an important task in mammogram analysis, providing
effective morphological features and regions of interest (ROI) for mass
detection and classification. Inspired by the success of using deep
convolutional features for natural image analysis and conditional random fields
(CRF) for structural learning, we propose an end-to-end network for
mammographic mass segmentation. The network employs a fully convolutional
network (FCN) to model potential function, followed by a CRF to perform
structural learning. Because the mass distribution varies greatly with pixel
position, the FCN is combined with position priori for the task. Due to the
small size of mammogram datasets, we use adversarial training to control
over-fitting. Four models with different convolutional kernels are further
fused to improve the segmentation results. Experimental results on two public
datasets, INbreast and DDSM-BCRP, show that our end-to-end network combined
with adversarial training achieves the-state-of-the-art results.
Wentao Zhu, Qi Lou, Yeeleng Scott Vang, Xiaohui Xie
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
Mammogram classification is directly related to computer-aided diagnosis of
breast cancer. Traditional methods requires great effort to annotate the
training data by costly manual labeling and specialized computational models to
detect these annotations during test. Inspired by the success of using deep
convolutional features for natural image analysis and multi-instance learning
for labeling a set of instances/patches, we propose end-to-end trained deep
multi-instance networks for mass classification based on whole mammogram
without the aforementioned costly need to annotate the training data. We
explore three different schemes to construct deep multi-instance networks for
whole mammogram classification. Experimental results on the INbreast dataset
demonstrate the robustness of proposed deep networks compared to previous work
using segmentation and detection annotations in the training.
Chao Ma, Chih-Yuan Yang, Xiaokang Yang, Ming-Hsuan Yang
Comments: Accepted by Computer Vision and Image Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Numerous single-image super-resolution algorithms have been proposed in the
literature, but few studies address the problem of performance evaluation based
on visual perception. While most super-resolution images are evaluated by
fullreference metrics, the effectiveness is not clear and the required
ground-truth images are not always available in practice. To address these
problems, we conduct human subject studies using a large set of
super-resolution images and propose a no-reference metric learned from visual
perceptual scores. Specifically, we design three types of low-level statistical
features in both spatial and frequency domains to quantify super-resolved
artifacts, and learn a two-stage regression model to predict the quality scores
of super-resolution images without referring to ground-truth images. Extensive
experimental results show that the proposed metric is effective and efficient
to assess the quality of super-resolution images based on human perception.
Zhiwu Huang, Chengde Wan, Thomas Probst, Luc Van Gool
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In recent years, skeleton-based action recognition has become a popular 3D
classification problem. State-of-the-art methods typically first represent each
motion sequence as a high-dimensional trajectory on a Lie group with an
additional dynamic time warping, and then shallowly learn favorable Lie group
features. In this paper we incorporate the Lie group structure into a deep
network architecture to learn more appropriate Lie group features for 3D action
recognition. Within the network structure, we design rotation mapping layers to
transform the input Lie group features into desirable ones, which are aligned
better in the temporal domain. To reduce the high feature dimensionality, the
architecture is equipped with rotation pooling layers for the elements on the
Lie group. Furthermore, we propose a logarithm mapping layer to map the
resulting manifold data into a tangent space that facilitates the application
of regular output layers for the final classification. Evaluations of the
proposed network for standard 3D human action recognition datasets clearly
demonstrate its superiority over existing shallow Lie group feature learning
methods as well as most conventional deep learning methods.
Matheus Gadelha, Subhransu Maji, Rui Wang
Comments: Submitted to CVPR 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In this paper we investigate the problem of inducing a distribution over
three-dimensional structures given two-dimensional views of multiple objects
taken from unknown viewpoints. Our approach called “projective generative
adversarial networks” (PrGANs) trains a deep generative model of 3D shapes
whose projections match the distributions of the input 2D views. The addition
of a projection module allows us to infer the underlying 3D shape distribution
without using any 3D, viewpoint information, or annotation during the learning
phase. We show that our approach produces 3D shapes of comparable quality to
GANs trained on 3D data for a number of shape categories including chairs,
airplanes, and cars. Experiments also show that the disentangled representation
of 2D shapes into geometry and viewpoint leads to a good generative model of 2D
shapes. The key advantage is that our model allows us to predict 3D, viewpoint,
and generate novel views from an input image in a completely unsupervised
manner.
Shervin Ardeshir, Krishna Regmi, Ali Borji
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Mirror neurons have been observed in the primary motor cortex of primate
species, in particular in humans and monkeys. A mirror neuron fires when a
person performs a certain action, and also when he observes the same action
being performed by another person. A crucial step towards building fully
autonomous intelligent systems with human-like learning abilities is the
capability in modeling the mirror neuron. On one hand, the abundance of
egocentric cameras in the past few years has offered the opportunity to study a
lot of vision problems from the first-person perspective. A great deal of
interesting research has been done during the past few years, trying to explore
various computer vision tasks from the perspective of the self. On the other
hand, videos recorded by traditional static cameras, capture humans performing
different actions from an exocentric third-person perspective. In this work, we
take the first step towards relating motion information across these two
perspectives. We train models that predict motion in an egocentric view, by
observing it from an exocentric view, and vice versa. This allows models to
predict how an egocentric motion would look like from outside. To do so, we
train linear and nonlinear models and evaluate their performance in terms of
retrieving the egocentric (exocentric) motion features, while having access to
an exocentric (egocentric) motion feature. Our experimental results demonstrate
that motion information can be successfully transferred across the two views.
Sajad Mousavi, Ali Borji, Nasser Mozayani
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
Bottom-Up (BU) saliency models do not perform well in complex interactive
environments where humans are actively engaged in tasks (e.g., sandwich making
and playing the video games). In this paper, we leverage Reinforcement Learning
(RL) to highlight task-relevant locations of input frames. We propose a soft
attention mechanism combined with the Deep Q-Network (DQN) model to teach an RL
agent how to play a game and where to look by focusing on the most pertinent
parts of its visual input. Our evaluations on several Atari 2600 games show
that the soft attention based model could predict fixation locations
significantly better than bottom-up models such as Itti-Kochs saliency and
Graph-Based Visual Saliency (GBVS) models.
Xiangfei Kong, Lin Yang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We propose a robust image enhancement algorithm dedicated for muscle fiber
specimen images captured by optical microscopes. Blur or out of focus problems
are prevalent in muscle images during the image acquisition stage. Traditional
image deconvolution methods do not work since they assume the blur kernels are
known and also produce ring artifacts. We provide a compact framework which
involves a novel spatially non-uniform blind deblurring approach specialized to
muscle images which automatically detects and alleviates degraded regions. Ring
artifacts problems are addressed and a kernel propagation strategy is proposed
to speedup the algorithm and deals with the high non-uniformity of the blur
kernels on muscle images. Experiments show that the proposed framework performs
well on muscle images taken with modern advanced optical microscopes. Our
framework is free of laborious parameter settings and is computationally
efficient.
Liao Ni, Yi Zhang, He Zheng, Shilei Liu, Houjun Huang, Wenxin Li
Comments: 5 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Finger vein verification has developed a lot since its first proposal, but
there is still not a perfect algorithm. It is proved that algorithms with the
same overall accuracy may have different misclassified patterns. We could make
use of this complementation to fuse individual algorithms together for more
precise result. According to our observation, algorithm has different
confidence on its decisions but it is seldom considered in fusion methods. Our
work is first to define decision reliability ratio to quantify this confidence,
and then propose the Maximum Decision Reliability Ratio (MDRR) fusion method
incorporating Weighted Voting. Experiment conducted on a data set of 1000
fingers and 5 images per finger proves the effectiveness of the method. The
classifier obtained by MDRR method gets an accuracy of 99.42% while the maximum
accuracy of the original individual classifiers is 97.77%. The experiment
results also show the MDRR outperforms the traditional fusion methods as
Voting, Weighted Voting, Sum and Weighted Sum.
Ben Nassi, Alona Levy, Yuval Elovici, Erez Shmueli
Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY)
Online signature verification technologies, such as those available in banks
and post offices, rely on dedicated digital devices such as tablets or smart
pens to capture, analyze and verify signatures. In this paper, we suggest a
novel method for online signature verification that relies on the increasingly
available hand-worn devices, such as smartwatches or fitness trackers, instead
of dedicated ad-hoc devices. Our method uses a set of known genuine and forged
signatures, recorded using the motion sensors of a hand-worn device, to train a
machine learning classifier. Then, given the recording of an unknown signature
and a claimed identity, the classifier can determine whether the signature is
genuine or forged. In order to validate our method, it was applied on 1980
recordings of genuine and forged signatures that we collected from 66 subjects
in our institution. Using our method, we were able to successfully distinguish
between genuine and forged signatures with a high degree of accuracy (0.98 AUC
and 0.05 EER).
Evan Schwab, René Vidal, Nicolas Charon
Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Quantitative Methods (q-bio.QM)
Diffusion MRI (dMRI) can reconstruct neuronal fibers in the brain, in vivo,
by measuring water diffusion along angular gradient directions in q-space. High
angular resolution diffusion imaging (HARDI) can produce better estimates of
fiber orientation than the popularly used diffusion tensor imaging, but the
high number of samples needed to estimate diffusivity requires lengthy patient
scan times. To accelerate dMRI, compressed sensing (CS) has been utilized by
exploiting a sparse representation of the data, discovered through sparse
coding. The sparser the representation, the fewer samples are needed to
reconstruct a high resolution signal with limited information loss and so a
focus of much dMRI research has been finding the sparsest possible dictionary
representation. All prior methods, however, rely on an angular model of q-space
signals in each voxel which fundamentally limits the global sparsity level
since at least one dictionary atom is needed for each voxel. In contrast, we
formulate a global spatial-angular representation of dMRI that will allow us to
sparsely model an entire dMRI brain signal below the limit of one atom per
voxel using joint spatial-angular sparse coding. But a main challenge is
optimizing over large-scale dMRI data. In this work, we present extensions to a
number of sparse coding algorithms that are better suited for large-scale
problems by exploiting the separable Kronecker structure of our global
spatial-angular dictionary. We compare the complexity and speed of our methods
with prior Kronecker sparse coding algorithms and show promising sparsity
results on phantom and real HARDI brain data for various dictionary choices.
With great efficiency our method achieves significantly sparser HARDI
representations than the state-of-the-art which has the potential achieve new
levels of HARDI acceleration within a unified (k,q)-CS framework.
Diederik Aerts, Jonito Aerts Arguëlles, Lester Beltran, Lyneth Beltran, Massimiliano Sassoli de Bianchi, Sandro Sozzo, Tomas Veloz
Comments: 12 pages, no figures
Subjects: Artificial Intelligence (cs.AI); Quantum Physics (quant-ph)
The mathematical formalism of quantum theory exhibits significant
effectiveness when applied to cognitive phenomena that have resisted
traditional (set theoretical) modeling. Relying on a decade of research on the
operational foundations of micro-physical and conceptual entities, we present a
theoretical framework for the representation of concepts and their conjunctions
and disjunctions that uses the quantum formalism. This framework provides a
unified solution to the ‘conceptual combinations problem’ of cognitive
psychology, explaining the observed deviations from classical (Boolean, fuzzy
set and Kolmogorovian) structures in terms of genuine quantum effects. In
particular, natural concepts ‘interfere’ when they combine to form more complex
conceptual entities, and they also exhibit a ‘quantum-type context-dependence’,
which are responsible of the ‘over- and under-extension’ that are
systematically observed in experiments on membership judgments.
Ahmed M. Alaa, Mihaela van der Schaar
Subjects: Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Modeling continuous-time physiological processes that manifest a patient’s
evolving clinical states is a key step in approaching many problems in
healthcare. In this paper, we develop the Hidden Absorbing Semi-Markov Model
(HASMM): a versatile probabilistic model that is capable of capturing the
modern electronic health record (EHR) data. Unlike exist- ing models, an HASMM
accommodates irregularly sampled, temporally correlated, and informatively
censored physiological data, and can describe non-stationary clinical state
transitions. Learning an HASMM from the EHR data is achieved via a novel
forward- filtering backward-sampling Monte-Carlo EM algorithm that exploits the
knowledge of the end-point clinical outcomes (informative censoring) in the EHR
data, and implements the E-step by sequentially sampling the patients’ clinical
states in the reverse-time direction while conditioning on the future states.
Real-time inferences are drawn via a forward- filtering algorithm that operates
on a virtually constructed discrete-time embedded Markov chain that mirrors the
patient’s continuous-time state trajectory. We demonstrate the di- agnostic and
prognostic utility of the HASMM in a critical care prognosis setting using a
real-world dataset for patients admitted to the Ronald Reagan UCLA Medical
Center.
Kavosh Asadi, Jason D. Williams
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)
Representing a dialog policy as a recurrent neural network (RNN) is
attractive because it handles partial observability, infers a latent
representation of state, and can be optimized with supervised learning (SL) or
reinforcement learning (RL). For RL, a policy gradient approach is natural, but
is sample inefficient. In this paper, we present 3 methods for reducing the
number of dialogs required to optimize an RNN-based dialog policy with RL. The
key idea is to maintain a second RNN which predicts the value of the current
policy, and to apply experience replay to both networks. On two tasks, these
methods reduce the number of dialogs/episodes required by about a third, vs.
standard policy gradient methods.
Valentina Franzoni, Giulio Biondi, Alfredo Milani, Yuanxi Li
Comments: Authors preprint, including revision differences with respect to the main publication ‘Web-based Similarity for Emotion Recognition in Web Objects’ published in the UCC ’16 workshop in IEEE UCC, December 06 – 09, 2016, Shanghai, China. DOI: this http URL
Journal-ref: In Proc 9th International Conference on Utility and Cloud
Computing (UCC 2016). ACM, New York, NY, USA, 327-332
Subjects: Artificial Intelligence (cs.AI); Social and Information Networks (cs.SI)
In this project we propose a new approach for emotion recognition using
web-based similarity (e.g. confidence, PMI and PMING). We aim to extract basic
emotions from short sentences with emotional content (e.g. news titles, tweets,
captions), performing a web-based quantitative evaluation of semantic proximity
between each word of the analyzed sentence and each emotion of a psychological
model (e.g. Plutchik, Ekman, Lovheim). The phases of the extraction include:
text preprocessing (tokenization, stop words, filtering), search engine
automated query, HTML parsing of results (i.e. scraping), estimation of
semantic proximity, ranking of emotions according to proximity measures. The
main idea is that, since it is possible to generalize semantic similarity under
the assumption that similar concepts co-occur in documents indexed in search
engines, therefore also emotions can be generalized in the same way, through
tags or terms that express them in a particular language, ranking emotions.
Training results are compared to human evaluation, then additional comparative
tests on results are performed, both for the global ranking correlation (e.g.
Kendall, Spearman, Pearson) both for the evaluation of the emotion linked to
each single word. Different from sentiment analysis, our approach works at a
deeper level of abstraction, aiming at recognizing specific emotions and not
only the positive/negative sentiment, in order to predict emotions as semantic
data.
Hang Ma, Sven Koenig
Comments: In AAMAS 2016
Subjects: Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA); Robotics (cs.RO)
We study the TAPF (combined target-assignment and path-finding) problem for
teams of agents in known terrain, which generalizes both the anonymous and
non-anonymous multi-agent path-finding problems. Each of the teams is given the
same number of targets as there are agents in the team. Each agent has to move
to exactly one target given to its team such that all targets are visited. The
TAPF problem is to first assign agents to targets and then plan collision-free
paths for the agents to their targets in a way such that the makespan is
minimized. We present the CBM (Conflict-Based Min-Cost-Flow) algorithm, a
hierarchical algorithm that solves TAPF instances optimally by combining ideas
from anonymous and non-anonymous multi-agent path-finding algorithms. On the
low level, CBM uses a min-cost max-flow algorithm on a time-expanded network to
assign all agents in a single team to targets and plan their paths. On the high
level, CBM uses conflict-based search to resolve collisions among agents in
different teams. Theoretically, we prove that CBM is correct, complete and
optimal. Experimentally, we show the scalability of CBM to TAPF instances with
dozens of teams and hundreds of agents and adapt it to a simulated warehouse
system.
Deepak Pathak, Ross Girshick, Piotr Dollár, Trevor Darrell, Bharath Hariharan
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)
This paper presents a novel yet intuitive approach to unsupervised feature
learning. Inspired by the human visual system, we explore whether low-level
motion-based grouping cues can be used to learn an effective visual
representation. Specifically, we use unsupervised motion-based segmentation on
videos to obtain segments, which we use as ‘pseudo ground truth’ to train a
convolutional network to segment objects from a single frame. Given the
extensive evidence that motion plays a key role in the development of the human
visual system, we hope that this straightforward approach to unsupervised
learning will be more effective than cleverly designed ‘pretext’ tasks studied
in the literature. Indeed, our extensive experiments show that this is the
case. When used for transfer learning on object detection, our representation
significantly outperforms previous unsupervised approaches across multiple
settings, especially when training data for the target task is scarce.
Sam Ganzfried, Farzana Yusuf
Subjects: Computer Science and Game Theory (cs.GT); Artificial Intelligence (cs.AI); Learning (cs.LG); Multiagent Systems (cs.MA)
Algorithms for equilibrium computation generally make no attempt to ensure
that the computed strategies are understandable by humans. For instance the
strategies for the strongest poker agents are represented as massive binary
files. In many situations, we would like to compute strategies that can
actually be implemented by humans, who may have computational limitations and
may only be able to remember a small number of features or components of the
strategies that have been computed. We study poker games where private
information distributions can be arbitrary. We create a large training set of
game instances and solutions, by randomly selecting the private information
probabilities, and present algorithms that learn from the training instances in
order to perform well in games with unseen information distributions. One
approach first clusters the training points into a small number of clusters and
then creates a small decision tree based on the cluster centers. This approach
produces low test error and could be easily implemented by humans since it only
requires memorizing a small number of “if-then” rules.
Ganesh J, Manish Gupta, Vasudeva Varma
Comments: To be presented at European Conference on Information Retrieval (ECIR) 2017
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
In this work we propose a novel representation learning model which computes
semantic representations for tweets accurately. Our model systematically
exploits the chronologically adjacent tweets (‘context’) from users’ Twitter
timelines for this task. Further, we make our model user-aware so that it can
do well in modeling the target tweet by exploiting the rich knowledge about the
user such as the way the user writes the post and also summarizing the topics
on which the user writes. We empirically demonstrate that the proposed models
outperform the state-of-the-art models in predicting the user profile
attributes like spouse, education and job by 19.66%, 2.27% and 2.22%
respectively.
Raphael Shu, Hideki Nakayama
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Recently, attention mechanism plays a key role to achieve high performance
for Neural Machine Translation models. It applies a score function to the
encoder states to obtain alignment weights. However, as this computation is
done for all positions in each decoding step, the attention mechanism greatly
increases the computational complexity. In this paper we propose a novel
attention model which can reduce redundant attentional computations in a
flexible manner. The proposed mechanism tracks the center of attention in each
decoding step, and computes position-based penalties. In the test time, the
computations of the score function for heavily penalized positions are skipped.
In our experiments, we found that the computations in the attention model can
be reduced by 54% in average with almost no loss of accuracy.
Erik Talvitie
Comments: To appear in Proceedings of the 31st AAAI Conference on Artificial Intelligence, 2017. This version incorporates the appendix into document (rather than as supplementary material)
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)
When an agent cannot represent a perfectly accurate model of its
environment’s dynamics, model-based reinforcement learning (MBRL) can fail
catastrophically. Planning involves composing the predictions of the model;
when flawed predictions are composed, even minor errors can compound and render
the model useless for planning. Hallucinated Replay (Talvitie 2014) trains the
model to “correct” itself when it produces errors, substantially improving MBRL
with flawed models. This paper theoretically analyzes this approach,
illuminates settings in which it is likely to be effective or ineffective, and
presents a novel error bound, showing that a model’s ability to self-correct is
more tightly related to MBRL performance than one-step prediction error. These
results inspire an MBRL algorithm for deterministic MDPs with performance
guarantees that are robust to model class limitations.
Fanlin Meng, Xiao-Jun Zeng, Yan Zhang, Chris J. Dent, Dunwei Gong
Comments: 9 pages, 5 figures
Subjects: Systems and Control (cs.SY); Artificial Intelligence (cs.AI); Computer Science and Game Theory (cs.GT); Optimization and Control (math.OC)
In this paper, we consider a realistic scenario in the context of smart grids
where an electricity retailer serves three different types of customers, i.e.,
customers with home energy management system embedded in their smart meters
(C-HEMS), customers with only smart meters (C-SM), and customers without smart
meters (C-NONE). The main objective of this paper is to support the retailer to
make optimal day-ahead dynamic pricing decisions subject to realistic market
constraints in such a mixed customer pool. To this end, we firstly propose an
optimal energy management model for each C-HEMS, two appliance-level customer
behaviour learning models for each CSM, and an aggregated demand forecasting
model for the whole C-NONE. With the above demand models established capturing
energy consumption patterns of different type customers, we then adopt genetic
algorithms (GAs) based solution framework to obtain the optimal dynamic prices
for the retailer. In addition to smart meters and HEMS, the penetration of
Photovoltaic (PV) generation and energy storage in the households is further
considered in the pricing optimization model. Simulation results indicate that
the proposed pricing optimization algorithms are effective.
Karl Schlechta (LIF)
Subjects: Logic (math.LO); Artificial Intelligence (cs.AI)
We use the theory of defaults and their meaning of [GS16] to develop (the
outline of a) new theory of argumentation.
Mirko Polato, Fabio Aiolli
Comments: Under revision for Neurocomputing (Elsevier Journal)
Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Learning (cs.LG)
The increasing availability of implicit feedback datasets has raised the
interest in developing effective collaborative filtering techniques able to
deal asymmetrically with unambiguous positive feedback and ambiguous negative
feedback. In this paper, we propose a principled kernel-based collaborative
filtering method for top-N item recommendation with implicit feedback. We
present an efficient implementation using the linear kernel, and we show how to
generalize it to kernels of the dot product family preserving the efficiency.
We also investigate on the elements which influence the sparsity of a standard
cosine kernel. This analysis shows that the sparsity of the kernel strongly
depends on the properties of the dataset, in particular on the long tail
distribution. We compare our method with state-of-the-art algorithms achieving
good results both in terms of efficiency and effectiveness.
Daniel Crawford, Anna Levit, Navid Ghadermarzy, Jaspreet S. Oberoi, Pooya Ronagh
Comments: 16 pages
Subjects: Quantum Physics (quant-ph); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Optimization and Control (math.OC)
We investigate whether quantum annealers with select chip layouts can
outperform classical computers in reinforcement learning tasks. We associate a
transverse field Ising spin Hamiltonian with a layout of qubits similar to that
of a deep Boltzmann machine (DBM) and use simulated quantum annealing (SQA) to
numerically simulate quantum sampling from this system. We design a
reinforcement learning algorithm in which the set of visible nodes representing
the states and actions of an optimal policy are the first and last layers of
the deep network. In absence of a transverse field, our simulations show that
DBMs train more effectively than restricted Boltzmann machines (RBM) with the
same number of weights. Since sampling from Boltzmann distributions of a DBM is
not classically feasible, this is evidence of supremacy of a non-Turing
sampling oracle. We then develop a framework for training the network as a
quantum Boltzmann machine (QBM) in the presence of a significant transverse
field for reinforcement learning. The latter method further improves the
reinforcement learning method using DBMs.
Xiujun Li, Zachary C. Lipton, Bhuwan Dhingra, Lihong Li, Jianfeng Gao, Yun-Nung Chen
Comments: 14 pages
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Despite widespread interests in reinforcement-learning for task-oriented
dialogue systems, several obstacles can frustrate research and development
progress. First, reinforcement learners typically require interaction with the
environment, so conventional dialogue corpora cannot be used directly. Second,
each task presents specific challenges, requiring separate corpus of
task-specific annotated data. Third, collecting and annotating human-machine or
human-human conversations for task-oriented dialogues requires extensive domain
knowledge. Because building an appropriate dataset can be both financially
costly and time-consuming, one popular approach is to build a user simulator
based upon a corpus of example dialogues. Then, one can train reinforcement
learning agents in an online fashion as they interact with the simulator.
Dialogue agents trained on these simulators can serve as an effective starting
point. Once agents master the simulator, they may be deployed in a real
environment to interact with humans, and continue to be trained online. To ease
empirical algorithmic comparisons in dialogues, this paper introduces a new,
publicly available simulation framework, where our simulator, designed for the
movie-booking domain, leverages both rules and collected data. The simulator
supports two tasks: movie ticket booking and movie seeking. Finally, we
demonstrate several agents and detail the procedure to add and test your own
agent in the proposed framework.
Nuno Moniz, Luís Torgo, João Vinagre
Comments: 12 pages, 4 figures
Subjects: Information Retrieval (cs.IR)
Ranking evaluation metrics are a fundamental element of design and
improvement efforts in information retrieval. We observe that most popular
metrics disregard information portrayed in the scores used to derive rankings,
when available. This may pose a numerical scaling problem, causing an under- or
over-estimation of the evaluation depending on the degree of divergence between
the scores of ranked items. The purpose of this work is to propose a principled
way of quantifying multi-graded relevance judgments of items and enable a more
accurate penalization of ordering errors in rankings. We propose a data-driven
generation of relevance functions based on the degree of the divergence amongst
a set of items’ scores and its application in the evaluation metric Normalized
Discounted Cumulative Gain (nDCG). We use synthetic data to demonstrate the
interest of our proposal and a combination of data on news items from Google
News and their respective popularity in Twitter to show its performance in
comparison to the standard nDCG. Results show that our proposal is capable of
providing a more fine-grained evaluation of rankings when compared to the
standard nDCG, and that the latter frequently under- or over-estimates its
evaluation scores in light of the divergence of items’ scores.
Mirko Polato, Fabio Aiolli
Comments: Under revision for Neurocomputing (Elsevier Journal)
Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Learning (cs.LG)
The increasing availability of implicit feedback datasets has raised the
interest in developing effective collaborative filtering techniques able to
deal asymmetrically with unambiguous positive feedback and ambiguous negative
feedback. In this paper, we propose a principled kernel-based collaborative
filtering method for top-N item recommendation with implicit feedback. We
present an efficient implementation using the linear kernel, and we show how to
generalize it to kernels of the dot product family preserving the efficiency.
We also investigate on the elements which influence the sparsity of a standard
cosine kernel. This analysis shows that the sparsity of the kernel strongly
depends on the properties of the dataset, in particular on the long tail
distribution. We compare our method with state-of-the-art algorithms achieving
good results both in terms of efficiency and effectiveness.
Gerhard Gossen, Elena Demidova, Thomas Risse
Comments: Published in the Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries 2015
Journal-ref: Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital
Libraries (pp. 75–84) (2015)
Subjects: Digital Libraries (cs.DL); Information Retrieval (cs.IR)
Researchers in the Digital Humanities and journalists need to monitor,
collect and analyze fresh online content regarding current events such as the
Ebola outbreak or the Ukraine crisis on demand. However, existing focused
crawling approaches only consider topical aspects while ignoring temporal
aspects and therefore cannot achieve thematically coherent and fresh Web
collections. Especially Social Media provide a rich source of fresh content,
which is not used by state-of-the-art focused crawlers. In this paper we
address the issues of enabling the collection of fresh and relevant Web and
Social Web content for a topic of interest through seamless integration of Web
and Social Media in a novel integrated focused crawler. The crawler collects
Web and Social Media content in a single system and exploits the stream of
fresh Social Media content for guiding the crawler.
Ciprian-Octavian Truică, Jérôme Darmont (ERIC), Julien Velcin (ERIC)
Journal-ref: 12th International Conference on Advanced Data Mining and
Applications (ADMA 2016), Dec 2016, Gold Coast, Australia. Springer, 10086,
pp.481-494, 2016, Lecture Notes in Artificial Intelligence
Subjects: Databases (cs.DB); Information Retrieval (cs.IR)
Analyzing textual data is a very challenging task because of the huge volume
of data generated daily. Fundamental issues in text analysis include the lack
of structure in document datasets, the need for various preprocessing steps
%(e.g., stem or lemma extraction, part-of-speech tagging, named entities
recognition…), and performance and scaling issues. Existing text analysis
architectures partly solve these issues, providing restrictive data schemas,
addressing only one aspect of text preprocessing and focusing on one single
task when dealing with performance optimization. %As a result, no definite
solution is currently available. Thus, we propose in this paper a new generic
text analysis architecture, where document structure is flexible, many
preprocessing techniques are integrated and textual datasets are indexed for
efficient access. We implement our conceptual architecture using both a
relational and a document-oriented database. Our experiments demonstrate the
feasibility of our approach and the superiority of the document-oriented
logical and physical implementation.
Gerhard Gossen, Elena Demidova, Thomas Risse
Comments: Published in the Proceedings of the European Conference on Information Retrieval (ECIR) 2015
Subjects: Digital Libraries (cs.DL); Information Retrieval (cs.IR)
Collections of Web documents about specific topics are needed for many areas
of current research. Focused crawling enables the creation of such collections
on demand. Current focused crawlers require the user to manually specify
starting points for the crawl (seed URLs). These are also used to describe the
expected topic of the collection. The choice of seed URLs influences the
quality of the resulting collection and requires a lot of expertise. In this
demonstration we present the iCrawl Wizard, a tool that assists users in
defining focused crawls efficiently and semi-automatically. Our tool uses major
search engines and Social Media APIs as well as information extraction
techniques to find seed URLs and a semantic description of the crawl intent.
Using the iCrawl Wizard even non-expert users can create semantic
specifications for focused crawlers interactively and efficiently.
Christophe Servan, Josep Crego, Jean Senellart
Comments: Submitted to EACL 2017 short paper
Subjects: Computation and Language (cs.CL)
Domain adaptation is a key feature in Machine Translation. It generally
encompasses terminology, domain and style adaptation, especially for human
post-editing workflows in Computer Assisted Translation (CAT). With Neural
Machine Translation (NMT), we introduce a new notion of domain adaptation that
we call “specialization” and which is showing promising results both in the
learning speed and in adaptation accuracy. In this paper, we propose to explore
this approach under several perspectives.
Catherine Kobus, Josep Crego, Jean Senellart
Comments: Submitted to EACL 2017 short paper
Subjects: Computation and Language (cs.CL)
Machine translation systems are very sensitive to the domains they were
trained on. Several domain adaptation techniques have been deeply studied. We
propose a new technique for neural machine translation (NMT) that we call
domain control which is performed at runtime using a unique neural network
covering multiple domains. The presented approach shows quality improvements
when compared to dedicated domains translating on any of the covered domains
and even on out-of-domain data. In addition, model parameters do not need to be
re-estimated for each domain, making this effective to real use cases.
Evaluation is carried out on English-to-French translation for two different
testing scenarios. We first consider the case where an end-user performs
translations on a known domain. Secondly, we consider the scenario where the
domain is not known and predicted at the sentence level before translating.
Results show consistent accuracy improvements for both conditions.
Josep Crego, Jean Senellart
Comments: Submitted to EACL 2017 Short paper
Subjects: Computation and Language (cs.CL)
Text simplification aims at reducing the lexical, grammatical and structural
complexity of a text while keeping the same meaning. In the context of machine
translation, we introduce the idea of simplified translations in order to boost
the learning ability of deep neural translation models. We conduct preliminary
experiments showing that translation complexity is actually reduced in a
translation of a source bi-text compared to the target reference of the bi-text
while using a neural machine translation (NMT) system learned on the exact same
bi-text. Based on knowledge distillation idea, we then train an NMT system
using the simplified bi-text, and show that it outperforms the initial system
that was built over the reference data set. Performance is further boosted when
both reference and automatic translations are used to learn the network. We
perform an elementary analysis of the translated corpus and report accuracy
results of the proposed approach on English-to-French and English-to-German
translation tasks.
Dakun Zhang, Jungi Kim, Josep Crego, Jean Senellart
Comments: Submitted to EACL 2017 short paper
Subjects: Computation and Language (cs.CL)
Training efficiency is one of the main problems for Neural Machine
Translation (NMT). Deep networks, very large data and many training iterations
are necessary to achieve state-of-the-art performance for NMT. This results in
very high computation cost and slow down research and industrialization. In
this paper, we first investigate the instability by randomizations for NMT
training, and further propose an efficient training method based on data
boosting and bootstrapping with no modifications to the neural network.
Experiments show that this method can converge much faster compared with a
baseline system and achieve stable improvement up to 2.36 BLEU points with 80%
training cost.
Ganesh J, Manish Gupta, Vasudeva Varma
Comments: To be presented at European Conference on Information Retrieval (ECIR) 2017
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
In this work we propose a novel representation learning model which computes
semantic representations for tweets accurately. Our model systematically
exploits the chronologically adjacent tweets (‘context’) from users’ Twitter
timelines for this task. Further, we make our model user-aware so that it can
do well in modeling the target tweet by exploiting the rich knowledge about the
user such as the way the user writes the post and also summarizing the topics
on which the user writes. We empirically demonstrate that the proposed models
outperform the state-of-the-art models in predicting the user profile
attributes like spouse, education and job by 19.66%, 2.27% and 2.22%
respectively.
Raphael Shu, Hideki Nakayama
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Recently, attention mechanism plays a key role to achieve high performance
for Neural Machine Translation models. It applies a score function to the
encoder states to obtain alignment weights. However, as this computation is
done for all positions in each decoding step, the attention mechanism greatly
increases the computational complexity. In this paper we propose a novel
attention model which can reduce redundant attentional computations in a
flexible manner. The proposed mechanism tracks the center of attention in each
decoding step, and computes position-based penalties. In the test time, the
computations of the score function for heavily penalized positions are skipped.
In our experiments, we found that the computations in the attention model can
be reduced by 54% in average with almost no loss of accuracy.
Katharina Kann, Ryan Cotterell, Hinrich Schütze
Comments: EACL 2017
Subjects: Computation and Language (cs.CL)
We explore the task of multi-source morphological reinflection, which
generalizes the standard, single-source version. The input consists of (i) a
target tag and (ii) multiple pairs of source form and source tag for a lemma.
The motivation is that it is beneficial to have access to more than one source
form since different source forms can provide complementary information, e.g.,
different stems. We further present a novel extension to the encoder- decoder
recurrent neural architecture, consisting of multiple encoders, to better solve
the task. We show that our new architecture outperforms single-source
reinflection models and publish our dataset for multi-source morphological
reinflection to facilitate future research.
Thomas Laurent, James von Brecht
Subjects: Neural and Evolutionary Computing (cs.NE); Computation and Language (cs.CL); Learning (cs.LG)
We introduce an exceptionally simple gated recurrent neural network (RNN)
that achieves performance comparable to well-known gated architectures, such as
LSTMs and GRUs, on the word-level language modeling task. We prove that our
model has simple, predicable and non-chaotic dynamics. This stands in stark
contrast to more standard gated architectures, whose underlying dynamical
systems exhibit chaotic behavior.
Xiujun Li, Zachary C. Lipton, Bhuwan Dhingra, Lihong Li, Jianfeng Gao, Yun-Nung Chen
Comments: 14 pages
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Despite widespread interests in reinforcement-learning for task-oriented
dialogue systems, several obstacles can frustrate research and development
progress. First, reinforcement learners typically require interaction with the
environment, so conventional dialogue corpora cannot be used directly. Second,
each task presents specific challenges, requiring separate corpus of
task-specific annotated data. Third, collecting and annotating human-machine or
human-human conversations for task-oriented dialogues requires extensive domain
knowledge. Because building an appropriate dataset can be both financially
costly and time-consuming, one popular approach is to build a user simulator
based upon a corpus of example dialogues. Then, one can train reinforcement
learning agents in an online fashion as they interact with the simulator.
Dialogue agents trained on these simulators can serve as an effective starting
point. Once agents master the simulator, they may be deployed in a real
environment to interact with humans, and continue to be trained online. To ease
empirical algorithmic comparisons in dialogues, this paper introduces a new,
publicly available simulation framework, where our simulator, designed for the
movie-booking domain, leverages both rules and collected data. The simulator
supports two tasks: movie ticket booking and movie seeking. Finally, we
demonstrate several agents and detail the procedure to add and test your own
agent in the proposed framework.
Tadeusz Kobus, Maciej Kokociński, Paweł T. Wojciechowski
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
We propose Hybrid Transactional Replication (HTR), a novel replication scheme
for highly dependable services. It combines two schemes: a transaction is
executed either optimistically by only one service replica in the deferred
update mode (DU), or deterministically by all replicas in the state machine
mode (SM); the choice is made by an oracle. The DU mode allows for parallelism
and thus takes advantage of multicore hardware. In contrast to DU, the SM mode
guarantees abort-free execution, so it is suitable for irrevocable operations
and transactions generating high contention. For expressiveness, transactions
can be discarded or retried on demand. We formally prove that the higher
flexibility of the scheme does not come at the cost of weaker guarantees for
clients: HTR satisfies strong consistency guarantees akin to those provided by
other popular transactional replication schemes such as Deferred Update
Replication. We developed HTR-enabled Paxos STM, an object-based distributed
transactional memory system, and evaluated it thoroughly under various
workloads. We show the benefits of using a novel oracle, which relies on
machine learning techniques for automatic adaptation to changing conditions. In
our tests, the ML-based oracle provides up to 50% improvement in throughput
when compared to the system running with DU-only or SM-only oracles. Our
approach is inspired by a well known algorithm used in the context of the
multi-armed bandit problem.
Fabio Baruffa, Luigi Iapichino, Nicolay J. Hammer, Vasileios Karakasis
Comments: 18 pages, 5 figures, submitted
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Instrumentation and Methods for Astrophysics (astro-ph.IM); Computational Physics (physics.comp-ph)
We describe a strategy for code modernisation of Gadget, a widely used
community code for computational astrophysics. The focus of this work is on
node-level performance optimisation, targeting current multi/many-core Intel
architectures. We identify and isolate a sample code kernel, which is
representative of a typical Smoothed Particle Hydrodynamics (SPH) algorithm.
The code modifications include threading parallelism optimisation, change of
the data layout into Structure of Arrays (SoA), auto-vectorisation and
algorithmic improvements in the particle sorting. We measure lower execution
time and improved threading scalability both on Intel Xeon ((2.6 imes) on Ivy
Bridge) and Xeon Phi ((13.7 imes) on Knights Corner) systems. First tests on
second generation Xeon Phi (Knights Landing) demonstrate the portability of the
devised optimisation solutions to upcoming architectures.
Afsin Akdogan, Hien To
Comments: Survey paper that covers data processing frameworks for big graph data
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Recently we create so much data (2.5 quintillion bytes every day) that 90% of
the data in the world today has been created in the last two years alone [1].
This data comes from sensors used to gather traffic or climate information,
posts to social media sites, photos, videos, emails, purchase transaction
records, call logs of cellular networks, etc. This data is big data. In this
report, we first briefly discuss what programming models are used for big data
processing, and focus on graph data and do a survey study about what
programming models/frameworks are used to solve graph problems at very
large-scale. In section 2, we introduce the programming models which are not
specifically designed to handle graph data but we include them in this survey
because we believe these are important frameworks and/or there have been
studies to customize them for more efficient graph processing. In section 3, we
discuss some techniques that yield up to 1340 times speedup for some certain
graph problems when applied to Hadoop. In section 4, we discuss vertex-based
programming model which is simply designed to process large graphs and the
frameworks adapting it. In section 5, we implement two of the fundamental graph
algorithms (Page Rank and Weight Bipartite Matching), and run them on a single
node as the baseline approach to see how fast they are for large datasets and
whether it is worth to partition them.
Marjorie Bournat (Regal), Swan Dubois (Regal), Franck Petit (Regal)
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
We consider systems made of autonomous mobile robots evolving in highly
dynamic discrete environment i.e., graphs where edges may appear and disappear
unpredictably without any recurrence, stability, nor periodicity assumption.
Robots are uniform (they execute the same algorithm), they are anonymous (they
are devoid of any observable ID), they have no means allowing them to
communicate together, they share no common sense of direction, and they have no
global knowledge related to the size of the environment. However, each of them
is endowed with persistent memory and is able to detect whether it stands alone
at its current location. A highly dynamic environment is modeled by a graph
such that its topology keeps continuously changing over time. In this paper, we
consider only dynamic graphs in which nodes are anonymous, each of them is
infinitely often reachable from any other one, and such that its underlying
graph (i.e., the static graph made of the same set of nodes and that includes
all edges that are present at least once over time) forms a ring of arbitrary
size. In this context, we consider the fundamental problem of perpetual
exploration: each node is required to be infinitely often visited by a robot.
This paper analyzes the computability of this problem in (fully) synchronous
settings, i.e., we study the deterministic solvability of the problem with
respect to the number of robots. We provide three algorithms and two
impossibility results that characterize, for any ring size, the necessary and
sufficient number of robots to perform perpetual exploration of highly dynamic
rings.
Syed Arshad Ali, Mansaf Alam
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Cloud Computing is a paradigm of both parallel processing and distributed
computing. It offers computing facilities as a utility service in pay as par
use manner. Virtualization, self service provisioning, elasticity and pay per
use are the key features of Cloud Computing. It provides different types of
resources over the Internet to perform user submitted tasks. In cloud
environment, huge number of tasks are executed simultaneously, an effective
Task Scheduling is required to gain better performance of the cloud system.
Various Cloud Based Task Scheduling algorithms are available that schedule the
task of user to resources for execution. Due to the novelty of Cloud Computing,
traditional scheduling algorithms cannot satisfy the needs of cloud , the
researchers are trying to modify traditional algorithms that can fulfill the
cloud requirements like rapid elasticity, resource pooling and on demand self
service. In this paper the current state of Task Scheduling algorithms has been
discussed and compared on the basis of various scheduling parameters like
execution time, throughput, make span, resource utilization, quality of
service, energy consumption, response time and cost.
Afsin Akdogan
Comments: PhD Dissertation – University of Southern California
Subjects: Databases (cs.DB); Distributed, Parallel, and Cluster Computing (cs.DC)
The number of mobile devices (e.g., smartphones, wearable technologies) is
rapidly growing. In line with this trend, a massive amount of spatial data is
being collected since these devices allow users to geo-tag user-generated
content. Clearly, a scalable computing infrastructure is needed to manage such
large datasets. Meanwhile, Cloud Computing service providers (e.g., Amazon,
Google, and Microsoft) allow users to lease computing resources. However, most
of the existing spatial indexing techniques are designed for the centralized
paradigm which is limited to the capabilities of a single sever. To address the
scalability shortcomings of existing approaches, we provide a study that focus
on generating a distributed spatial index structure that not only scales out to
multiple servers but also scales up since it fully exploits the multi-core CPUs
available on each server using Voronoi diagram as the partitioning and indexing
technique which we also use to process spatial queries effectively. More
specifically, since the data objects continuously move and issue position
updates to the index structure, we collect the latest positions of objects and
periodically generate a read-only index to eliminate costly distributed
updates. Our approach scales near-linearly in index construction and query
processing, and can efficiently construct an index for millions of objects
within a few seconds. In addition to scalability and efficiency, we also aim to
maximize the server utilization that can support the same workload with less
number of servers. Server utilization is a crucial point while using Cloud
Computing because users are charged based on the total amount of time they
reserve each server, with no consideration of utilization.
Yihan Sun, Daniel Ferizovic, Guy E. Blelloch
Subjects: Data Structures and Algorithms (cs.DS); Databases (cs.DB); Distributed, Parallel, and Cluster Computing (cs.DC)
In this paper we introduce an interface for supporting ordered maps that are
augmented to support quick “sums” of values over ranges of the keys. We have
implemented this interface as part of a C++ library called PAM (Parallel and
Persistent Augmented Map library). This library supports a wide variety of
functions on maps ranging from basic insertion and deletion to more interesting
functions such as union, intersection, difference, filtering, extracting
ranges, splitting, and range-sums. The functions in the library are parallel,
persistent (meaning that functions do not affect their inputs), and
work-efficient. The underlying data structure is the augmented balanced binary
search tree, which is a binary search tree in which each node is augmented with
a value keeping the “sum” of its subtree with respect to some user supplied
function. With this augmentation the library can be directly applied to many
settings such as to 2D range trees, interval trees, word index searching, and
segment trees. The interface greatly simplifies the implementation of such data
structures while it achieves efficiency that is significantly better than
previous libraries.
We tested our library and its corresponding applications. Experiments show
that our implementation of set functions can get up to 50+ speedup on 72 cores.
As for our range tree implementation, the sequential running time is more
efficient than existing libraries such as CGAL, and can get up to 42+ speedup
on 72 cores.
Nina Narodytska, Shiva Prasad Kasiviswanathan
Comments: 19 Pages
Subjects: Learning (cs.LG); Cryptography and Security (cs.CR); Machine Learning (stat.ML)
Deep neural networks are powerful and popular learning models that achieve
state-of-the-art pattern recognition performance on many computer vision,
speech, and language processing tasks. However, these networks have also been
shown susceptible to carefully crafted adversarial perturbations which force
misclassification of the inputs. Adversarial examples enable adversaries to
subvert the expected system behavior leading to undesired consequences and
could pose a security risk when these systems are deployed in the real world.
In this work, we focus on deep convolutional neural networks and demonstrate
that adversaries can easily craft adversarial examples even without any
internal knowledge of the target network. Our attacks treat the network as an
oracle (black-box) and only assume that the output of the network can be
observed on the probed inputs. Our first attack is based on a simple idea of
adding perturbation to a randomly selected single pixel or a small set of them.
We then improve the effectiveness of this attack by carefully constructing a
small set of pixels to perturb by using the idea of greedy local-search. Our
proposed attacks also naturally extend to a stronger notion of
misclassification. Our extensive experimental results illustrate that even
these elementary attacks can reveal a deep neural network’s vulnerabilities.
The simplicity and effectiveness of our proposed schemes mean that they could
serve as a litmus test for designing robust networks.
Alekh Agarwal, Haipeng Luo, Behnam Neyshabur, Robert E. Schapire
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
We study the problem of combining multiple bandit algorithms (that is, online
learning algorithms with partial feedback) with the goal of creating a master
algorithm that performs almost as well as the best base algorithm if it were to
be run on its own. The main challenge is that when run with a master, base
algorithms unavoidably receive much less feedback and it is thus critical that
the master not starve a base algorithm that might performs uncompetitively
initially but would eventually outperform others if given enough feedback. We
address this difficulty by devising a version of Online Mirror Descent with a
special mirror map together with a sophisticated learning rate scheme. We show
that this approach manages to achieve a more delicate balance between
exploiting and exploring base algorithms than previous works yielding superior
regret bounds.
Our results are applicable to many settings, such as multi-armed bandits,
contextual bandits, and convex bandits. As examples, we present two main
applications. The first is to create an algorithm that enjoys worst-case
robustness while at the same time performing much better when the environment
is relatively easy. The second is to create an algorithm that works
simultaneously under different assumptions of the environment, such as
different priors or different loss structures.
Penghang Yin, Shuai Zhang, Jack Xin, Yingyong Qi
Subjects: Learning (cs.LG)
In this paper, we propose a stochastic proximal gradient method to train
ternary weight neural networks (TNN). The proposed method features weight
ternarization via an exact formula of proximal operator. Our experiments show
that our trained TNN are able to preserve the state-of-the-art performance on
MNIST and CIFAR10 benchmark datesets.
Erik Talvitie
Comments: To appear in Proceedings of the 31st AAAI Conference on Artificial Intelligence, 2017. This version incorporates the appendix into document (rather than as supplementary material)
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)
When an agent cannot represent a perfectly accurate model of its
environment’s dynamics, model-based reinforcement learning (MBRL) can fail
catastrophically. Planning involves composing the predictions of the model;
when flawed predictions are composed, even minor errors can compound and render
the model useless for planning. Hallucinated Replay (Talvitie 2014) trains the
model to “correct” itself when it produces errors, substantially improving MBRL
with flawed models. This paper theoretically analyzes this approach,
illuminates settings in which it is likely to be effective or ineffective, and
presents a novel error bound, showing that a model’s ability to self-correct is
more tightly related to MBRL performance than one-step prediction error. These
results inspire an MBRL algorithm for deterministic MDPs with performance
guarantees that are robust to model class limitations.
Bin Gu, Zhouyuan Huo, Heng Huang
Subjects: Learning (cs.LG)
Non-convex and non-smooth optimization plays an important role in machine
learning. Proximal gradient method is one of the most important methods for
solving the non-convex and non-smooth problems, where a proximal operator need
to be solved exactly for each step. However, in a lot of problems the proximal
operator does not have an analytic solution, or is expensive to obtain an exact
solution. In this paper, we propose inexact proximal gradient methods (not only
a basic inexact proximal gradient method (IPG), but also a Nesterov’s
accelerated inexact proximal gradient method (AIPG)) for non-convex and
non-smooth optimization, which tolerate an error in the calculation of the
proximal operator. Theoretical analysis shows that IPG and AIPG have the same
convergence rates as in the error-free case, provided that the errors decrease
at appropriate rates.
Jiuyong Li, Lin Liu, Jixue Liu, Ryan Green
Comments: 12 pages
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
It is common that a trained classification model is applied to the operating
data that is deviated from the training data because of noise. This paper
demonstrate an ensemble classifier, Diversified Multiple Trees (DMT) is more
robust to classify noised data than other widely used ensemble methods. DMT is
tested on three real world biological data sets from different laboratories in
comparison with four benchmark ensemble classifiers. Experimental results show
that DMT is significantly more accurate than other benchmark ensemble
classifiers on noised test data. We also discussed a limitation of DMT and its
possible variations.
Zeeshan Khawar Malik, Zain U. Hussain, Ziad Kobti, Charlie W. Lees, Newton Howard, Amir Hussain
Subjects: Learning (cs.LG)
Faecal Calprotectin (FC) is a surrogate marker for intestinal inflammation,
termed Inflammatory Bowel Disease (IBD), but not for cancer. In this
retrospective study of 804 patients, an enhanced benchmark predictive model for
analyzing FC is developed, based on a novel state-of-the-art Echo State Network
(ESN), an advanced dynamic recurrent neural network which implements a
biologically plausible architecture, and a supervised learning mechanism. The
proposed machine learning driven predictive model is benchmarked against a
conventional logistic regression model, demonstrating statistically significant
performance improvements.
B. Pavlyshenko
Subjects: Learning (cs.LG)
In this work, we study the use of logistic regression in manufacturing
failures detection. As a data set for the analysis, we used the data from
Kaggle competition Bosch Production Line Performance. We considered the use of
machine learning, linear and Bayesian models. For machine learning approach, we
analyzed XGBoost tree based classifier to obtain high scored classification.
Using the generalized linear model for logistic regression makes it possible to
analyze the influence of the factors under study. The Bayesian approach for
logistic regression gives the statistical distribution for the parameters of
the model. It can be useful in the probabilistic analysis, e.g. risk
assessment.
Xiujun Li, Zachary C. Lipton, Bhuwan Dhingra, Lihong Li, Jianfeng Gao, Yun-Nung Chen
Comments: 14 pages
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Despite widespread interests in reinforcement-learning for task-oriented
dialogue systems, several obstacles can frustrate research and development
progress. First, reinforcement learners typically require interaction with the
environment, so conventional dialogue corpora cannot be used directly. Second,
each task presents specific challenges, requiring separate corpus of
task-specific annotated data. Third, collecting and annotating human-machine or
human-human conversations for task-oriented dialogues requires extensive domain
knowledge. Because building an appropriate dataset can be both financially
costly and time-consuming, one popular approach is to build a user simulator
based upon a corpus of example dialogues. Then, one can train reinforcement
learning agents in an online fashion as they interact with the simulator.
Dialogue agents trained on these simulators can serve as an effective starting
point. Once agents master the simulator, they may be deployed in a real
environment to interact with humans, and continue to be trained online. To ease
empirical algorithmic comparisons in dialogues, this paper introduces a new,
publicly available simulation framework, where our simulator, designed for the
movie-booking domain, leverages both rules and collected data. The simulator
supports two tasks: movie ticket booking and movie seeking. Finally, we
demonstrate several agents and detail the procedure to add and test your own
agent in the proposed framework.
Deepak Pathak, Ross Girshick, Piotr Dollár, Trevor Darrell, Bharath Hariharan
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)
This paper presents a novel yet intuitive approach to unsupervised feature
learning. Inspired by the human visual system, we explore whether low-level
motion-based grouping cues can be used to learn an effective visual
representation. Specifically, we use unsupervised motion-based segmentation on
videos to obtain segments, which we use as ‘pseudo ground truth’ to train a
convolutional network to segment objects from a single frame. Given the
extensive evidence that motion plays a key role in the development of the human
visual system, we hope that this straightforward approach to unsupervised
learning will be more effective than cleverly designed ‘pretext’ tasks studied
in the literature. Indeed, our extensive experiments show that this is the
case. When used for transfer learning on object detection, our representation
significantly outperforms previous unsupervised approaches across multiple
settings, especially when training data for the target task is scarce.
Sam Ganzfried, Farzana Yusuf
Subjects: Computer Science and Game Theory (cs.GT); Artificial Intelligence (cs.AI); Learning (cs.LG); Multiagent Systems (cs.MA)
Algorithms for equilibrium computation generally make no attempt to ensure
that the computed strategies are understandable by humans. For instance the
strategies for the strongest poker agents are represented as massive binary
files. In many situations, we would like to compute strategies that can
actually be implemented by humans, who may have computational limitations and
may only be able to remember a small number of features or components of the
strategies that have been computed. We study poker games where private
information distributions can be arbitrary. We create a large training set of
game instances and solutions, by randomly selecting the private information
probabilities, and present algorithms that learn from the training instances in
order to perform well in games with unseen information distributions. One
approach first clusters the training points into a small number of clusters and
then creates a small decision tree based on the cluster centers. This approach
produces low test error and could be easily implemented by humans since it only
requires memorizing a small number of “if-then” rules.
Clément Gaultier (PANAMA), Saurabh Kataria (PANAMA, IIT Kanpur), Antoine Deleforge (PANAMA)
Comments: International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA), Feb 2017, Grenoble, France. International Conference on Latent Variable Analysis and Signal Separation
Subjects: Sound (cs.SD); Learning (cs.LG)
This paper introduces a new paradigm for sound source lo-calization referred
to as virtual acoustic space traveling (VAST) and presents a first dataset
designed for this purpose. Existing sound source localization methods are
either based on an approximate physical model (physics-driven) or on a
specific-purpose calibration set (data-driven). With VAST, the idea is to learn
a mapping from audio features to desired audio properties using a massive
dataset of simulated room impulse responses. This virtual dataset is designed
to be maximally representative of the potential audio scenes that the
considered system may be evolving in, while remaining reasonably compact. We
show that virtually-learned mappings on this dataset generalize to real data,
overcoming some intrinsic limitations of traditional binaural sound
localization methods based on time differences of arrival.
Thomas Laurent, James von Brecht
Subjects: Neural and Evolutionary Computing (cs.NE); Computation and Language (cs.CL); Learning (cs.LG)
We introduce an exceptionally simple gated recurrent neural network (RNN)
that achieves performance comparable to well-known gated architectures, such as
LSTMs and GRUs, on the word-level language modeling task. We prove that our
model has simple, predicable and non-chaotic dynamics. This stands in stark
contrast to more standard gated architectures, whose underlying dynamical
systems exhibit chaotic behavior.
Mihir Mongia, Kundan Kumar, Akram Erraqabi, Yoshua Bengio
Comments: ICASSP 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
Recent work in the literature has shown experimentally that one can use the
lower layers of a trained convolutional neural network (CNN) to model natural
textures. More interestingly, it has also been experimentally shown that only
one layer with random filters can also model textures although with less
variability. In this paper we ask the question as to why one layer CNNs with
random filters are so effective in generating textures? We theoretically show
that one layer convolutional architectures (without a non-linearity) paired
with the an energy function used in previous literature, can in fact preserve
and modulate frequency coefficients in a manner so that random weights and
pretrained weights will generate the same type of images. Based on the results
of this analysis we question whether similar properties hold in the case where
one uses one convolution layer with a non-linearity. We show that in the case
of ReLu non-linearity there are situations where only one input will give the
minimum possible energy whereas in the case of no nonlinearity, there are
always infinite solutions that will give the minimum possible energy. Thus we
can show that in certain situations adding a ReLu non-linearity generates less
variable images.
Kavosh Asadi, Jason D. Williams
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)
Representing a dialog policy as a recurrent neural network (RNN) is
attractive because it handles partial observability, infers a latent
representation of state, and can be optimized with supervised learning (SL) or
reinforcement learning (RL). For RL, a policy gradient approach is natural, but
is sample inefficient. In this paper, we present 3 methods for reducing the
number of dialogs required to optimize an RNN-based dialog policy with RL. The
key idea is to maintain a second RNN which predicts the value of the current
policy, and to apply experience replay to both networks. On two tasks, these
methods reduce the number of dialogs/episodes required by about a third, vs.
standard policy gradient methods.
Wentao Zhu, Xiaohui Xie
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
Mass segmentation is an important task in mammogram analysis, providing
effective morphological features and regions of interest (ROI) for mass
detection and classification. Inspired by the success of using deep
convolutional features for natural image analysis and conditional random fields
(CRF) for structural learning, we propose an end-to-end network for
mammographic mass segmentation. The network employs a fully convolutional
network (FCN) to model potential function, followed by a CRF to perform
structural learning. Because the mass distribution varies greatly with pixel
position, the FCN is combined with position priori for the task. Due to the
small size of mammogram datasets, we use adversarial training to control
over-fitting. Four models with different convolutional kernels are further
fused to improve the segmentation results. Experimental results on two public
datasets, INbreast and DDSM-BCRP, show that our end-to-end network combined
with adversarial training achieves the-state-of-the-art results.
Wentao Zhu, Qi Lou, Yeeleng Scott Vang, Xiaohui Xie
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
Mammogram classification is directly related to computer-aided diagnosis of
breast cancer. Traditional methods requires great effort to annotate the
training data by costly manual labeling and specialized computational models to
detect these annotations during test. Inspired by the success of using deep
convolutional features for natural image analysis and multi-instance learning
for labeling a set of instances/patches, we propose end-to-end trained deep
multi-instance networks for mass classification based on whole mammogram
without the aforementioned costly need to annotate the training data. We
explore three different schemes to construct deep multi-instance networks for
whole mammogram classification. Experimental results on the INbreast dataset
demonstrate the robustness of proposed deep networks compared to previous work
using segmentation and detection annotations in the training.
Shervin Ardeshir, Krishna Regmi, Ali Borji
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Mirror neurons have been observed in the primary motor cortex of primate
species, in particular in humans and monkeys. A mirror neuron fires when a
person performs a certain action, and also when he observes the same action
being performed by another person. A crucial step towards building fully
autonomous intelligent systems with human-like learning abilities is the
capability in modeling the mirror neuron. On one hand, the abundance of
egocentric cameras in the past few years has offered the opportunity to study a
lot of vision problems from the first-person perspective. A great deal of
interesting research has been done during the past few years, trying to explore
various computer vision tasks from the perspective of the self. On the other
hand, videos recorded by traditional static cameras, capture humans performing
different actions from an exocentric third-person perspective. In this work, we
take the first step towards relating motion information across these two
perspectives. We train models that predict motion in an egocentric view, by
observing it from an exocentric view, and vice versa. This allows models to
predict how an egocentric motion would look like from outside. To do so, we
train linear and nonlinear models and evaluate their performance in terms of
retrieving the egocentric (exocentric) motion features, while having access to
an exocentric (egocentric) motion feature. Our experimental results demonstrate
that motion information can be successfully transferred across the two views.
Sajad Mousavi, Ali Borji, Nasser Mozayani
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
Bottom-Up (BU) saliency models do not perform well in complex interactive
environments where humans are actively engaged in tasks (e.g., sandwich making
and playing the video games). In this paper, we leverage Reinforcement Learning
(RL) to highlight task-relevant locations of input frames. We propose a soft
attention mechanism combined with the Deep Q-Network (DQN) model to teach an RL
agent how to play a game and where to look by focusing on the most pertinent
parts of its visual input. Our evaluations on several Atari 2600 games show
that the soft attention based model could predict fixation locations
significantly better than bottom-up models such as Itti-Kochs saliency and
Graph-Based Visual Saliency (GBVS) models.
Snehasis Banerjee, Tanushyam Chattopadhyay, Swagata Biswas, Rohan Banerjee, Anirban Dutta Choudhury, Arpan Pal
Comments: 4 pages, Machine Learning for Health Workshop, NIPS 2016
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
In this paper, a Wide Learning architecture is proposed that attempts to
automate the feature engineering portion of the machine learning (ML) pipeline.
Feature engineering is widely considered as the most time consuming and expert
knowledge demanding portion of any ML task. The proposed feature recommendation
approach is tested on 3 healthcare datasets: a) PhysioNet Challenge 2016
dataset of phonocardiogram (PCG) signals, b) MIMIC II blood pressure
classification dataset of photoplethysmogram (PPG) signals and c) an emotion
classification dataset of PPG signals. While the proposed method beats the
state of the art techniques for 2nd and 3rd dataset, it reaches 94.38% of the
accuracy level of the winner of PhysioNet Challenge 2016. In all cases, the
effort to reach a satisfactory performance was drastically less (a few days)
than manual feature engineering.
Mirko Polato, Fabio Aiolli
Comments: Under revision for Neurocomputing (Elsevier Journal)
Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Learning (cs.LG)
The increasing availability of implicit feedback datasets has raised the
interest in developing effective collaborative filtering techniques able to
deal asymmetrically with unambiguous positive feedback and ambiguous negative
feedback. In this paper, we propose a principled kernel-based collaborative
filtering method for top-N item recommendation with implicit feedback. We
present an efficient implementation using the linear kernel, and we show how to
generalize it to kernels of the dot product family preserving the efficiency.
We also investigate on the elements which influence the sparsity of a standard
cosine kernel. This analysis shows that the sparsity of the kernel strongly
depends on the properties of the dataset, in particular on the long tail
distribution. We compare our method with state-of-the-art algorithms achieving
good results both in terms of efficiency and effectiveness.
Jacob S. Hunter (1), Nathan O. Hodas (1) ((1) Pacific Northwest National Laboratory)
Subjects: Optimization and Control (math.OC); Learning (cs.LG); Machine Learning (stat.ML)
Deep nonlinear models pose a challenge for fitting parameters due to lack of
knowledge of the hidden layer and the potentially non-affine relation of the
initial and observed layers. In the present work we investigate the use of
information theoretic measures such as mutual information and Kullback-Leibler
(KL) divergence as objective functions for fitting such models without
knowledge of the hidden layer. We investigate one model as a proof of concept
and one application of cogntive performance. We further investigate the use of
optimizers with these methods. Mutual information is largely successful as an
objective, depending on the parameters. KL divergence is found to be similarly
succesful, given some knowledge of the statistics of the hidden layer.
Daniel Crawford, Anna Levit, Navid Ghadermarzy, Jaspreet S. Oberoi, Pooya Ronagh
Comments: 16 pages
Subjects: Quantum Physics (quant-ph); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Optimization and Control (math.OC)
We investigate whether quantum annealers with select chip layouts can
outperform classical computers in reinforcement learning tasks. We associate a
transverse field Ising spin Hamiltonian with a layout of qubits similar to that
of a deep Boltzmann machine (DBM) and use simulated quantum annealing (SQA) to
numerically simulate quantum sampling from this system. We design a
reinforcement learning algorithm in which the set of visible nodes representing
the states and actions of an optimal policy are the first and last layers of
the deep network. In absence of a transverse field, our simulations show that
DBMs train more effectively than restricted Boltzmann machines (RBM) with the
same number of weights. Since sampling from Boltzmann distributions of a DBM is
not classically feasible, this is evidence of supremacy of a non-Turing
sampling oracle. We then develop a framework for training the network as a
quantum Boltzmann machine (QBM) in the presence of a significant transverse
field for reinforcement learning. The latter method further improves the
reinforcement learning method using DBMs.
Mohammed E. Eltayeb, Tareq Y. Al-Naffouri, Robert W. Heath Jr
Subjects: Information Theory (cs.IT)
The radiation pattern of an antenna array depends on the excitation weights
and the geometry of the array. Due to wind and atmospheric conditions, outdoor
millimeter wave antenna elements are subject to full or partial blockages from
a plethora of particles like dirt, salt, ice, and water droplets. Handheld
devices are also subject to blockages from random finger placement and/or
finger prints. These blockages cause absorption and scattering to the signal
incident on the array, and change the array geometry. This distorts the
far-field radiation pattern of the array leading to an increase in the sidelobe
level and decrease in gain. This paper studies the effects of blockages on the
far-field radiation pattern of linear arrays and proposes two array diagnosis
techniques for millimeter wave antenna arrays. The proposed techniques jointly
estimate the locations of the blocked antennas and the induced attenuation and
phase shifts. Numerical results show that the proposed techniques provide
satisfactory results in terms of fault detection with reduced number of
measurements (diagnosis time) provided that the number of blockages is small
compared to the array size.
Armin Eftekhari, Ping Li, Michael B. Wakin, Rachel A. Ward
Subjects: Information Theory (cs.IT)
Consider an open set (mathbb{D}subseteqmathbb{R}^n), equipped with a
probability measure (mu). An important characteristic of a smooth function
(f:mathbb{D}
ightarrowmathbb{R}) is its (differential) (correlation)
(matrix) (Sigma_{mu}:=int
abla f(x) (
abla f(x))^* mu(dx)
inmathbb{R}^{n imes n}), where (
abla f(x)inmathbb{R}^n) is the gradient
of (f(cdot)) at (xinmathbb{D}). For instance, the span of the leading (r)
eigenvectors of (Sigma_{mu}) forms an (active) (subspace) of (f(cdot)),
thereby extending the concept of (principal) (component) (analysis) to the
problem of (ridge) (approximation). In this work, we propose a simple algorithm
for estimating (Sigma_{mu}) from point values of (f(cdot)) (without)
imposing any structural assumptions on (f(cdot)). Theoretical guarantees for
this algorithm are provided with the aid of the same technical tools that have
proved valuable in the context of covariance matrix estimation from partial
measurements.
Venkatesan Guruswami, Ray Li
Subjects: Information Theory (cs.IT); Computational Complexity (cs.CC); Data Structures and Algorithms (cs.DS)
We prove the existence of binary codes of positive rate that can correct an
arbitrary pattern of (p) fraction of deletions, for any (p < 1), when the bit
positions to delete are picked obliviously of the codeword. Formally, we prove
that for every (p < 1), there exists (mathcal{R} > 0) and a code family with a
randomized encoder (mathsf{Enc}: {0,1}^{mathcal{R} n} o {0,1}^n) and
(deterministic) decoder (mathsf{Dec}: {0,1}^{(1-p)n} o
{0,1}^{mathcal{R} n}) such that for all deletion patterns ( au) with (pn)
deletions and all messages (m in {0,1}^{mathcal{R} n}), ({ extbf{Pr}} [
mathsf{Dec}( au(mathsf{Enc}(m)))
eq m ] le o(1)), where the probability
is over the randomness of the encoder (which is private to the encoder and not
known to the decoder). The oblivious model is a significant generalization of
the random deletion channel where each bit is deleted independently with
probability (p). For the random deletion channel, existence of codes of rate
((1-p)/9), and thus bounded away from (0) for any (p < 1), was already known.
We give an explicit construction with polynomial time encoding and deletion
correction algorithms with rate (c_0 (1-p)) for an absolute constant (c_0 > 0).
Jiaheng Wang, Wei Guan, Yongming Huang, Robert Schober, Xiaohu You
Comments: Accepted by IEEE Journal on Selected Areas in Communications
Subjects: Information Theory (cs.IT)
Deployment of small cell base stations (SBSs) overlaying the coverage area of
a macrocell BS (MBS) results in a two-tier hierarchical small cell network.
Cross-tier and inter-tier interference not only jeopardize primary macrocell
communication but also limit the spectral efficiency of small cell
communication. This paper focuses on distributed interference management for
downlink small cell networks. We address the optimization of transmit
strategies from both the game theoretical and the network utility maximization
(NUM) perspectives and show that they can be unified in a generalized Nash
equilibrium problem (GNEP) framework. Specifically, the small cell network
design is first formulated as a GNEP, where the SBSs and MBS compete for the
spectral resources by maximizing their own rates while satisfying global
quality of service (QoS) constraints. We analyze the GNEP via variational
inequality theory and propose distributed algorithms, which only require the
broadcasting of some pricing information, to achieve a generalized Nash
equilibrium (GNE). Then, we also consider a nonconvex NUM problem that aims to
maximize the sum rate of all BSs subject to global QoS constraints. We
establish the connection between the NUM problem and a penalized GNEP and show
that its stationary solution can be obtained via a fixed point iteration of the
GNE. We propose GNEP-based distributed algorithms that achieve a stationary
solution of the NUM problem at the expense of additional signaling overhead and
complexity. The convergence of the proposed algorithms is proved and guaranteed
for properly chosen algorithm parameters. The proposed GNEP framework can scale
from a QoS constrained game to a NUM design for small cell networks by trading
off signaling overhead and complexity.
Yahya Sattar, Zubair Khalid, Rodney A. Kennedy
Comments: 5 Pages, 1 Figure, Submitted for review in IEEE Signal Processing Letters
Subjects: Information Theory (cs.IT)
We develop a method for the accurate reconstruction of non-bandlimited finite
rate of innovation signals on the sphere. For signals consisting of a finite
number of Dirac functions on the sphere, we develop an annihilating filter
based method for the accurate recovery of parameters of the Dirac functions
using a finite number of observations of the bandlimited signal. In comparison
to existing techniques, the proposed method enables more accurate
reconstruction primarily due to better conditioning of systems involved in the
recovery of parameters. For the recovery of (K) Diracs on the sphere, the
proposed method requires samples of the signal bandlimited in the spherical
harmonic~(SH) domain at SH degree equal or greater than ( K + sqrt{K +
frac{1}{4}} – frac{1}{2}). In comparison to the existing state-of-the art
technique, the required bandlimit, and consequently the number of samples, of
the proposed method is the same or less. We also conduct numerical experiments
to demonstrate that the proposed technique is more accurate than the existing
methods by a factor of (10^{7}) or more for (2 le Kle 20).
MinKeun Chung, Min Soo Sim, Dong Ku Kim, Chan-Byoung Chae
Comments: Submitted to IEEE Journal on Selected Areas in Communications
Subjects: Information Theory (cs.IT)
This paper considers the implementation and application possibilities of a
compact full duplex multiple-input multiple-output (MIMO) architecture for
users in a network where direct communication exists between users, e.g.,
device-to-device (D2D) and cellular link coexist in the same spectrum. For the
architecture of the compact full duplex radio, we combine dual-polarization
with high cross-polarization discrimination (XPD), based analog
self-interference canceler, and Long Term Evolution (LTE)-based per-subcarrier
digital self-interference canceler. While we consider the compactness and power
efficiency of an analog solution, we focus on the digital canceler design with
robustness to a frequency-selective channel and high compatibility with a
conventional LTE system. For an over-the-air wireless experiment through a full
duplex link prototype with a two-user pair, we implement a full duplex MIMO
physical layer (PHY), supporting a 20 MHz bandwidth, on an FPGA-based
software-defined radio platform. Further, we propose the novel timing
synchronization strategy to construct a more viable full duplex MIMO link. By
having the full duplex link prototype fully operating in real-time, we present
the first characterization of the proposed compact full duplex MIMO performance
depending on the transmit power of the full duplex node, and the link quality
between nodes. One of the crucial insights of this work is that the full duplex
of a user is capable of acquiring the throughput gain if the user has
self-interference capability with guaranteed performance.
Frederic Gabry, Valerio Bioglio, Ingmar Land, Jean-Claude Belfiore
Comments: submitted to Workshop on Channel Coding for 5G and Future Networks – IEEE ICC 2017
Subjects: Information Theory (cs.IT)
We propose a generalized construction for binary polar codes based on mixing
multiple kernels of different sizes in order to construct polar codes of block
lengths that are not only powers of integers. This results in a multi kernel
polar code with very good performance while the encoding complexity remains low
and the decoding follows the same general structure as for the original Arikan
polar codes. The construction provides numerous practical advantages as more
code lengths can be achieved without puncturing or shortening. We observe
numerically that the error-rate performance of our construction outperforms
stateof the-art constructions using puncturing methods.
D. Ciuonzo, P. Salvo Rossi
Comments: Accepted in Elsevier Information Fusion, Special Issue on Event-Based Distributed Information Fusion Over Sensor Networks (invited paper)
Subjects: Information Theory (cs.IT)
In this paper we tackle distributed detection of a non-cooperative target
with a Wireless Sensor Network (WSN). When the target is present, sensors
observe an unknown random signal with amplitude attenuation depending on the
distance between the sensor and the target (unknown) positions, embedded in
white Gaussian noise. The Fusion Center (FC) receives sensors decisions through
error-prone Binary Symmetric Channels (BSCs) and is in charge of performing a
(potentially) more-accurate global decision. The resulting problem is a
one-sided testing with nuisance parameters present only under the
target-present hypothesis. We first focus on fusion rules based on Generalized
Likelihood Ratio Test (GLRT), Bayesian and hybrid approaches. Then, aimed at
reducing the computational complexity, we develop fusion rules based on
generalizations of the well-known Locally-Optimum Detection (LOD) framework.
Finally, all the proposed rules are compared in terms of performance and
complexity.
Marjan Kashani, Ali Khaleghi
Subjects: Information Theory (cs.IT)
Rotman lens is a true-time-delay beam forming network with parallel plate
structure. Owing to its frequency- independency and wide-band operation, it has
gained vast applications as feeding element of linear array antennas. However,
the lens suffers from high intrinsic power loss and the problem worsens in
microstrip structures by their inevitable dielectric loss. On the other hand,
waveguides experience extremely lower loss and also have higher power capacity.
Thus, waveguides can be advantageously utilized in Rotman lens structure. Here,
a waveguide Rotman lens for the frequency range of 8 to 12 GHz is designed and
simulated in CST Microwave Studio. Phase error and power efficiency of the
proposed structure are satisfactory.
Sarah S. Bawazir, Paschalis C. Sofotasios, Sami Muhaidat, Yousof Al-Hammadi, George K. Karagiannidis
Subjects: Information Theory (cs.IT)
The ever-increasing demand of mobile Internet and multimedia services poses
unique and significant challenges for current and future generation wireless
networks. These challenges are mainly relating to the support of massive
ubiquitous connectivity, low latency and highly efficient utilization of
spectrum resources. Therefore, it is vital to consider them extensively prior
to design and deployment of future wireless networks. To this end, the present
article provides a comprehensive overview of a particularly promising and
effective wireless technology, namely, visible light communication (VLC). In
this context, we initially provide a thorough overview of frequency domain
multiple access techniques for single and multi-carrier systems, which is then
followed by an in depth discussion on the technical considerations of optical
code division multiple access techniques and their adoption in indoor VLC
applications. Furthermore, we address space division multiple access and,
finally, we revisit and analyze the distinct characteristics of a new
technology, namely, non-orthogonal multiple access, which has been proposed
recently as a particularly effective solution.
Ahmed El Shafie, Naofal Al-Dhahir
Comments: Published in IEEE Wireless Communications Letters in October 2015, this http URL
Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)
We consider a relay-assisted wireless network, where the energy-harvesting
buffer-aided relay node is powered by radio-frequency signals from a source
node wishing to communicate with its destination. We propose two secure
cooperative protocols for a network composed of a source node equipped with a
data buffer communicating with its destination in the presence of a
buffer-aided relay node and an eavesdropper. Our proposed protocols are
designed based on the channel state information and the buffer state
information at the source and relay nodes. The burstiness of data at the source
node queue (buffer) and the energy recycling process at the relay are taken
into account in our analysis. In addition, we take the decoding and signal
processing circuit power consumption constraints into consideration. For the
limiting case of an infinite-size battery at the relay, we derive a sufficient
condition for the energy queue to saturate. Our numerical results demonstrate
the throughput gains of our proposed protocols.
M. Alaee, A. Aubry, A. De Maio, M. M. Naghsh, M. Modarres-Hashemi
Subjects: Information Theory (cs.IT)
This paper is focused on the design of phase sequences with good (aperiodic)
autocorrelation properties in terms of Peak Sidelobe Level (PSL) and Integrated
Sidelobe Level (ISL). The problem is formulated as a bi-objective Pareto
optimization forcing either a continuous or a discrete phase constraint at the
design stage. An iterative procedure based on the coordinate descent method is
introduced to deal with the resulting optimization problems which are
non-convex and NP-hard in general. Each iteration of the devised method
requires the solution of a non-convex min-max problem. It is handled either
through a novel bisection or an FFT-based method for the continuous and the
discrete phase constraint, respectively. Additionally, a heuristic approach to
initialize the procedures employing the lp-norm minimization technique is
proposed. Simulation results illustrate that the proposed methodologies can
outperform some counterparts providing sequences with good autocorrelation
features especially in the discrete phase/binary case.
Jiantao Jiao, Yanjun Han, Tsachy Weissman
Subjects: Information Theory (cs.IT)
We propose a framework to analyze and quantify the bias in adaptive data
analysis. It generalizes that proposed by Russo and Zou’15, applying to all
measurements whose moment generating function exists, and to all measurements
with a finite (p)-norm. We introduce a new class of dependence measures which
retain key properties of mutual information while more effectively quantifying
the exploration bias for heavy tailed distributions. We provide examples of
cases where our bounds are nearly tight in situations where the original
framework of Russo and Zou’15 does not apply.
Mohamed Gaafar, Osama Amin, Rafael F. Schaefer, Mohamed-Slim Alouini
Comments: Submitted to 2017 ICC Workshops
Subjects: Information Theory (cs.IT)
Inter-relay interference (IRI) challenges the operation of two-path relaying
systems. Furthermore, the unavailability of the channel state information (CSI)
at the source and the limited detection capabilities at the relays prevent
neither eliminating the interference nor adopting joint detection at the relays
nodes. Improper signaling is a powerful signaling scheme that reduces the
interference impact at the receiver side and improves the achievable rate
performance. Therefore, improper signaling is adopted at both relays, which
have access to the global CSI of the proposed system. Then, the improper signal
characteristics are designed to maximize the total end-to-end achievable rate
at the relays. To this end, both the power and the circularity coefficient, a
measure of the impropriety degree of the signal, are optimized at the relays.
Although the optimization problem is not convex, optimal power allocation for
both relays for a fixed circularity coefficient is obtained. Moreover, the
circularity coefficient is tuned to maximize the rate for a given power
allocation. Finally, a joint solution of the optimization problem is proposed
using a coordinate descent method based on alternate optimization. The
simulation results show that employing improper signaling improves the
achievable rate at medium and high IRI.
Chang Lv
Comments: 16 pages, git20160928a/43dd4da
Subjects: Information Theory (cs.IT); Number Theory (math.NT)
We obtain new non-existence results of perfect p-ary sequences with period n
(called type ([p, n])). The first case is a class with type [pequiv5pmod
8,p^aqn’]. The second case contains five types [pequiv3pmod 4,p^aq^ln’] for
certain (p, q) and (l). Moreover, we also have similar non-existence results
for perfect almost p-ary sequences.
Mihailo Stojnic
Subjects: Optimization and Control (math.OC); Information Theory (cs.IT); Probability (math.PR)
In our companion work cite{Stojnicl1RegPosasymldp} we revisited random
under-determined linear systems with sparse solutions. The main emphasis was on
the performance analysis of the (ell_1) heuristic in the so-called asymptotic
regime, i.e. in the regime where the systems’ dimensions are large. Through an
earlier sequence of work
cite{DonohoPol,DonohoUnsigned,StojnicCSetam09,StojnicUpper10}, it is now well
known that in such a regime the (ell_1) exhibits the so-called emph{phase
transition} (PT) phenomenon. cite{Stojnicl1RegPosasymldp} then went much
further and established the so-called emph{large deviations principle} (LDP)
type of behavior that characterizes not only the breaking points of the
(ell_1)’s success but also the behavior in the entire so-called
emph{transition zone} around these points. Both of these concepts, the PTs and
the LDPs, are in fact defined so that one can use them to characterize the
asymptotic behavior. In this paper we complement the results of
cite{Stojnicl1RegPosasymldp} by providing an exact detailed analysis in the
non-asymptotic regime. Of course, not only are the non-asymptotic results
complementing those from cite{Stojnicl1RegPosasymldp}, they actually are the
ones that ultimately fully characterize the (ell_1)’s behavior in the most
general sense. We introduce several novel high-dimensional geometry type of
strategies that enable us to eventually determine the (ell_1)’s behavior.
Yan Shuo Tan
Subjects: Probability (math.PR); Information Theory (cs.IT)
For any Borel probability measure on (mathbb{R}^n), we may define a family
of eccentricity tensors. This new notion, together with a tensorization trick,
allows us to prove an energy minimization property for rotationally invariant
probability measures. We use this theory to give a new proof of the Welch
bounds, and to improve upon them for collections of real vectors. In addition,
we are able to give elementary proofs for two theorems characterizing
probability measures optimizing one-parameter families of energy integrals on
the sphere. We are also able to explain why a phase transition occurs for
optimizers of these two families.
Ahmed El Shafie, Kamel Tourki, Naofal Al-Dhahir
Comments: Accepted for publication in IEEE Communications Letters Dec 2016
Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT)
We propose a new artificial-noise aided hybrid time-switching/power-splitting
scheme for orthogonal frequency-division multiplexing (OFDM) systems to
securely transmit data and transfer energy to a legitimate receiving node. In
our proposed scheme, the cyclic prefix has two more benefits in addition to the
cancellation of the inter-symbol interference between the OFDM blocks. Firstly,
it enables the legitimate transmitter to send artificial-noise (AN) vectors in
a way such that the interference can be canceled at the legitimate receiver
prior to information decoding. Secondly, its power is used to energize the
legitimate receiver. We optimize the cyclic prefix length, the time-switching
and power-splitting parameters, and the power allocation ratio between the data
and AN signals at the legitimate transmitter to maximize the average secrecy
rate subject to a constraint on the average energy transfer rate at the
legitimate receiver. Our numerical results demonstrate that our proposed scheme
can achieve up to 23% average secrecy rate gain relative to a pure
power-splitting scheme.
Moises Delgado, Heeralal Janwa
Subjects: Algebraic Geometry (math.AG); Cryptography and Security (cs.CR); Information Theory (cs.IT); Combinatorics (math.CO); Number Theory (math.NT)
Let (f) be a function on a finite field (F). The decomposition of the
generalized Fermat variety (X) defined by the multivariate polynomial of degree
(n), (phi(x,y,z)=f(x)+f(y)+f(z)) in (Bbb{P}^3(overline{mathbb{F}}_2)),
plays a crucial role in the study of almost perfect non-linear (APN) functions
and exceptional APN functions. Their structure depends fundamentally on the
Fermat varieties corresponding to the monomial functions of exceptional degrees
(n=2^k+1) and (n=2^{2k}-2^k+1) (Gold and Kasami-Welch numbers, respectively).
Very important results for these have been obtained by Janwa, McGuire and
Wilson in [12,13]. In this paper we study (X) related to the Kasami-Welch
degree monomials and its decomposition into absolutely irreducible components.
We show that, in this decomposition, the components intersect transversally at
a singular point.
This structural fact implies that the corresponding generalized Fermat
hypersurfaces, related to Kasami-Welch degree polynomial families, are
absolutely irreducible. In particular, we prove that if
(f(x)=x^{2^{2k}-2^k+1}+h(x)), where ({
m deg}(h)equiv 3{pmod 4}), then the
corresponding APN multivariate hypersurface is absolutely irreducible, and
hence (f(x)) is not exceptional APN function. We also prove conditional result
in the case when ({
m deg}(h)equiv 5{pmod 8}). Since for odd degree (f(x)),
the conjecture needs to be resolved only for the Gold degree and the
Kasami-Welch degree cases our results contribute substantially to the proof of
the conjecture on exceptional APN functions—in the hardest case: the
Kasami-Welch degree.
Abhinav Aggarwal, Varsha Dani, Nico Doettling, Thomas P. Hayes, Jared Saia
Comments: 15 pages, 2 figures
Subjects: Cryptography and Security (cs.CR); Information Theory (cs.IT)
A group of n users want to run an asynchronous protocol, whose length L is
unknown a priori. An oblivious bit-flipping adversary, who does not have access
to the private random bits of the users, enters the scene with unbounded budget
T. This paper presents an algorithm that compiles the noise-free protocol (pi)
into a robust protocol ?(pi’) that can deal with this adversary and achieve
successful communication with probability 1 – O(1/n), sending O((L + T) log n(L
+ T)) bits in expectation. It does so without any a priori knowledge of L or T.
Our result is, hence, within logarithmic factors of the optimal.
Theo van Uem
Comments: 11 pages. arXiv admin note: text overlap with arXiv:1612.00276
Subjects: Combinatorics (math.CO); Discrete Mathematics (cs.DM); Information Theory (cs.IT)
Winning probabilities of The Hat Game (Ebert’s Hat Problem) with three
players and three colors are only known in the symmetric case: all
probabilities of the colors are equal. This paper solves the asymmetric case:
probabilities may be different. We find winning probabilies and optimal
strategies in all cases.
Ahmed El Shafie, Dusit Niyato, Naofal Al-Dhahir
Comments: IEEE Wireless Communications Letters 2016, this http URL
Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT)
In this letter, we investigate the security of a single-antenna rechargeable
source node in the presence of a multi-antenna rechargeable cooperative jammer
and a potential single-antenna eavesdropper. The batteries at the legitimate
transmitting nodes (i.e. the source node and the jamming node) are assumed to
be limited in capacity and are modeled as queueing systems. We investigate the
impact of the energy arrival rates at the batteries on the achievable secrecy
rates. In our energy-constrained network, we propose an efficient scheme to
enhance the system’s security by optimizing the transmission times of the
source node. The jammer uses a subset of its antennas (and transmit
radio-frequency chains) to create a beamformer which maximizes the system’s
secrecy rate while completely canceling the artificial noise at the legitimate
destination. Our numerical results demonstrate the significant average secrecy
rate gain of our proposed scheme.
Ahmed El Shafie, Ahmed Sultan, Naofal Al-Dhahir
Comments: Published in IEEE Comm Letters this http URL
Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT)
This letter proposes a novel hybrid half-/full-duplex relaying scheme to
enhance the relay channel security. A source node (Alice) communicates with her
destination node (Bob) in the presence of a buffer-aided full-duplex relay node
(Rooney) and a potential eavesdropper (Eve). Rooney adopts two different
relaying strategies, namely randomize-and-forward and decode-and-forward
relaying strategies, to improve the security of the legitimate system. In the
first relaying strategy, Rooney uses a codebook different from that used at
Alice. In the second relaying strategy, Rooney and Alice use the same
codebooks. In addition, Rooney switches between half-duplex and full-duplex
modes to further enhance the security of the legitimate system. The numerical
results demonstrate that our proposed scheme achieves a significant average
secrecy end-to-end throughput improvement relative to the conventional
bufferless full-duplex relaying scheme.