Benjamin Doerr, Huu Phuoc Le, Régis Makhmara, Ta Duy Nguyen
Subjects: Neural and Evolutionary Computing (cs.NE)
For genetic algorithms using a bit-string representation of length~(n), the
general recommendation is to take (1/n) as mutation rate. In this work, we
discuss whether this is really justified for multimodal functions. Taking jump
functions and the ((1+1)) evolutionary algorithm as the simplest example, we
observe that larger mutation rates give significantly better runtimes. For the
(jump_{m,n}) function, any mutation rate between (2/n) and (m ln(m/2) / n)
leads to a speed-up at least exponential in (m) compared to the standard
choice.
The asymptotically best runtime, obtained from using the mutation rate (m/n)
and leading to a speed-up super-exponential in (m), is very sensitive to small
changes of the mutation rate. Any deviation by a small ((1 pm eps)) factor
leads to a slow-down exponential in (m). Consequently, any fixed mutation rate
gives strongly sub-optimal results for most jump functions.
Building on this observation, we propose to use a random mutation rate
(alpha/n), where (alpha) is chosen from a power-law distribution. We prove
that the ((1+1)) EA with this heavy-tailed mutation rate optimizes any
(jump_{m,n}) function in a time that is only a small polynomial (in~(m))
factor above the one stemming from the optimal rate for this (m).
Our heavy-tailed mutation operator yields similar speed-ups (over the best
known performance guarantees) for the vertex cover problem in bipartite graphs
and the matching problem in general graphs.
Following the example of fast simulated annealing, fast evolution strategies,
and fast evolutionary programming, we propose to call genetic algorithms using
a heavy-tailed mutation operator emph{fast genetic algorithms}.
Chelsea Finn, Pieter Abbeel, Sergey Levine
Comments: Videos of the reinforcement learning results are at sites.google.com/view/maml
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
We propose an algorithm for meta-learning that is model-agnostic, in the
sense that it is compatible with any model trained with gradient descent and
applicable to a variety of different learning problems, including
classification, regression, and reinforcement learning. The goal of
meta-learning is to train a model on a variety of learning tasks, such that it
can solve new learning tasks using only a small number of training samples. In
our approach, the parameters of the model are explicitly trained such that a
small number of gradient steps with a small amount of training data from a new
task will produce good generalization performance on that task. In effect, our
method trains the model to be easy to fine-tune. We demonstrate that this
approach leads to state-of-the-art performance on a few-shot image
classification benchmark, produces good results on few-shot regression, and
accelerates fine-tuning for policy gradient reinforcement learning with neural
network policies.
Dhanesh Ramachandram, Terrance DeVries
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
We present a method for skin lesion segmentation for the ISIC 2017 Skin
Lesion Segmentation Challenge. Our approach is based on a Fully Convolutional
Network architecture which is trained end to end, from scratch, on a limited
dataset. Our semantic segmentation architecture utilizes several recent
innovations in particularly in the combined use of (i) use of emph{atrous}
convolutions to increase the effective field of view of the network’s receptive
field without increasing the number of parameters, (ii) the use of
network-in-network (1 imes1) convolution layers to increase network capacity
without incereasing the number of parameters and (iii) state-of-art
super-resolution upsampling of predictions using subpixel CNN layers for
accurate and efficient upsampling of predictions. We achieved a IOU score of
0.642 on the validation set provided by the organisers.
Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, Yoshua Bengio
Comments: 15 pages with appendix, 7 figures, 4 tables. Conference paper in 5th International Conference on Learning Representations (ICLR 2017)
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
This paper proposes a new model for extracting an interpretable sentence
embedding by introducing self-attention. Instead of using a vector, we use a
2-D matrix to represent the embedding, with each row of the matrix attending on
a different part of the sentence. We also propose a self-attention mechanism
and a special regularization term for the model. As a side effect, the
embedding comes with an easy way of visualizing what specific parts of the
sentence are encoded into the embedding. We evaluate our model on 3 different
tasks: author profiling, sentiment classification, and textual entailment.
Results show that our model yields a significant performance gain compared to
other sentence embedding methods in all of the 3 tasks.
Dhanesh Ramachandram, Terrance DeVries
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
We present a method for skin lesion segmentation for the ISIC 2017 Skin
Lesion Segmentation Challenge. Our approach is based on a Fully Convolutional
Network architecture which is trained end to end, from scratch, on a limited
dataset. Our semantic segmentation architecture utilizes several recent
innovations in particularly in the combined use of (i) use of emph{atrous}
convolutions to increase the effective field of view of the network’s receptive
field without increasing the number of parameters, (ii) the use of
network-in-network (1 imes1) convolution layers to increase network capacity
without incereasing the number of parameters and (iii) state-of-art
super-resolution upsampling of predictions using subpixel CNN layers for
accurate and efficient upsampling of predictions. We achieved a IOU score of
0.642 on the validation set provided by the organisers.
Limin Wang, Yuanjun Xiong, Dahua Lin, Luc Van Gool
Comments: Preliminary version to appear in CVPR2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Current action recognition methods heavily rely on trimmed videos for model
training. However, it is very expensive and time-consuming to acquire a
large-scale trimmed video dataset. This paper presents a new weakly supervised
architecture, called UntrimmedNet, which is able to directly learn from
untrimmed videos without the need of temporal annotations of action instances.
Our UntrimmedNet couples two important components, the classification module
and the selection module, to learn the action models and reason about the
temporal duration of action instances, respectively. These two modules are
implemented with feed-forward networks. UntrimmedNet is essentially an
end-to-end trainable architecture, which allows for the joint optimization of
model parameters of both components. We exploit the learned models for the
problems of action recognition (WSR) and detection (WSD) on the untrimmed video
datasets of THUMOS14 and ActivityNet. Although our UntrimmedNet only employs
weak supervision, our method achieves performance superior or comparable to
that of strongly supervised approaches on these two datasets.
Umut Güçlü, Yağmur Güçlütürk, Meysam Madadi, Sergio Escalera, Xavier Baró, Jordi González, Rob van Lier, Marcel A. J. van Gerven
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Recent years have seen a sharp increase in the number of related yet distinct
advances in semantic segmentation. Here, we tackle this problem by leveraging
the respective strengths of these advances. That is, we formulate a conditional
random field over a four-connected graph as end-to-end trainable convolutional
and recurrent networks, and estimate them via an adversarial process.
Importantly, our model learns not only unary potentials but also pairwise
potentials, while aggregating multi-scale contexts and controlling higher-order
inconsistencies. We evaluate our model on two standard benchmark datasets for
semantic face segmentation, achieving state-of-the-art results on both of them.
Jing Huo, Wenbin Li, Yinghuan Shi, Yang Gao, Hujun Yin
Comments: 8 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Caricatures are facial drawings by artists with exaggeration on certain
facial parts. The exaggerations are often beyond realism and yet the
caricatures are still recognizable by humans. With the advent of deep learning,
recognition performances by computers on real-world faces has become comparable
to human performance even under unconstrained situations. However, there is
still a gap in caricature recognition performance between computer and human.
This is mainly due to the lack of publicly available caricature datasets of
large scale. To facilitate the research in caricature recognition, a new
caricature dataset is built. All the caricature images and face images were
collected from the web.Compared with two existing datasets, this dataset is of
larger size and has various artistic styles. We also offer evaluation protocols
and present baseline performances on the dataset. Specifically, four evaluation
protocols are provided: restricted and unrestricted caricature verifications,
caricature to photo and photo to caricature face identifications. Based on the
evaluation protocols, three face alignment methods together with five kinds of
features and nine subspace and metric learning algorithms have been applied to
provide the baseline performances on this dataset. Main conclusion is that
there is still a space for improvement in caricature face recognition.
Amin Fehri (CMM), Santiago Velasco-Forero (CMM), Fernand Meyer (CMM)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Image segmentation is the process of partitioning an image into a set of
meaningful regions according to some criteria. Hierarchical segmentation has
emerged as a major trend in this regard as it favors the emergence of important
regions at different scales. On the other hand, many methods allow us to have
prior information on the position of structures of interest in the images. In
this paper, we present a versatile hierarchical segmentation method that takes
into account any prior spatial information and outputs a hierarchical
segmentation that emphasizes the contours or regions of interest while
preserving the important structures in the image. Several applications are
presented that illustrate the method versatility and efficiency.
Mario Rosario Guarracino, Lucia Maddalena
Comments: 4 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We propose an automatic algorithm, named SDI, for the segmentation of skin
lesions in dermoscopic images, articulated into three main steps: selection of
the image ROI, selection of the segmentation band, and segmentation. We present
extensive experimental results achieved by the SDI algorithm on the lesion
segmentation dataset made available for the ISIC 2017 challenge on Skin Lesion
Analysis Towards Melanoma Detection, highlighting its advantages and
disadvantages.
Thomas Vandal, Evan Kodra, Sangram Ganguly, Andrew Michaelis, Ramakrishna Nemani, Auroop R Ganguly
Comments: 9 pages, 5 Figures, 2 Tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
The impacts of climate change are felt by most critical systems, such as
infrastructure, ecological systems, and power-plants. However, contemporary
Earth System Models (ESM) are run at spatial resolutions too coarse for
assessing effects this localized. Local scale projections can be obtained using
statistical downscaling, a technique which uses historical climate observations
to learn a low-resolution to high-resolution mapping. Depending on statistical
modeling choices, downscaled projections have been shown to vary significantly
terms of accuracy and reliability. The spatio-temporal nature of the climate
system motivates the adaptation of super-resolution image processing techniques
to statistical downscaling. In our work, we present DeepSD, a generalized
stacked super resolution convolutional neural network (SRCNN) framework for
statistical downscaling of climate variables. DeepSD augments SRCNN with
multi-scale input channels to maximize predictability in statistical
downscaling. We provide a comparison with Bias Correction Spatial
Disaggregation as well as three Automated-Statistical Downscaling approaches in
downscaling daily precipitation from 1 degree (~100km) to 1/8 degrees (~12.5km)
over the Continental United States. Furthermore, a framework using the NASA
Earth Exchange (NEX) platform is discussed for downscaling more than 20 ESM
models with multiple emission scenarios.
Kazuhisa Matsunaga, Akira Hamada, Akane Minagawa, Hiroshi Koga
Comments: 4 pages. 3 figures. ISIC2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
This short paper reports the method and the evaluation results of Casio and
Shinshu University joint team for the ISBI Challenge 2017 – Skin Lesion
Analysis Towards Melanoma Detection – Part 3: Lesion Classification hosted by
ISIC. Our online validation score was 0.958 with melanoma classifier AUC 0.924
and seborrheic keratosis classifier AUC 0.993.
Yu Xiang, Dieter Fox
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
3D scene understanding is important for robots to interact with the 3D world
in a meaningful way. Most previous works on 3D scene understanding focus on
recognizing geometrical or semantic properties of the scene independently. In
this work, we introduce Data Associated Recurrent Neural Networks (DA-RNNs), a
novel framework for joint 3D scene mapping and semantic labeling. DA-RNNs use a
new recurrent neural network architecture for semantic labeling on RGB-D
videos. The output of the network is integrated with mapping techniques such as
KinectFusion in order to inject semantic information into the reconstructed 3D
scene. Experiments conducted on a real world dataset and a synthetic dataset
with RGB-D videos demonstrate the ability of our method in semantic 3D scene
mapping.
Xiaodan Liang, Liang Lin, Xiaohui Shen, Jiashi Feng, Shuicheng Yan, Eric P. Xing
Comments: To appear in CVPR 2017 as a spotlight paper
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)
This paper develops a general framework for learning interpretable data
representation via Long Short-Term Memory (LSTM) recurrent neural networks over
hierarchal graph structures. Instead of learning LSTM models over the pre-fixed
structures, we propose to further learn the intermediate interpretable
multi-level graph structures in a progressive and stochastic way from data
during the LSTM network optimization. We thus call this model the
structure-evolving LSTM. In particular, starting with an initial element-level
graph representation where each node is a small data element, the
structure-evolving LSTM gradually evolves the multi-level graph representations
by stochastically merging the graph nodes with high compatibilities along the
stacked LSTM layers. In each LSTM layer, we estimate the compatibility of two
connected nodes from their corresponding LSTM gate outputs, which is used to
generate a merging probability. The candidate graph structures are accordingly
generated where the nodes are grouped into cliques with their merging
probabilities. We then produce the new graph structure with a
Metropolis-Hasting algorithm, which alleviates the risk of getting stuck in
local optimums by stochastic sampling with an acceptance probability. Once a
graph structure is accepted, a higher-level graph is then constructed by taking
the partitioned cliques as its nodes. During the evolving process,
representation becomes more abstracted in higher-levels where redundant
information is filtered out, allowing more efficient propagation of long-range
data dependencies. We evaluate the effectiveness of structure-evolving LSTM in
the application of semantic object parsing and demonstrate its advantage over
state-of-the-art LSTM models on standard benchmarks.
Xiaodan Liang, Lisa Lee, Eric P. Xing
Comments: This manuscript is accepted by CVPR 2017 as a spotlight paper
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)
Despite progress in visual perception tasks such as image classification and
detection, computers still struggle to understand the interdependency of
objects in the scene as a whole, e.g., relations between objects or their
attributes. Existing methods often ignore global context cues capturing the
interactions among different object instances, and can only recognize a handful
of types by exhaustively training individual detectors for all possible
relationships. To capture such global interdependency, we propose a deep
Variation-structured Reinforcement Learning (VRL) framework to sequentially
discover object relationships and attributes in the whole image. First, a
directed semantic action graph is built using language priors to provide a rich
and compact representation of semantic correlations between object categories,
predicates, and attributes. Next, we use a variation-structured traversal over
the action graph to construct a small, adaptive action set for each step based
on the current state and historical actions. In particular, an ambiguity-aware
object mining scheme is used to resolve semantic ambiguity among object
categories that the object detector fails to distinguish. We then make
sequential predictions using a deep RL framework, incorporating global context
cues and semantic embeddings of previously extracted phrases in the state
vector. Our experiments on the Visual Relationship Detection (VRD) dataset and
the large-scale Visual Genome dataset validate the superiority of VRL, which
can achieve significantly better detection results on datasets involving
thousands of relationship and attribute types. We also demonstrate that VRL is
able to predict unseen types embedded in our action graph by learning
correlations on shared graph nodes.
Chelsea Finn, Pieter Abbeel, Sergey Levine
Comments: Videos of the reinforcement learning results are at sites.google.com/view/maml
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
We propose an algorithm for meta-learning that is model-agnostic, in the
sense that it is compatible with any model trained with gradient descent and
applicable to a variety of different learning problems, including
classification, regression, and reinforcement learning. The goal of
meta-learning is to train a model on a variety of learning tasks, such that it
can solve new learning tasks using only a small number of training samples. In
our approach, the parameters of the model are explicitly trained such that a
small number of gradient steps with a small amount of training data from a new
task will produce good generalization performance on that task. In effect, our
method trains the model to be easy to fine-tune. We demonstrate that this
approach leads to state-of-the-art performance on a few-shot image
classification benchmark, produces good results on few-shot regression, and
accelerates fine-tuning for policy gradient reinforcement learning with neural
network policies.
Morris Antonello, Marco Carraro, Marco Pierobon, Emanuele Menegatti
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
This paper deals with the problem of detecting fallen people lying on the
floor by means of a mobile robot equipped with a 3D depth sensor. In the
proposed algorithm, inspired by semantic segmentation techniques, the 3D scene
is over-segmented into small patches. Fallen people are then detected by means
of two SVM classifiers: the first one labels each patch, while the second one
captures the spatial relations between them. This novel approach showed to be
robust and fast. Indeed, thanks to the use of small patches, fallen people in
real cluttered scenes with objects side by side are correctly detected.
Moreover, the algorithm can be executed on a mobile robot fitted with a
standard laptop making it possible to exploit the 2D environmental map built by
the robot and the multiple points of view obtained during the robot navigation.
Additionally, this algorithm is robust to illumination changes since it does
not rely on RGB data but on depth data. All the methods have been thoroughly
validated on the IASLAB-RGBD Fallen Person Dataset, which is published online
as a further contribution. It consists of several static and dynamic sequences
with 15 different people and 2 different environments.
Chaitanya Mitash, Kostas E. Bekris, Abdeslam Boularias
Comments: Submitted to International Conference on Intelligent Robots and Systems (IROS 2017)
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
Impressive progress has been achieved in object detection with the use of
deep learning. Nevertheless, such tools typically require a large amount of
training data and significant manual effort for labeling objects. This limits
their applicability in robotics, where it is necessary to scale solutions to a
large number of objects and a variety of conditions. This work proposes a fully
autonomous process to train a Convolutional Neural Network (CNN) for object
detection and pose estimation in robotic setups. The application involves
detection of objects placed in a clutter and in tight environments, such as a
shelf. In particular, given access to 3D object models, several aspects of the
environment are simulated and the models are placed in physically realistic
poses with respect to their environment to generate a labeled synthetic
dataset. To further improve object detection, the network self-trains over real
images that are labeled using a robust multi-view pose estimation process. The
proposed training process is evaluated on several existing datasets and on a
dataset that we collected with a Motoman robotic manipulator. Results show that
the proposed process outperforms popular training processes relying on
synthetic data generation and manual annotation.
Enes Kocabey, Mustafa Camurcu, Ferda Ofli, Yusuf Aytar, Javier Marin, Antonio Torralba, Ingmar Weber
Comments: This is a preprint of a short paper accepted at ICWSM’17. Please cite that version instead
Subjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY)
A person’s weight status can have profound implications on their life,
ranging from mental health, to longevity, to financial income. At the societal
level, “fat shaming” and other forms of “sizeism” are a growing concern, while
increasing obesity rates are linked to ever raising healthcare costs. For these
reasons, researchers from a variety of backgrounds are interested in studying
obesity from all angles. To obtain data, traditionally, a person would have to
accurately self-report their body-mass index (BMI) or would have to see a
doctor to have it measured. In this paper, we show how computer vision can be
used to infer a person’s BMI from social media images. We hope that our tool,
which we release, helps to advance the study of social aspects related to body
weight.
Liangzhen Lai, Naveen Suda, Vikas Chandra
Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Deep convolutional neural network (CNN) inference requires significant amount
of memory and computation, which limits its deployment on embedded devices. To
alleviate these problems to some extent, prior research utilize low precision
fixed-point numbers to represent the CNN weights and activations. However, the
minimum required data precision of fixed-point weights varies across different
networks and also across different layers of the same network. In this work, we
propose using floating-point numbers for representing the weights and
fixed-point numbers for representing the activations. We show that using
floating-point representation for weights is more efficient than fixed-point
representation for the same bit-width and demonstrate it on popular large-scale
CNNs such as AlexNet, SqueezeNet, GoogLeNet and VGG-16. We also show that such
a representation scheme enables compact hardware multiply-and-accumulate (MAC)
unit design. Experimental results show that the proposed scheme reduces the
weight storage by up to 36% and power consumption of the hardware multiplier by
up to 50%.
Niki Pfeifer, Hiroshi Yama
Subjects: Artificial Intelligence (cs.AI); Logic (math.LO); Probability (math.PR)
In this paper we study selected argument forms involving counterfactuals and
indicative conditionals under uncertainty. We selected argument forms to
explore whether people with an Eastern cultural background reason differently
about conditionals compared to Westerners, because of the differences in the
location of negations. In a 2×2 between-participants design, 63 Japanese
university students were allocated to four groups, crossing indicative
conditionals and counterfactuals, and each presented in two random task orders.
The data show close agreement between the responses of Easterners and
Westerners. The modal responses provide strong support for the hypothesis that
conditional probability is the best predictor for counterfactuals and
indicative conditionals. Finally, the grand majority of the responses are
probabilistically coherent, which endorses the psychological plausibility of
choosing coherence-based probability logic as a rationality framework for
psychological reasoning research.
Niki Pfeifer, Leena Tulkki
Subjects: Artificial Intelligence (cs.AI); Probability (math.PR)
We study abductive, causal, and non-causal conditionals in indicative and
counterfactual formulations using probabilistic truth table tasks under
incomplete probabilistic knowledge (N = 80). We frame the task as a
probability-logical inference problem. The most frequently observed response
type across all conditions was a class of conditional event interpretations of
conditionals; it was followed by conjunction interpretations. An interesting
minority of participants neglected some of the relevant imprecision involved in
the premises when inferring lower or upper probability bounds on the target
conditional/counterfactual (“halfway responses”). We discuss the results in the
light of coherence-based probability logic and the new paradigm psychology of
reasoning.
Niki Pfeifer, Hanna Pankka
Subjects: Artificial Intelligence (cs.AI); Logic (math.LO); Probability (math.PR)
We present a formal measure of argument strength, which combines the ideas
that conclusions of strong arguments are (i) highly probable and (ii) their
uncertainty is relatively precise. Likewise, arguments are weak when their
conclusion probability is low or when it is highly imprecise. We show how the
proposed measure provides a new model of the Ellsberg paradox. Moreover, we
further substantiate the psychological plausibility of our approach by an
experiment (N = 60). The data show that the proposed measure predicts human
inferences in the original Ellsberg task and in corresponding argument strength
tasks. Finally, we report qualitative data taken from structured interviews on
folk psychological conceptions on what argument strength means.
Taisuke Sato
Comments: 7 pages, AAAI-17 Workshop on Symbolic Inference and Optimization
Subjects: Artificial Intelligence (cs.AI); Logic in Computer Science (cs.LO)
We propose a new linear algebraic approach to the computation of Tarskian
semantics in logic. We embed a finite model M in first-order logic with N
entities in N-dimensional Euclidean space R^N by mapping entities of M to N
dimensional one-hot vectors and k-ary relations to order-k adjacency tensors
(multi-way arrays). Second given a logical formula F in prenex normal form, we
compile F into a set Sigma_F of algebraic formulas in multi-linear algebra with
a nonlinear operation. In this compilation, existential quantifiers are
compiled into a specific type of tensors, e.g., identity matrices in the case
of quantifying two occurrences of a variable. It is shown that a systematic
evaluation of Sigma_F in R^N gives the truth value, 1(true) or 0(false), of F
in M. Based on this framework, we also propose an unprecedented way of
computing the least models defined by Datalog programs in linear spaces via
matrix equations and empirically show its effectiveness compared to
state-of-the-art approaches.
Chelsea Finn, Pieter Abbeel, Sergey Levine
Comments: Videos of the reinforcement learning results are at sites.google.com/view/maml
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
We propose an algorithm for meta-learning that is model-agnostic, in the
sense that it is compatible with any model trained with gradient descent and
applicable to a variety of different learning problems, including
classification, regression, and reinforcement learning. The goal of
meta-learning is to train a model on a variety of learning tasks, such that it
can solve new learning tasks using only a small number of training samples. In
our approach, the parameters of the model are explicitly trained such that a
small number of gradient steps with a small amount of training data from a new
task will produce good generalization performance on that task. In effect, our
method trains the model to be easy to fine-tune. We demonstrate that this
approach leads to state-of-the-art performance on a few-shot image
classification benchmark, produces good results on few-shot regression, and
accelerates fine-tuning for policy gradient reinforcement learning with neural
network policies.
Dhanesh Ramachandram, Terrance DeVries
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
We present a method for skin lesion segmentation for the ISIC 2017 Skin
Lesion Segmentation Challenge. Our approach is based on a Fully Convolutional
Network architecture which is trained end to end, from scratch, on a limited
dataset. Our semantic segmentation architecture utilizes several recent
innovations in particularly in the combined use of (i) use of emph{atrous}
convolutions to increase the effective field of view of the network’s receptive
field without increasing the number of parameters, (ii) the use of
network-in-network (1 imes1) convolution layers to increase network capacity
without incereasing the number of parameters and (iii) state-of-art
super-resolution upsampling of predictions using subpixel CNN layers for
accurate and efficient upsampling of predictions. We achieved a IOU score of
0.642 on the validation set provided by the organisers.
Thi Thanh Van Nguyen, Manh Duong Phung, Quang Vinh Tran
Journal-ref: International Journal of Control and Automation, Vol. 10, No. 2
(2017), pp.349-364
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Systems and Control (cs.SY)
This study proposes behavior-based navigation architecture, named BBFM, to
deal with the problem of navigating the mobile robot in unknown environments in
the presence of obstacles and local minimum regions. In the architecture, the
complex navigation task is split into principal sub-tasks or behaviors. Each
behavior is implemented by a fuzzy controller and executed independently to
deal with a specific problem of navigation. The fuzzy controller is modified to
contain only the fuzzification and inference procedures so that its output is a
membership function representing the behavior’s objective. The membership
functions of all controllers are then used as the objective functions for a
multi-objective optimization process to coordinate all behaviors. The result of
this process is an overall control signal, which is Pareto-optimal, used to
control the robot. A number of simulations, comparisons, and experiments were
conducted. The results show that the proposed architecture outperforms some
popular behavior-based architectures in term of accuracy, smoothness, traveled
distance, and time response.
Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, Yoshua Bengio
Comments: 15 pages with appendix, 7 figures, 4 tables. Conference paper in 5th International Conference on Learning Representations (ICLR 2017)
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
This paper proposes a new model for extracting an interpretable sentence
embedding by introducing self-attention. Instead of using a vector, we use a
2-D matrix to represent the embedding, with each row of the matrix attending on
a different part of the sentence. We also propose a self-attention mechanism
and a special regularization term for the model. As a side effect, the
embedding comes with an easy way of visualizing what specific parts of the
sentence are encoded into the embedding. We evaluate our model on 3 different
tasks: author profiling, sentiment classification, and textual entailment.
Results show that our model yields a significant performance gain compared to
other sentence embedding methods in all of the 3 tasks.
Mayank Kejriwal, Pedro Szekely
Comments: 10 pages, ACM WWW 2017
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Extracting useful entities and attribute values from illicit domains such as
human trafficking is a challenging problem with the potential for widespread
social impact. Such domains employ atypical language models, have `long tails’
and suffer from the problem of concept drift. In this paper, we propose a
lightweight, feature-agnostic Information Extraction (IE) paradigm specifically
designed for such domains. Our approach uses raw, unlabeled text from an
initial corpus, and a few (12-120) seed annotations per domain-specific
attribute, to learn robust IE models for unobserved pages and websites.
Empirically, we demonstrate that our approach can outperform feature-centric
Conditional Random Field baselines by over 18\% F-Measure on five annotated
sets of real-world human trafficking datasets in both low-supervision and
high-supervision settings. We also show that our approach is demonstrably
robust to concept drift, and can be efficiently bootstrapped even in a serial
computing environment.
Gelin Gao, Bud Mishra, Daniele Ramazzotti
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computational Engineering, Finance, and Science (cs.CE)
The most recent financial upheavals have cast doubt on the adequacy of some
of the conventional quantitative risk management strategies, such as VaR (Value
at Risk), in many common situations. Consequently, there has been an increasing
need for verisimilar financial stress testings, namely simulating and analyzing
financial portfolios in extreme, albeit rare scenarios. Unlike conventional
risk management which exploits statistical correlations among financial
instruments, here we focus our analysis on the notion of probabilistic
causation, which is embodied by Suppes-Bayes Causal Networks (SBCNs), SBCNs are
probabilistic graphical models that have many attractive features in terms of
more accurate causal analysis for generating financial stress scenarios. In
this paper, we present a novel approach for conducting stress testing of
financial portfolios based on SBCNs in combination with classical machine
learning classification tools. The resulting method is shown to be capable of
correctly discovering the causal relationships among financial factors that
affect the portfolios and thus, simulating stress testing scenarios with a
higher accuracy and lower computational complexity than conventional Monte
Carlo Simulations.
Daniele Ramazzotti, Marco S. Nobile, Marco Antoniotti, Alex Graudenzi
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)
One of the critical issues when adopting Bayesian networks (BNs) to model
dependencies among random variables is to “learn” their structure, given the
huge search space of possible solutions, i.e., all the possible direct acyclic
graphs. This is a well-known NP-hard problem, which is also complicated by
known pitfalls such as the issue of I-equivalence among different structures.
In this work we restrict the investigations on BN structure learning to a
specific class of networks, i.e., those representing the dynamics of phenomena
characterized by the monotonic accumulation of events. Such phenomena allow to
set specific structural constraints based on Suppes’ theory of probabilistic
causation and, accordingly, to define constrained BNs, named Suppes-Bayes
Causal Networks (SBCNs). We here investigate the structure learning of SBCNs
via extensive simulations with various state-of-the-art search strategies, such
as canonical local search techniques and Genetic Algorithms. Among the main
results we show that Suppes’ constraints deeply simplify the learning task, by
reducing the solution search space and providing a temporal ordering on the
variables.
Xiaodan Liang, Liang Lin, Xiaohui Shen, Jiashi Feng, Shuicheng Yan, Eric P. Xing
Comments: To appear in CVPR 2017 as a spotlight paper
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)
This paper develops a general framework for learning interpretable data
representation via Long Short-Term Memory (LSTM) recurrent neural networks over
hierarchal graph structures. Instead of learning LSTM models over the pre-fixed
structures, we propose to further learn the intermediate interpretable
multi-level graph structures in a progressive and stochastic way from data
during the LSTM network optimization. We thus call this model the
structure-evolving LSTM. In particular, starting with an initial element-level
graph representation where each node is a small data element, the
structure-evolving LSTM gradually evolves the multi-level graph representations
by stochastically merging the graph nodes with high compatibilities along the
stacked LSTM layers. In each LSTM layer, we estimate the compatibility of two
connected nodes from their corresponding LSTM gate outputs, which is used to
generate a merging probability. The candidate graph structures are accordingly
generated where the nodes are grouped into cliques with their merging
probabilities. We then produce the new graph structure with a
Metropolis-Hasting algorithm, which alleviates the risk of getting stuck in
local optimums by stochastic sampling with an acceptance probability. Once a
graph structure is accepted, a higher-level graph is then constructed by taking
the partitioned cliques as its nodes. During the evolving process,
representation becomes more abstracted in higher-levels where redundant
information is filtered out, allowing more efficient propagation of long-range
data dependencies. We evaluate the effectiveness of structure-evolving LSTM in
the application of semantic object parsing and demonstrate its advantage over
state-of-the-art LSTM models on standard benchmarks.
Xiaodan Liang, Lisa Lee, Eric P. Xing
Comments: This manuscript is accepted by CVPR 2017 as a spotlight paper
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)
Despite progress in visual perception tasks such as image classification and
detection, computers still struggle to understand the interdependency of
objects in the scene as a whole, e.g., relations between objects or their
attributes. Existing methods often ignore global context cues capturing the
interactions among different object instances, and can only recognize a handful
of types by exhaustively training individual detectors for all possible
relationships. To capture such global interdependency, we propose a deep
Variation-structured Reinforcement Learning (VRL) framework to sequentially
discover object relationships and attributes in the whole image. First, a
directed semantic action graph is built using language priors to provide a rich
and compact representation of semantic correlations between object categories,
predicates, and attributes. Next, we use a variation-structured traversal over
the action graph to construct a small, adaptive action set for each step based
on the current state and historical actions. In particular, an ambiguity-aware
object mining scheme is used to resolve semantic ambiguity among object
categories that the object detector fails to distinguish. We then make
sequential predictions using a deep RL framework, incorporating global context
cues and semantic embeddings of previously extracted phrases in the state
vector. Our experiments on the Visual Relationship Detection (VRD) dataset and
the large-scale Visual Genome dataset validate the superiority of VRL, which
can achieve significantly better detection results on datasets involving
thousands of relationship and attribute types. We also demonstrate that VRL is
able to predict unseen types embedded in our action graph by learning
correlations on shared graph nodes.
Stefano Beretta, Mauro Castelli, Ivo Goncalves, Ivan Merelli, Daniele Ramazzotti
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)
Gene and protein networks are very important to model complex large-scale
systems in molecular biology. Inferring or reverseengineering such networks can
be defined as the process of identifying gene/protein interactions from
experimental data through computational analysis. However, this task is
typically complicated by the enormously large scale of the unknowns in a rather
small sample size. Furthermore, when the goal is to study causal relationships
within the network, tools capable of overcoming the limitations of correlation
networks are required. In this work, we make use of Bayesian Graphical Models
to attach this problem and, specifically, we perform a comparative study of
different state-of-the-art heuristics, analyzing their performance in inferring
the structure of the Bayesian Network from breast cancer data.
Shuai Zhang, Lina Yao
Comments: 5 pages
Subjects: Information Retrieval (cs.IR)
Recommender systems have been actively and extensively studied over past
decades. In the meanwhile, the boom of Big Data is driving fundamental changes
in the development of recommender systems. In this paper, we propose a dynamic
intention-aware recommender system to better facilitate users to find desirable
products and services. Compare to prior work, our proposal possesses the
following advantages: (1) it takes user intentions and demands into account
through intention mining techniques. By unearthing user intentions from the
historical user-item interactions, and various user digital traces harvested
from social media and Internet of Things, it is capable of delivering more
satisfactory recommendations by leveraging rich online and offline user data;
(2) it embraces the benefits of embedding heterogeneous source information and
shared representations of multiple domains to provide accurate and effective
recommendations comprehensively; (3) it recommends products or services
proactively and timely by capturing the dynamic influences, which can
significantly reduce user involvements and efforts.
Jürgen Bernard, Christian Ritter, David Sessler, Matthias Zeppelzauer, Jörn Kohlhammer, Dieter Fellner
Subjects: Learning (cs.LG); Information Retrieval (cs.IR)
The definition of similarity is a key prerequisite when analyzing complex
data types in data mining, information retrieval, or machine learning. However,
the meaningful definition is often hampered by the complexity of data objects
and particularly by different notions of subjective similarity latent in
targeted user groups. Taking the example of soccer players, we present a
visual-interactive system that learns users’ mental models of similarity. In a
visual-interactive interface, users are able to label pairs of soccer players
with respect to their subjective notion of similarity. Our proposed similarity
model automatically learns the respective concept of similarity using an active
learning strategy. A visual-interactive retrieval technique is provided to
validate the model and to execute downstream retrieval tasks for soccer player
analysis. The applicability of the approach is demonstrated in different
evaluation strategies, including usage scenarions and cross-validation tests.
Buru Can, Ahmet Üstün, Murathan Kurfalı
Comments: 13 pages, accepted and presented in 17th International Conference on Intelligent Text Processing and Computational Linguistics (CICLING)
Subjects: Computation and Language (cs.CL)
Sparsity is one of the major problems in natural language processing. The
problem becomes even more severe in agglutinating languages that are highly
prone to be inflected. We deal with sparsity in Turkish by adopting
morphological features for part-of-speech tagging. We learn inflectional and
derivational morpheme tags in Turkish by using conditional random fields (CRF)
and we employ the morpheme tags in part-of-speech (PoS) tagging by using hidden
Markov models (HMMs) to mitigate sparsity. Results show that using morpheme
tags in PoS tagging helps alleviate the sparsity in emission probabilities. Our
model outperforms other hidden Markov model based PoS tagging models for small
training datasets in Turkish. We obtain an accuracy of 94.1% in morpheme
tagging and 89.2% in PoS tagging on a 5K training dataset.
Marjan Hosseinia, Arjun Mukherjee
Comments: 18 pages, Accepted at CICLing 2017, 18th International Conference on Intelligent Text Processing and Computational Linguistics
Subjects: Computation and Language (cs.CL)
This paper explores the problem of sockpuppet detection in deceptive opinion
spam using authorship attribution and verification approaches. Two methods are
explored. The first is a feature subsampling scheme that uses the KL-Divergence
on stylistic language models of an author to find discriminative features. The
second is a transduction scheme, spy induction that leverages the diversity of
authors in the unlabeled test set by sending a set of spies (positive samples)
from the training set to retrieve hidden samples in the unlabeled test set
using nearest and farthest neighbors. Experiments using ground truth sockpuppet
data show the effectiveness of the proposed schemes.
Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, Yoshua Bengio
Comments: 15 pages with appendix, 7 figures, 4 tables. Conference paper in 5th International Conference on Learning Representations (ICLR 2017)
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
This paper proposes a new model for extracting an interpretable sentence
embedding by introducing self-attention. Instead of using a vector, we use a
2-D matrix to represent the embedding, with each row of the matrix attending on
a different part of the sentence. We also propose a self-attention mechanism
and a special regularization term for the model. As a side effect, the
embedding comes with an easy way of visualizing what specific parts of the
sentence are encoded into the embedding. We evaluate our model on 3 different
tasks: author profiling, sentiment classification, and textual entailment.
Results show that our model yields a significant performance gain compared to
other sentence embedding methods in all of the 3 tasks.
Mayank Kejriwal, Pedro Szekely
Comments: 10 pages, ACM WWW 2017
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Extracting useful entities and attribute values from illicit domains such as
human trafficking is a challenging problem with the potential for widespread
social impact. Such domains employ atypical language models, have `long tails’
and suffer from the problem of concept drift. In this paper, we propose a
lightweight, feature-agnostic Information Extraction (IE) paradigm specifically
designed for such domains. Our approach uses raw, unlabeled text from an
initial corpus, and a few (12-120) seed annotations per domain-specific
attribute, to learn robust IE models for unobserved pages and websites.
Empirically, we demonstrate that our approach can outperform feature-centric
Conditional Random Field baselines by over 18\% F-Measure on five annotated
sets of real-world human trafficking datasets in both low-supervision and
high-supervision settings. We also show that our approach is demonstrably
robust to concept drift, and can be efficiently bootstrapped even in a serial
computing environment.
Marc Moreno Lopez, Jugal Kalita
Subjects: Computation and Language (cs.CL)
Convolutional Neural Network (CNNs) are typically associated with Computer
Vision. CNNs are responsible for major breakthroughs in Image Classification
and are the core of most Computer Vision systems today. More recently CNNs have
been applied to problems in Natural Language Processing and gotten some
interesting results. In this paper, we will try to explain the basics of CNNs,
its different variations and how they have been applied to NLP.
William L. Hamilton, Justine Zhang, Cristian Danescu-Niculescu-Mizil, Dan Jurafsky, Jure Leskovec
Comments: Extended version of a paper that will appear in ICWSM 2017 (under the same title)
Subjects: Social and Information Networks (cs.SI); Computation and Language (cs.CL)
Loyalty is an essential component of multi-community engagement. When users
have the choice to engage with a variety of different communities, they often
become loyal to just one, focusing on that community at the expense of others.
However, it is unclear how loyalty is manifested in user behavior, or whether
loyalty is encouraged by certain community characteristics.
In this paper we operationalize loyalty as a user-community relation: users
loyal to a community consistently prefer it over all others; loyal communities
retain their loyal users over time. By exploring this relation using a large
dataset of discussion communities from Reddit, we reveal that loyalty is
manifested in remarkably consistent behaviors across a wide spectrum of
communities. Loyal users employ language that signals collective identity and
engage with more esoteric, less popular content, indicating they may play a
curational role in surfacing new material. Loyal communities have denser
user-user interaction networks and lower rates of triadic closure, suggesting
that community-level loyalty is associated with more cohesive interactions and
less fragmentation into subgroups. We exploit these general patterns to predict
future rates of loyalty. Our results show that a user’s propensity to become
loyal is apparent from their first interactions with a community, suggesting
that some users are intrinsically loyal from the very beginning.
Stéphane Devismes, David Ilcinkas (LaBRI), Colette Johnen (LaBRI)
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
We deal with the problem of maintaining a shortest-path tree rooted at some
process r in a network that may be disconnected after topological changes. The
goal is then to maintain a shortest-path tree rooted at r in its connected
component, Vr, and make all processes of other components detecting that r is
not part of their connected component. We propose, in the composite atomicity
model, a silent self-stabilizing algorithm for this problem working in
semi-anonymous networks, where edges have strictly positive weights. This
algorithm does not require any a priori knowledge about global parameters of
the network. We prove its correctness assuming the distributed unfair daemon,
the most general daemon. Its stabilization time in rounds is at most 3nmaxCC +
D, where nmaxCC is the maximum number of non-root processes in a connected
component and D is the hop-diameter of Vr. Furthermore, if we additionally
assume that edge weights are positive integers, then it stabilizes in a
polynomial number of steps: namely, we exhibit a bound in O(WmaxnmaxCC 3 n),
where Wmax is the maximum weight of an edge and n is the number of processes.
Sai Xie, Zhe Chen
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Networking and Internet Architecture (cs.NI)
In the era of big data and Internet of things, massive sensor data are
gathered with Internet of things. Quantity of data captured by sensor networks
are considered to contain highly useful and valuable information. However, for
a variety of reasons, received sensor data often appear abnormal. Therefore,
effective anomaly detection methods are required to guarantee the quality of
data collected by those sensor nodes. Since sensor data are usually correlated
in time and space, not all the gathered data are valuable for further data
processing and analysis. Preprocessing is necessary for eliminating the
redundancy in gathered massive sensor data. In this paper, the proposed work
defines a sensor data preprocessing framework. It is mainly composed of two
parts, i.e., sensor data anomaly detection and sensor data redundancy
elimination. In the first part, methods based on principal statistic analysis
and Bayesian network is proposed for sensor data anomaly detection. Then,
approaches based on static Bayesian network (SBN) and dynamic Bayesian networks
(DBNs) are proposed for sensor data redundancy elimination. Static sensor data
redundancy detection algorithm (SSDRDA) for eliminating redundant data in
static datasets and real-time sensor data redundancy detection algorithm
(RSDRDA) for eliminating redundant sensor data in real-time are proposed. The
efficiency and effectiveness of the proposed methods are validated using
real-world gathered sensor datasets.
Arnaud Casteigts, Swan Dubois, Franck Petit, John Michael Robson
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Discrete Mathematics (cs.DM)
We investigate a special case of hereditary property that we refer to as {em
robustness}. A property is {em robust} in a given graph if it is inherited by
all connected spanning subgraphs of this graph. We motivate this definition in
different contexts, showing that it plays a central role in highly dynamic
networks, although the problem is defined in terms of classical (static) graph
theory. In this paper, we focus on the robustness of {em maximal independent
sets} (MIS). Following the above definition, a MIS is said to be {em robust}
(RMIS) if it remains a valid MIS in all connected spanning subgraphs of the
original graph. We characterize the class of graphs in which {em all} possible
MISs are robust. We show that, in these particular graphs, the problem of
finding a robust MIS is {em local}; that is, we present an RMIS algorithm
using only a sublogarithmic number of rounds (in the number of nodes (n)) in
the ({cal LOCAL}) model. On the negative side, we show that, in general
graphs, the problem is not local. Precisely, we prove a (Omega(n)) lower bound
on the number of rounds required for the nodes to decide consistently in some
graphs. This result implies a separation between the RMIS problem and the MIS
problem in general graphs. It also implies that any strategy in this case is
asymptotically (in order) as bad as collecting all the network information at
one node and solving the problem in a centralized manner. Motivated by this
observation, we present a centralized algorithm that computes a robust MIS in a
given graph, if one exists, and rejects otherwise. Significantly, this
algorithm requires only a polynomial amount of local computation time, despite
the fact that exponentially many MISs and exponentially many connected spanning
subgraphs may exist.
Chelsea Finn, Pieter Abbeel, Sergey Levine
Comments: Videos of the reinforcement learning results are at sites.google.com/view/maml
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
We propose an algorithm for meta-learning that is model-agnostic, in the
sense that it is compatible with any model trained with gradient descent and
applicable to a variety of different learning problems, including
classification, regression, and reinforcement learning. The goal of
meta-learning is to train a model on a variety of learning tasks, such that it
can solve new learning tasks using only a small number of training samples. In
our approach, the parameters of the model are explicitly trained such that a
small number of gradient steps with a small amount of training data from a new
task will produce good generalization performance on that task. In effect, our
method trains the model to be easy to fine-tune. We demonstrate that this
approach leads to state-of-the-art performance on a few-shot image
classification benchmark, produces good results on few-shot regression, and
accelerates fine-tuning for policy gradient reinforcement learning with neural
network policies.
Jürgen Bernard, Christian Ritter, David Sessler, Matthias Zeppelzauer, Jörn Kohlhammer, Dieter Fellner
Subjects: Learning (cs.LG); Information Retrieval (cs.IR)
The definition of similarity is a key prerequisite when analyzing complex
data types in data mining, information retrieval, or machine learning. However,
the meaningful definition is often hampered by the complexity of data objects
and particularly by different notions of subjective similarity latent in
targeted user groups. Taking the example of soccer players, we present a
visual-interactive system that learns users’ mental models of similarity. In a
visual-interactive interface, users are able to label pairs of soccer players
with respect to their subjective notion of similarity. Our proposed similarity
model automatically learns the respective concept of similarity using an active
learning strategy. A visual-interactive retrieval technique is provided to
validate the model and to execute downstream retrieval tasks for soccer player
analysis. The applicability of the approach is demonstrated in different
evaluation strategies, including usage scenarions and cross-validation tests.
Ksenia Konyushkova, Raphael Sznitman, Pascal Fua
Subjects: Learning (cs.LG)
In this paper, we suggest a novel data-driven approach to active learning:
Learning Active Learning (LAL). The key idea behind LAL is to train a regressor
that predicts the expected error reduction for a potential sample in a
particular learning state. By treating the query selection procedure as a
regression problem we are not restricted to dealing with existing AL
heuristics; instead, we learn strategies based on experience from previous
active learning experiments. We show that LAL can be learnt from a simple
artificial 2D dataset and yields strategies that work well on real data from a
wide range of domains. Moreover, if some domain-specific samples are available
to bootstrap active learning, the LAL strategy can be tailored for a particular
problem.
Łukasz Kaiser, Ofir Nachum, Aurko Roy, Samy Bengio
Comments: Conference paper accepted for ICLR’17
Subjects: Learning (cs.LG)
Despite recent advances, memory-augmented deep neural networks are still
limited when it comes to life-long and one-shot learning, especially in
remembering rare events. We present a large-scale life-long memory module for
use in deep learning. The module exploits fast nearest-neighbor algorithms for
efficiency and thus scales to large memory sizes. Except for the
nearest-neighbor query, the module is fully differentiable and trained
end-to-end with no extra supervision. It operates in a life-long manner, i.e.,
without the need to reset it during training.
Our memory module can be easily added to any part of a supervised neural
network. To show its versatility we add it to a number of networks, from simple
convolutional ones tested on image classification to deep sequence-to-sequence
and recurrent-convolutional models. In all cases, the enhanced network gains
the ability to remember and do life-long one-shot learning. Our module
remembers training examples shown many thousands of steps in the past and it
can successfully generalize from them. We set new state-of-the-art for one-shot
learning on the Omniglot dataset and demonstrate, for the first time, life-long
one-shot learning in recurrent neural networks on a large-scale machine
translation task.
Hoang M. Le, Yisong Yue, Peter Carr
Subjects: Learning (cs.LG)
We study the problem of imitation learning from demonstrations of multiple
coordinating agents. One key challenge in this setting is that learning a good
model of coordination can be difficult, since coordination is often implicit in
the demonstrations and must be inferred as a latent variable. We propose a
joint approach that simultaneously learns a latent coordination model along
with the individual policies. In particular, our method integrates unsupervised
structure learning with conventional imitation learning. We illustrate the
power of our approach on a difficult problem of learning multiple policies for
fine-grained behavior modeling in team sports, where different players occupy
different roles in the coordinated team strategy. We show that having a
coordination model to infer the roles of players yields substantially improved
imitation loss compared to conventional baselines.
Gelin Gao, Bud Mishra, Daniele Ramazzotti
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computational Engineering, Finance, and Science (cs.CE)
The most recent financial upheavals have cast doubt on the adequacy of some
of the conventional quantitative risk management strategies, such as VaR (Value
at Risk), in many common situations. Consequently, there has been an increasing
need for verisimilar financial stress testings, namely simulating and analyzing
financial portfolios in extreme, albeit rare scenarios. Unlike conventional
risk management which exploits statistical correlations among financial
instruments, here we focus our analysis on the notion of probabilistic
causation, which is embodied by Suppes-Bayes Causal Networks (SBCNs), SBCNs are
probabilistic graphical models that have many attractive features in terms of
more accurate causal analysis for generating financial stress scenarios. In
this paper, we present a novel approach for conducting stress testing of
financial portfolios based on SBCNs in combination with classical machine
learning classification tools. The resulting method is shown to be capable of
correctly discovering the causal relationships among financial factors that
affect the portfolios and thus, simulating stress testing scenarios with a
higher accuracy and lower computational complexity than conventional Monte
Carlo Simulations.
Daniele Ramazzotti, Marco S. Nobile, Marco Antoniotti, Alex Graudenzi
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)
One of the critical issues when adopting Bayesian networks (BNs) to model
dependencies among random variables is to “learn” their structure, given the
huge search space of possible solutions, i.e., all the possible direct acyclic
graphs. This is a well-known NP-hard problem, which is also complicated by
known pitfalls such as the issue of I-equivalence among different structures.
In this work we restrict the investigations on BN structure learning to a
specific class of networks, i.e., those representing the dynamics of phenomena
characterized by the monotonic accumulation of events. Such phenomena allow to
set specific structural constraints based on Suppes’ theory of probabilistic
causation and, accordingly, to define constrained BNs, named Suppes-Bayes
Causal Networks (SBCNs). We here investigate the structure learning of SBCNs
via extensive simulations with various state-of-the-art search strategies, such
as canonical local search techniques and Genetic Algorithms. Among the main
results we show that Suppes’ constraints deeply simplify the learning task, by
reducing the solution search space and providing a temporal ordering on the
variables.
Liangzhen Lai, Naveen Suda, Vikas Chandra
Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Deep convolutional neural network (CNN) inference requires significant amount
of memory and computation, which limits its deployment on embedded devices. To
alleviate these problems to some extent, prior research utilize low precision
fixed-point numbers to represent the CNN weights and activations. However, the
minimum required data precision of fixed-point weights varies across different
networks and also across different layers of the same network. In this work, we
propose using floating-point numbers for representing the weights and
fixed-point numbers for representing the activations. We show that using
floating-point representation for weights is more efficient than fixed-point
representation for the same bit-width and demonstrate it on popular large-scale
CNNs such as AlexNet, SqueezeNet, GoogLeNet and VGG-16. We also show that such
a representation scheme enables compact hardware multiply-and-accumulate (MAC)
unit design. Experimental results show that the proposed scheme reduces the
weight storage by up to 36% and power consumption of the hardware multiplier by
up to 50%.
Maher Al-Shoukairi, Philip Schniter, Bhaskar D. Rao
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
In this paper, we present an algorithm for the sparse signal recovery problem
that incorporates damped Gaussian generalized approximate message passing
(GGAMP) into Expectation-Maximization (EM)-based sparse Bayesian learning
(SBL). In particular, GGAMP is used to implement the E-step in SBL in place of
matrix inversion, leveraging the fact that GGAMP is guaranteed to converge with
appropriate damping. The resulting GGAMP-SBL algorithm is much more robust to
arbitrary measurement matrix (oldsymbol{A}) than the standard damped GAMP
algorithm while being much lower complexity than the standard SBL algorithm. We
then extend the approach from the single measurement vector (SMV) case to the
temporally correlated multiple measurement vector (MMV) case, leading to the
GGAMP-TSBL algorithm. We verify the robustness and computational advantages of
the proposed algorithms through numerical experiments.
Stefano Beretta, Mauro Castelli, Ivo Goncalves, Ivan Merelli, Daniele Ramazzotti
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)
Gene and protein networks are very important to model complex large-scale
systems in molecular biology. Inferring or reverseengineering such networks can
be defined as the process of identifying gene/protein interactions from
experimental data through computational analysis. However, this task is
typically complicated by the enormously large scale of the unknowns in a rather
small sample size. Furthermore, when the goal is to study causal relationships
within the network, tools capable of overcoming the limitations of correlation
networks are required. In this work, we make use of Bayesian Graphical Models
to attach this problem and, specifically, we perform a comparative study of
different state-of-the-art heuristics, analyzing their performance in inferring
the structure of the Bayesian Network from breast cancer data.
Daniele Ramazzotti, Marco S. Nobile, Paolo Cazzaniga, Giancarlo Mauri, Marco Antoniotti
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
The emergence and development of cancer is a consequence of the accumulation
over time of genomic mutations involving a specific set of genes, which
provides the cancer clones with a functional selective advantage. In this work,
we model the order of accumulation of such mutations during the progression,
which eventually leads to the disease, by means of probabilistic graphic
models, i.e., Bayesian Networks (BNs). We investigate how to perform the task
of learning the structure of such BNs, according to experimental evidence,
adopting a global optimization meta-heuristics. In particular, in this work we
rely on Genetic Algorithms, and to strongly reduce the execution time of the
inference — which can also involve multiple repetitions to collect
statistically significant assessments of the data — we distribute the
calculations using both multi-threading and a multi-node architecture. The
results show that our approach is characterized by good accuracy and
specificity; we also demonstrate its feasibility, thanks to a 84x reduction of
the overall execution time with respect to a traditional sequential
implementation.
Stephen Giguere, Francisco Garcia, Sridhar Mahadevan
Comments: 9 pages, 3 Figures
Subjects: Learning (cs.LG)
Although many machine learning algorithms involve learning subspaces with
particular characteristics, optimizing a parameter matrix that is constrained
to represent a subspace can be challenging. One solution is to use Riemannian
optimization methods that enforce such constraints implicitly, leveraging the
fact that the feasible parameter values form a manifold. While Riemannian
methods exist for some specific problems, such as learning a single subspace,
there are more general subspace constraints that offer additional flexibility
when setting up an optimization problem, but have not been formulated as a
manifold.
We propose the partitioned subspace (PS) manifold for optimizing matrices
that are constrained to represent one or more subspaces. Each point on the
manifold defines a partitioning of the input space into mutually orthogonal
subspaces, where the number of partitions and their sizes are defined by the
user. As a result, distinct groups of features can be learned by defining
different objective functions for each partition. We illustrate the properties
of the manifold through experiments on multiple dataset analysis and domain
adaptation.
Insu Han, Prabhanjan Kambadur, Kyoungsoo Park, Jinwoo Shin
Subjects: Discrete Mathematics (cs.DM); Learning (cs.LG)
Determinantal point processes (DPPs) are popular probabilistic models that
arise in many machine learning tasks, where distributions of diverse sets are
characterized by matrix determinants. In this paper, we develop fast algorithms
to find the most likely configuration (MAP) of large-scale DPPs, which is
NP-hard in general. Due to the submodular nature of the MAP objective, greedy
algorithms have been used with empirical success. Greedy implementations
require computation of log-determinants, matrix inverses or solving linear
systems at each iteration. We present faster implementations of the greedy
algorithms by utilizing the complementary benefits of two log-determinant
approximation schemes: (a) first-order expansions to the matrix log-determinant
function and (b) high-order expansions to the scalar log function with
stochastic trace estimators. In our experiments, our algorithms are orders of
magnitude faster than their competitors, while sacrificing marginal accuracy.
Ashish Bora, Ajil Jalal, Eric Price, Alexandros G. Dimakis
Subjects: Machine Learning (stat.ML); Information Theory (cs.IT); Learning (cs.LG)
The goal of compressed sensing is to estimate a vector from an
underdetermined system of noisy linear measurements, by making use of prior
knowledge on the structure of vectors in the relevant domain. For almost all
results in this literature, the structure is represented by sparsity in a
well-chosen basis. We show how to achieve guarantees similar to standard
compressed sensing but without employing sparsity at all. Instead, we suppose
that vectors lie near the range of a generative model (G: mathbb{R}^k o
mathbb{R}^n). Our main theorem is that, if (G) is (L)-Lipschitz, then roughly
(O(k log L)) random Gaussian measurements suffice for an (ell_2/ell_2)
recovery guarantee. We demonstrate our results using generative models from
published variational autoencoder and generative adversarial networks. Our
method can use (5)-(10)x fewer measurements than Lasso for the same accuracy.
Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, Yoshua Bengio
Comments: 15 pages with appendix, 7 figures, 4 tables. Conference paper in 5th International Conference on Learning Representations (ICLR 2017)
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
This paper proposes a new model for extracting an interpretable sentence
embedding by introducing self-attention. Instead of using a vector, we use a
2-D matrix to represent the embedding, with each row of the matrix attending on
a different part of the sentence. We also propose a self-attention mechanism
and a special regularization term for the model. As a side effect, the
embedding comes with an easy way of visualizing what specific parts of the
sentence are encoded into the embedding. We evaluate our model on 3 different
tasks: author profiling, sentiment classification, and textual entailment.
Results show that our model yields a significant performance gain compared to
other sentence embedding methods in all of the 3 tasks.
Eric Balkanski, Umar Syed, Sergei Vassilvitskii
Subjects: Computer Science and Game Theory (cs.GT); Learning (cs.LG)
We study the cost sharing problem for cooperative games in situations where
the cost function (C) is not available via oracle queries, but must instead be
derived from data, represented as tuples ((S, C(S))), for different subsets (S)
of players. We formalize this approach, which we call statistical cost sharing,
and consider the computation of the core and the Shapley value, when the tuples
are drawn from some distribution (mathcal{D}).
Previous work by Balcan et al. in this setting showed how to compute cost
shares that satisfy the core property with high probability for limited classes
of functions. We expand on their work and give an algorithm that computes such
cost shares for any function with a non-empty core. We complement these results
by proving an inapproximability lower bound for a weaker relaxation.
We then turn our attention to the Shapley value. We first show that when cost
functions come from the family of submodular functions with bounded curvature,
(kappa), the Shapley value can be approximated from samples up to a (sqrt{1 –
kappa}) factor, and that the bound is tight. We then define statistical
analogues of the Shapley axioms, and derive a notion of statistical Shapley
value. We show that these can always be approximated arbitrarily well for
general functions over any distribution (mathcal{D}).
Xiaodan Liang, Liang Lin, Xiaohui Shen, Jiashi Feng, Shuicheng Yan, Eric P. Xing
Comments: To appear in CVPR 2017 as a spotlight paper
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)
This paper develops a general framework for learning interpretable data
representation via Long Short-Term Memory (LSTM) recurrent neural networks over
hierarchal graph structures. Instead of learning LSTM models over the pre-fixed
structures, we propose to further learn the intermediate interpretable
multi-level graph structures in a progressive and stochastic way from data
during the LSTM network optimization. We thus call this model the
structure-evolving LSTM. In particular, starting with an initial element-level
graph representation where each node is a small data element, the
structure-evolving LSTM gradually evolves the multi-level graph representations
by stochastically merging the graph nodes with high compatibilities along the
stacked LSTM layers. In each LSTM layer, we estimate the compatibility of two
connected nodes from their corresponding LSTM gate outputs, which is used to
generate a merging probability. The candidate graph structures are accordingly
generated where the nodes are grouped into cliques with their merging
probabilities. We then produce the new graph structure with a
Metropolis-Hasting algorithm, which alleviates the risk of getting stuck in
local optimums by stochastic sampling with an acceptance probability. Once a
graph structure is accepted, a higher-level graph is then constructed by taking
the partitioned cliques as its nodes. During the evolving process,
representation becomes more abstracted in higher-levels where redundant
information is filtered out, allowing more efficient propagation of long-range
data dependencies. We evaluate the effectiveness of structure-evolving LSTM in
the application of semantic object parsing and demonstrate its advantage over
state-of-the-art LSTM models on standard benchmarks.
Xiaodan Liang, Lisa Lee, Eric P. Xing
Comments: This manuscript is accepted by CVPR 2017 as a spotlight paper
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)
Despite progress in visual perception tasks such as image classification and
detection, computers still struggle to understand the interdependency of
objects in the scene as a whole, e.g., relations between objects or their
attributes. Existing methods often ignore global context cues capturing the
interactions among different object instances, and can only recognize a handful
of types by exhaustively training individual detectors for all possible
relationships. To capture such global interdependency, we propose a deep
Variation-structured Reinforcement Learning (VRL) framework to sequentially
discover object relationships and attributes in the whole image. First, a
directed semantic action graph is built using language priors to provide a rich
and compact representation of semantic correlations between object categories,
predicates, and attributes. Next, we use a variation-structured traversal over
the action graph to construct a small, adaptive action set for each step based
on the current state and historical actions. In particular, an ambiguity-aware
object mining scheme is used to resolve semantic ambiguity among object
categories that the object detector fails to distinguish. We then make
sequential predictions using a deep RL framework, incorporating global context
cues and semantic embeddings of previously extracted phrases in the state
vector. Our experiments on the Visual Relationship Detection (VRD) dataset and
the large-scale Visual Genome dataset validate the superiority of VRL, which
can achieve significantly better detection results on datasets involving
thousands of relationship and attribute types. We also demonstrate that VRL is
able to predict unseen types embedded in our action graph by learning
correlations on shared graph nodes.
Sarah Parisot, Sofia Ira Ktena, Enzo Ferrante, Matthew Lee, Ricardo Guerrerro Moreno, Ben Glocker, Daniel Rueckert
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
Exploiting the wealth of imaging and non-imaging information for disease
prediction tasks requires models capable of representing, at the same time,
individual features as well as data associations between subjects from
potentially large populations. Graphs provide a natural framework for such
tasks, yet previous graph-based approaches focus on pairwise similarities
without modelling the subjects’ individual characteristics and features. On the
other hand, relying solely on subject-specific imaging feature vectors fails to
model the interaction and similarity between subjects, which can reduce
performance. In this paper, we introduce the novel concept of Graph
Convolutional Networks (GCN) for brain analysis in populations, combining
imaging and non-imaging data. We represent populations as a sparse graph where
its vertices are associated with image-based feature vectors and the edges
encode phenotypic information. This structure was used to train a GCN model on
partially labelled graphs, aiming to infer the classes of unlabelled nodes from
the node features and pairwise associations between subjects. We demonstrate
the potential of the method on the challenging ADNI and ABIDE databases, as a
proof of concept of the benefit from integrating contextual information in
classification tasks. This has a clear impact on the quality of the
predictions, leading to 69.5% accuracy for ABIDE (outperforming the current
state of the art of 66.8%) and 77% for ADNI for prediction of MCI conversion,
significantly outperforming standard linear classifiers where only individual
features are considered.
Themistoklis Charalambous, Su Min Kim, Nikolaos Nomikos, Mats Bengtsson, Mikael Johansson
Comments: 13 pages, 7 figures, submitted to IEEE Transactions on Wireless Communications
Subjects: Information Theory (cs.IT)
We study a cooperative network with a buffer-aided multi-antenna source,
multiple half-duplex (HD) buffer-aided relays and a single destination. Such a
setup could represent a cellular downlink scenario, in which the source can be
a more powerful wireless device with a buffer and multiple antennas, while a
set of intermediate less powerful devices are used as relays to reach the
destination. The main target is to recover the multiplexing loss of the network
by having the source and a relay to simultaneously transmit their information
to another relay and the destination, respectively. Successive transmissions in
such a cooperative network, however, cause inter-relay interference (IRI).
First, by assuming global channel state information (CSI), we show that the
detrimental effect of IRI can be alleviated by precoding at the source,
mitigating or even fully canceling the interference. A cooperative relaying
policy is proposed that employs a joint precoding design and relay-pair
selection. Note that both fixed rate and adaptive rate transmissions can be
considered. For the case when channel state information is only available at
the receiving side (CSIR), we propose a relay selection policy that employs a
phase alignment technique to reduce the IRI. The performance of the two
proposed relay pair selection policies are evaluated and compared with other
state-of-the-art relaying schemes in terms of outage and throughput. The
results show that the use of a powerful source can provide considerable
performance improvements.
Shanxiang Lyu, Cong Ling
Comments: submitted to a journal in Nov. 2016
Subjects: Information Theory (cs.IT)
There exists two issues among popular lattice reduction (LR) algorithms that
should cause our concern. The first one is Korkine Zolotarev (KZ) and Lenstra
Lenstra Lovasz (LLL) algorithms may increase the lengths of basis vectors. The
other is KZ reduction suffers much worse performance than Minkowski reduction
in terms of providing short basis vectors, despite its superior theoretical
upper bounds. To address these limitations, we improve the size reduction steps
in KZ and LLL to set up two new efficient algorithms, referred to as boosted KZ
and LLL, for solving the shortest basis problem (SBP) with exponential and
polynomial complexity, respectively. Both of them offer better actual
performance than their classic counterparts, and the performance bounds for KZ
are also improved. We apply them to designing integer-forcing (IF) linear
receivers for multi-input multi-output (MIMO) communications. Our simulations
confirm their rate and complexity advantages.
Alessandro Neri, Joachim Rosenthal, Davide Schipani
Comments: to appear in Cryptography and Physical Layer Security, Lecture Notes in Electrical Engineering, Springer
Subjects: Information Theory (cs.IT)
Fuzzy authentication allows authentication based on the fuzzy matching of two
objects, for example based on the similarity of two strings in the Hamming
metric, or on the similiarity of two sets in the set difference metric. Aim of
this paper is to show other models and algorithms of secure fuzzy
authentication, which can be performed using the rank metric. A few schemes are
presented which can then be applied in different scenarios and applications.
Divya U. S., B. Sundar Rajan
Comments: 9 pages, 6 figures and 2 tables
Subjects: Information Theory (cs.IT)
Index coded PSK modulation over an AWGN broadcast channel, for a given index
coding problem (ICP) is studied. For a chosen index code and an arbitrary
mapping (of broadcast vectors to PSK signal points), we have derived a decision
rule for the maximum likelihood (ML) decoder. The message error performance of
a receiver at high SNR is characterized by a parameter called PSK Index Coding
Gain (PSK-ICG). The PSK-ICG of a receiver is determined by a metric called
minimum inter-set distance. For a given ICP with an order of priority among the
receivers, and a chosen (2^N)-PSK constellation we propose an algorithm to find
(index code, mapping) pairs, each of which gives the best performance in terms
of PSK-ICG of the receivers. No other pair of index code (of length (N) with
(2^N) broadcast vectors) and mapping can give a better PSK-ICG for the highest
priority receiver. Also, given that the highest priority receiver achieves its
best performance, the next highest priority receiver achieves its maximum gain
possible and so on in the specified order or priority.
Jie Gong, Xiang Chen
Comments: 30 pages, 10 figures, submitted for possible publication
Subjects: Information Theory (cs.IT)
Non-orthogonal multiple access (NOMA) is a candidate multiple access scheme
in 5G systems for the simultaneous access of tremendous number of wireless
nodes. On the other hand, RF-enabled wireless energy harvesting is a promising
technology for self-sustainable wireless nodes. In this paper, we consider a
NOMA system where the near user harvests energy from the strong radio signal to
power-on the information decoder. A generalized energy harvesting scheme is
proposed by combining the conventional time switching and power splitting
scheme. The achievable rate region of the proposed scheme is characterized
under both constant and dynamic decoding power consumption models. If the
decoding power is constant, the achievable rate region can be found by solving
two convex optimization subproblems, and the regions for two special cases:
time switching and power splitting, are characterized in closed-form. If the
decoding power is proportional to data rate, the achievable rate region can be
found by exhaustive search algorithm. Numerical results show that the
achievable rate region of the proposed generalized scheme is larger than those
of time switching scheme and power splitting scheme, and rate-dependent decoder
design helps to enlarge the achievable rate region.
Sha Hu, Fredrik Rusek
Comments: 31 pages, 13 figures, submitted to IEEE Transactions on Wireless Communications in Aug. 2016
Subjects: Information Theory (cs.IT)
In this paper, we consider precoder designs for multiuser
multiple-input-single-output (MISO) broadcasting channels. Instead of using a
traditional linear zero-forcing (ZF) precoder, we propose a generalized ZF
(GZF) precoder in conjunction with successive dirty-paper coding (DPC) for
data-transmissions, namely, the GZF-DP precoder, where the suffix lq{}DP
q{}
stands for lq{}dirty-paper
q{}. The GZF-DP precoder is designed to generate a
band-shaped and lower-triangular effective channel (vec{F}) such that only the
entries along the main diagonal and the (
u) first lower-diagonals can take
non-zero values. Utilizing the successive DPC, the known non-causal inter-user
interferences from the other (up to) (
u) users are canceled through
successive encoding. We analyze optimal GZF-DP precoder designs both for
sum-rate and minimum user-rate maximizations. Utilizing Lagrange multipliers,
the optimal precoders for both cases are solved in closed-forms in relation to
optimal power allocations. For the sum-rate maximization, the optimal power
allocation can be found through water-filling, but with modified water-levels
depending on the parameter (
u). While for the minimum user-rate maximization
that measures the quality of the service (QoS), the optimal power allocation is
directly solved in closed-form which also depends on (
u). Moreover, we
propose two low-complexity user-ordering algorithms for the GZF-DP precoder
designs for both maximizations, respectively. We show through numerical results
that, the proposed GZF-DP precoder with a small (
u) ((leq!3)) renders
significant rate increments compared to the previous precoder designs such as
the linear ZF and user-grouping based DPC (UG-DP) precoders.
Siqing Xiong, Lijun Wang, Kyung Sup Kwak, Zhiquan Bai, Jiang Wang, Qiang Li, Tao Han
Comments: 6 pages, 4 figures, submitted to IWCMC 2017
Subjects: Information Theory (cs.IT)
Based on the distinguishing features of multi-tier millimeter wave (mmWave)
networks such as different transmit powers, different directivity gains from
directional beamforming alignment and path loss laws for line-of-sight (LOS)
and non-line-of-sight (NLOS) links, we introduce a normalization model to
simplify the analysis of multi-tier mmWave cellular networks. The highlight of
the model is that we convert a multi-tier mmWave cellular network into a
single-tier mmWave network, where all the base stations (BSs) have the same
normalized transmit power 1 and the densities of BSs scaled by LOS or NLOS
scaling factors respectively follow piecewise constant function which have
multiple demarcation points. On this basis, expressions for computing the
coverage probability are obtained in general case with beamforming alignment
errors and in the special case with perfect beamforming alignment in the
communication. According to corresponding numerical exploration, we conclude
that the normalization model for multi-tier mmWave cellular networks fully
meets requirements of network performance analysis, and it is simpler and
clearer than the untransformed model. Besides, an unexpected but sensible
finding is that there is an optimal beamwidth that maximizes coverage
probability in the case with beamforming alignment errors.
Sergey V. Zhidkov
Subjects: Information Theory (cs.IT)
In the context of orthogonal frequency division multiplexing (OFDM),
nonlinearity is usually considered a highly undesirable phenomenon. In this
letter, we demonstrate that if a suitable memoryless nonlinearity is applied to
the uncoded OFDM signal on the transmitter side, the performance of the
communication system may be similar to that of state-of-the-art,
capacity-approaching coded modulation schemes without requiring any additional
forward error correction coding. The proposed modulation technique could be
used directly or as a precoder for a conventional OFDM modulator, resulting in
a system possessing all benefits of OFDM along with reduced crest-factor (CF).
Sung Sik Nam, Duckdong Hwang, Janghoon Yang
Subjects: Information Theory (cs.IT)
In this paper, we deal with the performance analysis of full-duplex relaying
in decode-&-forward cooperative networks with multiple-antenna terminals. More
specifically, by analyzing the end-to-end statistics, we derive the accurate
closed-form expressions of the end-to-end outage probability for both transmit
and receive ZFBF scheme over Rayleigh fading environments. Some selected
results show some interesting observations useful for system designers.
Specifically, we observe that the outage performance can be improved by
adopting the joint ZF-based precoding with different antenna configurations.
Adel Alahmadi, Cem Güneri, Buket Özkaya, Hatoon Shoaib, Patrick Solé
Comments: arXiv admin note: text overlap with arXiv:1606.00815
Subjects: Information Theory (cs.IT)
Linear codes with complementary-duals (LCD) are linear codes that intersect
with their dual trivially. Multinegacirculant codes of index (2) that are LCD
are characterized algebraically and some good codes are found in this family.
Exact enumeration is performed for indices 2 and 3, and for all indices (t) for
a special case of the co-index by using their concatenated structure.
Asymptotic existence results are derived for the special class of such codes
that are one-generator and have co-index a power of two by means of Dickson
polynomials. This shows that there are infinite families of LCD
multinegacirculant codes with relative distance satisfying a modified
Varshamov-Gilbert bound.
Fei Liu, Marina Petrova
Comments: This is the author’s version of an article that has been submitted to IEEE Journal on Selected Areas in Communications
Subjects: Information Theory (cs.IT)
In this paper, we present the first analytical solution for performance
analysis of proportional fair scheduling (PFS) in downlink non-orthogonal
multiple access (NOMA) systems. Assuming an ideal NOMA system with an arbitrary
number of multiplexed users, we derive a closed-form solution of the optimal
power allocation for PFS and design a low-complexity algorithm for joint power
allocation and user set selection. We develop an analytical model to analyze
the throughput performance of this optimal solution based on stochastic channel
modeling. Our analytical performance is proved to be the upper bound for PFS
and is used to estimate user data rates and overall throughput in practical
NOMA systems. We conduct system-level simulations to evaluate the accuracy of
our data rate estimation. The simulation results verify our analysis on the
upper bound of PFS performance in NOMA and confirm that using the analytical
performance for data rate estimation guarantees high accuracy. The impact of
partial and imperfect channel information on the estimation performance is
investigated as well.
Adel Alahmadi, Cem Güneri, Hatoon Shoaib, Patrick Solé
Comments: 12 pages
Subjects: Information Theory (cs.IT)
We study complementary information set codes of length (tn) and dimension (n)
of order (t) called ((t-)CIS code for short). Quasi-cyclic and quasi-twisted
(t)-CIS codes are enumerated by using their concatenated structure. Asymptotic
existence results are derived for one-generator and have co-index (n) by
Artin’s conjecture for quasi cyclic and special case for quasi twisted. This
shows that there are infinite families of long QC and QT (t)-CIS codes with
relative distance satisfying a modified Varshamov-Gilbert bound for rate (1/t)
codes.
Similar results are defined for the new and more general class of
quasi-polycyclic codes introduced recently by Berger and Amrani.
Gokhan M. Guvensen, Ender Ayanoglu
Comments: 7 pages, 3 figures, accepted by IEEE ICC 2017 Wireless Communications Symposium
Subjects: Information Theory (cs.IT)
In this paper, the problem of sequential beam construction and adaptive
channel estimation based on reduced rank (RR) Kalman filtering for
frequency-selective massive multiple-input multiple-output (MIMO) systems
employing single-carrier (SC) in time division duplex (TDD) mode are
considered. In two-stage beamforming, a new algorithm for statistical
pre-beamformer design is proposed for spatially correlated time-varying
wideband MIMO channels under the assumption that the channel is a stationary
Gauss-Markov random process. The proposed algorithm yields a nearly optimal
pre-beamformer whose beam pattern is designed sequentially with low complexity
by taking the user-grouping into account, and exploiting the properties of
Kalman filtering and associated prediction error covariance matrices. The
resulting design, based on the second order statistical properties of the
channel, generates beamspace on which the RR Kalman estimator can be realized
as accurately as possible. It is observed that the adaptive channel estimation
technique together with the proposed sequential beamspace construction shows
remarkable robustness to the pilot interference. This comes with significant
reduction in both pilot overhead and dimension of the pre-beamformer lowering
both hardware complexity and power consumption.
Alireza Zaeemzadeh, Mohsen Joneidi, Nazanin Rahnavard
Comments: 6 pages, 8 figures, Conference on Information Sciences and Systems (CISS 2017) Baltimore, Maryland
Subjects: Applications (stat.AP); Information Theory (cs.IT)
In this paper, adaptive non-uniform compressive sampling (ANCS) of
time-varying signals, which are sparse in a proper basis, is introduced. ANCS
employs the measurements of previous time steps to distribute the sensing
energy among coefficients more intelligently. To this aim, a Bayesian inference
method is proposed that does not require any prior knowledge of importance
levels of coefficients or sparsity of the signal. Our numerical simulations
show that ANCS is able to achieve the desired non-uniform recovery of the
signal. Moreover, if the signal is sparse in canonical basis, ANCS can reduce
the number of required measurements significantly.
Ashish Bora, Ajil Jalal, Eric Price, Alexandros G. Dimakis
Subjects: Machine Learning (stat.ML); Information Theory (cs.IT); Learning (cs.LG)
The goal of compressed sensing is to estimate a vector from an
underdetermined system of noisy linear measurements, by making use of prior
knowledge on the structure of vectors in the relevant domain. For almost all
results in this literature, the structure is represented by sparsity in a
well-chosen basis. We show how to achieve guarantees similar to standard
compressed sensing but without employing sparsity at all. Instead, we suppose
that vectors lie near the range of a generative model (G: mathbb{R}^k o
mathbb{R}^n). Our main theorem is that, if (G) is (L)-Lipschitz, then roughly
(O(k log L)) random Gaussian measurements suffice for an (ell_2/ell_2)
recovery guarantee. We demonstrate our results using generative models from
published variational autoencoder and generative adversarial networks. Our
method can use (5)-(10)x fewer measurements than Lasso for the same accuracy.
Daniele Bartoli, Maria Montanucci, Giovanni Zini
Subjects: Combinatorics (math.CO); Information Theory (cs.IT)
In this paper, algebraic-geometric (AG) codes associated with the GGS maximal
curve are investigated. The Weierstrass semigroup at all (mathbb
F_{q^2})-rational points of the curve is determined; the Feng-Rao designed
minimum distance is computed for infinite families of such codes, as well as
the automorphism group. As a result, some linear codes with better relative
parameters with respect to one-point Hermitian codes are discovered. Classes of
quantum and convolutional codes are provided relying on the constructed AG
codes; a quantum MDS code with parameters ([[3968,3786,92]]_{2^{10}}) is
obtained.
L.L. Frumin, A.A. Gelash, S.K. Turitsyn
Comments: 6 pages plus supplementary materials, 4 figures
Subjects: Exactly Solvable and Integrable Systems (nlin.SI); Information Theory (cs.IT); Optics (physics.optics)
Remarkable mathematical properties of the integrable nonlinear Schr”odinger
equation (NLSE) can offer advanced solutions for the mitigation of nonlinear
signal distortions in optical fibre links. Fundamental optical soliton,
continuous and discrete eigenvalues of the nonlinear spectrum have already been
considered for transmission of information in fibre-optic channels. Here we
propose to apply signal modulation to the kernel of the
Gelfand-Levitan-Marchenko equations (GLME) that offers the advantage of a
relatively simple decoder design. First, we describe an approach based on
exploiting the general N-soliton solution of the NLSE for simultaneous coding
of (N) symbols involving (4 imes N) coding parameters. As a specific elegant
sub-class of the general schemes we introduce a soliton orthogonal frequency
division multiplexing (SOFDM) method. This method is based on the choice of
identical imaginary parts of the N-soliton solution eigenvalues, corresponding
to equidistant soliton frequencies making it similar to the conventional OFDM
scheme, thus, allowing use of the efficient fast Fourier transform algorithm to
recover the data. Then, we demonstrate how to use this new approach to control
signal parameters in the case of the continuous spectrum.
Xinran Chen, Zhe Chen, Sai Xie, Yongshuai Shao
Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT)
Cognitive radio technology enables improving the utilization efficiency of
the precious and scarce radio spectrum. How to maximize the overall spectrum
efficiency while minimizing the conflicts with primary users is vital to
cognitive radio. The key is to make the right decisions of accessing the
spectrum. Spectrum prediction can be employed to predict the future states of a
spectrum band using previous states of the spectrum band, whereas spectrum
recommendation recommends secondary users a subset of available spectrum bands
based on secondary user’s previous experiences of accessing the available
spectrum bands. In this paper, a framework for spectrum decision based on
spectrum prediction and spectrum recommendation is proposed. As a benchmark, a
method based on extreme learning machine (ELM) for single-user spectrum
prediction and a method based on Q-learning for multiple-user spectrum
prediction are proposed. At the stage of spectrum decision, two methods based
on Q-learning andMarkov decision process (MDP), respectively, are also proposed
to enhance the overall performance of spectrum decision. Experimental results
show that the performance of the spectrum decision framework is much better.