Hokchhay Tann, Soheil Hashemi, Iris Bahar, Sherief Reda
Comments: 6 pages
Subjects: Neural and Evolutionary Computing (cs.NE)
While Deep Neural Networks (DNNs) push the state-of-the-art in many machine
learning applications, they often require millions of expensive floating-point
operations for each input classification. This computation overhead limits the
applicability of DNNs to low-power, embedded platforms and incurs high cost in
data centers. This motivates recent interests in designing low-power,
low-latency DNNs based on fixed-point, ternary, or even binary data precision.
While recent works in this area offer promising results, they often lead to
large accuracy drops when compared to the floating-point networks. We propose a
novel approach to map floating-point based DNNs to 8-bit dynamic fixed-point
networks with integer power-of-two weights with no change in network
architecture. Our dynamic fixed-point DNNs allow different radix points between
layers. During inference, power-of-two weights allow multiplications to be
replaced with arithmetic shifts, while the 8-bit fixed-point representation
simplifies both the buffer and adder design. In addition, we propose a hardware
accelerator design to achieve low-power, low-latency inference with
insignificant degradation in accuracy. Using our custom accelerator design with
the CIFAR-10 and ImageNet datasets, we show that our method achieves
significant power and energy savings while increasing the classification
accuracy.
Thangarajah Akilan (1), Q.M. Jonathan Wu (1), Wei Jiang (2) ((1) Department of Electrical and Computer Engineering, University of Windsor, Canada, (2) Department of Control Science and Engineering, Zhejiang University, China)
Comments: 5 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Following the rapidly growing digital image usage, automatic image
categorization has become preeminent research area. It has broaden and adopted
many algorithms from time to time, whereby multi-feature (generally,
hand-engineered features) based image characterization comes handy to improve
accuracy. Recently, in machine learning, pre-trained deep convolutional neural
networks (DCNNs or ConvNets) have been that the features extracted through such
DCNN can improve classification accuracy. Thence, in this paper, we further
investigate a feature embedding strategy to exploit cues from multiple DCNNs.
We derive a generalized feature space by embedding three different DCNN
bottleneck features with weights respect to their Softmax cross-entropy loss.
Test outcomes on six different object classification data-sets and an action
classification data-set show that regardless of variation in image statistics
and tasks the proposed multi-DCNN bottleneck feature fusion is well suited to
image classification tasks and an effective complement of DCNN. The comparisons
to existing fusion-based image classification approaches prove that the
proposed method surmounts the state-of-the-art methods and produces competitive
results with fully trained DCNNs as well.
Nan Yang, Rui Wang, Daniel Cremers
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Monocular visual odometry (VO), which incrementally estimates camera poses
and local 3D maps, is the key component of monocular simultaneously
localization and mapping (SLAM). The two main branches of the VO family, namely
the feature-based and the so called direct methods, have seen tremendous
improvements on accuracy, robustness and efficiency, and have gained
exponential popularity over recent years. Nevertheless, no comprehensive
evaluation between them has been performed. In this paper, we compare the
state-of-the-art feature-based and direct VO methods, aiming at an exhaustive
evaluation under diverse real-world settings. For the first time, we evaluate
their performance on images with and without photometric calibration, camera
motion bias and robustness to rolling shutter effect. We discuss about the
possible reasons and give explanations to all our experiment results.
Yair Rivenson, Yibo Zhang, Harun Gunaydin, Da Teng, Aydogan Ozcan
Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR); Learning (cs.LG); Applied Physics (physics.app-ph); Optics (physics.optics)
Phase recovery from intensity-only measurements forms the heart of coherent
imaging techniques and holography. Here we demonstrate that a neural network
can learn to perform phase recovery and holographic image reconstruction after
appropriate training. This deep learning-based approach provides an entirely
new framework to conduct holographic imaging by rapidly eliminating twin-image
and self-interference related spatial artifacts. Compared to existing
approaches, this neural network based method is significantly faster to
compute, and reconstructs improved phase and amplitude images of the objects
using only one hologram, i.e., requires less number of measurements in addition
to being computationally faster. We validated this method by reconstructing
phase and amplitude images of various samples, including blood and Pap smears,
and tissue sections. These results are broadly applicable to any phase recovery
problem, and highlight that through machine learning challenging problems in
imaging science can be overcome, providing new avenues to design powerful
computational imaging systems.
Amanda Song, Linjie Li, Chad Atalla, Garrison Cottrell
Comments: 10 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)
Humans make complex inferences on faces, ranging from objective properties
(gender, ethnicity, expression, age, identity, etc) to subjective judgments
(facial attractiveness, trustworthiness, sociability, friendliness, etc). While
the objective aspects of face perception have been extensively studied,
relatively fewer computational models have been developed for the social
impressions of faces. Bridging this gap, we develop a method to predict human
impressions of faces in 40 subjective social dimensions, using deep
representations from state-of-the-art neural networks. We find that model
performance grows as the human consensus on a face trait increases, and that
model predictions outperform human groups in correlation with human averages.
This illustrates the learnability of subjective social perception of faces,
especially when there is high human consensus. Our system can be used to decide
which photographs from a personal collection will make the best impression. The
results are significant for the field of social robotics, demonstrating that
robots can learn the subjective judgments defining the underlying fabric of
human interaction.
Hsiou-Yuan Liu, Dehong Liu, Hassan Mansour, Petros T. Boufounos, Laura Waller, Ulugbek S. Kamilov
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Multiple scattering of an electromagnetic wave as it passes through an object
is a fundamental problem that limits the performance of current imaging
systems. In this paper, we describe a new technique-called Series Expansion
with Accelerated Gradient Descent on Lippmann-Schwinger Equation (SEAGLE)-for
robust imaging under multiple scattering based on a combination of a new
nonlinear forward model and a total variation (TV) regularizer. The proposed
forward model can account for multiple scattering, which makes it advantageous
in applications where linear models are inaccurate. Specifically, it
corresponds to a series expansion of the scattered wave with an
accelerated-gradient method. This expansion guarantees the convergence even for
strongly scattering objects. One of our key insights is that it is possible to
obtain an explicit formula for computing the gradient of our nonlinear forward
model with respect to the unknown object, thus enabling fast image
reconstruction with the state-of-the-art fast iterative shrinkage/thresholding
algorithm (FISTA). The proposed method is validated on both simulated and
experimentally measured data.
U. A. Nnolim
Comments: 22 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
The experimental results of improved underwater image enhancement algorithms
based on partial differential equations (PDEs) are presented in this report.
This second work extends the study of previous work and incorporating several
improvements into the revised algorithm. Experiments show the evidence of the
improvements when compared to previously proposed approaches and other
conventional algorithms found in the literature.
Dufan Wu, Kyungsang Kim, Georges El Fakhri, Quanzheng Li
Comments: 9 pages, 9 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
Image denoising techniques are essential to reducing noise levels and
enhancing diagnosis reliability in low-dose computed tomography (CT). Machine
learning based denoising methods have shown great potential in removing the
complex and spatial-variant noises in CT images. However, some residue
artifacts would appear in the denoised image due to complexity of noises. A
cascaded training network was proposed in this work, where the trained CNN was
applied on the training dataset to initiate new trainings and remove artifacts
induced by denoising. A cascades of convolutional neural networks (CNN) were
built iteratively to achieve better performance with simple CNN structures.
Experiments were carried out on 2016 Low-dose CT Grand Challenge datasets to
evaluate the method’s performance.
Amelie Royer, Alexander Kolesnikov, Christoph H. Lampert
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We develop a probabilistic technique for colorizing grayscale natural images.
In light of the intrinsic uncertainty of this task, the proposed probabilistic
framework has numerous desirable properties. In particular, our model is able
to produce multiple plausible and vivid colorizations for a given grayscale
image and is one of the first colorization models to provide a proper
stochastic sampling scheme. Moreover, our training procedure is supported by a
rigorous theoretical framework that does not require any ad hoc heuristics and
allows for efficient modeling and learning of the joint pixel color
distribution. We demonstrate strong quantitative and qualitative experimental
results on the CIFAR-10 dataset and the challenging ILSVRC 2012 dataset.
Amir Rosenfeld, John K. Tsotsos
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
Given an existing trained neural network, it is often desirable to be able to
add new capabilities without hindering performance of already learned tasks.
Existing approaches either learn sub-optimal solutions, require joint training,
or incur a substantial increment in the number of parameters for each added
task, typically as many as the original network. We propose a method which
fully preserves performance on the original task, with only a small increase
(around 20%) in the number of required parameters while performing on par with
more costly fine-tuning procedures, which typically double the number of
parameters. The learned architecture can be controlled to switch between
various learned representations, enabling a single network to solve a task from
multiple different domains. We conduct extensive experiments showing the
effectiveness of our method and explore different aspects of its behavior.
Akkas Uddin Haque, Ashkan Nejadpak
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In this paper we present a novel method for obstacle avoidance using the
stereo camera. The conventional obstacle avoidance methods and their
limitations are discussed. A new algorithm is developed for the real-time
obstacle avoidance which responds faster to unexpected obstacles. In this
approach the depth map is divided into optimized number of regions and the
minimum depth at each section is assigned as the depth of that region. A fuzzy
controller is designed to create the drive commands for the robot/quadcopter.
The system was tested on multiple paths with different obstacles and the
results demonstrated the high accuracy of the developed system.
Christoph Lassner, Gerard Pons-Moll, Peter V. Gehler
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We present the first image-based generative model of people in clothing in a
full-body setting. We sidestep the commonly used complex graphics rendering
pipeline and the need for high-quality 3D scans of dressed people. Instead, we
learn generative models from a large image database. The main challenge is to
cope with the high variance in human pose, shape and appearance. For this
reason, pure image-based approaches have not been considered so far. We show
that this challenge can be overcome by splitting the generating process in two
parts. First, we learn to generate a semantic segmentation of the body and
clothing. Second, we learn a conditional model on the resulting segments that
creates realistic images. The full model is differentiable and can be
conditioned on pose, shape or color. The result are samples of people in
different clothing items and styles. The proposed model can generate entirely
new people with realistic clothing. In several experiments we present
encouraging results that suggest an entirely data-driven approach to people
generation is possible.
Carlos Guindel, Jorge Beltrán, David Martín, Fernando García
Comments: Submitted to International Conference on Intelligent Transportation Systems 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
Sensor setups consisting of a combination of 3D range scanner lasers and
stereo vision systems are becoming a popular choice for on-board perception
systems in vehicles; however, the combined use of both sources of information
implies a tedious calibration process. We present a method for extrinsic
calibration of lidar-stereo camera pairs without user intervention. Our
calibration approach is aimed to cope with the constraints commonly found in
automotive setups, such as low-resolution and specific sensor poses. To
demonstrate the performance of our method, we also introduce a novel approach
for the quantitative assessment of the calibration results, based on a
simulation environment. Tests using real devices have been conducted as well,
proving the usability of the system and the improvement over the existing
approaches. Code is available at
this https URL
Yongcheng Jing, Yezhou Yang, Zunlei Feng, Jingwen Ye, Mingli Song
Subjects: Computer Vision and Pattern Recognition (cs.CV)
The recent work of Gatys et al. demonstrated the power of Convolutional
Neural Networks (CNN) in creating artistic fantastic imagery by separating and
recombing the image content and style. This process of using CNN to migrate the
semantic content of one image to different styles is referred to as Neural
Style Transfer. Since then, Neural Style Transfer has become a trending topic
both in academic literature and industrial applications. It is receiving
increasing attention from computer vision researchers and several methods are
proposed to either improve or extend the original neural algorithm proposed by
Gatys et al. However, there is no comprehensive survey presenting and
summarizing recent Neural Style Transfer literature. This review aims to
provide an overview of the current progress towards Neural Style Transfer, as
well as discussing its various applications and open problems for future
research.
Kai Han, Rafael S. Rezende, Bumsub Ham, Kwan-Yee K. Wong, Minsu Cho, Cordelia Schmid, Jean Ponce
Subjects: Computer Vision and Pattern Recognition (cs.CV)
This paper addresses the problem of establishing semantic correspondences
between images depicting different instances of the same object or scene
category. Previous approaches focus on either combining a spatial regularizer
with hand-crafted features, or learning a correspondence model for appearance
only. We propose instead a convolutional neural network architecture, called
SCNet, for learning a geometrically plausible model for semantic
correspondence. SCNet uses region proposals as matching primitives, and
explicitly incorporates geometric consistency in its loss function. It is
trained on image pairs obtained from the PASCAL VOC 2007 keypoint dataset, and
a comparative evaluation on several standard benchmarks demonstrates that the
proposed approach substantially outperforms both recent deep learning
architectures and previous methods based on hand-crafted features.
Mieczysław A. Kłopotek
Comments: 20 pages, 7 figures
Journal-ref: Machine Graphics and Vision, 1995, Vol. 4, No 1-2 (preliminary
version)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
This paper is concerned with recovery of motion and structure parameters from
multiframes under orthogonal projection when only points are traced. The main
question is how many points and/or how many frames are necessary for the task.
It is demonstrated that 3 frames and 3 points are the absolute minimum.
Closed-form solution is presented. Furthermore, it is shown that the task may
be linearized if either four points or four frames are available. It is
demonstrated that no increase in the number of points may lead to recovery of
structure and motion parameters from two frames only. It is shown that instead
the increase in the number of points may support the task of tracing the points
from frame to frame.
David Novotny, Diane Larlus, Andrea Vedaldi
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Traditional approaches for learning 3D object categories use either synthetic
data or manual supervision. In this paper, we propose instead an unsupervised
method that is cued by observing objects from a moving vantage point. Our
system builds on two innovations: a Siamese viewpoint factorization network
that robustly aligns different videos together without explicitly comparing 3D
shapes; and a 3D shape completion network that can extract the full shape of an
object from partial observations. We also demonstrate the benefits of
configuring networks to perform probabilistic predictions as well as of
geometry-aware data augmentation schemes. State-of-the-art results are
demonstrated on publicly-available benchmarks.
Angana Chakraborty, Sanghamitra Bandyopadhyay
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Locality Sensitive Hashing (LSH) based algorithms have already shown their
promise in finding approximate nearest neighbors in high dimen- sional data
space. However, there are certain scenarios, as in sequential data, where the
proximity of a pair of points cannot be captured without considering their
surroundings or context. In videos, as for example, a particular frame is
meaningful only when it is seen in the context of its preceding and following
frames. LSH has no mechanism to handle the con- texts of the data points. In
this article, a novel scheme of Context based Locality Sensitive Hashing
(conLSH) has been introduced, in which points are hashed together not only
based on their closeness, but also because of similar context. The contribution
made in this article is three fold. First, conLSH is integrated with a recently
proposed fast optimal sequence alignment algorithm (FOGSAA) using a layered
approach. The resultant method is applied to video retrieval for extracting
similar sequences. The pro- posed algorithm yields more than 80% accuracy on an
average in different datasets. It has been found to save 36.3% of the total
time, consumed by the exhaustive search. conLSH reduces the search space to
approximately 42% of the entire dataset, when compared with an exhaustive
search by the aforementioned FOGSAA, Bag of Words method and the standard LSH
implementations. Secondly, the effectiveness of conLSH is demon- strated in
action recognition of the video clips, which yields an average gain of 12.83%
in terms of classification accuracy over the state of the art methods using
STIP descriptors. The last but of great significance is that this article
provides a way of automatically annotating long and composite real life videos.
The source code of conLSH is made available at
this http URL
Ali Hasan, Ebrahim M. Kolahdouz, Andinet Enquobahrie, Thomas G. Caranasos, John P. Vavalle, Boyce E. Griffith
Subjects: Medical Physics (physics.med-ph); Computational Engineering, Finance, and Science (cs.CE); Computer Vision and Pattern Recognition (cs.CV)
Each year, approximately 300,000 heart valve repair or replacement procedures
are performed worldwide, including approximately 70,000 aortic valve
replacement surgeries in the United States alone. This paper describes progress
in constructing anatomically and physiologically realistic immersed boundary
(IB) models of the dynamics of the aortic root and ascending aorta. This work
builds on earlier IB models of fluid-structure interaction (FSI) in the aortic
root, which previously achieved realistic hemodynamics over multiple cardiac
cycles, but which also were limited to simplified aortic geometries and
idealized descriptions of the biomechanics of the aortic valve cusps. By
contrast, the model described herein uses an anatomical geometry reconstructed
from patient-specific computed tomography angiography (CTA) data, and employs a
description of the elasticity of the aortic valve leaflets based on a
fiber-reinforced constitutive model fit to experimental tensile test data.
Numerical tests show that the model is able to resolve the leaflet biomechanics
in diastole and early systole at practical grid spacings. The model is also
used to examine differences in the mechanics and fluid dynamics yielded by
fresh valve leaflets and glutaraldehyde-fixed leaflets similar to those used in
bioprosthetic heart valves. Although there are large differences in the leaflet
deformations during diastole, the differences in the open configurations of the
valve models are relatively small, and nearly identical hemodynamics are
obtained in all cases considered.
Sina Ghiassian, Banafsheh Rafiee, Richard S. Sutton
Comments: 5 pages, Accepted to NIPS Continual Learning and Deep Networks workshop, 2016
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG)
In this paper we present the first empirical study of the emphatic
temporal-difference learning algorithm (ETD), comparing it with conventional
temporal-difference learning, in particular, with linear TD(0), on on-policy
and off-policy variations of the Mountain Car problem. The initial motivation
for developing ETD was that it has good convergence properties under
emph{off}-policy training (Sutton, Mahmood & White 2016), but it is also a
new algorithm for the emph{on}-policy case. In both our on-policy and
off-policy experiments, we found that each method converged to a characteristic
asymptotic level of error, with ETD better than TD(0). TD(0) achieved a still
lower error level temporarily before falling back to its higher asymptote,
whereas ETD never showed this kind of “bounce”. In the off-policy case (in
which TD(0) is not guaranteed to converge), ETD was significantly slower.
Wang Ling, Dani Yogatama, Chris Dyer, Phil Blunsom
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Learning (cs.LG)
Solving algebraic word problems requires executing a series of arithmetic
operations—a program—to obtain a final answer. However, since programs can
be arbitrarily complicated, inducing them directly from question-answer pairs
is a formidable challenge. To make this task more feasible, we solve these
problems by generating answer rationales, sequences of natural language and
human-readable mathematical expressions that derive the final answer through a
series of small steps. Although rationales do not explicitly specify programs,
they provide a scaffolding for their structure via intermediate milestones. To
evaluate our approach, we have created a new 100,000-sample dataset of
questions, answers and rationales. Experimental results show that indirect
supervision of program learning via answer rationales is a promising strategy
for inducing arithmetic programs.
Yangming Zhou, Jin-Kao Hao, Fred Glover
Subjects: Artificial Intelligence (cs.AI)
Critical node problems involve identifying a subset of critical nodes from an
undirected graph whose removal results in optimizing a pre-defined measure over
the residual graph. As useful models for a variety of practical applications,
these problems are computational challenging. In this paper, we study the
classic critical node problem (CNP) and introduce an effective memetic
algorithm for solving CNP. The proposed algorithm combines a double
backbone-based crossover operator (to generate promising offspring solutions),
a component-based neighborhood search procedure (to find high-quality local
optima) and a rank-based pool updating strategy (to guarantee a healthy
population). Specially, the component-based neighborhood search integrates two
key techniques, i.e., two-phase node exchange strategy and node weighting
scheme. The double backbone-based crossover extends the idea of general
backbone-based crossovers. Extensive evaluations on 42 synthetic and real-world
benchmark instances show that the proposed algorithm discovers 21 new upper
bounds and matches 18 previous best-known upper bounds. We also demonstrate the
relevance of our algorithm for effectively solving a variant of the classic
CNP, called the cardinality-constrained critical node problem. Finally, we
investigate the usefulness of each key algorithmic component.
Amanda Song, Linjie Li, Chad Atalla, Garrison Cottrell
Comments: 10 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)
Humans make complex inferences on faces, ranging from objective properties
(gender, ethnicity, expression, age, identity, etc) to subjective judgments
(facial attractiveness, trustworthiness, sociability, friendliness, etc). While
the objective aspects of face perception have been extensively studied,
relatively fewer computational models have been developed for the social
impressions of faces. Bridging this gap, we develop a method to predict human
impressions of faces in 40 subjective social dimensions, using deep
representations from state-of-the-art neural networks. We find that model
performance grows as the human consensus on a face trait increases, and that
model predictions outperform human groups in correlation with human averages.
This illustrates the learnability of subjective social perception of faces,
especially when there is high human consensus. Our system can be used to decide
which photographs from a personal collection will make the best impression. The
results are significant for the field of social robotics, demonstrating that
robots can learn the subjective judgments defining the underlying fabric of
human interaction.
Tiep Le, Tran Cao Son, Enrico Pontelli, William Yeoh
Comments: Under consideration in Theory and Practice of Logic Programming (TPLP)
Subjects: Multiagent Systems (cs.MA); Artificial Intelligence (cs.AI)
This paper explores the use of Answer Set Programming (ASP) in solving
Distributed Constraint Optimization Problems (DCOPs). The paper provides the
following novel contributions: (1) It shows how one can formulate DCOPs as
logic programs; (2) It introduces ASP-DPOP, the first DCOP algorithm that is
based on logic programming; (3) It experimentally shows that ASP-DPOP can be up
to two orders of magnitude faster than DPOP (its imperative programming
counterpart) as well as solve some problems that DPOP fails to solve, due to
memory limitations; and (4) It demonstrates the applicability of ASP in a wide
array of multi-agent problems currently modeled as DCOPs. Under consideration
in Theory and Practice of Logic Programming (TPLP).
Barun Patra
Subjects: Information Retrieval (cs.IR)
With the advent of numerous community forums, tasks associated with the same
have gained importance in the recent past. With the influx of new questions
every day on these forums, the issues of identifying methods to find answers to
said questions, or even trying to detect duplicate questions, are of practical
importance and are challenging in their own right. This paper aims at surveying
some of the aforementioned issues, and methods proposed for tackling the same.
Yair Rivenson, Yibo Zhang, Harun Gunaydin, Da Teng, Aydogan Ozcan
Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR); Learning (cs.LG); Applied Physics (physics.app-ph); Optics (physics.optics)
Phase recovery from intensity-only measurements forms the heart of coherent
imaging techniques and holography. Here we demonstrate that a neural network
can learn to perform phase recovery and holographic image reconstruction after
appropriate training. This deep learning-based approach provides an entirely
new framework to conduct holographic imaging by rapidly eliminating twin-image
and self-interference related spatial artifacts. Compared to existing
approaches, this neural network based method is significantly faster to
compute, and reconstructs improved phase and amplitude images of the objects
using only one hologram, i.e., requires less number of measurements in addition
to being computationally faster. We validated this method by reconstructing
phase and amplitude images of various samples, including blood and Pap smears,
and tissue sections. These results are broadly applicable to any phase recovery
problem, and highlight that through machine learning challenging problems in
imaging science can be overcome, providing new avenues to design powerful
computational imaging systems.
Romain Paulus, Caiming Xiong, Richard Socher
Subjects: Computation and Language (cs.CL)
Attentional, RNN-based encoder-decoder models for abstractive summarization
have achieved good performance on short input and output sequences. However,
for longer documents and summaries, these models often include repetitive and
incoherent phrases. We introduce a neural network model with intra-attention
and a new training method. This method combines standard supervised word
prediction and reinforcement learning (RL). Models trained only with the former
often exhibit “exposure bias” — they assume ground truth is provided at each
step during training. However, when standard word prediction is combined with
the global sequence prediction training of RL the resulting summaries become
more readable. We evaluate this model on the CNN/Daily Mail and New York Times
datasets. Our model obtains a 41.16 ROUGE-1 score on the CNN/Daily Mail
dataset, a 5.7 absolute points improvement over previous state-of-the-art
models. It also performs well as the first abstractive model on the New York
Times corpus. Human evaluation also shows that our model produces higher
quality summaries.
Behrang QasemiZadeh, Laura Kallmeyer, Peyman Passban
Subjects: Computation and Language (cs.CL)
We propose a new fast word embedding technique using hash functions. The
method is a derandomization of a new type of random projections: By
disregarding the classic constraint used in designing random projections (i.e.,
preserving pairwise distances in a particular normed space), our solution
exploits extremely sparse non-negative random projections. Our experiments show
that the proposed method can achieve competitive results, comparable to neural
embedding learning techniques, however, with only a fraction of the
computational complexity of these methods. While the proposed derandomization
enhances the computational and space complexity of our method, the possibility
of applying weighting methods such as positive pointwise mutual information
(PPMI) to our models after their construction (and at a reduced dimensionality)
imparts a high discriminatory power to the resulting embeddings. Obviously,
this method comes with other known benefits of random projection-based
techniques such as ease of update.
Camilo Akimushkin, Diego R. Amancio, Osvaldo N. Oliveira Jr
Subjects: Computation and Language (cs.CL); Social and Information Networks (cs.SI)
Well-established automatic analyses of texts mainly consider frequencies of
linguistic units, e.g. letters, words and bigrams, while methods based on
co-occurrence networks consider the structure of texts regardless of the nodes
label (i.e. the words semantics). In this paper, we reconcile these distinct
viewpoints by introducing a generalized similarity measure to compare texts
which accounts for both the network structure of texts and the role of
individual words in the networks. We use the similarity measure for authorship
attribution of three collections of books, each composed of 8 authors and 10
books per author. High accuracy rates were obtained with typical values from
90% to 98.75%, much higher than with the traditional the TF-IDF approach for
the same collections. These accuracies are also higher than taking only the
topology of networks into account. We conclude that the different properties of
specific words on the macroscopic scale structure of a whole text are as
relevant as their frequency of appearance; conversely, considering the identity
of nodes brings further knowledge about a piece of text represented as a
network.
Pengfei Liu, Xipeng Qiu, Xuanjing Huang
Comments: Accepted by IJCAI 2017
Subjects: Computation and Language (cs.CL)
Tree-structured neural networks have proven to be effective in learning
semantic representations by exploiting syntactic information. In spite of their
success, most existing models suffer from the underfitting problem: they
recursively use the same shared compositional function throughout the whole
compositional process and lack expressive power due to inability to capture the
richness of compositionality. In this paper, we address this issue by
introducing the dynamic compositional neural networks over tree structure
(DC-TreeNN), in which the compositional function is dynamically generated by a
meta network. The role of meta-network is to capture the metaknowledge across
the different compositional rules and formulate them. Experimental results on
two typical tasks show the effectiveness of the proposed models.
Thai-Hoang Pham, Phuong Le-Hong
Comments: 6 pages, submitted to PACLING 2017
Subjects: Computation and Language (cs.CL)
This paper demonstrates end-to-end neural network architectures for
Vietnamese named entity recognition. Our best model is the combination of
bidirectional Long Short-Term Memory (Bi-LSTM), Convolutional Neural Network
(CNN), Conditional Random Field (CRF), using pre-trained word embeddings as
input, which achieves an F1 score of 88.59% on a standard test set. Our system
is able to achieve a comparable performance to the first-rank system of the
VLSP campaign without using any syntactic or hand-crafted features. We also
give an extensive empirical study on using common deep learning models for
Vietnamese NER, at both word and character level.
Thai-Hoang Pham, Xuan-Khoai Pham, Phuong Le-Hong
Comments: 8 pages, ICDIM 2015
Subjects: Computation and Language (cs.CL)
Semantic role labelling (SRL) is a task in natural language processing which
detects and classifies the semantic arguments associated with the predicates of
a sentence. It is an important step towards understanding the meaning of a
natural language. There exists SRL systems for well-studied languages like
English, Chinese or Japanese but there is not any such system for the
Vietnamese language. In this paper, we present the first SRL system for
Vietnamese with encouraging accuracy. We first demonstrate that a simple
application of SRL techniques developed for English could not give a good
accuracy for Vietnamese. We then introduce a new algorithm for extracting
candidate syntactic constituents, which is much more accurate than the common
node-mapping algorithm usually used in the identification step. Finally, in the
classification step, in addition to the common linguistic features, we propose
novel and useful features for use in SRL. Our SRL system achieves an (F_1)
score of 73.53\% on the Vietnamese PropBank corpus. This system, including
software and corpus, is available as an open source project and we believe that
it is a good baseline for the development of future Vietnamese SRL systems.
Thai-Hoang Pham, Phuong Le-Hong
Comments: 4 pages, IALP 2016
Subjects: Computation and Language (cs.CL)
Short Message Service (SMS) spam is a serious problem in Vietnam because of
the availability of very cheap pre-paid SMS packages. There are some systems to
detect and filter spam messages for English, most of which use machine learning
techniques to analyze the content of messages and classify them. For
Vietnamese, there is some research on spam email filtering but none focused on
SMS. In this work, we propose the first system for filtering Vietnamese spam
SMS. We first propose an appropriate preprocessing method since existing tools
for Vietnamese preprocessing cannot give good accuracy on our dataset. We then
experiment with vector representations and classifiers to find the best model
for this problem. Our system achieves an accuracy of 94% when labelling spam
messages while the misclassification rate of legitimate messages is relatively
small, about only 0.4%. This is an encouraging result compared to that of
English and can be served as a strong baseline for future development of
Vietnamese SMS spam prevention systems.
Bingfeng Luo, Yansong Feng, Zheng Wang, Zhanxing Zhu, Songfang Huang, Rui Yan, Dongyan Zhao
Comments: 10 pages, accepted by ACL 2017
Subjects: Computation and Language (cs.CL)
Distant supervision significantly reduces human efforts in building training
data for many classification tasks. While promising, this technique often
introduces noise to the generated training data, which can severely affect the
model performance. In this paper, we take a deep look at the application of
distant supervision in relation extraction. We show that the dynamic transition
matrix can effectively characterize the noise in the training data built by
distant supervision. The transition matrix can be effectively trained using a
novel curriculum learning based method without any direct supervision about the
noise. We thoroughly evaluate our approach under a wide range of extraction
scenarios. Experimental results show that our approach consistently improves
the extraction results and outperforms the state-of-the-art in various
evaluation scenarios.
Mitchell Stern, Jacob Andreas, Dan Klein
Comments: To appear in ACL 2017
Subjects: Computation and Language (cs.CL)
In this work, we present a minimal neural model for constituency parsing
based on independent scoring of labels and spans. We show that this model is
not only compatible with classical dynamic programming techniques, but also
admits a novel greedy top-down inference algorithm based on recursive
partitioning of the input. We demonstrate empirically that both prediction
schemes are competitive with recent work, and when combined with basic
extensions to the scoring model are capable of achieving state-of-the-art
single-model performance on the Penn Treebank (91.79 F1) and strong performance
on the French Treebank (82.23 F1).
Wang Ling, Dani Yogatama, Chris Dyer, Phil Blunsom
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Learning (cs.LG)
Solving algebraic word problems requires executing a series of arithmetic
operations—a program—to obtain a final answer. However, since programs can
be arbitrarily complicated, inducing them directly from question-answer pairs
is a formidable challenge. To make this task more feasible, we solve these
problems by generating answer rationales, sequences of natural language and
human-readable mathematical expressions that derive the final answer through a
series of small steps. Although rationales do not explicitly specify programs,
they provide a scaffolding for their structure via intermediate milestones. To
evaluate our approach, we have created a new 100,000-sample dataset of
questions, answers and rationales. Experimental results show that indirect
supervision of program learning via answer rationales is a promising strategy
for inducing arithmetic programs.
Tom Vander Aa, Imen Chakroun, Tom Haber
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Matrix factorization is a common machine learning technique for recommender
systems. Despite its high prediction accuracy, the Bayesian Probabilistic
Matrix Factorization algorithm (BPMF) has not been widely used on large scale
data because of its high computational cost. In this paper we propose a
distributed high-performance parallel implementation of BPMF on shared memory
and distributed architectures. We show by using efficient load balancing using
work stealing on a single node, and by using asynchronous communication in the
distributed version we beat state of the art implementations.
Laurent Feuilloley, Pierre Fraigniaud
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Proof-labeling schemes are known mechanisms providing nodes of networks with
certificates that can be verified locally by distributed algorithms. Given a
boolean predicate on network states, such schemes enable to check whether the
predicate is satisfied by the actual state of the network, by having nodes
interacting with their neighbors only. Proof-labeling schemes are typically
designed for enforcing fault-tolerance, by making sure that if the current
state of the network is illegal with respect to some given predicate, then at
least one node will detect it. Such a node can raise an alarm, or launch a
recovery procedure enabling the system to return to a legal state. In this
paper, we introduce error-sensitive proof-labeling schemes. These are
proof-labeling schemes which guarantee that the number of nodes detecting
illegal states is linearly proportional to the edit-distance between the
current state and the set of legal states. By using error-sensitive
proof-labeling schemes, states which are far from satisfying the predicate will
be detected by many nodes, enabling fast return to legality. We provide a
structural characterization of the set of boolean predicates on network states
for which there exist error-sensitive proof-labeling schemes. This
characterization allows us to show that classical predicates such as, e.g.,
acyclicity, and leader admit error-sensitive proof-labeling schemes, while
others like regular subgraphs don’t. We also focus on compact error-sensitive
proof-labeling schemes. In particular, we show that the known proof-labeling
schemes for spanning tree and minimum spanning tree, using certificates on
(O(log n)) bits, and on (Oleft(log^2n
ight)) bits, respectively, are
error-sensitive, as long as the trees are locally represented by adjacency
lists, and not just by parent pointers.
Christoph Lenzen, Moti Medina
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Designing routing schemes is a multidimensional and complex task that depends
on the objective function, the computational model (centralized vs.
distributed), and the amount of uncertainty (online vs. offline). Nevertheless,
there are quite a few well-studied general techniques, for a large variety of
network problems. In contrast, in our view, practical techniques for designing
robust routing schemes are scarce; while fault-tolerance has been studied from
a number of angles, existing approaches are concerned with dealing with faults
after the fact by rerouting, self-healing, or similar techniques. We argue that
this comes at a high burden for the designer, as in such a system any algorithm
must account for the effects of faults on communication.
With the goal of initiating efforts towards addressing this issue, we
showcase simple and generic transformations that can be used as a blackbox to
increase resilience against (independently distributed) faults. Given a network
and a routing scheme, we determine a reinforced network and corresponding
routing scheme that faithfully preserves the specification and behavior of the
original scheme. We show that reasonably small constant overheads in terms of
size of the new network compared to the old are sufficient for substantially
relaxing the reliability requirements on individual components. The main
message in this paper is that the task of designing a robust routing scheme can
be decoupled into (i) designing a routing scheme that meets the specification
in a fault-free environment, (ii) ensuring that nodes correspond to
fault-containment regions, i.e., fail (approximately) independently, and (iii)
applying our transformation to obtain a reinforced network and a robust routing
scheme that is fault-tolerant.
Dylan Fagot, Cédric Févotte, Herwig Wendt
Subjects: Learning (cs.LG)
Traditional NMF-based signal decomposition relies on the factorization of
spectral data which is typically computed by means of the short-time Fourier
transform. In this paper we propose to relax the choice of a pre-fixed
transform and learn a short-time unitary transform together with the
factorization, using a novel block-descent algorithm. This improves the fit
between the processed data and its approximation and is in turn shown to induce
better separation performance in a speech enhancement experiment.
Yue Yu, Longbo Huang
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
In this paper, we consider the stochastic composition optimization problem
proposed in cite{wang2017stochastic}, which has applications ranging from
estimation to statistical and machine learning. We propose the first ADMM-based
algorithm named com-SVR-ADMM, and show that com-SVR-ADMM converges linearly for
the strongly convex and Lipschitz smooth objectives, and a convergence rate of
(O( log S/S)), which improves upon the known (O(S^{-4/9})) rate when the
objective is convex and Lipschitz smooth. Moreover, it processes a rate of
(O(1/sqrt{S})) when the objective is convex but without Lipschitz smoothness.
We also conduct experiments and show that it outperforms existing algorithms.
YaoGong Zhang, YingJie Xu, Xin Fan, YuXiang Hong, Jiahui Liu, ZhiCheng He, YaLou Huang, MaoQiang Xie
Subjects: Learning (cs.LG); Quantitative Methods (q-bio.QM)
Background: Mining gene modules from genomic data is an important step to
detect gene members of pathways or other relations such as protein-protein
interactions. In this work, we explore the plausibility of detecting gene
modules by factorizing gene-phenotype associations from a phenotype ontology
rather than the conventionally used gene expression data. In particular, the
hierarchical structure of ontology has not been sufficiently utilized in
clustering genes while functionally related genes are consistently associated
with phenotypes on the same path in the phenotype ontology. Results: We propose
a hierarchal Nonnegative Matrix Factorization (NMF)-based method, called
Consistent Multiple Nonnegative Matrix Factorization (CMNMF), to factorize
genome-phenome association matrix at two levels of the hierarchical structure
in phenotype ontology for mining gene functional modules. CMNMF constrains the
gene clusters from the association matrices at two consecutive levels to be
consistent since the genes are annotated with both the child phenotype and the
parent phenotype in the consecutive levels. CMNMF also restricts the identified
phenotype clusters to be densely connected in the phenotype ontology hierarchy.
In the experiments on mining functionally related genes from mouse phenotype
ontology and human phenotype ontology, CMNMF effectively improved clustering
performance over the baseline methods. Gene ontology enrichment analysis was
also conducted to reveal interesting gene modules. Conclusions: Utilizing the
information in the hierarchical structure of phenotype ontology, CMNMF can
identify functional gene modules with more biological significance than the
conventional methods. CMNMF could also be a better tool for predicting members
of gene pathways and protein-protein interactions. Availability:
this https URL
Adam White, Richard S. Sutton
Subjects: Learning (cs.LG)
This document should serve as a quick reference for and guide to the
implementation of linear GQ((lambda)), a gradient-based off-policy
temporal-difference learning algorithm. Explanation of the intuition and theory
behind the algorithm are provided elsewhere (e.g., Maei & Sutton 2010, Maei
2011). If you questions or concerns about the content in this document or the
attached java code please email Adam White (adam.white@ualberta.ca).
The code is provided as part of the source files in the arXiv submission.
Ronny Ronen
Comments: This paper is the preface part of the “Why & When Deep Learning works looking inside Deep Learning” ICRI-CI paper bundle
Subjects: Learning (cs.LG)
The Intel Collaborative Research Institute for Computational Intelligence
(ICRI-CI) has been heavily supporting Machine Learning and Deep Learning
research from its foundation in 2012. We have asked six leading ICRI-CI Deep
Learning researchers to address the challenge of “Why & When Deep Learning
works”, with the goal of looking inside Deep Learning, providing insights on
how deep networks function, and uncovering key observations on their
expressiveness, limitations, and potential. The output of this challenge
resulted in five papers that address different facets of deep learning. These
different facets include a high-level understating of why and when deep
networks work (and do not work), the impact of geometry on the expressiveness
of deep networks, and making deep networks interpretable.
Ho Chung Leon Law, Dougal J. Sutherland, Dino Sejdinovic, Seth Flaxman
Comments: 10 pages
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
Distribution regression has recently attracted much interest as a generic
solution to the problem of supervised learning where labels are available at
the group level, rather than at the individual level. Current approaches,
however, do not propagate the uncertainty in observations due to sampling
variability in the groups. This effectively assumes that small and large groups
are estimated equally well, and should have equal weight in the final
regression. We construct a Bayesian distribution regression formalism that
accounts for this uncertainty, improving the robustness and performance of the
model when group sizes vary. We frame the model in a neural network style,
allowing for simple MAP inference using backpropagation to learn the
parameters, as well as MCMC-based inference which can fully propagate
uncertainty. We demonstrate our approach on illustrative toy datasets, as well
as on an astrostatistics problem in which velocity distributions are used to
predict galaxy cluster masses, quantifying the distribution of dark matter in
the universe.
Yair Rivenson, Yibo Zhang, Harun Gunaydin, Da Teng, Aydogan Ozcan
Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR); Learning (cs.LG); Applied Physics (physics.app-ph); Optics (physics.optics)
Phase recovery from intensity-only measurements forms the heart of coherent
imaging techniques and holography. Here we demonstrate that a neural network
can learn to perform phase recovery and holographic image reconstruction after
appropriate training. This deep learning-based approach provides an entirely
new framework to conduct holographic imaging by rapidly eliminating twin-image
and self-interference related spatial artifacts. Compared to existing
approaches, this neural network based method is significantly faster to
compute, and reconstructs improved phase and amplitude images of the objects
using only one hologram, i.e., requires less number of measurements in addition
to being computationally faster. We validated this method by reconstructing
phase and amplitude images of various samples, including blood and Pap smears,
and tissue sections. These results are broadly applicable to any phase recovery
problem, and highlight that through machine learning challenging problems in
imaging science can be overcome, providing new avenues to design powerful
computational imaging systems.
Amanda Song, Linjie Li, Chad Atalla, Garrison Cottrell
Comments: 10 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)
Humans make complex inferences on faces, ranging from objective properties
(gender, ethnicity, expression, age, identity, etc) to subjective judgments
(facial attractiveness, trustworthiness, sociability, friendliness, etc). While
the objective aspects of face perception have been extensively studied,
relatively fewer computational models have been developed for the social
impressions of faces. Bridging this gap, we develop a method to predict human
impressions of faces in 40 subjective social dimensions, using deep
representations from state-of-the-art neural networks. We find that model
performance grows as the human consensus on a face trait increases, and that
model predictions outperform human groups in correlation with human averages.
This illustrates the learnability of subjective social perception of faces,
especially when there is high human consensus. Our system can be used to decide
which photographs from a personal collection will make the best impression. The
results are significant for the field of social robotics, demonstrating that
robots can learn the subjective judgments defining the underlying fabric of
human interaction.
Cheng-Shang Chang, Chia-Tai Chang, Duan-Shin Lee, Li-Heng Liou
Subjects: Data Structures and Algorithms (cs.DS); Learning (cs.LG)
In this paper, we first propose a new iterative algorithm, called the K-sets+
algorithm for clustering data points in a semi-metric space, where the distance
measure does not necessarily satisfy the triangular inequality. We show that
the K-sets+ algorithm converges in a finite number of iterations and it retains
the same performance guarantee as the K-sets algorithm for clustering data
points in a metric space. We then extend the applicability of the K-sets+
algorithm from data points in a semi-metric space to data points that only have
a symmetric similarity measure. Such an extension leads to great reduction of
computational complexity. In particular, for an n * n similarity matrix with m
nonzero elements in the matrix, the computational complexity of the K-sets+
algorithm is O((Kn + m)I), where I is the number of iterations. The memory
complexity to achieve that computational complexity is O(Kn + m). As such, both
the computational complexity and the memory complexity are linear in n when the
n * n similarity matrix is sparse, i.e., m = O(n). We also conduct various
experiments to show the effectiveness of the K-sets+ algorithm by using a
synthetic dataset from the stochastic block model and a real network from the
WonderNetwork website.
Amir Rosenfeld, John K. Tsotsos
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
Given an existing trained neural network, it is often desirable to be able to
add new capabilities without hindering performance of already learned tasks.
Existing approaches either learn sub-optimal solutions, require joint training,
or incur a substantial increment in the number of parameters for each added
task, typically as many as the original network. We propose a method which
fully preserves performance on the original task, with only a small increase
(around 20%) in the number of required parameters while performing on par with
more costly fine-tuning procedures, which typically double the number of
parameters. The learned architecture can be controlled to switch between
various learned representations, enabling a single network to solve a task from
multiple different domains. We conduct extensive experiments showing the
effectiveness of our method and explore different aspects of its behavior.
Sina Ghiassian, Banafsheh Rafiee, Richard S. Sutton
Comments: 5 pages, Accepted to NIPS Continual Learning and Deep Networks workshop, 2016
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG)
In this paper we present the first empirical study of the emphatic
temporal-difference learning algorithm (ETD), comparing it with conventional
temporal-difference learning, in particular, with linear TD(0), on on-policy
and off-policy variations of the Mountain Car problem. The initial motivation
for developing ETD was that it has good convergence properties under
emph{off}-policy training (Sutton, Mahmood & White 2016), but it is also a
new algorithm for the emph{on}-policy case. In both our on-policy and
off-policy experiments, we found that each method converged to a characteristic
asymptotic level of error, with ETD better than TD(0). TD(0) achieved a still
lower error level temporarily before falling back to its higher asymptote,
whereas ETD never showed this kind of “bounce”. In the off-policy case (in
which TD(0) is not guaranteed to converge), ETD was significantly slower.
Wang Ling, Dani Yogatama, Chris Dyer, Phil Blunsom
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Learning (cs.LG)
Solving algebraic word problems requires executing a series of arithmetic
operations—a program—to obtain a final answer. However, since programs can
be arbitrarily complicated, inducing them directly from question-answer pairs
is a formidable challenge. To make this task more feasible, we solve these
problems by generating answer rationales, sequences of natural language and
human-readable mathematical expressions that derive the final answer through a
series of small steps. Although rationales do not explicitly specify programs,
they provide a scaffolding for their structure via intermediate milestones. To
evaluate our approach, we have created a new 100,000-sample dataset of
questions, answers and rationales. Experimental results show that indirect
supervision of program learning via answer rationales is a promising strategy
for inducing arithmetic programs.
Roberto Gonzalez, Filipe Manco, Alberto Garcia-Duran, Jose Mendes, Felipe Huici, Saverio Niccolini, Mathias Niepert
Subjects: Networking and Internet Architecture (cs.NI); Learning (cs.LG)
We present Net2Vec, a flexible high-performance platform that allows the
execution of deep learning algorithms in the communication network. Net2Vec is
able to capture data from the network at more than 60Gbps, transform it into
meaningful tuples and apply predictions over the tuples in real time. This
platform can be used for different purposes ranging from traffic classification
to network performance analysis.
Finally, we showcase the use of Net2Vec by implementing and testing a
solution able to profile network users at line rate using traces coming from a
real network. We show that the use of deep learning for this case outperforms
the baseline method both in terms of accuracy and performance.
Palash Goyal, Emilio Ferrara
Comments: Submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence for review
Subjects: Social and Information Networks (cs.SI); Learning (cs.LG); Data Analysis, Statistics and Probability (physics.data-an)
Graphs, such as social networks, word co-occurrence networks, and
communication networks, occur naturally in various real-world applications.
Analyzing them yields insight into the structure of society, language, and
different patterns of communication. Many approaches have been proposed to
perform the analysis. Recently, methods which use the representation of graph
nodes in vector space have gained traction from the research community. In this
survey, we provide a comprehensive and structured analysis of various graph
embedding techniques proposed in the literature. We first introduce the
embedding task and its challenges such as scalability, choice of
dimensionality, and features to be preserved, and their possible solutions. We
then present three categories of approaches based on factorization methods,
random walks, and deep learning, with examples of representative algorithms in
each category and analysis of their performance on various tasks. We evaluate
these state-of-the-art methods on a few common datasets and compare their
performance against one another and versus non-embedding based models. Our
analysis concludes by suggesting some potential applications and future
directions. We finally present the open-source Python library, named GEM (Graph
Embedding Methods), we developed that provides all presented algorithms within
a unified interface, to foster and facilitate research on the topic.
Ali Bereyhi, Mohammad Ali Sedaghat, Ralf R. Müller
Comments: 5 pages; 2 figures; to be presented at ISIT 2017
Subjects: Information Theory (cs.IT)
This paper studies the large-system performance of Least Square Error (LSE)
precoders which~minimize~the~input-output distortion over an arbitrary support
subject to a general penalty function. The asymptotics are determined via the
replica method in a general form which encloses the Replica Symmetric (RS) and
Replica Symmetry Breaking (RSB) ans”atze. As a result, the “marginal
decoupling property” of LSE precoders for (b)-steps of RSB is derived. The
generality of the studied setup enables us to address special cases in which
the number of active transmit antennas are constrained. Our numerical
investigations depict that the computationally efficient forms of LSE precoders
based on “(ell_1)-norm” minimization perform close to the cases with
“zero-norm” penalty function which have a considerable improvements compared to
the random antenna selection. For the case with BPSK signals and restricted
number of active antennas, the results show that RS fails to predict the
performance while the RSB ansatz is consistent with theoretical bounds.
Burak Çakmak, Manfred Opper, Ole Winther, Bernard H. Fleury
Comments: 5 pages, accepted for ISIT 2017
Subjects: Information Theory (cs.IT)
We introduce a theoretical approach for designing generalizations of the
approximate message passing (AMP) algorithm for compressed sensing which are
valid for large observation matrices that are drawn from an invariant random
matrix ensemble. By design, the fixed points of the algorithm obey the
Thouless-Anderson-Palmer (TAP) equations corresponding to the ensemble. Using a
dynamical functional approach we are able to derive an effective stochastic
process for the marginal statistics of a single component of the dynamics. This
allows us to design memory terms in the algorithm in such a way that the
resulting fields become Gaussian random variables allowing for an explicit
analysis. The asymptotic statistics of these fields are consistent with the
replica ansatz of the compressed sensing problem.
Mohammad Shehab, Endrit Dosti, Hirley Alves, Matti Latva-aho
Comments: To appear at EUCNC’2017
Subjects: Information Theory (cs.IT)
This paper analyzes the effective capacity (EC) of delay constrained machine
type communication (MTC) networks operating in the finite blocklength (FB)
regime. First, we derive a closed-form mathematical approximation for the EC in
Rayleigh block fading channels. We characterize the optimum error probability
to maximize the concave EC function and study the effect of SINR variations for
different delay constraints. Our analysis reveals that SINR variations have
less impact on EC for strict delay constrained networks. We present an
exemplary scenario for massive MTC access to analyze the interference effect
proposing three methods to restore the EC for a certain node which are power
control, graceful degradation of delay constraint and joint compensation. Joint
compensation combines both power control and graceful degradation of delay
constraint, where we perform maximization of an objective function whose
parameters are determined according to delay and SINR priorities. Our results
show that networks with stringent delay constraints favor power controlled
compensation and compensation is generally performed at higher costs for
shorter packets.
Mohammed Al-Imari, Pei Xiao, Muhammad Ali Imran, Rahim Tafazolli
Comments: Presented in the International Symposium on Wireless Communications Systems (ISWCS), 2014
Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)
Orthogonal Frequency Division Multiple Access (OFDMA) as well as other
orthogonal multiple access techniques fail to achieve the system capacity limit
in the uplink due to the exclusivity in resource allocation. This issue is more
prominent when fairness among the users is considered in the system. Current
Non-Orthogonal Multiple Access (NOMA) techniques introduce redundancy by
coding/spreading to facilitate the users’ signals separation at the receiver,
which degrade the system spectral efficiency. Hence, in order to achieve higher
capacity, more efficient NOMA schemes need to be developed. In this paper, we
propose a NOMA scheme for uplink that removes the resource allocation
exclusivity and allows more than one user to share the same subcarrier without
any coding/spreading redundancy. Joint processing is implemented at the
receiver to detect the users’ signals. However, to control the receiver
complexity, an upper limit on the number of users per subcarrier needs to be
imposed. In addition, a novel subcarrier and power allocation algorithm is
proposed for the new NOMA scheme that maximizes the users’ sum-rate. The
link-level performance evaluation has shown that the proposed scheme achieves
bit error rate close to the single-user case. Numerical results show that the
proposed NOMA scheme can significantly improve the system performance in terms
of spectral efficiency and fairness comparing to OFDMA.
Martin Uecker
Comments: 3 pages, 3 figures, 12th International Conference on Sampling Theory and Applications, Tallinn 2017
Subjects: Information Theory (cs.IT); Computational Engineering, Finance, and Science (cs.CE); Optimization and Control (math.OC); Instrumentation and Detectors (physics.ins-det); Medical Physics (physics.med-ph)
Modern reconstruction methods for magnetic resonance imaging (MRI) exploit
the spatially varying sensitivity profiles of receive-coil arrays as additional
source of information. This allows to reduce the number of time-consuming
Fourier-encoding steps by undersampling. The receive sensitivities are a priori
unknown and influenced by geometry and electric properties of the (moving)
subject. For optimal results, they need to be estimated jointly with the image
from the same undersampled measurement data. Formulated as an inverse problem,
this leads to a bilinear reconstruction problem related to multi-channel blind
deconvolution. In this work, we will discuss some recently developed approaches
for the solution of this problem.
Seok-Hwan Park, Osvaldo Simeone, Wonju Lee, Shlomo Shamai (Shitz)
Comments: to appear in Proc. IEEE SPAWC 2017
Subjects: Information Theory (cs.IT)
This work studies the advantages of coded multicasting for the downlink of a
Fog Radio Access Network (F-RAN) system equipped with a multicast fronthaul
link. In this system, a control unit (CU) in the baseband processing unit (BBU)
pool is connected to distributed edge nodes (ENs) through a multicast fronthaul
link of finite capacity, and the ENs have baseband processing and caching
capabilities. Each user equipment (UE) requests a file in a content library
which is available at the CU, and the requested files are served by the closest
ENs based on the cached contents and on the information received on the
multicast fronthaul link. The performance of coded multicast fronthauling is
investigated in terms of the delivery latency of the requested contents under
the assumption of pipelined transmission on the fronthaul and edge links and of
single-user encoding and decoding strategies based on the hard transfer of
files on the fronthaul links. Extensive numerical results are provided to
validate the advantages of the coded multicasting scheme compared to uncoded
unicast and multicast strategies.
Victor Quintero, Samir M. Perlaza, Jean-Marie Gorce, H. Vincent Poor
Comments: 5 pages, 2 figures, to appear in ISIT 2017
Subjects: Information Theory (cs.IT)
In this paper, the (eta)-Nash equilibrium ((eta)-NE) region of the two-user
linear deterministic interference channel (IC) with noisy channel-output
feedback is characterized for all (eta > 0). The (eta)-NE region, a subset of
the capacity region, contains the set of all achievable information rate pairs
that are stable in the sense of an (eta)-NE. More specifically, given an
(eta)-NE coding scheme, there does not exist an alternative coding scheme for
either transmitter-receiver pair that increases the individual rate by more
than (eta) bits per channel use. Existing results such as the (eta)-NE region
of the linear deterministic IC without feedback and with perfect output
feedback are obtained as particular cases of the result presented in this
paper.
Zhiyong Zhou, Jun Yu
Subjects: Information Theory (cs.IT)
We study the recovery conditions of weighted (ell_1) minimization for real
signal reconstruction from phaseless compressed sensing measurements when
partial support information is available. A Strong Restricted Isometry Property
(SRIP) condition is provided to ensure the stable recovery. Moreover, we
present the weighted null space property as the sufficient and necessary
condition for the success of (k)-sparse phase retrieval via weighted (ell_1)
minimization.
Yi Lou, Qiyue Yu, Julian Cheng, Honglin Zhao
Comments: 5 pages, 2 figures, submitted for possible publication
Subjects: Information Theory (cs.IT)
Two weighted selection combining (WSC) schemes are proposed for a
differential decode-and-forward relaying system in Rayleigh fading channels.
Compared to the conventional selection combining scheme, the decision variable
of the relay link is multiplied by a scale factor to combat the error
propagation phenomenon. Average bit-error rate (ABER) expressions of the two
proposed WSC schemes are derived in closed-form and verified by simulation
results. For the second WSC scheme, asymptotic ABER expression and diversity
order are derived to gain more insight into this scheme. Moreover, it is
demonstrated that both WSC schemes can overcome the extra noise amplification
induced by the link adaptive relaying scheme. The first WSC scheme is slightly
inferior to the second one, which has a higher complexity. Both proposed WSC
schemes outperform the conventional selection combining scheme.
Yi Lou, Julian Cheng, Yan Zheng, Honglin Zhao
Comments: 12 pages, 3 figures, submitted for possible publication
Subjects: Information Theory (cs.IT)
A novel asymptotic closed-form probability density function (pdf) of the
two-hop (TH) link is derived for a simultaneous wireless information and power
transfer based differential amplify-and-forward system. Based on the pdf,
asymptotic closed-form average bit-error rate expressions of the single TH link
and the TH link with direct link combined with a linear combining scheme are
both derived. Monte Carlo simulations verify the analytical expressions.
Batu K. Chalise, Himal A. Suraweera, Gan Zheng, George K. Karagiannidis
Comments: 14 pages, accepted
Subjects: Information Theory (cs.IT)
We propose techniques for optimizing transmit beamforming in a full-duplex
multiple-input-multiple-output (MIMO) wireless-powered communication system,
which consists of two phases. In the first phase, the wireless-powered mobile
station (MS) harvests energy using signals from the base station (BS), whereas
in the second phase, both MS and BS communicate to each other in a full-duplex
mode. When complete instantaneous channel state information (CSI) is available,
the BS beamformer and the time-splitting (TS) parameter of energy harvesting
are jointly optimized in order to obtain the BS-MS rate region. The joint
optimization problem is non-convex, however, a computationally efficient
optimum technique, based upon semidefinite relaxation and line-search, is
proposed to solve the problem. A sub-optimum zero-forcing approach is also
proposed, in which a closed-form solution of TS parameter is obtained. When
only second-order statistics of transmit CSI is available, we propose to
maximize the ergodic information rate at the MS, while maintaining the outage
probability at the BS below a certain threshold. An upper bound for the outage
probability is also derived and an approximate convex optimization framework is
proposed for efficiently solving the underlying non-convex problem. Simulations
demonstrate the advantages of the proposed methods over the sub-optimum and
half-duplex ones.
Masato Tajima
Comments: 25 pages, 3 figures
Subjects: Information Theory (cs.IT)
Basic properties of a characteristic matrix for a tail-biting convolutional
code are investigated. A tail-biting convolutional code can be regarded as a
linear block code. Since the corresponding scalar generator matrix Gt has a
kind of cyclic structure, an associated characteristic matrix also has a cyclic
structure, from which basic properties of a characteristic matrix are obtained.
Next, using the derived results, we discuss the possibility of trellis
reduction for a given tail-biting convolutional code. There are cases where we
can find a scalar generator matrix Gs equivalent to Gt based on a
characteristic matrix. In this case, if the polynomial generator matrix
corresponding to Gs has been reduced, or can be reduced by using appropriate
transformations, then trellis reduction for the original tail-biting
convolutional code is realized. In many cases, the polynomial generator matrix
corresponding to Gs has a monomial factor in some column and is reduced by
dividing the column by the factor. Note that this transformation corresponds to
cyclically shifting the associated code subsequence (a tail-biting path is
regarded as a code sequence) to the left. Thus if we allow partial cyclic
shifts of a tail-biting path, then trellis reduction is accomplished.
Kumar Vijay Mishra, Yonina C. Eldar
Comments: 5 pages, 5 figures, SampTA 2017 conference
Subjects: Information Theory (cs.IT)
Nowadays, millimeter-wave communication centered at the 60 GHz radio
frequency band is increasingly the preferred technology for near-field
communication since it provides transmission bandwidth that is several GHz
wide. The IEEE 802.11ad standard has been developed for commercial wireless
local area networks in the 60 GHz transmission environment. Receivers designed
to process IEEE 802.11ad waveforms employ very high rate analog-to-digital
converters, and therefore, reducing the receiver sampling rate can be useful.
In this work, we study the problem of low-rate channel estimation over the IEEE
802.11ad 60 GHz communication link by harnessing sparsity in the channel
impulse response. In particular, we focus on single carrier modulation and
exploit the special structure of the 802.11ad waveform embedded in the channel
estimation field of its single carrier physical layer frame. We examine various
sub-Nyquist sampling methods for this problem and recover the channel using
compressed sensing techniques. Our numerical experiments show feasibility of
our procedures up to one-seventh of the Nyquist rates with minimal performance
deterioration.
Ryutaroh Matsumoto
Comments: svjour3.cls, 5 pages, no figure
Subjects: Quantum Physics (quant-ph); Information Theory (cs.IT)
In this note we report two versions of Gilbert-Varshamov type existential
bounds for asymmetric quantum error-correcting codes.