Janez Aleš
Comments: Predictor weights can be provided upon request
Subjects: Neural and Evolutionary Computing (cs.NE)
The architecture of neural Turing machines is differentiable end to end and
is trainable with gradient descent methods. Due to their large unfolded depth
Neural Turing Machines are hard to train and because of their linear access of
complete memory they do not scale. Other architectures have been studied to
overcome these difficulties. In this report we focus on improving the quality
of prediction of the original linear memory architecture on copy and repeat
copy tasks. Copy task predictions on sequences of length six times larger than
those the neural Turing machine was trained on prove to be highly accurate and
so do predictions of repeat copy tasks for sequences with twice the repetition
number and twice the sequence length neural Turing machine was trained on.
Anmol Biswas, Sidharth Prasad, Sandip Lashkare, Udayan Ganguly
Comments: 9 page conference paper submitted at IJCNN 2017
Subjects: Neural and Evolutionary Computing (cs.NE)
Spiking Neural Networks (SNN) are more closely related to brain-like
computation and inspire hardware implementation. This is enabled by small
networks that give high performance on standard classification problems. In
literature, typical SNNs are deep and complex in terms of network structure,
weight update rules and learning algorithms. This makes it difficult to
translate them into hardware. In this paper, we first develop a simple
2-layered network in software which compares with the state of the art on four
different standard data-sets within SNNs and has improved efficiency. For
example, it uses lower number of neurons (3 x), synapses (3.5 x) and epochs for
training (30 x) for the Fisher Iris classification problem. The efficient
network is based on effective population coding and synapse-neuron co-design.
Second, we develop a computationally efficient (15000 x) and accurate
(correlation of 0.98) method to evaluate the performance of the network without
standard recognition tests. Third, we show that the method produces a
robustness metric that can be used to evaluate noise tolerance.
Tong Che, Yanran Li, Athul Paul Jacob, Yoshua Bengio, Wenjie Li
Comments: Under review as a conference paper at ICLR 2017
Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Although Generative Adversarial Networks achieve state-of-the-art results on
a variety of generative tasks, they are regarded as highly unstable and prone
to miss modes. We argue that these bad behaviors of GANs are due to the very
particular functional shape of the trained discriminators in high dimensional
spaces, which can easily make training stuck or push probability mass in the
wrong direction, towards that of higher concentration than that of the data
generating distribution.
We introduce several ways of regularizing the objective, which can
dramatically stabilize the training of GAN models. We also show that our
regularizers can help the fair distribution of probability mass across the
modes of the data generating distribution, during the early phases of training
and thus providing a unified solution to the missing modes problem.
Benjamin Ummenhofer, Huizhong Zhou, Jonas Uhrig, Nikolaus Mayer, Eddy Ilg, Alexey Dosovitskiy, Thomas Brox
Comments: Supplementary material included. Project page: this http URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In this paper we formulate structure from motion as a learning problem. We
train a convolutional network end-to-end to compute depth and camera motion
from successive, unconstrained image pairs. The architecture is composed of
multiple stacked encoder-decoder networks, the core part being an iterative
network that is able to improve its own predictions. The network estimates not
only depth and motion, but additionally surface normals, optical flow between
the images and confidence of the matching. A crucial component of the approach
is a training loss based on spatial relative differences. Compared to
traditional two-frame structure from motion methods, results are more accurate
and more robust. In contrast to the popular depth-from-single-image networks,
DeMoN learns the concept of matching and, thus, better generalizes to
structures not seen during training.
Shashank Jaiswal, Michel Valstar, Alinda Gillott, David Daley
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Attention Deficit Hyperactivity Disorder (ADHD) and Autism Spectrum Disorder
(ASD) are neurodevelopmental conditions which impact on a significant number of
children and adults. Currently, the diagnosis of such disorders is done by
experts who employ standard questionnaires and look for certain behavioural
markers through manual observation. Such methods for their diagnosis are not
only subjective, difficult to repeat, and costly but also extremely time
consuming. In this work, we present a novel methodology to aid diagnostic
predictions about the presence/absence of ADHD and ASD by automatic visual
analysis of a person’s behaviour. To do so, we conduct the questionnaires in a
computer-mediated way while recording participants with modern RGBD
(Colour+Depth) sensors. In contrast to previous automatic approaches which have
focussed only detecting certain behavioural markers, our approach provides a
fully automatic end-to-end system for directly predicting ADHD and ASD in
adults. Using state of the art facial expression analysis based on Dynamic Deep
Learning and 3D analysis of behaviour, we attain classification rates of 96%
for Controls vs Condition (ADHD/ASD) group and 94% for Comorbid (ADHD+ASD) vs
ASD only group. We show that our system is a potentially useful time saving
contribution to the diagnostic field of ADHD and ASD.
Jia Xue, Hang Zhang, Kristin Dana, Ko Nishino
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Material recognition for real-world outdoor surfaces has become increasingly
important for computer vision to support its operation “in the wild.”
Computational surface modeling that underlies material recognition has
transitioned from reflectance modeling using in-lab controlled radiometric
measurements to image-based representations based on internet-mined images of
materials captured in the scene. We propose to take a middle-ground approach
for material recognition that takes advantage of both rich radiometric cues and
flexible image capture. We realize this by developing a framework for
differential angular imaging, where small angular variations in image capture
provide an enhanced appearance representation and significant recognition
improvement. We build a large-scale material database, Ground Terrain in
Outdoor Scenes (GTOS) database, geared towards real use for autonomous agents.
The database consists of over 30,000 images covering 40 classes of outdoor
ground terrain under varying weather and lighting conditions. We develop a
novel approach for material recognition called a Differential Angular Imaging
Network (DAIN) to fully leverage this large dataset. With this novel network
architecture, we extract characteristics of materials encoded in the angular
and spatial gradients of their appearance. Our results show that DAIN achieves
recognition performance that surpasses single view or coarsely quantized
multiview images. These results demonstrate the effectiveness of differential
angular imaging as a means for flexible, in-place material recognition.
Yu-Chuan Su, Dinesh Jayaraman, Kristen Grauman
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We introduce the novel task of Pano2Vid (-) automatic cinematography in
panoramic 360(^{circ}) videos. Given a 360(^{circ}) video, the goal is to
direct an imaginary camera to virtually capture natural-looking normal
field-of-view (NFOV) video. By selecting “where to look” within the panorama at
each time step, Pano2Vid aims to free both the videographer and the end viewer
from the task of determining what to watch. Towards this goal, we first compile
a dataset of 360(^{circ}) videos downloaded from the web, together with
human-edited NFOV camera trajectories to facilitate evaluation. Next, we
propose AutoCam, a data-driven approach to solve the Pano2Vid task. AutoCam
leverages NFOV web video to discriminatively identify space-time “glimpses” of
interest at each time instant, and then uses dynamic programming to select
optimal human-like camera trajectories. Through experimental evaluation on
multiple newly defined Pano2Vid performance measures against several baselines,
we show that our method successfully produces informative videos that could
conceivably have been captured by human videographers.
Michael Figurnov, Maxwell D. Collins, Yukun Zhu, Li Zhang, Jonathan Huang, Dmitry Vetrov, Ruslan Salakhutdinov
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
This paper proposes a deep learning architecture based on Residual Network
that dynamically adjusts the number of executed layers for the regions of the
image. This architecture is end-to-end trainable, deterministic and
problem-agnostic. It is therefore applicable without any modifications to a
wide range of computer vision problems such as image classification, object
detection and image segmentation. We present experimental results showing that
this model improves the computational efficiency of Residual Networks on the
challenging ImageNet classification and COCO object detection datasets.
Additionally, we evaluate the computation time maps on the visual saliency
dataset cat2000 and find that they correlate surprisingly well with human eye
fixation positions.
Frank Michel, Alexander Kirillov, Erix Brachmann, Alexander Krull, Stefan Gumhold, Bogdan Savchynskyy, Carsten Rother
Subjects: Computer Vision and Pattern Recognition (cs.CV)
This paper addresses the task of estimating the 6D pose of a known 3D object
from a single RGB-D image. Most modern approaches solve this task in three
steps: i) Compute local features; ii) Generate a pool of pose-hypotheses; iii)
Select and refine a pose from the pool. This work focuses on the second step.
While all existing approaches generate the hypotheses pool via local reasoning,
e.g. RANSAC or Hough-voting, we are the first to show that global reasoning is
beneficial at this stage. In particular, we formulate a novel fully-connected
Conditional Random Field (CRF) that outputs a very small number of
pose-hypotheses. Despite the potential functions of the CRF being non-Gaussian,
we give a new and efficient two-step optimization procedure, with some
guarantees for optimality. We utilize our global hypotheses generation
procedure to produce results that exceed state-of-the-art for the challenging
“Occluded Object Dataset”.
Wim Abbeloos, Toon Goedemé
Comments: Proceedings of the 10th International Conference on Informatics in Control, Automation and Robotics
Journal-ref: Proceedings of the International Conference on Informatics in
Control, Automation and Robotics (2013) 464-470
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Combining new, low-cost thermal infrared and time-of-flight range sensors
provides new opportunities. In this position paper we explore the possibilities
of combining these sensors and using their fused data for person detection. The
proposed calibration approach for this sensor combination differs from the
traditional stereo camera calibration in two fundamental ways. A first
distinction is that the spectral sensitivity of the two sensors differs
significantly. In fact, there is no sensitivity range overlap at all. A second
distinction is that their resolution is typically very low, which requires
special attention. We assume a situation in which the sensors’ relative
position is known, but their orientation is unknown. In addition, some of the
typical measurement errors are discussed, and methods to compensate for them
are proposed. We discuss how the fused data could allow increased accuracy and
robustness without the need for complex algorithms requiring large amounts of
computational power and training data.
Matthias Faes, Wim Abbeloos, Frederik Vogeler, Hans Valkenaers, Kurt Coppens, Toon Goedemé, Eleonora Ferraris
Comments: International Conference on Polymers and Moulds Innovations(PMI) 2014
Journal-ref: Conference Proceedings PMI 6 (2014) 363-367
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Extrusion based 3D Printing (E3DP) is an Additive Manufacturing (AM)
technique that extrudes thermoplastic polymer in order to build up components
using a layerwise approach. Hereby, AM typically requires long production times
in comparison to mass production processes such as Injection Molding. Failures
during the AM process are often only noticed after build completion and
frequently lead to part rejection because of dimensional inaccuracy or lack of
mechanical performance, resulting in an important loss of time and material. A
solution to improve the accuracy and robustness of a manufacturing technology
is the integration of sensors to monitor and control process state-variables
online. In this way, errors can be rapidly detected and possibly compensated at
an early stage. To achieve this, we integrated a modular 2D laser triangulation
scanner into an E3DP machine and analyzed feedback signals. A 2D laser
triangulation scanner was selected here owing to the very compact size,
achievable accuracy and the possibility of capturing geometrical 3D data. Thus,
our implemented system is able to provide both quantitative and qualitative
information. Also, in this work, first steps towards the development of a
quality control loop for E3DP processes are presented and opportunities are
discussed.
Stef Van Wolputte, Wim Abbeloos, Stijn Helsen, Abdellatif Bey-Temsamani, Toon Goedemé
Comments: 2015 International Conference on Image Processing Theory, Tools and Applications (IPTA)
Journal-ref: Proceedings of the International Conference on Image Processing
Theory, Tools and Applications (2015) 543-549
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In this paper we propose a low-cost high-speed imaging line scan system. We
replace an expensive industrial line scan camera and illumination with a
custom-built set-up of cheap off-the-shelf components, yielding a measurement
system with comparative quality while costing about 20 times less. We use a
low-cost linear (1D) image sensor, cheap optics including a LED-based or
LASER-based lighting and an embedded platform to process the images. A
step-by-step method to design such a custom high speed imaging system and
select proper components is proposed. Simulations allowing to predict the final
image quality to be obtained by the set-up has been developed. Finally, we
applied our method in a lab, closely representing the real-life cases. Our
results shows that our simulations are very accurate and that our low-cost line
scan set-up acquired image quality compared to the high-end commercial vision
system, for a fraction of the price.
Enrique Sánchez-Lozano, Georgios Tzimiropoulos, Brais Martinez, Fernando De la Torre, Michel Valstar
Comments: Manuscript submitted to review at TPAMI, extending ECCV 2012 and ECCV 2016 papers on Continuous Regression
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Linear regression is a fundamental building block in many face detection and
tracking algorithms, typically used to predict shape displacements from image
features through a linear mapping. This paper presents a Functional Regression
solution to the least squares problem, which we coin Continuous Regression,
resulting in the first real-time incremental face tracker. Contrary to prior
work in Functional Regression, in which B-splines or Fourier series were used,
we propose to approximate the input space by its first-order Taylor expansion,
yielding a closed-form solution for the continuous domain of displacements. We
then extend the continuous least squares problem to correlated variables, and
demonstrate the generalisation of our approach. We incorporate Continuous
Regression into the cascaded regression framework, and show its computational
benefits for both training and testing. We then present a fast approach for
incremental learning within Cascaded Continuous Regression, coined iCCR, and
show that its complexity allows real-time face tracking, being 20 times faster
than the state of the art. To the best of our knowledge, this is the first
incremental face tracker that is shown to operate in real-time. We show that
iCCR achieves state-of-the-art performance in the 300-VW dataset, the most
recent, large-scale benchmark for face tracking.
Itamar Talmi, Roey Mechrez, Lihi Zelnik-Manor
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We propose a novel measure for template matching named Deformable Diversity
Similarity — based on the diversity of feature matches between a target image
window and the template. We rely on both local appearance and geometric
information that jointly lead to a powerful approach for matching. Our key
contribution is a similarity measure, that is robust to complex deformations,
significant background clutter, and occlusions. Empirical evaluation on the
most up-to-date benchmark shows that our method outperforms the current
state-of-the-art in its detection accuracy while improving computational
complexity.
Roey Mechrez, Eli Shechtman, Lihi Zelnik-Manor
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Have you ever taken a picture only to find out that an unimportant background
object ended up being overly salient? Or one of those team sports photos where
your favorite player blends with the rest? Wouldn’t it be nice if you could
tweak these pictures just a little bit so that the distractor would be
attenuated and your favorite player will stand-out among her peers?
Manipulating images in order to control the saliency of objects is the goal of
this paper. We propose an approach that considers the internal color and
saliency properties of the image. It changes the saliency map via an
optimization framework that relies on patch-based manipulation using only
patches from within the same image to achieve realistic looking results.
Applications include object enhancement, distractors attenuation and background
decluttering. Comparing our method to previous ones shows significant
improvement, both in the achieved saliency manipulation and in the realistic
appearance of the resulting images.
Wim Abbeloos, Toon Goedemé
Comments: VII International Conference on Electrical Engineering FIE 2014, Santiago de Cuba
Journal-ref: Proceedings Conferencia Internacional de Ingenier’ia El’ectrica
7 (2014) 1-4
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Detecting people in images is a challenging problem. Differences in pose,
clothing and lighting, along with other factors, cause a lot of variation in
their appearance. To overcome these issues, we propose a system based on fused
range and thermal infrared images. These measurements show considerably less
variation and provide more meaningful information. We provide a brief
introduction to the sensor technology used and propose a calibration method.
Several data fusion algorithms are compared and their performance is assessed
on a simulated data set. The results of initial experiments on real data are
analyzed and the measurement errors and the challenges they present are
discussed. The resulting fused data are used to efficiently detect people in a
fixed camera set-up. The system is extended to include person tracking.
Seungjun Nah, Tae Hyun Kim, Kyoung Mu Lee
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Non-uniform blind deblurring for general dynamic scenes is a challenging
computer vision problem since blurs are caused by camera shake, scene depth as
well as multiple object motions. To remove these complicated motion blurs,
conventional energy optimization based methods rely on simple assumptions such
that blur kernel is partially uniform or locally linear. Moreover, recent
machine learning based methods also depend on synthetic blur datasets generated
under these assumptions. This makes conventional deblurring methods fail to
remove blurs where blur kernel is difficult to approximate or parameterize
(e.g. object motion boundaries). In this work, we propose a multi-scale
convolutional neural network that restores blurred images caused by various
sources in an end-to-end manner. Furthermore, we present multi-scale loss
function that mimics conventional coarse-to-fine approaches. Moreover, we
propose a new large scale dataset that provides pairs of realistic blurry image
and the corresponding ground truth sharp image that are obtained by a
high-speed camera. With the proposed model trained on this dataset, we
demonstrate empirically that our method achieves the state-of-the-art
performance in dynamic scene deblurring not only qualitatively, but also
quantitatively.
Dwarikanath Mahapatra
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Medical image segmentation requires consensus ground truth segmentations to
be derived from multiple expert annotations. A novel approach is proposed that
obtains consensus segmentations from experts using graph cuts (GC) and semi
supervised learning (SSL). Popular approaches use iterative Expectation
Maximization (EM) to estimate the final annotation and quantify annotator’s
performance. Such techniques pose the risk of getting trapped in local minima.
We propose a self consistency (SC) score to quantify annotator consistency
using low level image features. SSL is used to predict missing annotations by
considering global features and local image consistency. The SC score also
serves as the penalty cost in a second order Markov random field (MRF) cost
function optimized using graph cuts to derive the final consensus label. Graph
cut obtains a global maximum without an iterative procedure. Experimental
results on synthetic images, real data of Crohn’s disease patients and retinal
images show our final segmentation to be accurate and more consistent than
competing methods.
Shayan Modiri Assari, Haroon Idrees, Mubarak Shah
Subjects: Computer Vision and Pattern Recognition (cs.CV)
This paper addresses the problem of human re-identification across
non-overlapping cameras in crowds.Re-identification in crowded scenes is a
challenging problem due to large number of people and frequent occlusions,
coupled with changes in their appearance due to different properties and
exposure of cameras. To solve this problem, we model multiple Personal, Social
and Environmental (PSE) constraints on human motion across cameras. The
personal constraints include appearance and preferred speed of each individual
assumed to be similar across the non-overlapping cameras. The social influences
(constraints) are quadratic in nature, i.e. occur between pairs of individuals,
and modeled through grouping and collision avoidance. Finally, the
environmental constraints capture the transition probabilities between gates
(entrances / exits) in different cameras, defined as multi-modal distributions
of transition time and destination between all pairs of gates. We incorporate
these constraints into an energy minimization framework for solving human
re-identification. Assigning (1-1) correspondence while modeling PSE
constraints is NP-hard. We present a stochastic local search algorithm to
restrict the search space of hypotheses, and obtain (1-1) solution in the
presence of linear and quadratic PSE constraints. Moreover, we present an
alternate optimization using Frank-Wolfe algorithm that solves the convex
approximation of the objective function with linear relaxation on binary
variables, and yields an order of magnitude speed up over stochastic local
search with minor drop in performance. We evaluate our approach using
Cumulative Matching Curves as well (1-1) assignment on several thousand frames
of Grand Central, PRID and DukeMTMC datasets, and obtain significantly better
results compared to existing re-identification methods.
Aditya Balu, Kin Gwn Lore, Gavin Young, Adarsh Krishnamurthy, Soumik Sarkar
Comments: 9 Pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Deep 3D Convolutional Neural Networks (3D-CNN) are traditionally used for
object recognition, video data analytics and human gesture recognition. In this
paper, we present a novel application of 3D-CNNs in understanding
difficult-to-manufacture features from computer-aided design (CAD) models to
develop a decision support tool for cyber-enabled manufacturing. Traditionally,
design for manufacturability (DFM) rules are hand-crafted and used to
accelerate the engineering product design cycle by integrating
manufacturability analysis during the design stage. Such a practice relies on
the experience and training of the designer to create a complex component that
is manufacturable. However, even after careful design, the inclusion of certain
features might cause the part to be non-manufacturable. In this paper, we
develop a framework using Deep 3D-CNNs to learn salient features from a CAD
model of a mechanical part and determine if the part can be manufactured or
not. CAD models of different manufacturable and non-manufacturable parts are
generated using a solid modeling kernel and then converted into 3D voxel data
using a fast GPU-accelerated voxelization algorithm. The voxel data is used to
train a 3D-CNN model for manufacturability classification. Feature space and
filter visualization is also performed to understand the learning capability in
the context of manufacturability features. We demonstrate that the proposed
3D-CNN based DFM framework is able to learn the DFM rules for
non-manufacturable features without a human prior. The framework can be
extended to identify a large variety of difficult-to-manufacture features at
multiple spatial scales leading to a real-time decision support system for DFM.
Yun Liu, Ming-Ming Cheng, Xiaowei Hu, Kai Wang, Xiang Bai
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In this paper, we propose an accurate edge detector using richer
convolutional features (RCF). Since objects in nature images have various
scales and aspect ratios, the automatically learned rich hierarchical
representations by CNNs are very critical and effective to detect edges and
object boundaries. And the convolutional features gradually become coarser with
receptive fields increasing. Based on these observations, our proposed network
architecture makes full use of multiscale and multi-level information to
perform the image-to-image edge prediction by combining all of the useful
convolutional features into a holistic framework. It is the first attempt to
adopt such rich convolutional features in computer vision tasks. Using VGG16
network, we achieve sArt results on several available datasets. When
evaluating on the well-known BSDS500 benchmark, we achieve ODS F-measure of
extbf{.811} while retaining a fast speed ( extbf{8} FPS). Besides, our fast
version of RCF achieves ODS F-measure of extbf{.806} with extbf{30} FPS.
Qinbin Hou, Puneet Kumar Dokania, Daniela Massiceti, Yunchao Wei, Ming-Ming Cheng, Philip Torr
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We consider the task of learning a classifier for semantic segmentation using
weak supervision, in this case, image labels specifying the objects within the
image. Our method uses deep convolutional neural networks (CNNs) and adopts an
Expectation-Maximization (EM) based approach maintaining the uncertainty on
pixel labels. We focus on the following three crucial aspects of the EM based
approach: (i) initialization; (ii) latent posterior estimation (E step) and
(iii) the parameter update (M step). We show that {em saliency} and {em
attention} maps provide good cues to learn an initialization model and allows
us to skip the bad local minimum to which EM methods are otherwise
traditionally prone. In order to update the parameters, we propose minimizing
the combination of the standard extit{softmax} loss and the KL divergence
between the true latent posterior and the likelihood given by the CNN. We argue
that this combination is more robust to wrong predictions made by the
expectation step of the EM method. We support this argument with empirical and
visual results. We additionally incorporate an approximate
intersection-over-union (IoU) term into the loss function for better parameter
estimation. Extensive experiments and discussions show that: (i) our method is
very simple and intuitive; (ii) requires only image-level labels; and (iii)
consistently outperforms other weakly supervised state-of-the-art methods with
a very high margin on the PASCAL VOC 2012 dataset.
Evan Racah, Christopher Beckham, Tegan Maharaj, Prabhat, Christopher Pal
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
The detection and identification of extreme weather events in large scale
climate simulations is an important problem for risk management, informing
governmental policy decisions and advancing our basic understanding of the
climate system. Recent work has shown that fully supervised convolutional
neural networks (CNNs) can yield acceptable accuracy for classifying well-known
types of extreme weather events when large amounts of labeled data are
available. However, there are many different types of spatially localized
climate patterns of interest (including hurricanes, extra-tropical cyclones,
weather fronts, blocking events, etc.) found in simulation data for which
labeled data is not available at large scale for all simulations of interest.
We present a multichannel spatiotemporal encoder-decoder CNN architecture for
semi-supervised bounding box prediction and exploratory data analysis. This
architecture is designed to fully model multi-channel simulation data, temporal
dynamics and unlabelled data within a reconstruction and prediction framework
so as to improve the detection of a wide range of extreme weather events. Our
architecture can be viewed as a 3D convolutional autoencoder with an additional
modified one-pass bounding box regression loss. We demonstrate that our
approach is able to leverage temporal information and unlabelled data to
improve localization of extreme weather events. Further, we explore the
representations learned by our model in order to better understand this
important data, and facilitate further work in understanding and mitigating the
effects of climate change.
Ji Feng, Qingsheng Zhu, Jinlong Huang, Lijun Yang
Comments: 10 pages, 2 figures, 2 tables
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG)
Various kinds of k-nearest neighbor (KNN) based classification methods are
the bases of many well-established and high-performance pattern-recognition
techniques, but both of them are vulnerable to their parameter choice.
Essentially, the challenge is to detect the neighborhood of various data sets,
while utterly ignorant of the data characteristic. This article introduces a
new supervised classification method: the extend natural neighbor (ENaN)
method, and shows that it provides a better classification result without
choosing the neighborhood parameter artificially. Unlike the original KNN based
method which needs a prior k, the ENaNE method predicts different k in
different stages. Therefore, the ENaNE method is able to learn more from
flexible neighbor information both in training stage and testing stage, and
provide a better classification result.
Armando Vieira
Subjects: Artificial Intelligence (cs.AI)
Knowledge Graphs (KG) constitute a flexible representation of complex
relationships between entities particularly useful for biomedical data. These
KG, however, are very sparse with many missing edges (facts) and the
visualisation of the mesh of interactions nontrivial. Here we apply a
compositional model to embed nodes and relationships into a vectorised semantic
space to perform graph completion. A visualisation tool based on Convolutional
Neural Networks and Self-Organised Maps (SOM) is proposed to extract high-level
insights from the KG. We apply this technique to a subset of CTD, containing
interactions of compounds with human genes / proteins and show that the
performance is comparable to the one obtained by structural models.
Marco F. Cusumano-Towner, Vikash K. Mansinghka
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)
A key limitation of sampling algorithms for approximate inference is that it
is difficult to quantify their approximation error. Widely used sampling
schemes, such as sequential importance sampling with resampling and
Metropolis-Hastings, produce output samples drawn from a distribution that may
be far from the target posterior distribution. This paper shows how to
upper-bound the symmetric KL divergence between the output distribution of a
broad class of sequential Monte Carlo (SMC) samplers and their target posterior
distributions, subject to assumptions about the accuracy of a separate
gold-standard sampler. The proposed method applies to samplers that combine
multiple particles, multinomial resampling, and rejuvenation kernels. The
experiments show the technique being used to estimate bounds on the divergence
of SMC samplers for posterior inference in a Bayesian linear regression model
and a Dirichlet process mixture model.
Shuai Ma, Jiayuan Yu
Comments: 23 pages, 4figures
Subjects: Artificial Intelligence (cs.AI)
This paper studies Value-at-Risk problems in finite-horizon Markov decision
processes (MDPs) with finite state space and two forms of reward function.
Firstly we study the effect of reward function on two criteria in a
short-horizon MDP. Secondly, for long-horizon MDPs, we estimate the total
reward distribution in a finite-horizon Markov chain (MC) with the help of
spectral theory and the central limit theorem, and present a transformation
algorithm for the MCs with a three-argument reward function and a salvage
reward.
Yaron Meirovitch, Alexander Matveev, Hayk Saribekyan, David Budden, David Rolnick, Gergely Odor, Seymour Knowles-Barley Thouis Raymond Jones, Hanspeter Pfister, Jeff William Lichtman, Nir Shavit
Comments: 18 pages, 10 figures
Subjects: Quantitative Methods (q-bio.QM); Artificial Intelligence (cs.AI); Neurons and Cognition (q-bio.NC)
The field of connectomics faces unprecedented “big data” challenges. To
reconstruct neuronal connectivity, automated pixel-level segmentation is
required for petabytes of streaming electron microscopy data. Existing
algorithms provide relatively good accuracy but are unacceptably slow, and
would require years to extract connectivity graphs from even a single cubic
millimeter of neural tissue. Here we present a viable real-time solution, a
multi-pass pipeline optimized for shared-memory multicore systems, capable of
processing data at near the terabyte-per-hour pace of multi-beam electron
microscopes. The pipeline makes an initial fast-pass over the data, and then
makes a second slow-pass to iteratively correct errors in the output of the
fast-pass. We demonstrate the accuracy of a sparse slow-pass reconstruction
algorithm and suggest new methods for detecting morphological errors. Our
fast-pass approach provided many algorithmic challenges, including the design
and implementation of novel shallow convolutional neural nets and the
parallelization of watershed and object-merging techniques. We use it to
reconstruct, from image stack to skeletons, the full dataset of Kasthuri et al.
(463 GB capturing 120,000 cubic microns) in a matter of hours on a single
multicore machine rather than the weeks it has taken in the past on much larger
distributed systems.
Francisco Raposo, David Martins de Matos, Ricardo Ribeiro
Comments: 5 pages, 1 table
Subjects: Information Retrieval (cs.IR); Learning (cs.LG); Sound (cs.SD)
Applying generic media-agnostic summarization to music allows for higher
efficiency in automatic processing, storage, and communication of datasets
while also alleviating copyright issues. This process has already been proven
useful in the context of music genre classification. In this paper, we
generalize conclusions from previous work by evaluating generic the impact of
summarization in music from a probabilistic perspective and agnostic relative
to certain tasks. We estimate Gaussian distributions for original and
summarized songs and compute their relative entropy to measure how much
information is lost in the summarization process. Based on this observation, we
further propose a simple yet expressive summarization method that objectively
outperforms previous methods and is better suited to avoid copyright issues. We
present results suggesting that relative entropy is a good predictor of
summarization performance in the context of tasks relying on a bag-of-features
assumption.
Héctor Martínez Alonso, Barbara Plank
Comments: To appear in EACL 2017
Subjects: Computation and Language (cs.CL)
Multitask learning has been applied successfully to a range of tasks, mostly
morphosyntactic. However, little is known on when MTL works and whether there
are data characteristics that help to determine the success of MTL. In this
paper we evaluate a range of semantic sequence labeling tasks in a MTL setup.
We examine different auxiliary task configurations, amongst which a novel
setup, and correlate their impact to data-dependent conditions. Our results
show that MTL is not always effective, because significant improvements are
obtained only for 1 out of 5 tasks. When successful, auxiliary tasks with
compact and more uniform label distributions are preferable.
Sébastien Bouchard, Marjorie Bournat, Yoann Dieudonné, Swan Dubois, Franck Petit
Subjects: Data Structures and Algorithms (cs.DS); Distributed, Parallel, and Cluster Computing (cs.DC)
In this paper we study the task of approach of two mobile agents having the
same limited range of vision and moving asynchronously in the plane. This task
consists in getting them in finite time within each other’s range of vision.
The agents execute the same deterministic algorithm and are assumed to have a
compass showing the cardinal directions as well as a unit measure. On the other
hand, they do not share any global coordinates system (like GPS), cannot
communicate and have distinct labels. Each agent knows its label but does not
know the label of the other agent or the initial position of the other agent
relative to its own. The route of an agent is a sequence of segments that are
subsequently traversed in order to achieve approach. For each agent, the
computation of its route depends only on its algorithm and its label. An
adversary chooses the initial positions of both agents in the plane and
controls the way each of them moves along every segment of the routes, in
particular by arbitrarily varying the speeds of the agents. A deterministic
approach algorithm is a deterministic algorithm that always allows two agents
with any distinct labels to solve the task of approach regardless of the
choices and the behavior of the adversary. The cost of a complete execution of
an approach algorithm is the length of both parts of route travelled by the
agents until approach is completed. Let (Delta) and (l) be the initial
distance separating the agents and the length of the shortest label,
respectively. Assuming that (Delta) and (l) are unknown to both agents, does
there exist a deterministic approach algorithm always working at a cost that is
polynomial in (Delta) and (l)? In this paper, we provide a positive answer to
the above question by designing such an algorithm.
Binghong Chen, Jun Zhu
Comments: 7 pages
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
Group-Lasso (gLasso) identifies important explanatory factors in predicting
the response variable by considering the grouping structure over input
variables. However, most existing algorithms for gLasso are not scalable to
deal with large-scale datasets, which are becoming a norm in many applications.
In this paper, we present a divide-and-conquer based parallel algorithm
(DC-gLasso) to scale up gLasso in the tasks of regression with grouping
structures. DC-gLasso only needs two iterations to collect and aggregate the
local estimates on subsets of the data, and is provably correct to recover the
true model under certain conditions. We further extend it to deal with
overlappings between groups. Empirical results on a wide range of synthetic and
real-world datasets show that DC-gLasso can significantly improve the time
efficiency without sacrificing regression accuracy.
Tong Che, Yanran Li, Athul Paul Jacob, Yoshua Bengio, Wenjie Li
Comments: Under review as a conference paper at ICLR 2017
Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Although Generative Adversarial Networks achieve state-of-the-art results on
a variety of generative tasks, they are regarded as highly unstable and prone
to miss modes. We argue that these bad behaviors of GANs are due to the very
particular functional shape of the trained discriminators in high dimensional
spaces, which can easily make training stuck or push probability mass in the
wrong direction, towards that of higher concentration than that of the data
generating distribution.
We introduce several ways of regularizing the objective, which can
dramatically stabilize the training of GAN models. We also show that our
regularizers can help the fair distribution of probability mass across the
modes of the data generating distribution, during the early phases of training
and thus providing a unified solution to the missing modes problem.
Francisco Raposo, David Martins de Matos, Ricardo Ribeiro
Comments: 5 pages, 1 table
Subjects: Information Retrieval (cs.IR); Learning (cs.LG); Sound (cs.SD)
Applying generic media-agnostic summarization to music allows for higher
efficiency in automatic processing, storage, and communication of datasets
while also alleviating copyright issues. This process has already been proven
useful in the context of music genre classification. In this paper, we
generalize conclusions from previous work by evaluating generic the impact of
summarization in music from a probabilistic perspective and agnostic relative
to certain tasks. We estimate Gaussian distributions for original and
summarized songs and compute their relative entropy to measure how much
information is lost in the summarization process. Based on this observation, we
further propose a simple yet expressive summarization method that objectively
outperforms previous methods and is better suited to avoid copyright issues. We
present results suggesting that relative entropy is a good predictor of
summarization performance in the context of tasks relying on a bag-of-features
assumption.
Ji Feng, Qingsheng Zhu, Jinlong Huang, Lijun Yang
Comments: 10 pages, 2 figures, 2 tables
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG)
Various kinds of k-nearest neighbor (KNN) based classification methods are
the bases of many well-established and high-performance pattern-recognition
techniques, but both of them are vulnerable to their parameter choice.
Essentially, the challenge is to detect the neighborhood of various data sets,
while utterly ignorant of the data characteristic. This article introduces a
new supervised classification method: the extend natural neighbor (ENaN)
method, and shows that it provides a better classification result without
choosing the neighborhood parameter artificially. Unlike the original KNN based
method which needs a prior k, the ENaNE method predicts different k in
different stages. Therefore, the ENaNE method is able to learn more from
flexible neighbor information both in training stage and testing stage, and
provide a better classification result.
Michael Figurnov, Maxwell D. Collins, Yukun Zhu, Li Zhang, Jonathan Huang, Dmitry Vetrov, Ruslan Salakhutdinov
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
This paper proposes a deep learning architecture based on Residual Network
that dynamically adjusts the number of executed layers for the regions of the
image. This architecture is end-to-end trainable, deterministic and
problem-agnostic. It is therefore applicable without any modifications to a
wide range of computer vision problems such as image classification, object
detection and image segmentation. We present experimental results showing that
this model improves the computational efficiency of Residual Networks on the
challenging ImageNet classification and COCO object detection datasets.
Additionally, we evaluate the computation time maps on the visual saliency
dataset cat2000 and find that they correlate surprisingly well with human eye
fixation positions.
Weiyang Liu, Yandong Wen, Zhiding Yu, Meng Yang
Comments: Published in ICML 2016. Revised some typos
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
Cross-entropy loss together with softmax is arguably one of the most common
used supervision components in convolutional neural networks (CNNs). Despite
its simplicity, popularity and excellent performance, the component does not
explicitly encourage discriminative learning of features. In this paper, we
propose a generalized large-margin softmax (L-Softmax) loss which explicitly
encourages intra-class compactness and inter-class separability between learned
features. Moreover, L-Softmax not only can adjust the desired margin but also
can avoid overfitting. We also show that the L-Softmax loss can be optimized by
typical stochastic gradient descent. Extensive experiments on four benchmark
datasets demonstrate that the deeply-learned features with L-softmax loss
become more discriminative, hence significantly boosting the performance on a
variety of visual classification and verification tasks.
Sergey Bartunov, Dmitry P. Vetrov
Comments: Submitted to ICLR 2017
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
Despite recent advances, the remaining bottlenecks in deep generative models
are necessity of extensive training and difficulties with generalization from
small number of training examples. Both problems may be addressed by
conditional generative models that are trained to adapt the generative
distribution to additional input data. So far this idea was explored only under
certain limitations such as restricting the input data to be a single object or
multiple objects representing the same concept. In this work we develop a new
class of deep generative model called generative matching networks which is
inspired by the recently proposed matching networks for one-shot learning in
discriminative tasks and the ideas from meta-learning. By conditioning on the
additional input dataset, generative matching networks may instantly learn new
concepts that were not available during the training but conform to a similar
generative process, without explicit limitations on the number of additional
input objects or the number of concepts they represent. Our experiments on the
Omniglot dataset demonstrate that generative matching networks can
significantly improve predictive performance on the fly as more additional data
is available to the model and also adapt the latent space which is beneficial
in the context of feature extraction.
Nir Baram, Oron Anschel, Shie Mannor
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
Generative adversarial learning is a popular new approach to training
generative models which has been proven successful for other related problems
as well. The general idea is to maintain an oracle (D) that discriminates
between the expert’s data distribution and that of the generative model (G).
The generative model is trained to capture the expert’s distribution by
maximizing the probability of (D) misclassifying the data it generates.
Overall, the system is emph{differentiable} end-to-end and is trained using
basic backpropagation. This type of learning was successfully applied to the
problem of policy imitation in a model-free setup. However, a model-free
approach does not allow the system to be differentiable, which requires the use
of high-variance gradient estimations. In this paper we introduce the Model
based Adversarial Imitation Learning (MAIL) algorithm. A model-based approach
for the problem of adversarial imitation learning. We show how to use a forward
model to make the system fully differentiable, which enables us to train
policies using the (stochastic) gradient of (D). Moreover, our approach
requires relatively few environment interactions, and fewer hyper-parameters to
tune. We test our method on the MuJoCo physics simulator and report initial
results that surpass the current state-of-the-art.
Marco F. Cusumano-Towner, Vikash K. Mansinghka
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)
A key limitation of sampling algorithms for approximate inference is that it
is difficult to quantify their approximation error. Widely used sampling
schemes, such as sequential importance sampling with resampling and
Metropolis-Hastings, produce output samples drawn from a distribution that may
be far from the target posterior distribution. This paper shows how to
upper-bound the symmetric KL divergence between the output distribution of a
broad class of sequential Monte Carlo (SMC) samplers and their target posterior
distributions, subject to assumptions about the accuracy of a separate
gold-standard sampler. The proposed method applies to samplers that combine
multiple particles, multinomial resampling, and rejuvenation kernels. The
experiments show the technique being used to estimate bounds on the divergence
of SMC samplers for posterior inference in a Bayesian linear regression model
and a Dirichlet process mixture model.
Niek Tax, Ilya Verenich, Marcello La Rosa, Marlon Dumas
Subjects: Applications (stat.AP); Databases (cs.DB); Learning (cs.LG); Machine Learning (stat.ML)
Predictive business process monitoring methods exploit logs of completed
cases of a process in order to make predictions about running cases thereof.
Existing methods in this space are tailor-made for specific prediction tasks.
Moreover, their relative accuracy is highly sensitive to the dataset at hand,
thus requiring users to engage in trial-and-error and tuning when applying them
in a specific setting. This paper investigates Long Short-Term Memory (LSTM)
neural networks as an approach to build consistently accurate models for a wide
range of predictive process monitoring tasks. First, we show that LSTMs
outperform existing techniques to predict the next event of a running case and
its timestamp. Next, we show how to use models for predicting the next task in
order to predict the full continuation of a running case. Finally, we apply the
same approach to predict the remaining time, and show that this approach
outperforms existing tailor-made methods.
Yu Lu, Harrison H. Zhou
Subjects: Statistics Theory (math.ST); Learning (cs.LG); Machine Learning (stat.ML)
Clustering is a fundamental problem in statistics and machine learning.
Lloyd’s algorithm, proposed in 1957, is still possibly the most widely used
clustering algorithm in practice due to its simplicity and empirical
performance. However, there has been little theoretical investigation on the
statistical and computational guarantees of Lloyd’s algorithm. This paper is an
attempt to bridge this gap between practice and theory. We investigate the
performance of Lloyd’s algorithm on clustering sub-Gaussian mixtures. Under an
appropriate initialization for labels or centers, we show that Lloyd’s
algorithm converges to an exponentially small clustering error after an order
of (log n) iterations, where (n) is the sample size. The error rate is shown
to be minimax optimal. For the two-mixture case, we only require the
initializer to be slightly better than random guess.
In addition, we extend the Lloyd’s algorithm and its analysis to community
detection and crowdsourcing, two problems that have received a lot of attention
recently in statistics and machine learning. Two variants of Lloyd’s algorithm
are proposed respectively for community detection and crowdsourcing. On the
theoretical side, we provide statistical and computational guarantees of the
two algorithms, and the results improve upon some previous signal-to-noise
ratio conditions in literature for both problems. Experimental results on
simulated and real data sets demonstrate competitive performance of our
algorithms to the state-of-the-art methods.
Xingguo Li, Jarvis Haupt
Comments: 16 pages, 4 figures
Subjects: Information Theory (cs.IT)
This paper examines the problem of locating outlier columns in a large,
otherwise low-rank matrix, in settings where {}{the data} are noisy, or where
the overall matrix has missing elements. We propose a randomized two-step
inference framework, and establish sufficient conditions on the required sample
complexities under which these methods succeed (with high probability) in
accurately locating the outliers for each task. Comprehensive numerical
experimental results are provided to verify the theoretical bounds and
demonstrate the computational efficiency of the proposed algorithm.
Muris Sarajlić, Liang Liu, Ove Edfors
Comments: To be published in Proceedings of 50th Asilomar Conference on Signals, Systems and Computers
Subjects: Information Theory (cs.IT)
One of the basic aspects of Massive MIMO (MaMi) that is in the focus of
current investigations is its potential of using low-cost and energy-efficient
hardware. It is often claimed that MaMi will allow for using analog-to-digital
converters (ADCs) with very low resolutions and that this will result in
overall improvement of energy efficiency. In this contribution, we perform a
parametric energy efficiency analysis of MaMi uplink for the entire base
station receiver system with varying ADC resolutions. The analysis shows that,
for a wide variety of system parameters, ADCs with intermediate bit resolutions
(4 – 10 bits) are optimal in energy efficiency sense, and that using very low
bit resolutions results in degradation of energy efficiency.
Ramakrishna Bandi, Alexandre Fotue Tabue, Edgar Martínez-Moro
Subjects: Information Theory (cs.IT)
Let (R) be a finite principal ideal ring and (S) the Galois extension of (R)
of degree (m). For (k) and (k_0), positive integers we determine the number of
free (S)-linear codes (B) of length (l) with the property (k = rank_S(B)) and
(k_0 = rank_R (Bcap R^l)). This corrects a wrong result which was given in the
case of finite fields.
Ted Hurley
Comments: arXiv admin note: text overlap with arXiv:1205.0703
Subjects: Information Theory (cs.IT)
Orthogonal sets of idempotents are used to design sets of unitary matrices,
known as constellations, such that the modulus of the determinant of the
difference of any two distinct elements is greater than (0). It is shown that
unitary matrices in general are derived from orthogonal sets of idempotents
reducing the design problem to a construction problem of unitary matrices from
such sets. The quality of the constellations constructed in this way and the
actual differences between the unitary matrices can be determined algebraically
from the idempotents used.
This has applications to the design of unitary space time constellations.
A. Agustin, S. Lagen, J. Vidal, O. Muñoz, A. Pascual-Iserte, G. Zhiheng, W.Ronghui
Comments: submitted to IEEE Communications Magazine
Subjects: Information Theory (cs.IT)
Traditionally, wireless cellular systems have been designed to operate in
Frequency Division Duplexing (FDD) paired bands that allocates the same amount
of spectrum for both downlink (DL) and uplink (UL) communication. Such design
is very convenient under symmetric DL/UL traffic conditions, as it used to be
the case when the voice transmission was the predominant service. However, with
the overwhelming advent of data services, bringing along large asymmetries
between DL and UL, the conventional FDD solution becomes inefficient. In this
regard, flexible duplexing concepts aim to derive procedures for improving the
spectrum utilization, by adjusting resources to the actual traffic demand. In
this work we review these concepts and propose the use of unpaired Time
Division Duplexing (TDD) spectrum on the unused resources for small eNBs
(SeNB), so that user equipment (UEs) associated to those SeNB could be served
either in DL or UL. This proposal alleviates the saturated DL in FDD-based
system through user offloading towards the TDD-based system. The flexible
duplexing concept is analyzed from three points of view: a) regulation, b) Long
Term Evolution (LTE)standardization, and c) technical solutions.
Mauro Girotto, Andrea M. Tonello
Comments: A version of this manuscript has been submitted to the IEEE Access for possible publication
Subjects: Information Theory (cs.IT)
This paper considers Electromagnetic Compatibility (EMC) aspects in the
context of Power Line Communication (PLC) systems. It offers a complete
overview of both narrow band PLC and broad band PLC EMC norms. How to interpret
and translate such norms and measurement procedures into typical constraints
used by designers of communication systems, is discussed. In particular, the
constraints to the modulated signal spectrum are considered and the ability of
pulse shaped OFDM (PS-OFDM), used in most of the PLC standards as IEEE P1901
and P1901.2, to fulfill them is analyzed. In addition, aiming to improve the
spectrum management ability, a novel scheme named Pulse Shaped Cyclic Block
Filtered Multitone modulation (PS-CB-FMT) is introduced and compared to
PS-OFDM. It is shown that, PS-CB-FMT offers better ability to fulfill the norms
which translates in higher system capacity.
Md. Abdul Latif Sarker
Comments: 4
Subjects: Information Theory (cs.IT)
We address the problem of the bit error rate (BER) performance gap between
the sub-optimal and optimal linear precoder (LP) for a multiuser (MU) multiple
input and multiple output (MIMO) broadcast systems in this paper. Particularly,
mobile users suffer noise enhancement effect due to a sub-optimal LP that can
be suppressed by an optimal LP matrix. A sub-optimal LP matrix such as a linear
zero-forcing (LZF) precoder performs in high signal to noise ratio (SNR) regime
only, in contrast, an optimal precoder for instance a linear minimum
mean-square-error (LMMSE) precoder outperforms in both low and high SNR
scenarios. These kinds of precoder illustrates the BER gap distance at least
0.1 when it is used in itself in a MU MIMO systems. Thus, we propose and design
a unified linear precoding (ULP) matrix using a precoding selection technique
that combines the sub-optimal and optimal LP matrix for a multi-user MIMO
systems to ensure zero BER performance gap in this paper. The numerical results
show that our proposed ULP technique offers significant performance in both low
and high SNR scenarios.
Victoria Kostina, Babak Hassibi
Subjects: Information Theory (cs.IT); Systems and Control (cs.SY)
Consider a distributed control problem with a communication channel
connecting the observer of a linear stochastic system to the controller. The
goal of the controller is to minimize a quadratic cost function in the state
variables and control signal, known as the linear quadratic regulator (LQR). We
study the fundamental tradeoff between the communication rate r bits/sec and
the limsup of the expected cost b. In the companion paper, which can be read
independently of the current one, we show a lower bound on a certain cost
function, which quantifies the minimum mutual information between the channel
input and output, given the past, that is compatible with a target LQR cost.
The bound applies as long as the system noise has a probability density
function, and it holds for a general class of codes that can take full
advantage of the memory of the data observed so far and that are not
constrained to have any particular structure. In this paper, we prove that the
bound can be approached by a simple variable-length lattice quantization
scheme, as long as the system noise satisfies a smoothness condition. The
quantization scheme only quantizes the innovation, that is, the difference
between the controller’s belief about the current state and the encoder’s state
estimate. Our proof technique leverages some recent results on nonasymptotic
high resolution vector quantization.
Victoria Kostina, Babak Hassibi
Subjects: Information Theory (cs.IT); Systems and Control (cs.SY)
Consider a distributed control problem with a communication channel
connecting the observer of a linear stochastic system to the controller. The
goal of the controller is to minimize a quadratic cost function in the state
variables and control signal, known as the linear quadratic regulator (LQR). We
study the fundamental tradeoff between the communication rate r bits/sec and
the limsup of the expected cost b. We obtain a lower bound on a certain cost
function, which quantifies the minimum mutual information between the channel
input and output, given the past, that is compatible with a target LQR cost.
The rate-cost function has operational significance in multiple scenarios of
interest: among other, it allows us to lower bound the minimum communication
rate for fixed and variable length quantization, and for control over a noisy
channel. Our results extend and generalize an earlier explicit expression, due
to Tatikonda el al., for the scalar Gaussian case to the vector, non-Gaussian,
and partially observed one. The bound applies as long as the system noise has a
probability density function. Apart from standard dynamic programming
arguments, our proof technique leverages the Shannon lower bound on the
rate-distortion function and proposes new estimates for information measures of
linear combinations of random vectors.
Matthew Kokshoorn, He Chen, Yonghui Li, Branka Vucetic
Comments: Submitted for publication
Subjects: Information Theory (cs.IT)
This paper develops a novel channel estimation approach for multi-user
millimeter wave (mmWave) wireless systems with large antenna arrays. By
exploiting the inherent mmWave channel sparsity, we propose a novel
simultaneous-estimation with iterative fountain training (SWIFT) framework, in
which the average number of channel measurements is adapted to various channel
conditions. To this end, the base station (BS) and each user continue to
measure the channel with a random subset of transmit/receive beamforming
directions until the channel estimate converges. We formulate the channel
estimation process as a compressed sensing problem and apply a sparse
estimation approach to recover the virtual channel information. As SWIFT does
not adapt the BS’s transmitting beams to any single user, we are able to
estimate all user channels simultaneously. Simulation results show that SWIFT
can significantly outperform existing random-beamforming based approaches that
use a fixed number of measurements, over a range of signal-to-noise ratios and
channel coherence times.