Timothée Masquelier
Comments: 10 pages, 5 figures, 1 table
Subjects: Neural and Evolutionary Computing (cs.NE); Neurons and Cognition (q-bio.NC)
By recording multiple cells simultaneously, electrophysiologists have found
evidence for repeating spatiotemporal spike patterns. In sensory systems in
particular, repeating a sensory sequence typically elicits a reproducible spike
pattern, which carries information about the sensory sequence. How this
information is readout by downstream neurons is unclear. In this theoretical
paper, we investigate to what extent a single cell could detect a given spike
pattern and what are the optimal parameters to do so, in particular the
membrane time constant ( au). Using a leaky integrate-and-fire (LIF) neuron
with instantaneous synapses and homogeneous Poisson inputs, we computed this
optimum analytically. Our results indicate that a relatively small ( au) (at
most a few tens of ms) is usually optimal, even when the pattern is longer.
This is somewhat surprising as the resulting detector ignores most of the
pattern, due to its fast memory decay. Next, we wondered if
spike-timing-dependent plasticity (STDP) could enable a neuron to reach the
theoretical optimum. We simulated a LIF neuron equipped with additive STDP, and
repeatedly exposed to a given input spike pattern. As in previous studies, the
LIF progressively became selective to the repeating pattern with no
supervision, even when the pattern was embedded in Poisson activity. Here we
show that, using certain STDP parameters, the resulting pattern detector can be
optimal. Taken together, these results may explain how humans can learn
repeating visual or auditory sequences. Long sequences could be recognized
thanks to coincidence detectors working at a much shorter timescale. This is
consistent with the fact that recognition is still possible if a sound sequence
is compressed of played backward, or scrambled using 10ms bins. Coincidence
detection is a simple yet powerful mechanism, which could be the main function
of neurons in the brain.
Zhiguang Wang, Wei Song, Lu Liu, Fan Zhang, Junxiao Xue, Yangdong Ye, Ming Fan, Mingliang Xu
Comments: Submitted to NeuroComputing. arXiv admin note: text overlap with arXiv:1505.04366 by other authors
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
We propose a new model based on the deconvolutional networks and SAX
discretization to learn the representation for multivariate time series.
Deconvolutional networks fully exploit the advantage the powerful
expressiveness of deep neural networks in the manner of unsupervised learning.
We design a network structure specifically to capture the cross-channel
correlation with deconvolution, forcing the pooling operation to perform the
dimension reduction along each position in the individual channel.
Discretization based on Symbolic Aggregate Approximation is applied on the
feature vectors to further extract the bag of features. We show how this
representation and bag of features helps on classification. A full comparison
with the sequence distance based approach is provided to demonstrate the
effectiveness of our approach on the standard datasets. We further build the
Markov matrix from the discretized representation from the deconvolution to
visualize the time series as complex networks, which show more class-specific
statistical properties and clear structures with respect to different labels.
Jiajun Wu, Chengkai Zhang, Tianfan Xue, William T. Freeman, Joshua B. Tenenbaum
Comments: NIPS 2016. The first two authors contributed equally to this work
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
We study the problem of 3D object generation. We propose a novel framework,
namely 3D Generative Adversarial Network (3D-GAN), which generates 3D objects
from a probabilistic space by leveraging recent advances in volumetric
convolutional networks and generative adversarial nets. The benefits of our
model are three-fold: first, the use of an adversarial criterion, instead of
traditional heuristic criteria, enables the generator to capture object
structure implicitly and to synthesize high-quality 3D objects; second, the
generator establishes a mapping from a low-dimensional probabilistic space to
the space of 3D objects, so that we can sample objects without a reference
image or CAD models, and explore the 3D object manifold; third, the adversarial
discriminator provides a powerful 3D shape descriptor which, learned without
supervision, has wide applications in 3D object recognition. Experiments
demonstrate that our method generates high-quality 3D objects, and our
unsupervisedly learned features achieve impressive performance on 3D object
recognition, comparable with those of supervised learning methods.
Christoforos C. Charalambous, Anil A. Bharath
Comments: The paper and supplementary material are available on this http URL Dataset is available on this http URL Proceedings of the BMVC 2016
Subjects: Computer Vision and Pattern Recognition (cs.CV)
There are several confounding factors that can reduce the accuracy of gait
recognition systems. These factors can reduce the distinctiveness, or alter the
features used to characterise gait, they include variations in clothing,
lighting, pose and environment, such as the walking surface. Full invariance to
all confounding factors is challenging in the absence of high-quality labelled
training data. We introduce a simulation-based methodology and a
subject-specific dataset which can be used for generating synthetic video
frames and sequences for data augmentation. With this methodology, we generated
a multi-modal dataset. In addition, we supply simulation files that provide the
ability to simultaneously sample from several confounding variables. The basis
of the data is real motion capture data of subjects walking and running on a
treadmill at different speeds. Results from gait recognition experiments
suggest that information about the identity of subjects is retained within
synthetically generated examples. The dataset and methodology allow studies
into fully-invariant identity recognition spanning a far greater number of
observation conditions than would otherwise be possible.
Sohini Roychowdhury, Dara D. Koozekanani, Michael Reinsbach, Keshab K. Parhi
Comments: 31 pages, 7 figures, CRC Press Book Chapter, 2016
Subjects: Computer Vision and Pattern Recognition (cs.CV)
This paper presents a novel automated system that segments six sub-retinal
layers from optical coherence tomography (OCT) image stacks of healthy patients
and patients with diabetic macular edema (DME). First, each image in the OCT
stack is denoised using a Wiener deconvolution algorithm that estimates the
additive speckle noise variance using a novel Fourier-domain based structural
error. This denoising method enhances the image SNR by an average of 12dB.
Next, the denoised images are subjected to an iterative multi-resolution
high-pass filtering algorithm that detects seven sub-retinal surfaces in six
iterative steps. The thicknesses of each sub-retinal layer for all scans from a
particular OCT stack are then compared to the manually marked groundtruth. The
proposed system uses adaptive thresholds for denoising and segmenting each
image and hence it is robust to disruptions in the retinal micro-structure due
to DME. The proposed denoising and segmentation system has an average error of
1.2-5.8 (mu m) and 3.5-26(mu m) for segmenting sub-retinal surfaces in normal
and abnormal images with DME, respectively. For estimating the sub-retinal
layer thicknesses, the proposed system has an average error of 0.2-2.5 (mu m)
and 1.8-18 (mu m) in normal and abnormal images, respectively. Additionally,
the average inner sub-retinal layer thickness in abnormal images is estimated
as 275(mu m (r=0.92)) with an average error of 9.3 (mu m), while the average
thickness of the outer layers in abnormal images is estimated as 57.4(mu m
(r=0.74)) with an average error of 3.5 (mu m). The proposed system can be
useful for tracking the disease progression for DME over a period of time.
Mohammad-Parsa Hosseini, Mohammad-Reza Nazem-Zadeh, Dario Pompili, Kourosh Jafari-Khouzani, Kost Elisevich, Hamid Soltanian-Zadeh
Comments: Presented in the 6th Annual New York Medical Imaging Informatics Symposium (NYMIIS), New York, NY, USA, Sept. 2015
Subjects: Computer Vision and Pattern Recognition (cs.CV); Medical Physics (physics.med-ph)
The hippocampus is a seminal structure in the most common surgically-treated
form of epilepsy. Accurate segmentation of the hippocampus aids in establishing
asymmetry regarding size and signal characteristics in order to disclose the
likely site of epileptogenicity. With sufficient refinement, it may ultimately
aid in the avoidance of invasive monitoring with its expense and risk for the
patient. To this end, a reliable and consistent method for segmentation of the
hippocampus from magnetic resonance imaging (MRI) is needed. In this work, we
present a systematic and statistical analysis approach for evaluation of
automated segmentation methods in order to establish one that reliably
approximates the results achieved by manual tracing of the hippocampus.
Yu Song, Yiquan Wu
Comments: 17 pages, 4 figures, 5 tables. The paper is submitted to “computer vision and image understanding”. arXiv admin note: text overlap with arXiv:1610.03604
Subjects: Computer Vision and Pattern Recognition (cs.CV)
The problem of fitting a union of subspaces to a collection of data points
drawn from multiple subspaces is considered in this paper. In the traditional
low rank representation model, the dictionary used to represent the data points
is chosen as the data points themselves and thus the dictionary is corrupted
with noise. This problem is solved in the low rank subspace clustering model
which decomposes the corrupted data matrix as the sum of a clean and
self-expressive dictionary plus a matrix of noise and gross errors. Also, the
clustering results of the low rank representation model can be enhanced by
using a graph of data similarity. This model is called Laplacian regularized
low rank representation model with a graph regularization term added to the
objective function. Inspired from the above two ideas, in this paper a
Laplacian regularized low rank subspace clustering model is proposed. This
model uses a clean dictionary to represent the data points and a graph
regularization term is also incorporated in the objective function.
Experimental results show that, compared with the traditional low rank
representation model, low rank subspace clustering model and several other
state-of-the-art subspace clustering model, the model proposed in this paper
can get better subspace clustering results with lower clustering error.
Siqi Bao, Albert C. S. Chung
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In this paper, a novel label fusion method is proposed for brain magnetic
resonance image segmentation. This label fusion method is formulated on a
graph, which embraces both label priors from atlases and anatomical priors from
target image. To represent a pixel in a comprehensive way, three kinds of
feature vectors are generated, including intensity, gradient and structural
signature. To select candidate atlas nodes for fusion, rather than exact
searching, randomized k-d tree with spatial constraint is introduced as an
efficient approximation for high-dimensional feature matching. Feature
Sensitive Label Prior (FSLP), which takes both the consistency and variety of
different features into consideration, is proposed to gather atlas priors. As
FSLP is a non-convex problem, one heuristic approach is further designed to
solve it efficiently. Moreover, based on the anatomical knowledge, parts of the
target pixels are also employed as graph seeds to assist the label fusion
process and an iterative strategy is utilized to gradually update the label
map. The comprehensive experiments carried out on two publicly available
databases give results to demonstrate that the proposed method can obtain
better segmentation quality.
Mohsen Ghafoorian, Nico Karssemeijer, Tom Heskes, Mayra Bergkamp, Joost Wissink, Jiri Obels, Karlijn Keizer, Frank-Erik de Leeuw, Bram van Ginneken, Elena Marchiori, Bram Platel
Comments: 11 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Lacunes of presumed vascular origin (lacunes) are associated with an
increased risk of stroke, gait impairment, and dementia and are a primary
imaging feature of the small vessel disease. Quantification of lacunes may be
of great importance to elucidate the mechanisms behind neuro-degenerative
disorders and is recommended as part of study standards for small vessel
disease research. However, due to the different appearance of lacunes in
various brain regions and the existence of other similar-looking structures,
such as perivascular spaces, manual annotation is a difficult, elaborative and
subjective task, which can potentially be greatly improved by reliable and
consistent computer-aided detection (CAD) routines.
In this paper, we propose an automated two-stage method using deep
convolutional neural networks (CNN). We show that this method has good
performance and can considerably benefit readers. We first use a fully
convolutional neural network to detect initial candidates. In the second step,
we employ a 3D CNN as a false positive reduction tool. As the location
information is important to the analysis of candidate structures, we further
equip the network with contextual information using multi-scale analysis and
integration of explicit location features. We trained, validated and tested our
networks on a large dataset of 1075 cases obtained from two different studies.
Subsequently, we conducted an observer study with four trained observers and
compared our method with them using a free-response operating characteristic
analysis. Shown on a test set of 111 cases, the resulting CAD system exhibits
performance similar to the trained human observers and achieves a sensitivity
of 0.974 with 0.13 false positives per slice. A feasibility study also showed
that a trained human observer would considerably benefit once aided by the CAD
system.
Samuele Capobianco, Simone Marinai
Comments: Deep Learning for Pattern Recognition
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In this paper, we investigate the use of Convolutional Neural Networks for
counting the number of records in historical handwritten documents. With this
work we demonstrate that training the networks only with synthetic images
allows us to perform a near perfect evaluation of the number of records printed
on historical documents. The experiments have been performed on a benchmark
dataset composed by marriage records and outperform previous results on this
dataset.
Christos Sakaridis, Kimon Drakopoulos, Petros Maragos
Comments: 20 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Active contour models based on partial differential equations have proved
successful in image segmentation, yet the study of their geometric formulation
on arbitrary geometric graphs is still at an early stage. In this paper, we
introduce geometric approximations of gradient and curvature, which are used in
the geodesic active contour model. We prove convergence in probability of our
gradient approximation to the true gradient value and derive an asymptotic
upper bound for the error of this approximation for the class of random
geometric graphs. Two different approaches for the approximation of curvature
are presented and both are also proved to converge in probability in the case
of random geometric graphs. We propose neighborhood-based filtering on graphs
to improve the accuracy of the aforementioned approximations and define two
variants of Gaussian smoothing on graphs which include normalization in order
to adapt to graph non-uniformities. The performance of our active contour
framework on graphs is demonstrated in the segmentation of regular images and
geographical data defined on arbitrary graphs.
Steffen Urban, Stefan Hinz
Comments: 15 pages, 8 figures, 2 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
The basis for most vision based applications like robotics, self-driving cars
and potentially augmented and virtual reality is a robust, continuous
estimation of the position and orientation of a camera system w.r.t the
observed environment (scene). In recent years many vision based systems that
perform simultaneous localization and mapping (SLAM) have been presented and
released as open source. In this paper, we extend and improve upon a
state-of-the-art SLAM to make it applicable to arbitrary, rigidly coupled
multi-camera systems (MCS) using the MultiCol model. In addition, we include a
performance evaluation on accurate ground truth and compare the robustness of
the proposed method to a single camera version of the SLAM system. An open
source implementation of the proposed multi-fisheye camera SLAM system can be
found on-line this https URL
Xiaoshui Huang, Jian Zhang, Qiang Wu, Lixin Fan, Chun Yuan
Subjects: Computer Vision and Pattern Recognition (cs.CV)
With the development of numerous 3D sensing technologies, object registration
on cross-source point cloud has aroused researchers’ interests. When the point
clouds are captured from different kinds of sensors, there are large and
different kinds of variations. In this study, we address an even more
challenging case in which the differently-source point clouds are acquired from
a real street view. One is produced directly by the LiDAR system and the other
is generated by using VSFM software on image sequence captured from RGB
cameras. When it confronts to large scale point clouds, previous methods mostly
focus on point-to-point level registration, and the methods have many
limitations.The reason is that the least mean error strategy shows poor ability
in registering large variable cross-source point clouds. In this paper,
different from previous ICP-based methods, and from a statistic view, we
propose a effective coarse-to-fine algorithm to detect and register a small
scale SFM point cloud in a large scale Lidar point cloud. Seen from the
experimental results, the model can successfully run on LiDAR and SFM point
clouds, hence it can make a contribution to many applications, such as robotics
and smart city development.
François-Xavier Derue, Guillaume-Alexandre Bilodeau, Robert Bergevin
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In visual tracking, part-based trackers are attractive since they are robust
against occlusion and deformation. However, a part represented by a rectangular
patch does not account for the shape of the target, while a superpixel does
thanks to its boundary evidence. Nevertheless, tracking superpixels is
difficult due to their lack of discriminative power. Therefore, to enable
superpixels to be tracked discriminatively as object parts, we propose to
enhance them with keypoints. By combining properties of these two features, we
build a novel element designated as a Superpixel-Keypoints structure (SPiKeS).
Being discriminative, these new object parts can be located efficiently by a
simple nearest neighbor matching process. Then, in a tracking process, each
match votes for the target’s center to give its location. In addition, the
interesting properties of our new feature allows the development of an
efficient model update for more robust tracking. According to experimental
results, our SPiKeS-based tracker proves to be robust in many challenging
scenarios by performing favorably against the state-of-the-art.
Nazanin Sadat Hashemi, Roya Babaie Aghdam, Atieh Sadat Bayat Ghiasi, Parastoo Fatemi
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
In most computer vision and image analysis problems, it is necessary to
define a similarity measure between two or more different objects or images.
Template matching is a classic and fundamental method used to score
similarities between objects using certain mathematical algorithms. In this
paper, we reviewed the basic concept of matching, as well as advances in
template matching and applications such as invariant features or novel
applications in medical image analysis. Additionally, deformable models and
templates originating from classic template matching were discussed. These
models have broad applications in image registration, and they are a
fundamental aspect of novel machine vision or deep learning algorithms, such as
convolutional neural networks (CNN), which perform shift and scale invariant
functions followed by classification. In general, although template matching
methods have restrictions which limit their application, they are recommended
for use with other object recognition methods as pre- or post-processing steps.
Combining a template matching technique such as normalized cross-correlation or
dice coefficient with a robust decision-making algorithm yields a significant
improvement in the accuracy rate for object detection and recognition.
Jiawei Zhang, Jianbo Jiao, Mingliang Chen, Liangqiong Qu, Xiaobin Xu, Qingxiong Yang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
3D hand pose tracking/estimation will be very important in the next
generation of human-computer interaction. Most of the currently available
algorithms rely on low-cost active depth sensors. However, these sensors can be
easily interfered by other active sources and require relatively high power
consumption. As a result, they are currently not suitable for outdoor
environments and mobile devices. This paper aims at tracking/estimating hand
poses using passive stereo which avoids these limitations. A benchmark with
18,000 stereo image pairs and 18,000 depth images captured from different
scenarios and the ground-truth 3D positions of palm and finger joints (obtained
from the manual label) is thus proposed. This paper demonstrates that the
performance of the state-of-the art tracking/estimation algorithms can be
maintained with most stereo matching algorithms on the proposed benchmark, as
long as the hand segmentation is correct. As a result, a novel stereo-based
hand segmentation algorithm specially designed for hand tracking/estimation is
proposed. The quantitative evaluation demonstrates that the proposed algorithm
is suitable for the state-of-the-art hand pose tracking/estimation algorithms
and the tracking quality is comparable to the use of active depth sensors under
different challenging scenarios.
Lucas Thies, Michael Zollhöfer, Christian Richardt, Christian Theobalt, Günther Greiner
Comments: Proc. of the International Conference on 3D Vision 2016 (3DV 2016)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We present a novel approach for real-time joint reconstruction of 3D scene
motion and geometry from binocular stereo videos. Our approach is based on a
novel variational halfway-domain scene flow formulation, which allows us to
obtain highly accurate spatiotemporal reconstructions of shape and motion. We
solve the underlying optimization problem at real-time frame rates using a
novel data-parallel robust non-linear optimization strategy. Fast convergence
and large displacement flows are achieved by employing a novel hierarchy that
stores delta flows between hierarchy levels. High performance is obtained by
the introduction of a coarser warp grid that decouples the number of unknowns
from the input resolution of the images. We demonstrate our approach in a live
setup that is based on two commodity webcams, as well as on publicly available
video data. Our extensive experiments and evaluations show that our approach
produces high-quality dense reconstructions of 3D geometry and scene flow at
real-time frame rates, and compares favorably to the state of the art.
Yuan Xie, Dacheng Tao, Wensheng Zhang, Lei Zhang
Comments: 13pages, 9 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In this paper, we address the multi-view subspace clustering problem. Our
method utilize the circulant algebra for tensor, which is constructed by
stacking the subspace representation matrices of different views and then
shifting, to explore the high order correlations underlying multi-view data. By
introducing a recently proposed tensor factorization, namely tensor-Singular
Value Decomposition (t-SVD) cite{kilmer13}, we can impose a new type of
low-rank tensor constraint on the shifted tensor to capture the complementary
information from multiple views. Different from traditional unfolding based
tensor norm, this low-rank tensor constraint has optimality properties similar
to that of matrix rank derived from SVD, so that complementary information
among views can be explored more efficiently and thoroughly. The established
model, called t-SVD based Multi-view Subspace Clustering (t-SVD-MSC), falls
into the applicable scope of augmented Lagrangian method, and its minimization
problem can be efficiently solved with theoretical convergence guarantee and
relatively low computational complexity. Extensive experimental testing on
eight challenging image dataset shows that the proposed method has achieved
highly competent objective performance compared to several state-of-the-art
multi-view clustering methods.
Gwenolé Quellec, Katia Charrière, Yassine Boudi, Béatrice Cochener, Mathieu Lamard
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Deep learning is quickly becoming the leading methodology for medical image
analysis. Given a large medical archive, where each image is associated with a
diagnosis, efficient pathology detectors or classifiers can be trained with
virtually no expert knowledge about the target pathologies. However, deep
learning algorithms, including the popular ConvNets, are black boxes: little is
known about the local patterns analyzed by ConvNets to make a decision at the
image level. A solution is proposed in this paper to create heatmaps showing
which pixels in images play a role in the image-level predictions. In other
words, a ConvNet trained for image-level classification can be used to detect
lesions as well. A generalization of the backpropagation method is proposed in
order to train ConvNets that produce high-quality heatmaps. The proposed
solution is applied to diabetic retinopathy (DR) screening in a dataset of
almost 90,000 fundus photographs from the 2015 Kaggle Diabetic Retinopathy
competition. For the task of detecting referable DR, the proposed system
outperforms state-of-the-art methods based on deep learning: (A_z) = 0.954, as
opposed to (A_z) = 0.946 for Colas et al. (2016) in particular. Performance at
the pixel level was evaluated in the DiaretDB1 dataset, where four types of
lesions are manually segmented: microaneurysms, hemorrhages, exudates and
cotton-wool spots. For all lesion types, the proposed detector, trained with
image-level supervision, outperforms recent algorithms, even though they were
trained with pixel-level supervision. This detector is part of the RetinOpTIC
system for mobile eye pathology screening. Because it does not rely on expert
knowledge or manual segmentation for detecting relevant patterns, the proposed
solution can potentially discover new biomarkers in images, which makes it a
promising image mining tool.
Terry Taewoong Um, Vahid Babakeshizadeh, Dana Kulic
Comments: submitted to ICRA2017
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
The ability to accurately observe human motion and identify human activities
is essential for developing automatic rehabilitation and sports training
systems. In this paper, large-scale exercise motion data obtained from a
forearm-worn wearable sensor are classified with a convolutional neural network
(CNN). Time series data consisting of accelerometer and orientation
measurements are formatted as “images”, allowing the CNN to automatically
extract discriminative features. The resulting CNN classifies 50 gym exercises
with 92.14% accuracy. A comparative study on the effects of image formatting
and different CNN architectures is also presented.
Mete Ozay, Takayuki Okatani
Comments: 9 pages, 3 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Kernel normalization methods have been employed to improve robustness of
optimization methods to reparametrization of convolution kernels, covariate
shift, and to accelerate training of Convolutional Neural Networks (CNNs).
However, our understanding of theoretical properties of these methods has
lagged behind their success in applications. We develop a geometric framework
to elucidate underlying mechanisms of a diverse range of kernel normalization
methods. Our framework enables us to expound and identify geometry of space of
normalized kernels. We analyze and delineate how state-of-the-art kernel
normalization methods affect the geometry of search spaces of the stochastic
gradient descent (SGD) algorithms in CNNs. Following our theoretical results,
we propose a SGD algorithm with assurance of almost sure convergence of the
methods to a solution at single minimum of classification loss of CNNs.
Experimental results show that the proposed method achieves state-of-the-art
performance for major image classification benchmarks with CNNs.
Utsav B. Gewali, Sildomar T. Monteiro
Comments: In Proc. 8th IEEE Workshop on Hyperspectral Image and Signal Processing : Evolution in Remote Sensing (WHISPERS), August 2016
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Statistical models have been successful in accurately estimating the
biochemical contents of vegetation from the reflectance spectra. However, their
performance deteriorates when there is a scarcity of sizable amount of ground
truth data for modeling the complex non-linear relationship occurring between
the spectrum and the biochemical quantity. We propose a novel Gaussian process
based multitask learning method for improving the prediction of a biochemical
through the transfer of knowledge from the learned models for predicting
related biochemicals. This method is most advantageous when there are few
ground truth data for the biochemical of interest, but plenty of ground truth
data for related biochemicals. The proposed multitask Gaussian process
hypothesizes that the inter-relationship between the biochemical quantities is
better modeled by using a combination of two or more covariance functions and
inter-task correlation matrices. In the experiments, our method outperformed
the current methods on two real-world datasets.
Utsav B. Gewali, Sildomar T. Monteiro
Comments: In Proc. 8th IEEE Workshop on Hyperspectral Image and Signal Processing : Evolution in Remote Sensing (WHISPERS), August 2016
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In this paper, we propose and compare two spectral angle based approaches for
spatial-spectral classification. Our methods use the spectral angle to generate
unary energies in a grid-structured Markov random field defined over the pixel
labels of a hyperspectral image. The first approach is to use the exponential
spectral angle mapper (ESAM) kernel/covariance function, a spectral angle based
function, with the support vector machine and the Gaussian process classifier.
The second approach is to directly use the minimum spectral angle between the
test pixel and the training pixels as the unary energy. We compare the proposed
methods with the state-of-the-art Markov random field methods that use support
vector machines and Gaussian processes with squared exponential
kernel/covariance function. In our experiments with two datasets, it is seen
that using minimum spectral angle as unary energy produces better or comparable
results to the existing methods at a smaller running time.
Nakka Krishna Kanth
Comments: Master Thesis, EE IIT KGP, May 2015
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Tourists and Wild-life photographers are often hindered in capturing their
cherished images or videos by a fence that limits accessibility to the scene of
interest. The situation has been exacerbated by growing concerns of security at
public places and a need exists to provide a tool that can be used for
post-processing such fenced videos to produce a de-fenced image. There are
several challenges in this problem, we identify them as Robust detection of
fence/occlusions and Estimating pixel motion of background scenes and Filling
in the fence/occlusions by utilizing information in multiple frames of the
input video. In this work, we aim to build an automatic post-processing tool
that can efficiently rid the input video of occlusion artifacts like fences.
Our work is distinguished by two major contributions. The first is the
introduction of learning based technique to detect the fences patterns with
complicated backgrounds. The second is the formulation of objective function
and further minimization through loopy belief propagation to fill-in the fence
pixels. We observe that grids of Histogram of oriented gradients descriptor
using Support vector machines based classifier significantly outperforms
detection accuracy of texels in a lattice. We present results of experiments
using several real-world videos to demonstrate the effectiveness of the
proposed fence detection and de-fencing algorithm.
Douwe Kiela, Luana Bulat, Anita L. Vero, Stephen Clark
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
Meaning has been called the “holy grail” of a variety of scientific
disciplines, ranging from linguistics to philosophy, psychology and the
neurosciences. The field of Artifical Intelligence (AI) is very much a part of
that list: the development of sophisticated natural language semantics is a
sine qua non for achieving a level of intelligence comparable to humans.
Embodiment theories in cognitive science hold that human semantic
representation depends on sensori-motor experience; the abundant evidence that
human meaning representation is grounded in the perception of physical reality
leads to the conclusion that meaning must depend on a fusion of multiple
(perceptual) modalities. Despite this, AI research in general, and its
subdisciplines such as computational linguistics and computer vision in
particular, have focused primarily on tasks that involve a single modality.
Here, we propose virtual embodiment as an alternative, long-term strategy for
AI research that is multi-modal in nature and that allows for the kind of
scalability required to develop the field coherently and incrementally, in an
ethically responsible fashion.
J. Albericio, P. Judd, A. Delmás, S. Sharify, A. Moshovos
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Hardware Architecture (cs.AR); Computer Vision and Pattern Recognition (cs.CV)
We quantify a source of ineffectual computations when processing the
multiplications of the convolutional layers in Deep Neural Networks (DNNs) and
propose Pragmatic (PRA), an architecture that exploits it improving performance
and energy efficiency. The source of these ineffectual computations is best
understood in the context of conventional multipliers which generate internally
multiple terms, that is, products of the multiplicand and powers of two, which
added together produce the final product [1]. At runtime, many of these terms
are zero as they are generated when the multiplicand is combined with the
zero-bits of the multiplicator. While conventional bit-parallel multipliers
calculate all terms in parallel to reduce individual product latency, PRA
calculates only the non-zero terms using a) on-the-fly conversion of the
multiplicator representation into an explicit list of powers of two, and b)
hybrid bit-parallel multplicand/bit-serial multiplicator processing units. PRA
exploits two sources of ineffectual computations: 1) the aforementioned zero
product terms which are the result of the lack of explicitness in the
multiplicator representation, and 2) the excess in the representation precision
used for both multiplicants and multiplicators, e.g., [2]. Measurements
demonstrate that for the convolutional layers, a straightforward variant of PRA
improves performance by 2.6x over the DaDiaNao (DaDN) accelerator [3] and by
1.4x over STR [4]. Similarly, PRA improves energy efficiency by 28% and 10% on
average compared to DaDN and STR. An improved cross lane synchronication scheme
boosts performance improvements to 3.1x over DaDN. Finally, Pragmatic benefits
persist even with an 8-bit quantized representation [5].
Ahmed M. Alaa, Mihaela van der Schaar
Subjects: Artificial Intelligence (cs.AI)
We develop a Bayesian model for decision-making under time pressure with
endogenous information acquisition. In our model, the decision maker decides
when to observe (costly) information by sampling an underlying continuous-time
stochastic process (time series) that conveys information about the potential
occurrence or non-occurrence of an adverse event which will terminate the
decision-making process. In her attempt to predict the occurrence of the
adverse event, the decision-maker follows a policy that determines when to
acquire information from the time series (continuation), and when to stop
acquiring information and make a final prediction (stopping). We show that the
optimal policy has a rendezvous structure, i.e. a structure in which whenever a
new information sample is gathered from the time series, the optimal “date” for
acquiring the next sample becomes computable. The optimal interval between two
information samples balances a trade-off between the decision maker’s surprise,
i.e. the drift in her posterior belief after observing new information, and
suspense, i.e. the probability that the adverse event occurs in the time
interval between two information samples. Moreover, we characterize the
continuation and stopping regions in the decision-maker’s state-space, and show
that they depend not only on the decision-maker’s beliefs, but also on the
context, i.e. the current realization of the time series.
Douwe Kiela, Luana Bulat, Anita L. Vero, Stephen Clark
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
Meaning has been called the “holy grail” of a variety of scientific
disciplines, ranging from linguistics to philosophy, psychology and the
neurosciences. The field of Artifical Intelligence (AI) is very much a part of
that list: the development of sophisticated natural language semantics is a
sine qua non for achieving a level of intelligence comparable to humans.
Embodiment theories in cognitive science hold that human semantic
representation depends on sensori-motor experience; the abundant evidence that
human meaning representation is grounded in the perception of physical reality
leads to the conclusion that meaning must depend on a fusion of multiple
(perceptual) modalities. Despite this, AI research in general, and its
subdisciplines such as computational linguistics and computer vision in
particular, have focused primarily on tasks that involve a single modality.
Here, we propose virtual embodiment as an alternative, long-term strategy for
AI research that is multi-modal in nature and that allows for the kind of
scalability required to develop the field coherently and incrementally, in an
ethically responsible fashion.
László Csató
Comments: 11 pages
Subjects: Artificial Intelligence (cs.AI)
Pairwise comparisons between alternatives are a well-known method of
decision-making. Since the preferences do often not exhibit consistency, a
number of inconsistency indices have been defined in order to measure the
deviation from this ideal case. Recently, some axioms have been proposed to
identify indices that evaluate inconsistency correctly. Inconsistency rankings
are weak orders on the set of pairwise comparison matrices with respect to
their inconsistency level and can be defined by any inconsistency index. This
study characterizes a specific inconsistency ranking by providing six
independent axioms such that they determine a unique weak order on the set of
all pairwise comparison matrices.
Yi Tay, Cong-Minh Phan, Tuan-Anh Nguyen Pham
Comments: 4 pages Competition Report for CIKM Cup
Subjects: Artificial Intelligence (cs.AI)
We describe the 1st place winning approach for the CIKM Cup 2016 Challenge.
In this paper, we provide an approach to reasonably identify same users across
multiple devices based on browsing logs. Our approach regards a candidate
ranking problem as pairwise classification and utilizes an unsupervised neural
feature ensemble approach to learn latent features of users. Combined with
traditional hand crafted features, each user pair feature is fed into a
supervised classifier in order to perform pairwise classification. Lastly, we
propose supervised and unsupervised inference techniques.
Dominik Meyer, Johannes Feldmaier, Hao Shen
Subjects: Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Robotics (cs.RO)
In this work, we investigate the application of Reinforcement Learning to two
well known decision dilemmas, namely Newcomb’s Problem and Prisoner’s Dilemma.
These problems are exemplary for dilemmas that autonomous agents are faced with
when interacting with humans. Furthermore, we argue that a Newcomb-like
formulation is more adequate in the human-machine interaction case and
demonstrate empirically that the unmodified Reinforcement Learning algorithms
end up with the well known maximum expected utility solution.
Julie Yixuan Zhu, Chao Zhang, Shi Zhi, Victor O.K. Li, Jiawei Han, Yu Zheng
Subjects: Artificial Intelligence (cs.AI); Data Structures and Algorithms (cs.DS)
Many countries are suffering from severe air pollution. Understanding how
different air pollutants accumulate and propagate is critical to making
relevant public policies. In this paper, we use urban big data (air quality
data and meteorological data) to identify the emph{spatiotemporal (ST) causal
pathways} for air pollutants. This problem is challenging because: (1) there
are numerous noisy and low-pollution periods in the raw air quality data, which
may lead to unreliable causality analysis, (2) for large-scale data in the ST
space, the computational complexity of constructing a causal structure is very
high, and (3) the emph{ST causal pathways} are complex due to the interactions
of multiple pollutants and the influence of environmental factors. Therefore,
we present emph{p-Causality}, a novel pattern-aided causality analysis
approach that combines the strengths of emph{pattern mining} and
emph{Bayesian learning} to efficiently and faithfully identify the emph{ST
causal pathways}. First, emph{Pattern mining} helps suppress the noise by
capturing frequent evolving patterns (FEPs) of each monitoring sensor, and
greatly reduce the complexity by selecting the pattern-matched sensors as
“causers”. Then, emph{Bayesian learning} carefully encodes the local and ST
causal relations with a Gaussian Bayesian network (GBN)-based graphical model,
which also integrates environmental influences to minimize biases in the final
results. We evaluate our approach with three real-world data sets containing
982 air quality sensors, in three regions of China from 01-Jun-2013 to
19-Dec-2015. Results show that our approach outperforms the traditional causal
structure learning methods in time efficiency, inference accuracy and
interpretability.
Himabindu Lakkaraju, Cynthia Rudin
Subjects: Artificial Intelligence (cs.AI)
Decision makers, such as doctors and judges, make crucial decisions such as
recommending treatments to patients, and granting bails to defendants on a
daily basis. Such decisions typically involve weighting the potential benefits
of taking an action against the costs involved. In this work, we aim to
automate this task of learning emph{cost-effective, interpretable and
actionable treatment regimes}. We formulate this as a problem of learning a
decision list — a sequence of if-then-else rules — which maps characteristics
of subjects (eg., diagnostic test results of patients) to treatments. We
propose a novel objective to construct a decision list which maximizes outcomes
for the population, and minimizes overall costs. We model the problem of
learning such a list as a Markov Decision Process (MDP) and employ a variant of
the Upper Confidence Bound for Trees (UCT) strategy which leverages customized
checks for pruning the search space effectively. Experimental results on real
world observational data capturing judicial bail decisions and treatment
recommendations for asthma patients demonstrate the effectiveness of our
approach.
Xiaowei Huang, Marta Kwiatkowska, Sen Wang, Min Wu
Subjects: Artificial Intelligence (cs.AI)
Deep neural networks have achieved impressive experimental results in image
classification, but can surprisingly be unstable with respect to adversarial
perturbations, that is, minimal changes to the input image that cause the
network to misclassify it. With potential applications including perception
modules and end-to-end controllers for self-driving cars, this raises concerns
about their safety. We develop the first SMT-based automated verification
framework for feed-forward multi-layer neural networks that works directly with
the code of the network, exploring it layer by layer. We define safety for a
region around a data point in a given layer by requiring that all points in the
region are assigned the same class label. Working with a notion of a
manipulation, a mapping between points that intuitively corresponds to a
modification of an image, we employ discretisation to enable exhaustive search
of the region. Our method can guarantee that adversarial examples are found for
the given region and set of manipulations. If found, adversarial examples can
be shown to human testers and/or used to fine-tune the network, and otherwise
the network is declared safe for the given parameters. We implement the
techniques using Z3 and evaluate them on state-of-the-art networks, including
regularised and deep learning networks.
Thierry Poibeau (LaTTICe), Shravan Vasishth
Journal-ref: Traitement Automatique des Langues, ATALA, 2014, Traitement
Automatique des Langues et Sciences Cognitives, 55 (3), pp.7-19
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)
This special issue is dedicated to get a better picture of the relationships
between computational linguistics and cognitive science. It specifically raises
two questions: “what is the potential contribution of computational language
modeling to cognitive science?” and conversely: “what is the influence of
cognitive science in contemporary computational linguistics?”
Zhiguang Wang, Wei Song, Lu Liu, Fan Zhang, Junxiao Xue, Yangdong Ye, Ming Fan, Mingliang Xu
Comments: Submitted to NeuroComputing. arXiv admin note: text overlap with arXiv:1505.04366 by other authors
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
We propose a new model based on the deconvolutional networks and SAX
discretization to learn the representation for multivariate time series.
Deconvolutional networks fully exploit the advantage the powerful
expressiveness of deep neural networks in the manner of unsupervised learning.
We design a network structure specifically to capture the cross-channel
correlation with deconvolution, forcing the pooling operation to perform the
dimension reduction along each position in the individual channel.
Discretization based on Symbolic Aggregate Approximation is applied on the
feature vectors to further extract the bag of features. We show how this
representation and bag of features helps on classification. A full comparison
with the sequence distance based approach is provided to demonstrate the
effectiveness of our approach on the standard datasets. We further build the
Markov matrix from the discretized representation from the deconvolution to
visualize the time series as complex networks, which show more class-specific
statistical properties and clear structures with respect to different labels.
Nazanin Sadat Hashemi, Roya Babaie Aghdam, Atieh Sadat Bayat Ghiasi, Parastoo Fatemi
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
In most computer vision and image analysis problems, it is necessary to
define a similarity measure between two or more different objects or images.
Template matching is a classic and fundamental method used to score
similarities between objects using certain mathematical algorithms. In this
paper, we reviewed the basic concept of matching, as well as advances in
template matching and applications such as invariant features or novel
applications in medical image analysis. Additionally, deformable models and
templates originating from classic template matching were discussed. These
models have broad applications in image registration, and they are a
fundamental aspect of novel machine vision or deep learning algorithms, such as
convolutional neural networks (CNN), which perform shift and scale invariant
functions followed by classification. In general, although template matching
methods have restrictions which limit their application, they are recommended
for use with other object recognition methods as pre- or post-processing steps.
Combining a template matching technique such as normalized cross-correlation or
dice coefficient with a robust decision-making algorithm yields a significant
improvement in the accuracy rate for object detection and recognition.
J. Albericio, P. Judd, A. Delmás, S. Sharify, A. Moshovos
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Hardware Architecture (cs.AR); Computer Vision and Pattern Recognition (cs.CV)
We quantify a source of ineffectual computations when processing the
multiplications of the convolutional layers in Deep Neural Networks (DNNs) and
propose Pragmatic (PRA), an architecture that exploits it improving performance
and energy efficiency. The source of these ineffectual computations is best
understood in the context of conventional multipliers which generate internally
multiple terms, that is, products of the multiplicand and powers of two, which
added together produce the final product [1]. At runtime, many of these terms
are zero as they are generated when the multiplicand is combined with the
zero-bits of the multiplicator. While conventional bit-parallel multipliers
calculate all terms in parallel to reduce individual product latency, PRA
calculates only the non-zero terms using a) on-the-fly conversion of the
multiplicator representation into an explicit list of powers of two, and b)
hybrid bit-parallel multplicand/bit-serial multiplicator processing units. PRA
exploits two sources of ineffectual computations: 1) the aforementioned zero
product terms which are the result of the lack of explicitness in the
multiplicator representation, and 2) the excess in the representation precision
used for both multiplicants and multiplicators, e.g., [2]. Measurements
demonstrate that for the convolutional layers, a straightforward variant of PRA
improves performance by 2.6x over the DaDiaNao (DaDN) accelerator [3] and by
1.4x over STR [4]. Similarly, PRA improves energy efficiency by 28% and 10% on
average compared to DaDN and STR. An improved cross lane synchronication scheme
boosts performance improvements to 3.1x over DaDN. Finally, Pragmatic benefits
persist even with an 8-bit quantized representation [5].
Wei Tan, Liangliang Cao, Liana Fong
Comments: 12 pages, 11 figures
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI); Performance (cs.PF)
Matrix factorization (MF) is employed by many popular algorithms, e.g.,
collaborative filtering. The emerging GPU technology, with massively multicore
and high intra-chip memory bandwidth but limited memory capacity, presents an
opportunity for accelerating MF much further when appropriately exploiting the
GPU architectural characteristics.
This paper presents cuMF, a CUDA-based matrix factorization library that
implements memory-optimized alternate least square (ALS) method to solve very
large-scale MF. CuMF uses a variety set of techniques to maximize the
performance on either single or multiple GPUs. These techniques include smart
access of sparse data leveraging GPU memory hierarchy, using data parallelism
in conjunction with model parallelism, minimizing the communication overhead
between computing units, and utilizing a novel topology-aware parallel
reduction scheme.
With only a single machine with four Nvidia GPU cards, cuMF can be 6-10 times
as fast, and 33-100 times as cost-efficient, compared with the state-of-art
distributed CPU solutions. Moreover, this cuMF can solve the largest matrix
factorization problem ever reported yet in current literature, while
maintaining impressively good performance.
Chen Luo, Anshumali Shrivastava
Subjects: Information Retrieval (cs.IR)
Similarity search on time series is a frequent operation in large-scale
data-driven applications. Sophisticated similarity measures are standard for
time series matching, as they are usually misaligned. Dynamic Time Warping or
DTW is the most widely used similarity measure for time series because it
combines alignment and matching at the same time. However, the alignment makes
DTW slow. To speed up the expensive similarity search with DTW, branch and
bound based pruning strategies are adopted. However, branch and bound based
pruning are only useful for very short queries (low dimensional time series),
and the bounds are quite weak for longer queries. Due to the loose bounds
branch and bound pruning strategy boils down to a brute-force search.
To circumvent this issue, we design SSH (Sketch, Shingle, & Hashing), an
efficient and approximate hashing scheme which is much faster than the
state-of-the-art branch and bound searching technique: the UCR suite. SSH uses
a novel combination of sketching, shingling and hashing techniques to produce
(probabilistic) indexes which align (near perfectly) with DTW similarity
measure. The generated indexes are then used to create hash buckets for
sub-linear search. Our results show that SSH is very effective for longer time
sequence and prunes around 95% candidates, leading to the massive speedup in
search with DTW. Empirical results on two large-scale benchmark time series
data show that our proposed method can be around 20 times faster than the
state-of-the-art package (UCR suite) without any significant loss in accuracy.
Arkaitz Zubiaga, Maria Liakata, Rob Procter
Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR); Social and Information Networks (cs.SI)
Breaking news leads to situations of fast-paced reporting in social media,
producing all kinds of updates related to news stories, albeit with the caveat
that some of those early updates tend to be rumours, i.e., information with an
unverified status at the time of posting. Flagging information that is
unverified can be helpful to avoid the spread of information that may turn out
to be false. Detection of rumours can also feed a rumour tracking system that
ultimately determines their veracity. In this paper we introduce a novel
approach to rumour detection that learns from the sequential dynamics of
reporting during breaking news in social media to detect rumours in new
stories. Using Twitter datasets collected during five breaking news stories, we
experiment with Conditional Random Fields as a sequential classifier that
leverages context learnt during an event for rumour detection, which we compare
with the state-of-the-art rumour detection system as well as other baselines.
In contrast to existing work, our classifier does not need to observe tweets
querying a piece of information to deem it a rumour, but instead we detect
rumours from the tweet alone by exploiting context learnt during the event. Our
classifier achieves competitive performance, beating the state-of-the-art
classifier that relies on querying tweets with improved precision and recall,
as well as outperforming our best baseline with nearly 40% improvement in terms
of F1 score. The scale and diversity of our experiments reinforces the
generalisability of our classifier.
Jiaqi Mu, Suma Bhat, Pramod Viswanath
Subjects: Computation and Language (cs.CL); Learning (cs.LG); Machine Learning (stat.ML)
Vector representations of words have heralded a transformational approach to
classical problems in NLP; the most popular example is word2vec. However, a
single vector does not suffice to model the polysemous nature of many
(frequent) words, i.e., words with multiple meanings. In this paper, we propose
a three-fold approach for unsupervised polysemy modeling: (a) context
representations, (b) sense induction and disambiguation and (c) lexeme (as a
word and sense pair) representations. A key feature of our work is the finding
that a sentence containing a target word is well represented by a low rank
subspace, instead of a point in a vector space. We then show that the subspaces
associated with a particular sense of the target word tend to intersect over a
line (one-dimensional subspace), which we use to disambiguate senses using a
clustering algorithm that harnesses the Grassmannian geometry of the
representations. The disambiguation algorithm, which we call (K)-Grassmeans,
leads to a procedure to label the different senses of the target word in the
corpus — yielding lexeme vector representations, all in an unsupervised manner
starting from a large (Wikipedia) corpus in English. Apart from several
prototypical target (word,sense) examples and a host of empirical studies to
intuit and justify the various geometric representations, we validate our
algorithms on standard sense induction and disambiguation datasets and present
new state-of-the-art results.
Raj Nath Patel, Rohit Gupta, Prakash B. Pimpale, Sasikumar M
Comments: 8 pages, Published at the Second Workshop on Hybrid Approaches to Translation, ACL 2013
Subjects: Computation and Language (cs.CL)
Reordering is a preprocessing stage for Statistical Machine Translation (SMT)
system where the words of the source sentence are reordered as per the syntax
of the target language. We are proposing a rich set of rules for better
reordering. The idea is to facilitate the training process by better alignments
and parallel phrase extraction for a phrase-based SMT system. Reordering also
helps the decoding process and hence improving the machine translation quality.
We have observed significant improvements in the translation quality by using
our approach over the baseline SMT. We have used BLEU, NIST, multi-reference
word error rate, multi-reference position independent error rate for judging
the improvements. We have exploited open source SMT toolkit MOSES to develop
the system.
Raj Nath Patel, Prakash B. Pimpale, Sasikumar M
Comments: 5 pages, Published at NLP Tools Contest: Statistical Machine Translation in Indian Languages, ICON-2015
Subjects: Computation and Language (cs.CL)
This paper discusses Centre for Development of Advanced Computing Mumbai’s
(CDACM) submission to the NLP Tools Contest on Statistical Machine Translation
in Indian Languages (ILSMT) 2014 (collocated with ICON 2014). The objective of
the contest was to explore the effectiveness of Statistical Machine Translation
(SMT) for Indian language to Indian language and English-Hindi machine
translation. In this paper, we have proposed that suffix separation and word
splitting for SMT from agglutinative languages to Hindi significantly improves
over the baseline (BL). We have also shown that the factored model with
reordering outperforms the phrase-based SMT for English-Hindi (enhi). We
report our work on all five pairs of languages, namely Bengali-Hindi (nhi),
Marathi-Hindi (mrhi), Tamil-Hindi ( ahi), Telugu-Hindi ( ehi), and enhi for
Health, Tourism, and General domains.
Thierry Poibeau (LaTTICe), Shravan Vasishth
Journal-ref: Traitement Automatique des Langues, ATALA, 2014, Traitement
Automatique des Langues et Sciences Cognitives, 55 (3), pp.7-19
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)
This special issue is dedicated to get a better picture of the relationships
between computational linguistics and cognitive science. It specifically raises
two questions: “what is the potential contribution of computational language
modeling to cognitive science?” and conversely: “what is the influence of
cognitive science in contemporary computational linguistics?”
Arkaitz Zubiaga, Maria Liakata, Rob Procter
Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR); Social and Information Networks (cs.SI)
Breaking news leads to situations of fast-paced reporting in social media,
producing all kinds of updates related to news stories, albeit with the caveat
that some of those early updates tend to be rumours, i.e., information with an
unverified status at the time of posting. Flagging information that is
unverified can be helpful to avoid the spread of information that may turn out
to be false. Detection of rumours can also feed a rumour tracking system that
ultimately determines their veracity. In this paper we introduce a novel
approach to rumour detection that learns from the sequential dynamics of
reporting during breaking news in social media to detect rumours in new
stories. Using Twitter datasets collected during five breaking news stories, we
experiment with Conditional Random Fields as a sequential classifier that
leverages context learnt during an event for rumour detection, which we compare
with the state-of-the-art rumour detection system as well as other baselines.
In contrast to existing work, our classifier does not need to observe tweets
querying a piece of information to deem it a rumour, but instead we detect
rumours from the tweet alone by exploiting context learnt during the event. Our
classifier achieves competitive performance, beating the state-of-the-art
classifier that relies on querying tweets with improved precision and recall,
as well as outperforming our best baseline with nearly 40% improvement in terms
of F1 score. The scale and diversity of our experiments reinforces the
generalisability of our classifier.
Jiajun Zhang, Chengqing Zong
Comments: 10 pages, 2 figures
Subjects: Computation and Language (cs.CL)
Neural Machine Translation (NMT) has become the new state-of-the-art in
several language pairs. However, it remains a challenging problem how to
integrate NMT with a bilingual dictionary which mainly contains words rarely or
never seen in the bilingual training data. In this paper, we propose two
methods to bridge NMT and the bilingual dictionaries. The core idea behind is
to design novel models that transform the bilingual dictionaries into adequate
sentence pairs, so that NMT can distil latent bilingual mappings from the ample
and repetitive phenomena. One method leverages a mixed word/character model and
the other attempts at synthesizing parallel sentences guaranteeing massive
occurrence of the translation lexicon. Extensive experiments demonstrate that
the proposed methods can remarkably improve the translation quality, and most
of the rare words in the test sentences can obtain correct translations if they
are covered by the dictionary.
Yiping Song, Rui Yan, Xiang Li, Dongyan Zhao, Ming Zhang
Subjects: Computation and Language (cs.CL)
Open-domain human-computer conversation has attracted much attention in the
field of NLP. Contrary to rule- or template-based domain-specific dialog
systems, open-domain conversation usually requires data-driven approaches,
which can be roughly divided into two categories: retrieval-based and
generation-based systems. Retrieval systems search a user-issued utterance
(called a query) in a large database, and return a reply that best matches the
query. Generative approaches, typically based on recurrent neural networks
(RNNs), can synthesize new replies, but they suffer from the problem of
generating short, meaningless utterances. In this paper, we propose a novel
ensemble of retrieval-based and generation-based dialog systems in the open
domain. In our approach, the retrieved candidate, in addition to the original
query, is fed to an RNN-based reply generator, so that the neural model is
aware of more information. The generated reply is then fed back as a new
candidate for post-reranking. Experimental results show that such ensemble
outperforms each single part of it by a large margin.
Aditya Joshi, Pranav Goel, Pushpak Bhattacharyya, Mark Carman
Comments: This paper is currently under review at EACL 2017. In case of rejection from the conference, it may be submitted to another conference in the future. This comment may be updated accordingly
Subjects: Computation and Language (cs.CL)
Past work in computational sarcasm deals primarily with sarcasm detection. In
this paper, we introduce a novel, related problem: sarcasm target
identification ( extit{i.e.}, extracting the target of ridicule in a sarcastic
sentence). We present an introductory approach for sarcasm target
identification. Our approach employs two types of extractors: one based on
rules, and another consisting of a statistical classifier. To compare our
approach, we use two baselines: a na”ive baseline and another baseline based
on work in sentiment target identification. We perform our experiments on book
snippets and tweets, and show that our hybrid approach performs better than the
two baselines and also, in comparison with using the two extractors
individually. Our introductory approach establishes the viability of sarcasm
target identification, and will serve as a baseline for future work.
Douwe Kiela, Luana Bulat, Anita L. Vero, Stephen Clark
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
Meaning has been called the “holy grail” of a variety of scientific
disciplines, ranging from linguistics to philosophy, psychology and the
neurosciences. The field of Artifical Intelligence (AI) is very much a part of
that list: the development of sophisticated natural language semantics is a
sine qua non for achieving a level of intelligence comparable to humans.
Embodiment theories in cognitive science hold that human semantic
representation depends on sensori-motor experience; the abundant evidence that
human meaning representation is grounded in the perception of physical reality
leads to the conclusion that meaning must depend on a fusion of multiple
(perceptual) modalities. Despite this, AI research in general, and its
subdisciplines such as computational linguistics and computer vision in
particular, have focused primarily on tasks that involve a single modality.
Here, we propose virtual embodiment as an alternative, long-term strategy for
AI research that is multi-modal in nature and that allows for the kind of
scalability required to develop the field coherently and incrementally, in an
ethically responsible fashion.
Theo Jepsen, Leandro Pacheco de Sousa, Huynh Tu Dang, Fernando Pedone, Robert Soulé
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Network latency can have a significant impact on the performance of
transactional storage systems, particularly in wide area or geo-distributed
deployments. To reduce latency, systems typically rely on a cache to service
read-requests closer to the client. However, caches are not effective for
write-heavy workloads, which have to be processed by the storage system in
order to maintain serializability.
This paper presents a new technique, called optimistic abort, which reduces
network latency for high-contention, write-heavy workloads by identifying
transactions that will abort as early as possible, and aborting them before
they reach the store. We have implemented optimistic abort in a system called
Gotthard, which leverages recent advances in network data plane programmability
to execute transaction processing logic directly in network devices. Gotthard
examines network traffic to observe and log transaction requests. If Gotthard
suspects that a transaction is likely to be aborted at the store, it aborts the
transaction early by re-writing the packet header, and routing the packets back
to the client. Gotthard significantly reduces the overall latency and improves
the throughput for high-contention workloads.
Cristóbal A. Navarro, Benjamín Bustos, Nancy Hitscheld
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
The problem of parallel thread mapping is studied for the case of discrete
orthogonal (m)-simplices. The possibility of a (O(1)) time recursive
block-space map (lambda: mathbb{Z}^m mapsto mathbb{Z}^m) is analyzed from
the point of view of parallel space efficiency and potential performance
improvement. The (2)-simplex and (3)-simplex are analyzed as special cases,
where constant time maps are found, providing a potential improvement of up to
(2 imes) and (6 imes) more efficient than a bounding-box approach,
respectively. For the general case it is shown that finding an efficient
recursive parallel space for an (m)-simplex depends of the choice of two
parameters, for which some insights are provided which can lead to a volume
that matches the (m)-simplex for (n>n_0), making parallel space approximately
(m!) times more efficient than a bounding-box.
Emanuele Carlini, Massimo Coppola, Patrizio Dazzi, Matteo Mordacchini
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
This paper sketches the challenges to address to realise a support able to
achieve an Ephemeral Cloud Federation, an innovative cloud computing paradigm
that enables the exploitation of a dynamic, personalised and context-aware set
of resources.
The aim of the Ephemeral Federation is to answer to the need of combining
private data-centres with both federation of cloud providers and the resource
on the edge of the network.
The goal of the Ephemeral Federation is to deliver a context-aware and
personalised federations of computational, data and network resources, able to
manage their heterogeneity in a highly distributed deployment, which can
dynamically bring data and computation close to the final user.
Fanny Pascual, Krzysztof Rzadca
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
In data centers, up to dozens of tasks are colocated on a single physical
machine. Machines are used more efficiently, but tasks’ performance
deteriorates, as colocated tasks compete for shared resources. As tasks are
heterogeneous (CPU-, memory-, network- or disk-intensive), the resulting
performance dependencies are complex.
We explore a new combinatorial optimization model that uses two parameters of
a task – its size and its type – to characterize how a task influences the
performance of the other tasks allocated to the same machine. We study the
egalitarian optimization goal: maximizing the worst-off performance. This
problem generalizes the classic makespan minimization on multiple processors
(P||Cmax).
We prove that polynomially-solvable variants of multiprocessor scheduling
become NP-hard and hard to approximate when the number of types is not
constant. We propose a PTAS and a series of fast approximation algorithms when
the number of types is constant. By simulation on instances derived from a
trace of one of Google clusters, we show that our algorithms that take into
account types lead to lower costs compared with P||Cmax baseline.
The notion of type enables us to model degeneration of performance caused by
colocation using standard combinatorial optimization methods. Types add a layer
of additional complexity. However, our results – approximation algorithms and
good average-case performance – show that types can be handled efficiently.
George M Slota, Sivasankaran Rajamanickam, Karen Devine, Kamesh Madduri
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
We introduce XtraPuLP, a new distributed-memory graph partitioner designed to
process trillion-edge graphs. XtraPuLP is based on the scalable label
propagation community detection technique, which has been demonstrated as a
viable means to produce high quality partitions with minimal computation time.
On a collection of large sparse graphs, we show that XtraPuLP partitioning
quality is comparable to state-of-the-art partitioning methods. We also
demonstrate that XtraPuLP can produce partitions of real-world graphs with
billion+ vertices in minutes. Further, we show that using XtraPuLP partitions
for distributed-memory graph analytics leads to significant end-to-end
execution time reduction.
Soumitra Pal, Tingyang Xu, Tianbao Yang, Sanguthevar Rajasekaran, Jinbo Bi
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
In prior works, stochastic dual coordinate ascent (SDCA) has been
parallelized in a multi-core environment where the cores communicate through
shared memory, or in a multi-processor distributed memory environment where the
processors communicate through message passing. In this paper, we propose a
hybrid SDCA framework for multi-core clusters, the most common high performance
computing environment that consists of multiple nodes each having multiple
cores and its own shared memory. We distribute data across nodes where each
node solves a local problem in an asynchronous parallel fashion on its cores,
and then the local updates are aggregated via an asynchronous across-node
update scheme. The proposed double asynchronous method converges to a global
solution for (L)-Lipschitz continuous loss functions, and at a linear
convergence rate if a smooth convex loss function is used. Extensive empirical
comparison has shown that our algorithm scales better than the best known
shared-memory methods and runs faster than previous distributed-memory methods.
Big datasets, such as one of 280 GB from the LIBSVM repository, cannot be
accommodated on a single node and hence cannot be solved by a parallel
algorithm. For such a dataset, our hybrid algorithm takes 30 seconds to achieve
a duality gap of (10^{-6}) on 16 nodes each using 8 cores, which is
significantly faster than the best known distributed algorithms, such as
CoCoA+, that take more than 300 seconds on 16 nodes.
Rodrigo Canales, Elmar Peise, Paolo Bientinesi
Comments: 16 pages, 5 figures
Subjects: Computation (stat.CO); Distributed, Parallel, and Cluster Computing (cs.DC); Mathematical Software (cs.MS)
Even though in recent years the scale of statistical analysis problems has
increased tremendously, many statistical software tools are still limited to
single-node computations. However, statistical analyses are largely based on
dense linear algebra operations, which have been deeply studied, optimized and
parallelized in the high-performance-computing community. To make
high-performance distributed computations available for statistical analysis,
and thus enable large scale statistical computations, we introduce RElem, an
open source package that integrates the distributed dense linear algebra
library Elemental into R. While on the one hand, RElem provides direct wrappers
of Elemental’s routines, on the other hand, it overloads various operators and
functions to provide an entirely native R experience for distributed
computations. We showcase how simple it is to port existing R programs to Relem
and demonstrate that Relem indeed allows to scale beyond the single-node
limitation of R with the full performance of Elemental without any overhead.
Tian Jin, Nirmal Prajapati, Waruna Ranasinghe, Guillaume Iooss, Yun Zou, Sanjay Rajopadhye, David Wonnacott
Subjects: Programming Languages (cs.PL); Distributed, Parallel, and Cluster Computing (cs.DC); Performance (cs.PF)
Polyhedral compilers perform optimizations such as tiling and
parallelization; when doing both, they usually generate code that executes
“barrier-synchronized wavefronts” of tiles. We present a system to express and
generate code for hybrid schedules, where some constraints are automatically
satisfied through the structure of the code, and the remainder are dynamically
enforced at run-time with data flow mechanisms. We prove bounds on the added
overheads that are better, by at least one polynomial degree, than those of
previous techniques.
We propose a generic mechanism to implement the needed synchronization, and
show it can be easily realized for a variety of targets: OpenMP, Pthreads, GPU
(CUDA or OpenCL) code, languages like X10, Habanero, Cilk, as well as data flow
platforms like DAGuE, and OpenStream and MPI. We also provide a simple concrete
implementation that works without the need of any sophisticated run-time
mechanism.
Our experiments show our simple implementation to be competitive or better
than the wavefront-synchronized code generated by other systems. We also show
how the proposed mechanism can achieve 24% to 70% reduction in energy.
Xin Wang, Jinbo Bi, Shipeng Yu, Jiangwen Sun
Comments: Advances in Neural Information Processing Systems 2014
Subjects: Learning (cs.LG)
We investigate a general framework of multiplicative multitask feature
learning which decomposes each task’s model parameters into a multiplication of
two components. One of the components is used across all tasks and the other
component is task-specific. Several previous methods have been proposed as
special cases of our framework. We study the theoretical properties of this
framework when different regularization conditions are applied to the two
decomposed components. We prove that this framework is mathematically
equivalent to the widely used multitask feature learning methods that are based
on a joint regularization of all model parameters, but with a more general form
of regularizers. Further, an analytical formula is derived for the across-task
component as related to the task-specific component for all these regularizers,
leading to a better understanding of the shrinkage effect. Study of this
framework motivates new multitask learning algorithms. We propose two new
learning formulations by varying the parameters in the proposed framework.
Empirical studies have revealed the relative advantages of the two new
formulations by comparing with the state of the art, which provides instructive
insights into the feature learning problem with multiple tasks.
Lu Liu, Zhiguang Wang
Comments: Submitted to PaciVis 2017
Subjects: Learning (cs.LG); Human-Computer Interaction (cs.HC)
Time series is attracting more attention across statistics, machine learning
and pattern recognition as it appears widely in both industry and academia, but
few advances has been achieved in effective time series visualization due to
its temporal dimensionality and complex dynamics. Inspired by recent effort on
using network metrics to characterize time series for classification, we
present an approach to visualize time series as complex networks based on first
order Markov process and temporal ordering. Different to classical bar charts,
line plots and other statistics based graph, our approach delivers more
intuitive visualization that better preserves both the temporal dependency and
frequency structures. It provides a natural inverse operation to map the graph
back to time series, making it possible to use graph statistics to characterize
time series for better visual exploration and statistical analysis. Our
experimental results suggest the effectiveness on various tasks such as system
identification, classification and anomaly detection on both synthetic and the
real time series data.
Zhiguang Wang, Wei Song, Lu Liu, Fan Zhang, Junxiao Xue, Yangdong Ye, Ming Fan, Mingliang Xu
Comments: Submitted to NeuroComputing. arXiv admin note: text overlap with arXiv:1505.04366 by other authors
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
We propose a new model based on the deconvolutional networks and SAX
discretization to learn the representation for multivariate time series.
Deconvolutional networks fully exploit the advantage the powerful
expressiveness of deep neural networks in the manner of unsupervised learning.
We design a network structure specifically to capture the cross-channel
correlation with deconvolution, forcing the pooling operation to perform the
dimension reduction along each position in the individual channel.
Discretization based on Symbolic Aggregate Approximation is applied on the
feature vectors to further extract the bag of features. We show how this
representation and bag of features helps on classification. A full comparison
with the sequence distance based approach is provided to demonstrate the
effectiveness of our approach on the standard datasets. We further build the
Markov matrix from the discretized representation from the deconvolution to
visualize the time series as complex networks, which show more class-specific
statistical properties and clear structures with respect to different labels.
L. Elisa Celis, Amit Deshpande, Tarun Kathuria, Nisheeth K. Vishnoi
Subjects: Learning (cs.LG)
Due to the recent cases of algorithmic bias in data-driven decision-making,
machine learning methods are being put under the microscope in order to
understand the root cause of these biases and how to correct them. Here, we
consider a basic algorithmic task that is central in machine learning:
subsampling from a large data set. Subsamples are used both as an end-goal in
data summarization (where fairness could either be a legal, political or moral
requirement) and to train algorithms (where biases in the samples are often a
source of bias in the resulting model). Consequently, there is a growing effort
to modify either the subsampling methods or the algorithms themselves in order
to ensure fairness. However, in doing so, a question that seems to be
overlooked is whether it is possible to produce fair subsamples that are also
adequately representative of the feature space of the data set – an important
and classic requirement in machine learning. Can diversity and fairness be
simultaneously ensured? We start by noting that, in some applications,
guaranteeing one does not necessarily guarantee the other, and a new approach
is required. Subsequently, we present an algorithmic framework which allows us
to produce both fair and diverse samples. Our experimental results on an image
summarization task show marked improvements in fairness without compromising
feature diversity by much, giving us the best of both the worlds.
Andre G. C. Pacheco, Renato A. Krohling
Comments: 16 pages, 8 figures and 11 tables
Subjects: Learning (cs.LG)
In classification problems when multiples algorithms are applied to different
benchmarks a difficult issue arises, i.e., how can we rank the algorithms? In
machine learning it is common run the algorithms several times and then a
statistic is calculated in terms of means and standard deviations. In order to
compare the performance of the algorithms, it is very common to employ
statistical tests. However, these tests may also present limitations, since
they consider only the means and not the standard deviations of the obtained
results. In this paper, we present the so called A-TOPSIS, based on TOPSIS
(Technique for Order Preference by Similarity to Ideal Solution), to solve the
problem of ranking and comparing classification algorithms in terms of means
and standard deviations. We use two case studies to illustrate the A-TOPSIS for
ranking classification algorithms and the results show the suitability of
A-TOPSIS to rank the algorithms. The presented approach is general and can be
applied to compare the performance of stochastic algorithms in machine
learning. Finally, to encourage researchers to use the A-TOPSIS for ranking
algorithms we also presented in this work an easy-to-use A-TOPSIS web
framework.
J. Albericio, P. Judd, A. Delmás, S. Sharify, A. Moshovos
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Hardware Architecture (cs.AR); Computer Vision and Pattern Recognition (cs.CV)
We quantify a source of ineffectual computations when processing the
multiplications of the convolutional layers in Deep Neural Networks (DNNs) and
propose Pragmatic (PRA), an architecture that exploits it improving performance
and energy efficiency. The source of these ineffectual computations is best
understood in the context of conventional multipliers which generate internally
multiple terms, that is, products of the multiplicand and powers of two, which
added together produce the final product [1]. At runtime, many of these terms
are zero as they are generated when the multiplicand is combined with the
zero-bits of the multiplicator. While conventional bit-parallel multipliers
calculate all terms in parallel to reduce individual product latency, PRA
calculates only the non-zero terms using a) on-the-fly conversion of the
multiplicator representation into an explicit list of powers of two, and b)
hybrid bit-parallel multplicand/bit-serial multiplicator processing units. PRA
exploits two sources of ineffectual computations: 1) the aforementioned zero
product terms which are the result of the lack of explicitness in the
multiplicator representation, and 2) the excess in the representation precision
used for both multiplicants and multiplicators, e.g., [2]. Measurements
demonstrate that for the convolutional layers, a straightforward variant of PRA
improves performance by 2.6x over the DaDiaNao (DaDN) accelerator [3] and by
1.4x over STR [4]. Similarly, PRA improves energy efficiency by 28% and 10% on
average compared to DaDN and STR. An improved cross lane synchronication scheme
boosts performance improvements to 3.1x over DaDN. Finally, Pragmatic benefits
persist even with an 8-bit quantized representation [5].
Jiajun Wu, Chengkai Zhang, Tianfan Xue, William T. Freeman, Joshua B. Tenenbaum
Comments: NIPS 2016. The first two authors contributed equally to this work
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
We study the problem of 3D object generation. We propose a novel framework,
namely 3D Generative Adversarial Network (3D-GAN), which generates 3D objects
from a probabilistic space by leveraging recent advances in volumetric
convolutional networks and generative adversarial nets. The benefits of our
model are three-fold: first, the use of an adversarial criterion, instead of
traditional heuristic criteria, enables the generator to capture object
structure implicitly and to synthesize high-quality 3D objects; second, the
generator establishes a mapping from a low-dimensional probabilistic space to
the space of 3D objects, so that we can sample objects without a reference
image or CAD models, and explore the 3D object manifold; third, the adversarial
discriminator provides a powerful 3D shape descriptor which, learned without
supervision, has wide applications in 3D object recognition. Experiments
demonstrate that our method generates high-quality 3D objects, and our
unsupervisedly learned features achieve impressive performance on 3D object
recognition, comparable with those of supervised learning methods.
Jiaqi Mu, Suma Bhat, Pramod Viswanath
Subjects: Computation and Language (cs.CL); Learning (cs.LG); Machine Learning (stat.ML)
Vector representations of words have heralded a transformational approach to
classical problems in NLP; the most popular example is word2vec. However, a
single vector does not suffice to model the polysemous nature of many
(frequent) words, i.e., words with multiple meanings. In this paper, we propose
a three-fold approach for unsupervised polysemy modeling: (a) context
representations, (b) sense induction and disambiguation and (c) lexeme (as a
word and sense pair) representations. A key feature of our work is the finding
that a sentence containing a target word is well represented by a low rank
subspace, instead of a point in a vector space. We then show that the subspaces
associated with a particular sense of the target word tend to intersect over a
line (one-dimensional subspace), which we use to disambiguate senses using a
clustering algorithm that harnesses the Grassmannian geometry of the
representations. The disambiguation algorithm, which we call (K)-Grassmeans,
leads to a procedure to label the different senses of the target word in the
corpus — yielding lexeme vector representations, all in an unsupervised manner
starting from a large (Wikipedia) corpus in English. Apart from several
prototypical target (word,sense) examples and a host of empirical studies to
intuit and justify the various geometric representations, we validate our
algorithms on standard sense induction and disambiguation datasets and present
new state-of-the-art results.
Felipe C. Pinheiro, Cassio G. Lopes
Subjects: Systems and Control (cs.SY); Learning (cs.LG)
This work proposes a low complexity nonlinearity model and develops adaptive
algorithms over it. The model is based on the decomposable—or rank-one, in
tensor language—Volterra kernels. It may also be described as a product of
FIR filters, which explains its low-complexity. The rank-one model is also
interesting because it comes from a well-posed problem in approximation theory.
The paper uses such model in an estimation theory context to develop an exact
gradient-type algorithm, from which adaptive algorithms such as the least mean
squares (LMS) filter and its data-reuse version—the TRUE-LMS—are derived.
Stability and convergence issues are addressed. The algorithms are then tested
in simulations, which show its good performance when compared to other
nonlinear processing algorithms in the literature.
Udi Margolin, Alberto Mozo, Bruno Ordozgoiti, Danny Raz, Elisha Rosensweig, Itai Segall
Subjects: Networking and Internet Architecture (cs.NI); Learning (cs.LG)
5G networks are expected to be more dynamic and chaotic in their structure
than current networks. With the advent of Network Function Virtualization
(NFV), Network Functions (NF) will no longer be tightly coupled with the
hardware they are running on, which poses new challenges in network management.
Noisy neighbor is a term commonly used to describe situations in NFV
infrastructure where an application experiences degradation in performance due
to the fact that some of the resources it needs are occupied by other
applications in the same cloud node. These situations cannot be easily
identified using straightforward approaches, which calls for the use of
sophisticated methods for NFV infrastructure management. In this paper we
demonstrate how Machine Learning (ML) techniques can be used to identify such
events. Through experiments using data collected at real NFV infrastructure, we
show that standard models for automated classification can detect the noisy
neighbor phenomenon with an accuracy of more than 90% in a simple scenario.
Ilija Bogunovic, Jonathan Scarlett, Andreas Krause, Volkan Cevher
Comments: Accepted to NIPS 2016
Subjects: Machine Learning (stat.ML); Information Theory (cs.IT); Learning (cs.LG)
We present a new algorithm, truncated variance reduction (TruVaR), that
treats Bayesian optimization (BO) and level-set estimation (LSE) with Gaussian
processes in a unified fashion. The algorithm greedily shrinks a sum of
truncated variances within a set of potential maximizers (BO) or unclassified
points (LSE), which is updated based on confidence bounds. TruVaR is effective
in several important settings that are typically non-trivial to incorporate
into myopic algorithms, including pointwise costs and heteroscedastic noise. We
provide a general theoretical guarantee for TruVaR covering these aspects, and
use it to recover and strengthen existing results on BO and LSE. Moreover, we
provide a new result for a setting where one can select from a number of noise
levels having associated costs. We demonstrate the effectiveness of the
algorithm on both synthetic and real-world data sets.
Terry Taewoong Um, Vahid Babakeshizadeh, Dana Kulic
Comments: submitted to ICRA2017
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
The ability to accurately observe human motion and identify human activities
is essential for developing automatic rehabilitation and sports training
systems. In this paper, large-scale exercise motion data obtained from a
forearm-worn wearable sensor are classified with a convolutional neural network
(CNN). Time series data consisting of accelerometer and orientation
measurements are formatted as “images”, allowing the CNN to automatically
extract discriminative features. The resulting CNN classifies 50 gym exercises
with 92.14% accuracy. A comparative study on the effects of image formatting
and different CNN architectures is also presented.
Rashad Eletreby, Osman Yağan
Comments: In proceedings of the 55th IEEE Conference on Decision and Control (CDC 2016). arXiv admin note: substantial text overlap with arXiv:1604.00460
Subjects: Information Theory (cs.IT)
We consider the network reliability problem in wireless sensor networks
secured by the heterogeneous random key predistribution scheme. This scheme
generalizes Eschenauer-Gligor scheme by considering the cases when the network
comprises sensor nodes with varying level of resources; e.g., regular nodes vs.
cluster heads. The scheme induces the inhomogeneous random key graph, denoted
(mathbb{G}(n;pmb{mu},pmb{K},P)). We analyze the reliability of
(mathbb{G}(n;pmb{mu},pmb{K},P)) against random link failures. Namely, we
consider (mathbb{G}(n;pmb{mu},pmb{K}, P,alpha)) formed by deleting each
edge of (mathbb{G}(n;pmb{mu},pmb{K},P)) independently with probability
(1-alpha), and study the probability that the resulting graph i) has no
isolated node; and ii) is connected. We present scaling conditions on
(pmb{K}), (P), and (alpha) such that both events take place with probability
zero or one, respectively, as the number of nodes gets large. We present
numerical results to support these in the finite-node regime.
Rashad Eletreby, Osman Yağan
Comments: In Proceedings of the 54th Annual Allerton Conference on Communication, Control, and Computing
Subjects: Information Theory (cs.IT)
We investigate the secure connectivity of wireless sensor networks under a
heterogeneous random key predistribution scheme and a heterogeneous channel
model. In particular, we study a random graph formed by the intersection of an
inhomogeneous random key graph with an inhomogeneous ErdH{o}s-R’enyi graph.
The former graph is naturally induced by the heterogeneous random key
predistribution scheme while the latter graph constitutes a heterogeneous
on/off channel model; wherein, the wireless channel between a class-(i) node
and a class-(j) node is on with probability (alpha_{ij}) independently. We
present conditions (in the form of zero-one laws) on how to scale the
parameters of the intersection model so that it has no isolated node with high
probability as the number of nodes gets large. We also present numerical
results to support these zero-one laws in the finite-node regime.
Sven Jacobsson, Giuseppe Durisi, Mikael Coldrey, Tom Goldstein, Christoph Studer
Subjects: Information Theory (cs.IT)
Massive multiuser (MU) multiple-input multiple-output (MIMO) is foreseen to
be one of the key technologies in fifth-generation wireless communication
systems. In this paper, we investigate the problem of downlink precoding for a
narrowband massive MU-MIMO system with low-resolution digital-to-analog
converters (DACs) at the base station (BS). We analyze the performance of
linear precoders, such as maximal-ratio transmission and zero-forcing, subject
to coarse quantization. Using Bussgang’s theorem, we derive a closed-form
approximation of the achievable rate of the coarsely quantized system. Our
results reveal that the infinite-resolution performance can be approached with
DACs using only 3 to 4 bits of resolution, depending on the number of BS
antennas and the number of user equipments (UEs). For the case of 1-bit DACs,
we also propose novel nonlinear precoding algorithms that significantly
outperform linear precoders at the cost of an increased computational
complexity. Specifically, we show that nonlinear precoding incurs only a 3 dB
penalty compared to the infinite-resolution case for an uncoded bit error rate
of 10^-3 in a system with 128 BS antennas that uses 1-bit DACs and serves 16
single-antenna UEs; in contrast, the penalty is about 8 dB for linear
precoders.
Tom Goldstein, Christoph Studer
Subjects: Information Theory (cs.IT); Optimization and Control (math.OC)
We consider the recovery of a (real- or complex-valued) signal from
magnitude-only measurements, known as phase retrieval. We formulate phase
retrieval as a convex optimization problem, which we call PhaseMax. Unlike
other convex methods that use semidefinite relaxation and lift the phase
retrieval problem to a higher dimension, PhaseMax operates in the original
signal dimension. We show that the dual problem to PhaseMax is Basis Pursuit,
which implies that phase retrieval can be performed using algorithms initially
designed for sparse signal recovery. We develop sharp lower bounds on the
success probability of PhaseMax for a broad range of random measurement
ensembles, and we analyze the impact of measurement noise on the solution
accuracy. We use numerical results to demonstrate the accuracy of our recovery
guarantees, and we showcase the efficacy and limits of PhaseMax in practice.
Alex Jones, Ben Adcock, Anders Hansen
Subjects: Information Theory (cs.IT)
Recently it has been established that asymptotic incoherence can be used to
facilitate subsampling, in order to optimize reconstruction quality, in a
variety of continuous compressed sensing problems, and the coherence structure
of certain one-dimensional Fourier sampling problems was determined. This paper
extends the analysis of asymptotic incoherence to cover multidimensional
reconstruction problems. It is shown that Fourier sampling and separable
wavelet sparsity in any dimension can yield the same optimal asymptotic
incoherence as in one dimensional case. Moreover in two dimensions the
coherence structure is compatible with many standard two dimensional sampling
schemes that are currently in use. However, in higher dimensional problems with
poor wavelet smoothness we demonstrate that there are considerable restrictions
on how one can subsample from the Fourier basis with optimal incoherence. This
can be remedied by using a sufficiently smooth generating wavelet. It is also
shown that using tensor bases will always provide suboptimal decay marred by
problems associated with dimensionality. The impact of asymptotic incoherence
on the ability to subsample is demonstrated with some simple two dimensional
numerical experiments.
Kim Mahler, Wilhelm Keusgen, Fredrik Tufvesson, Thomas Zemen, Giuseppe Caire
Subjects: Information Theory (cs.IT)
A detailed understanding of the dynamic processes of vehicular radio channels
is crucial for its realistic modeling. In this paper, we present multipath
components (MPCs) tracking results from a channel sounder measurement with 1
GHz bandwidth at a carrier frequency of 5.7 GHz. We describe in detail the
applied algorithms and perform a tracking performance evaluation based on
artificial channels and on measurement data from a tunnel scenario. The
tracking performance of the proposed algorithm is comparable to the tracking
performance of the state-of-the-art Gaussian mixture probability hypothesis
density filter, yet with a significantly lower complexity. The fluctuation of
the measured channel gain is followed very well by the proposed tracking
algorithm, with a power loss of only 2.5 dB. We present statistical
distributions for the number of MPCs and the birth/death rate. The applied
algorithms and tracking results can be used to enhance the development of
geometry-based channel models.
Soo-Jin Kim, Gee-Yong Suk, Jong-Seok Lee, Chan-Byoung Chae
Subjects: Information Theory (cs.IT); Multimedia (cs.MM)
An important concept in wireless systems has been quality of experience
(QoE)-aware video transmission. Such communications are considered not only
connection-based communications but also content-aware communications, since
the video quality is closely related to the content itself. It becomes
necessary therefore for video communications to utilize a cross-layer design
(also known as joint source and channel coding). To provide efficient methods
of allocating network resources, the wireless network uses its cross-layer
knowledge to perform unequal error protection (UEP) solutions. In this article,
we summarize the latest video transmission technologies that are based on
scalable video coding (SVC) over multiple-input multiple-output (MIMO) systems
with cross-layer designs. To provide insight into video transmission in
wireless networks, we investigate UEP solutions in the delivering of video over
massive MIMO systems. Our results show that in terms of quality of experience
(QoE), SVC layer prioritization, which was considered important in the prior
work, is not always beneficial in massive MIMO systems; consideration must be
given to the content characteristics.
Xiaoke Zhang, Qian Gao, Chen Gong, Zhengyuan Xu
Comments: 5 pages, 4 figures. This article has been submitted to IEEE Communication Letters for publication on July 27, 2016
Subjects: Information Theory (cs.IT)
To design an efficient interference management and multiple access scheme for
visible light communication (VLC) network, this letter leverages the
non-orthogonal multiple access (NOMA), which has received significant attention
in the (5^{th}) generation wireless communication. With the residual
interference from the successive interference cancellation in NOMA taken into
account, we optimize the power allocation for NOMA VLC network to improve the
achievable user rate under user quality of service (QoS) constraint. The
performance of the proposed approaches is evaluated by the numerical results.
Roy Timo, Shirin Saeedi Bidokhti, Michèle Wigger, Bernhard C. Geiger
Subjects: Information Theory (cs.IT)
This paper takes a rate-distortion approach to understanding the
information-theoretic laws governing cache-aided communications systems.
Specifically, we characterise the optimal tradeoffs between the delivery rate,
cache capacity and reconstruction distortions for a single-user problem and
some special cases of a two-user problem. Our analysis considers discrete
memoryless sources, expected- and excess-distortion constraints, and separable
and f-separable distortion functions. We also establish a strong converse for
separable-distortion functions, and we show that lossy versions of common
information (G'{a}cs-K”{o}rner and Wyner) play an important role in caching.
Finally, we illustrate and explicitly evaluate these laws for multivariate
Gaussian sources and binary symmetric sources.
Mine Alsan
Comments: Submitted to IEEE Transactions on Information Theory
Subjects: Information Theory (cs.IT)
Ar{i}kan’s polar coding, is by now a well studied technique that allows
achieving the symmetric capacity of binary input memoryless channels with low
complexity encoding and decoding, provided that the polar decoding architecture
is used and the decoding metric is matched to the true channel. In this paper,
we analyze communication rates that are achievable when the polar
coding/decoding architecture is used with the decoder using an incorrect model
of the channel. We define the `polar mismatched capacity’ as an analogue of the
classical mismatched capacity, give an expression for it, and derive bounds on
it.
Alessandro Biason, Subhrakanti Dey, Michele Zorzi
Comments: 6 pages, 6 figures, submitted to IEEE Wireless Communications and Networking Conference (WCNC)
Subjects: Information Theory (cs.IT)
The problem of finding decentralized transmission policies in a wireless
communication network with energy harvesting constraints is formulated and
solved using the decentralized Markov decision process framework. The proposed
policy defines the transmission probabilities of all devices so as to correctly
balance the collision probabilities with the energy constraints. After an
initial coordination phase, in which the network parameters are initialized for
all devices, every node proceeds in a fully decentralized fashion. We
numerically show that, because of the harvesting, a fully orthogonal scheme
(e.g., TDMA-like) is sub-optimal in this scenario, and that the optimal
trade-off lies between an orthogonal and a completely symmetric system.
Ahmad Salim, Tolga M. Duman
Journal-ref: Wirel. Commun. Mob. Comput., 16: 2422 to 2435 (2016)
Subjects: Information Theory (cs.IT)
In this paper, we propose two schemes for asynchronous multi-relay two-way
relay (MR-TWR) systems in which neither the users nor the relays know the
channel state information (CSI). In an MR-TWR system, two users exchange their
messages with the help of (N_R) relays. Most of the existing works on MR-TWR
systems based on differential modulation assume perfect symbol-level
synchronization between all communicating nodes. However, this assumption is
not valid in many practical systems, which makes the design of differentially
modulated schemes more challenging. Therefore, we design differential
modulation schemes that can tolerate timing misalignment under
frequency-selective fading. We investigate the performance of the proposed
schemes in terms of either probability of bit error or pairwise error
probability. Through numerical examples, we show that the proposed schemes
outperform existing competing solutions in the literature, especially for high
signal-to-noise ratio (SNR) values.
Maria Angeles Vazquez-Castro, Masahito Hayashi
Subjects: Information Theory (cs.IT); Cryptography and Security (cs.CR)
Shannon introduced the classic model of a cryptosystem in 1949, where Eve has
access to an identical copy of the cyphertext that Alice sends to Bob. Shannon
defined perfect secrecy to be the case when the mutual information between the
plaintext and the cyphertext is zero. Perfect secrecy is motivated by
error-free transmission and requires that Bob and Alice share a secret key.
Wyner in 1975 and later I.~Csisz’ar and J.~K”orner in 1978 modified the
Shannon model assuming that the channels are noisy and proved that secrecy can
be achieved without sharing a secret key. This model is called wiretap channel
model and secrecy capacity is known when Eve’s channel is noisier than Bob’s
channel.
In this paper we review the concept of wiretap coding from the satellite
channel viewpoint. We also review subsequently introduced stronger secrecy
levels which can be numerically quantified and are keyless unconditionally
secure under certain assumptions. We introduce the general construction of
wiretap coding and analyse its applicability for a typical satellite channel.
From our analysis we discuss the potential of keyless information theoretic
physical layer security for satellite channels based on wiretap coding. We also
identify system design implications for enabling simultaneous operation with
additional information theoretic security protocols.
Rana Abbas, Mahyar Shirvanimoghaddam, Yonghui Li, Branka Vucetic
Comments: 13 pages, 13 figures, 1 table
Subjects: Information Theory (cs.IT)
We propose a novel random multiple access (RMA) scheme with quality of
service (QoS) guarantees for machine-to-machine (M2M) communications. We
consider a slotted uncoordinated data transmission period during which machine
type communication (MTC) devices transmit over the same radio channel. Based on
the latency requirements, MTC devices are divided into groups of different
sizes, and the transmission frame is divided into subframes of different
lengths. In each subframe, each group is assigned an access probability based
on which an MTC device decides to transmit replicas of its packet or remain
silent. The base station (BS) employs successive interference cancellation
(SIC) to recover all the superposed packets. We derive the closed form
expressions for the average probability of device resolution for each group,
and we use these expressions to design the access probabilities. The accuracy
of the expressions is validated through Monte Carlo simulations. We show that
the designed access probabilities can guarantee the QoS requirements with high
reliability and high energy efficiency. Finally, we show that RMA can
outperform standard coordinated access schemes as well as some of the recently
proposed M2M access schemes for cellular networks.
Stefano Buzzi, Carmen D'Andrea
Comments: 6 pages. To be presented at the 2016 GLOBECOM Workshop on Emerging Technologies for 5G Wireless Cellular Networks (ET5G)
Subjects: Information Theory (cs.IT)
Future cellular systems based on the use of above-6 GHz frequencies, the
so-called millimeter wave (mmWave) bandwidths, will heavily rely on the use of
antenna arrays both at the transmitter and at the receiver, possibly with a
large number of elements. For complexity reasons, fully digital precoding and
postcoding structures may turn out to be unfeasible, and thus suboptimal
structures, making use of simplified hardware and a limited number of RF
chains, have been investigated. This paper considers and makes a comparative
assessment, both from a spectral efficiency and energy efficiency point of
view, of several suboptimal precoding and postcoding beamforming structures for
the downlink of a cellular multiuser MIMO (MU-MIMO) system. Based on the most
recently available data for the energy consumption of phase shifters and
switches, we show that there are cases where fully-digital beamformers may
achieve a larger energy efficiency than lower-complexity solutions, as well as
that structures based on the exclusive use of switches achieve quite
unsatisfactory performance in realistic scenarios.
Eric Chaumette, Alexandre Renaux, Mohammed Nabil El Korso
Subjects: Information Theory (cs.IT)
A fairly general class of Bayesian “large-error” lower bounds of the
Weiss-Weinstein family, essentially free from regularity conditions on the
probability density functions support, and for which a limiting form yields a
generalized Bayesian Cramer-Rao bound (BCRB), is introduced. In a large number
of cases, the generalized BCRB appears to be the Bobrovsky-Mayer-Wolf-Zakai
bound (BMZB). Interestingly enough, a regularized form of the Bobrovsky-Zakai
bound (BZB), applicable when the support of the prior is a constrained
parameter set, is obtained. Modified Weiss-Weinstein bound and BZB which
limiting form is the BMZB are proposed, in expectation of an increased
tightness in the threshold region. Some of the proposed results are exemplified
with a reference problem in signal processing: the Gaussian observation model
with parameterized mean and uniform prior.
X. Lu, R. C. de Lamare
Comments: 8 figures
Subjects: Information Theory (cs.IT)
In this paper, we propose novel non-linear precoders for the downlink of a
multi-user MIMO system with the existence of multiple eavesdroppers. The
proposed non-linear precoders are designed to improve the physical-layer
secrecy rate. Specifically, we combine the non-linear successive optimization
Tomlinson-Harashima precoding (SO-THP) with generalized matrix inversion (GMI)
technique to maximize the physical-layer secrecy rate. For the purpose of
comparison, we examine different traditional precoders with the proposed
algorithm in terms of secrecy rate as well as BER performance. We also
investigate simplified generalized matrix inversion (S-GMI) and
lattice-reduction (LR) techniques in order to efficiently compute the
parameters of the precoders. We further conduct computational complexity and
secrecy rate analysis of the proposed and existing algorithms. In addition, in
the scenario without knowledge of channel state information (CSI) to the
eavesdroppers, a strategy of injecting artificial noise (AN) prior to the
transmission is employed to enhance the physical-layer secrecy rate. Simulation
results show that the proposed non-linear precoders outperform existing
precoders in terms of BER and secrecy rate performance.
Hina Tabassum, Ekram Hossain, Md. Jahangir Hossain
Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)
Non-orthogonal multiple access (NOMA) serves multiple users by superposing
their distinct message signals. The desired message signal is decoded at the
receiver by applying successive interference cancellation (SIC). Using the
theory of Poisson cluster process (PCP), we provide a framework to analyze
multi-cell uplink NOMA systems. Specifically, we characterize the rate coverage
probability of a NOMA user who is at rank (m) (in terms of the distance from
its serving BS) among all users in a cell and the mean rate coverage
probability of all users in a cell. Since the signal-to-interference-plus-noise
ratio (SINR) of (m)-th user relies on efficient SIC, we consider three
scenarios, i.e., perfect SIC (in which the signals of (m-1) interferers who are
stronger than (m)-th user are decoded successfully), imperfect SIC (in which
the signals of of (m-1) interferers who are stronger than (m)-th user may or
may not be decoded successfully), and imperfect worst case SIC (in which the
decoding of the signal of (m)-th user is always unsuccessful whenever the
decoding of its relative (m-1) stronger users is unsuccessful). The derived
expressions are customized to capture the performance of a user at rank (m) in
an equivalent orthogonal multiple access (OMA) system. Finally, numerical
results are presented to validate the derived expressions.
Mariia Sorokina, Stylianos Sygletos, Sergei Turitsyn
Subjects: Information Theory (cs.IT); Mathematical Physics (math-ph)
Since Shannon proved that Gaussian distribution is the optimum for a linear
channel with additive white Gaussian noise and he calculated the corresponding
channel capacity, it remains the most applied distribution in optical
communications while the capacity result is celebrated as the seminal linear
Shannon limit. Yet, when it is applied in nonlinear channels (e.g.
fiber-optics) it has been shown to be non-optimum, yielding the same result as
for uncoded transmission in the high nonlinear regime. This has led to the
notion of nonlinear Shannon limit, which predicts vanishing capacity at high
nonlinearity. However, recent findings indicate that non-Gaussian distribution
may lead to improved capacity estimations, urging for an exciting search for
novel methods in nonlinear optical communications. Here for the first time, we
show that it is possible to transmit information above the existing limits by
using a novel probabilistic shaping of the input signal, which we call ripple
distribution
Hye Won Chung, Saikat Guha, Lizhong Zheng
Subjects: Quantum Physics (quant-ph); Information Theory (cs.IT)
We study the problem of designing optical receivers to discriminate between
multiple coherent states using coherent processing receivers—i.e., one that
uses arbitrary coherent feedback control and quantum-noise-limited direct
detection—which was shown by Dolinar to achieve the minimum error probability
in discriminating any two coherent states. We first derive and re-interpret
Dolinar’s binary-hypothesis minimum-probability-of-error receiver as the one
that optimizes the information efficiency at each time instant, based on
recursive Bayesian updates within the receiver. Using this viewpoint, we
propose a natural generalization of Dolinar’s receiver design to discriminate
(M) coherent states each of which could now be a codeword, i.e., a sequence of
(n) coherent states each drawn from a modulation alphabet. We analyze the
channel capacity of the pure-loss optical channel with a general coherent
processing receiver in the low-photon number regime and compare it with the
capacity achievable with direct detection and the Holevo limit (achieving the
latter would require a quantum joint-detection receiver). We show compelling
evidence that despite the optimal performance of Dolinar’s receiver for the
binary coherent-state hypothesis test (either in error probability or mutual
information), the asymptotic communication rate achievable by such a coherent
processing receiver is only as good as direct detection. This suggests that in
the infinitely-long codeword limit, all potential benefits of coherent
processing at the receiver can be obtained by designing a good code and direct
detection, with no feedback within the receiver.
Ilija Bogunovic, Jonathan Scarlett, Andreas Krause, Volkan Cevher
Comments: Accepted to NIPS 2016
Subjects: Machine Learning (stat.ML); Information Theory (cs.IT); Learning (cs.LG)
We present a new algorithm, truncated variance reduction (TruVaR), that
treats Bayesian optimization (BO) and level-set estimation (LSE) with Gaussian
processes in a unified fashion. The algorithm greedily shrinks a sum of
truncated variances within a set of potential maximizers (BO) or unclassified
points (LSE), which is updated based on confidence bounds. TruVaR is effective
in several important settings that are typically non-trivial to incorporate
into myopic algorithms, including pointwise costs and heteroscedastic noise. We
provide a general theoretical guarantee for TruVaR covering these aspects, and
use it to recover and strengthen existing results on BO and LSE. Moreover, we
provide a new result for a setting where one can select from a number of noise
levels having associated costs. We demonstrate the effectiveness of the
algorithm on both synthetic and real-world data sets.
Samet Oymak, Mahdi Soltanolkotabi
Comments: 23 pages, 4 figures
Subjects: Machine Learning (stat.ML); Information Theory (cs.IT); Optimization and Control (math.OC)
In this paper we study the problem of recovering a structured but unknown
parameter ({f{ heta}}^*) from (n) nonlinear observations of the form
(y_i=f(langle {f{x}}_i,{f{ heta}}^*
angle)) for (i=1,2,ldots,n). We
develop a framework for characterizing time-data tradeoffs for a variety of
parameter estimation algorithms when the nonlinear function (f) is unknown.
This framework includes many popular heuristics such as projected/proximal
gradient descent and stochastic schemes. For example, we show that a projected
gradient descent scheme converges at a linear rate to a reliable solution with
a near minimal number of samples. We provide a sharp characterization of the
convergence rate of such algorithms as a function of sample size, amount of
a-prior knowledge available about the parameter and a measure of the
nonlinearity of the function (f). These results provide a precise understanding
of the various tradeoffs involved between statistical and computational
resources as well as a-prior side information available for such nonlinear
parameter estimation problems.