Edwin Simonnet (LIUM), Sahar Ghannay (LIUM), Nathalie Camelin (LIUM), Yannick Estève (LIUM), Renato De Mori (LIA)
Comments: Interspeech 2017, Aug 2017, Stockholm, Sweden. 2017
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
This paper addresses the problem of automatic speech recognition (ASR) error
detection and their use for improving spoken language understanding (SLU)
systems. In this study, the SLU task consists in automatically extracting, from
ASR transcriptions , semantic concepts and concept/values pairs in a e.g
touristic information system. An approach is proposed for enriching the set of
semantic labels with error specific labels and by using a recently proposed
neural approach based on word embeddings to compute well calibrated ASR
confidence measures. Experimental results are reported showing that it is
possible to decrease significantly the Concept/Value Error Rate with a state of
the art system, outperforming previously published results performance on the
same experimental data. It also shown that combining an SLU approach based on
conditional random fields with a neural encoder/decoder attention based
architecture , it is possible to effectively identifying confidence islands and
uncertain semantic output segments useful for deciding appropriate error
handling actions by the dialogue manager strategy .
Meysam Madadi, Sergio Escalera, Xavier Baro, Jordi Gonzalez
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Despite recent advances in 3D pose estimation of human hands, especially
thanks to the advent of CNNs and depth cameras, this task is still far from
being solved. This is mainly due to the highly non-linear dynamics of fingers,
which makes hand model training a challenging task. In this paper, we exploit a
novel hierarchical tree-like structured CNN, in which branches are trained to
become specialized in predefined subsets of hand joints, called local poses. We
further fuse local pose features, extracted from hierarchical CNN branches, to
learn higher order dependencies among joints in the final pose by end-to-end
training. Lastly, the loss function used is also defined to incorporate
appearance and physical constraints about doable hand motion and deformation.
Experimental results suggest that feeding a tree-shaped CNN, specialized in
local poses, into a fusion network for modeling joints correlations, helps to
increase the precision of final estimations, outperforming state-of-the-art
results on NYU and MSRA datasets.
Elena Burceanu, Marius Leordeanu
Comments: 9.5 pages of main content, 2.5 of bibliography, 2 pages of appendix, 3 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Object tracking is an essential task in computer vision that has been studied
since the early days of the field. Being able to follow objects that undergo
different transformations in the video sequence, including changes in scale,
illumination, shape and occlusions, makes the problem extremely difficult. One
of the real challenges is to keep track of the changes in objects appearance
and not drift towards the background clutter. Different from previous
approaches, we obtain robustness against background with a tracker model that
is composed of many different parts. They are classifiers that respond at
different scales and locations. The tracker system functions as a society of
parts, each having its own role and level of credibility. Reliable classifiers
decide the tracker’s next move, while newcomers are first monitored before
gaining the necessary level of reliability to participate in the decision
process. Some parts that loose their consistency are rejected, while others
that show consistency for a sufficiently long time are promoted to permanent
roles. The tracker system, as a whole, could also go through different phases,
from the usual, normal functioning to states of weak agreement and even crisis.
The tracker system has different governing rules in each state. What truly
distinguishes our work from others is not necessarily the strength of
individual tracking parts, but the way in which they work together and build a
strong and robust organization. We also propose an efficient way to learn
simultaneously many tracking parts, with a single closed-form formulation. We
obtain a fast and robust tracker with state of the art performance on the
challenging OTB50 dataset.
Russell Bates, Benjamin Irving, Bostjan Markelc, Jakob Kaeppler, Ruth Muschel, Vicente Grau, Julia A. Schnabel
Comments: The article has been submitted to IEEE TMI
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Vasculature is known to be of key biological significance, especially in the
study of cancer. As such, considerable effort has been focused on the automated
measurement and analysis of vasculature in medical and pre-clinical images. In
tumors in particular, the vascular networks may be extremely irregular and the
appearance of the individual vessels may not conform to classical descriptions
of vascular appearance. Typically, vessels are extracted by either a
segmentation and thinning pipeline, or by direct tracking. Neither of these
methods are well suited to microscopy images of tumor vasculature. In order to
address this we propose a method to directly extract a medial representation of
the vessels using Convolutional Neural Networks. We then show that these
two-dimensional centerlines can be meaningfully extended into 3D in anisotropic
and complex microscopy images using the recently popularized Convolutional Long
Short-Term Memory units (ConvLSTM). We demonstrate the effectiveness of this
hybrid convolutional-recurrent architecture over both 2D and 3D convolutional
comparators.
Jisoo Jeong, Hyojin Park, Nojun Kwak
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We propose an object detection method that improves the accuracy of the
conventional SSD (Single Shot Multibox Detector), which is one of the top
object detection algorithms in both aspects of accuracy and speed. The
performance of a deep network is known to be improved as the number of feature
maps increases. However, it is difficult to improve the performance by simply
raising the number of feature maps. In this paper, we propose and analyze how
to use feature maps effectively to improve the performance of the conventional
SSD. The enhanced performance was obtained by changing the structure close to
the classifier network, rather than growing layers close to the input data,
e.g., by replacing VGGNet with ResNet. The proposed network is suitable for
sharing the weights in the classifier networks, by which property, the training
can be faster with better generalization power. For the Pascal VOC 2007 test
set trained with VOC 2007 and VOC 2012 training sets, the proposed network with
the input size of 300 x 300 achieved 78.5% mAP (mean average precision) at the
speed of 35.0 FPS (frame per second), while the network with a 512 x 512 sized
input achieved 80.8% mAP at 16.6 FPS using Nvidia Titan X GPU. The proposed
network shows state-of-the-art mAP, which is better than those of the
conventional SSD, YOLO, Faster-RCNN and RFCN. Also, it is faster than
Faster-RCNN and RFCN.
Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, Omar Fawzi, Pascal Frossard, Stefano Soatto
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)
Deep networks have recently been shown to be vulnerable to universal
perturbations: there exist very small image-agnostic perturbations that cause
most natural images to be misclassified by such classifiers. In this paper, we
propose the first quantitative analysis of the robustness of classifiers to
universal perturbations, and draw a formal link between the robustness to
universal perturbations, and the geometry of the decision boundary.
Specifically, we establish theoretical bounds on the robustness of classifiers
under two decision boundary models (flat and curved models). We show in
particular that the robustness of deep networks to universal perturbations is
driven by a key property of their curvature: there exists shared directions
along which the decision boundary of deep networks is systematically positively
curved. Under such conditions, we prove the existence of small universal
perturbations. Our analysis further provides a novel geometric method for
computing universal perturbations, in addition to explaining their properties.
Alhussein Fawzi, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard, Stefano Soatto
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)
The goal of this paper is to analyze the geometric properties of deep neural
network classifiers in the input space. We specifically study the topology of
classification regions created by deep networks, as well as their associated
decision boundary. Through a systematic empirical investigation, we show that
state-of-the-art deep nets learn connected classification regions, and that the
decision boundary in the vicinity of datapoints is flat along most directions.
We further draw an essential connection between two seemingly unrelated
properties of deep networks: their sensitivity to additive perturbations in the
inputs, and the curvature of their decision boundary. The directions where the
decision boundary is curved in fact remarkably characterize the directions to
which the classifier is the most vulnerable. We finally leverage a fundamental
asymmetry in the curvature of the decision boundary of deep nets, and propose a
method to discriminate between original images, and images perturbed with small
adversarial examples. We show the effectiveness of this purely geometric
approach for detecting small adversarial perturbations in images, and for
recovering the labels of perturbed images.
Daiki Ikami, Toshihiko Yamasaki, Kiyoharu Aizawa
Comments: Accepted to CVPR2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We propose the residual expansion (RE) algorithm: a global (or near-global)
optimization method for nonconvex least squares problems. Unlike most existing
nonconvex optimization techniques, the RE algorithm is not based on either
stochastic or multi-point searches; therefore, it can achieve fast global
optimization. Moreover, the RE algorithm is easy to implement and successful in
high-dimensional optimization. The RE algorithm exhibits excellent empirical
performance in terms of k-means clustering, point-set registration, optimized
product quantization, and blind image deblurring.
Guang Yang, Xiahai Zhuang, Habib Khan, Shouvik Haldar, Eva Nyktari, Lei Li, Rick Wage, Xujiong Ye, Greg Slabaugh, Raad Mohiaddin, Tom Wong, Jennifer Keegan, David Firmin
Comments: 39 pages, 8 figure, 2 tables, submitted to MRM journal
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Purpose: Atrial fibrillation (AF) is the most common cardiac arrhythmia and
is correlated with increased morbidity and mortality. It is associated with
atrial fibrosis, which may be assessed non-invasively using late
gadolinium-enhanced (LGE) magnetic resonance imaging (MRI) where scar tissue is
visualised as a region of signal enhancement. In this study, we proposed a
novel fully automatic pipeline to achieve an accurate and objective atrial
scarring segmentation and assessment of LGE MRI scans for the AF patients.
Methods: Our fully automatic pipeline uniquely combined: (1) a multi-atlas
based whole heart segmentation (MA-WHS) to determine the cardiac anatomy from
an MRI Roadmap acquisition which is then mapped to LGE MRI, and (2) a
super-pixel and supervised learning based approach to delineate the
distribution and extent of atrial scarring in LGE MRI. Results: Both our MA-WHS
and atrial scarring segmentation showed accurate delineations of cardiac
anatomy (mean Dice = 89%) and atrial scarring (mean Dice =79%) respectively
compared to the established ground truth from manual segmentation. Compared
with previously studied methods with manual interventions, our innovative
pipeline demonstrated comparable results, but was computed fully automatically.
Conclusion: The proposed segmentation methods allow LGE MRI to be used as an
objective assessment tool for localisation, visualisation and quantification of
atrial scarring.
Ruben Gomez-Ojeda, Francisco-Angel Moreno, Davide Scaramuzza, Javier Gonzalez-Jimenez
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Traditional approaches to stereo visual SLAM rely on point features to
estimate the camera trajectory and build a map of the environment. In
low-textured environments, though, it is often difficult to find a sufficient
number of reliable point features and, as a consequence, the performance of
such algorithms degrades. This paper proposes PL-SLAM, a stereo visual SLAM
system that combines both points and line segments to work robustly in a wider
variety of scenarios, particularly in those where point features are scarce or
not well-distributed in the image. PL-SLAM leverages both points and segments
at all the instances of the process: visual odometry, keyframe selection,
bundle adjustment, etc. We contribute also with a loop closure procedure
through a novel bag-of-words approach that exploits the combined descriptive
power of the two kinds of features. Additionally, the resulting map is richer
and more diverse in 3D elements, which can be exploited to infer valuable,
high-level scene structures like planes, empty spaces, ground plane, etc. (not
addressed in this work). Our proposal has been tested with several popular
datasets (such as KITTI and EuRoC), and is compared to state of the art methods
like ORB-SLAM, revealing superior performance in most of the experiments, while
still running in real-time. An open source version of the PL-SLAM C++ code will
be released for the benefit of the community.
Yanan Li, Donghui Wang
Comments: This work was completed in Oct, 2016
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Zero-shot learning, which studies the problem of object classification for
categories for which we have no training examples, is gaining increasing
attention from community. Most existing ZSL methods exploit deterministic
transfer learning via an in-between semantic embedding space. In this paper, we
try to attack this problem from a generative probabilistic modelling
perspective. We assume for any category, the observed representation, e.g.
images or texts, is developed from a unique prototype in a latent space, in
which the semantic relationship among prototypes is encoded via linear
reconstruction. Taking advantage of this assumption, virtual instances of
unseen classes can be generated from the corresponding prototype, giving rise
to a novel ZSL model which can alleviate the domain shift problem existing in
the way of direct transfer learning. Extensive experiments on three benchmark
datasets show our proposed model can achieve state-of-the-art results.
Yichao Yan, Bingbing Ni, Xiaokang Yang
Comments: To appear in IJCAI 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Predicting human interaction is challenging as the on-going activity has to
be inferred based on a partially observed video. Essentially, a good algorithm
should effectively model the mutual influence between the two interacting
subjects. Also, only a small region in the scene is discriminative for
identifying the on-going interaction. In this work, we propose a relative
attention model to explicitly address these difficulties. Built on a
tri-coupled deep recurrent structure representing both interacting subjects and
global interaction status, the proposed network collects spatio-temporal
information from each subject, rectified with global interaction information,
yielding effective interaction representation. Moreover, the proposed network
also unifies an attention module to assign higher importance to the regions
which are relevant to the on-going action. Extensive experiments have been
conducted on two public datasets, and the results demonstrate that the proposed
relative attention network successfully predicts informative regions between
interacting subjects, which in turn yields superior human interaction
prediction accuracy.
Y Qian, P Giaccone, M Sasdelli, E Vasquez, B Sengupta
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In this paper we detail Cortexica’s (this https URL)
recommendation framework — particularly, we describe how a hybrid visual
recommender system can be created by combining conditional random fields for
segmentation and deep neural networks for object localisation and feature
representation. The recommendation system that is built after localisation,
segmentation and classification has two properties — first, it is knowledge
based in the sense that it learns pairwise preference/occurrence matrix by
utilising knowledge from experts (images from fashion blogs) and second, it is
content-based as it utilises a deep learning based framework for learning
feature representation. Such a construct is especially useful when there is a
scarcity of user preference data, that forms the foundation of many
collaborative recommendation algorithms.
Ruwan Tennakoon, Alireza Sadri, Reza Hoseinnezhad, Alireza Bab-Hadiashar
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Identifying the underlying models in a set of data points contaminated by
noise and outliers, leads to a highly complex multi-model fitting problem. This
problem can be posed as a clustering problem by the projection of higher order
affinities between data points into a graph, which can then be clustered using
spectral clustering. Calculating all possible higher order affinities is
computationally expensive. Hence in most cases only a subset is used. In this
paper, we propose an effective sampling method to obtain a highly accurate
approximation of the full graph required to solve multi-structural model
fitting problems in computer vision. The proposed method is based on the
observation that the usefulness of a graph for segmentation improves as the
distribution of hypotheses (used to build the graph) approaches the
distribution of actual parameters for the given data. In this paper, we
approximate this actual parameter distribution using a k-th order statistics
based cost function and the samples are generated using a greedy algorithm
coupled with a data sub-sampling strategy. The experimental analysis shows that
the proposed method is both accurate and computationally efficient compared to
the state-of-the-art robust multi-model fitting techniques. The code is
publicly available from this https URL
Kingsley Kuan, Mathieu Ravaut, Gaurav Manek, Huiling Chen, Jie Lin, Babar Nazir, Cen Chen, Tse Chiang Howe, Zeng Zeng, Vijay Chandrasekhar
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We present a deep learning framework for computer-aided lung cancer
diagnosis. Our multi-stage framework detects nodules in 3D lung CAT scans,
determines if each nodule is malignant, and finally assigns a cancer
probability based on these results. We discuss the challenges and advantages of
our framework. In the Kaggle Data Science Bowl 2017, our framework ranked 41st
out of 1972 teams.
Yao Qin, Mengyang Feng, Huchuan Lu, Garrison W. Cottrell
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Saliency detection, finding the most important parts of an image, has become
increasingly popular in computer vision. In this paper, we introduce
Hierarchical Cellular Automata (HCA) — a temporally evolving model to
intelligently detect salient objects. HCA consists of two main components:
Single-layer Cellular Automata (SCA) and Cuboid Cellular Automata (CCA). As an
unsupervised propagation mechanism, Single-layer Cellular Automata can exploit
the intrinsic relevance of similar regions through interactions with neighbors.
Low-level image features as well as high-level semantic information extracted
from deep neural networks are incorporated into the SCA to measure the
correlation between different image patches. With these hierarchical deep
features, an impact factor matrix and a coherence matrix are constructed to
balance the influences on each cell’s next state. The saliency values of all
cells are iteratively updated according to a well-defined update rule.
Furthermore, we propose CCA to integrate multiple saliency maps generated by
SCA at different scales in a Bayesian framework. Therefore, single-layer
propagation and multi-layer integration are jointly modeled in our unified HCA.
Surprisingly, we find that the SCA can improve all existing methods that we
applied it to, resulting in a similar precision level regardless of the
original results. The CCA can act as an efficient pixel-wise aggregation
algorithm that can integrate state-of-the-art methods, resulting in even better
results. Extensive experiments on four challenging datasets demonstrate that
the proposed algorithm outperforms state-of-the-art conventional methods and is
competitive with deep learning based approaches.
Amirsina Torfi, Nasser M. Nasrabadi, Jeremy Dawson
Comments: submitted to 5th IEEE Global Conference on Signal and Information Processing(2017)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In this paper, a 3D Convolutional Neural Network (3D-CNN) architecture has
been utilized for text-independent speaker verification. At the development
phase, a CNN is trained to classify speakers at the utterance-level. In the
enrollment stage, the trained network is utilized to directly create a speaker
model for each speaker based on the extracted features. Finally, in the
evaluation phase, the extracted features from the test utterance will be
compared to the stored speaker model to verify the claimed identity. One of the
main challenges is the creation of the speaker models. Previously-reported
approaches create speaker models based on averaging the extracted features from
utterances of the speaker, which is known as a d-vector system. In our paper,
we propose to use the 3D-CNNs for direct speaker model creation in which, for
both development and enrollment phases, an identical number of speaker
utterances is fed to the network for representing the speaker utterances and
creation of the speaker model. This leads to simultaneously capturing the
speaker-related information and building a more robust system to cope with
within-speaker variation. We demonstrate that the proposed method significantly
outperforms the d-vector verification system.
Vincent Christlein, Martin Gropp, Stefan Fiel, Andreas Maier
Comments: Submitted to ICDAR2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Deep Convolutional Neural Networks (CNN) have shown great success in
supervised classification tasks such as character classification or dating.
Deep learning methods typically need a lot of annotated training data, which is
not available in many scenarios. In these cases, traditional methods are often
better than or equivalent to deep learning methods. In this paper, we propose a
simple, yet effective, way to learn CNN activation features in an unsupervised
manner. Therefore, we train a deep residual network using surrogate classes.
The surrogate classes are created by clustering the training dataset, where
each cluster index represents one surrogate class. The activations from the
penultimate CNN layer serve as features for subsequent classification tasks. We
evaluate the feature representations on two publicly available datasets. The
focus lies on the ICDAR17 competition dataset on historical document writer
identification (Historical-WI). We show that the activation features we trained
without supervision are superior to descriptors of state-of-the-art writer
identification methods. Additionally, we achieve comparable results in the case
of handwriting classification using the ICFHR16 competition dataset on
historical Latin script types (CLaMM16).
Liqian Ma, Xu Jia, Qianru Sun, Bernt Schiele, Tinne Tuytelaars, Luc Van Gool
Comments: Xu Jia and Qianru Sun contribute equally to this work
Subjects: Computer Vision and Pattern Recognition (cs.CV)
This paper proposes the novel Pose Guided Person Generation Network (PG(^2))
that allows to synthesize person images in arbitrary poses, based on an image
of that person and a novel pose. Our generation framework PG(^2) utilizes the
pose information explicitly and consists of two key stages: pose integration
and image refinement. In the first stage the condition image and the target
pose are fed into a U-Net-like network to generate an initial but coarse image
of the person with the target pose. The second stage then refines the initial
and blurry result by training a U-Net-like generator in an adversarial way.
Extensive experimental results on both 128( imes)64 re-identification images
and 256( imes)256 fashion photos show that our model generates high-quality
person images with convincing details.
Benjamin Hepp, Matthias Nießner, Otmar Hilliges
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We introduce a new method that efficiently computes a set of rich viewpoints
and trajectories for high-quality 3D reconstructions in outdoor environments.
The input images of the reconstruction are taken with a commodity RGB camera
which is mounted on an autonomously navigated quadcopter, and the obtained
recordings are fed into a multi-view stereo reconstruction pipeline that
produces high-quality results but is computationally expensive. Our goal is to
automatically explore an unknown area, and obtain a complete 3D scan of a
region of interest (e.g., a large building). In this process, the scan is
constraint by the restricted flight time of quadcopters and the heavy compute
costs of the subsequent 3D reconstruction — i.e., only a small number of
images can be recorded and processed. To this end, we introduce a novel
optimization strategy that respects these constraints by maximizing the
information gain from sparsely-sampled view points while limiting the total
number of captured images. The core of this strategy is based on the concept of
tri-state space classification, which is common in volumetric fusion
approaches, and includes labels for unknown, free, and occupied space. Our
optimization leverages a hierarchical and sparse volumetric data structure that
takes advantage of the implicit representation, where its main objective is to
convert unknown space into known regions. In addition to the surface geometry,
we utilize the free-space information to avoid obstacles and determine feasible
flight paths. A simple tool can be used to specify the region of interest and
to plan trajectories. We demonstrate our method by obtaining a number of
compelling 3D reconstructions, and provide a thorough quantitative evaluation
for our optimization strategy.
Wufeng Xue, Ali Islam, Mousumi Bhaduri, Shuo Li
Comments: accepted by IEEE Transactions on Medical Imaging
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Cardiac indices estimation is of great importance during identification and
diagnosis of cardiac disease in clinical routine. However, estimation of
multitype cardiac indices with consistently reliable and high accuracy is still
a great challenge due to the high variability of cardiac structures and
complexity of temporal dynamics in cardiac MR sequences. While efforts have
been devoted into cardiac volumes estimation through feature engineering
followed by a independent regression model, these methods suffer from the
vulnerable feature representation and incompatible regression model. In this
paper, we propose a semi-automated method for multitype cardiac indices
estimation. After manual labelling of two landmarks for ROI cropping, an
integrated deep neural network Indices-Net is designed to jointly learn the
representation and regression models. It comprises two tightly-coupled
networks: a deep convolution autoencoder (DCAE) for cardiac image
representation, and a multiple output convolution neural network (CNN) for
indices regression. Joint learning of the two networks effectively enhances the
expressiveness of image representation with respect to cardiac indices, and the
compatibility between image representation and indices regression, thus leading
to accurate and reliable estimations for all the cardiac indices.
When applied with five-fold cross validation on MR images of 145 subjects,
Indices-Net achieves consistently low estimation error for LV wall thicknesses
(1.44(pm)0.71mm) and areas of cavity and myocardium (204(pm)133mm(^2)). It
outperforms, with significant error reductions, segmentation method (55.1% and
17.4%) and two-phase direct volume-only methods (12.7% and 14.6%) for wall
thicknesses and areas, respectively. These advantages endow the proposed method
a great potential in clinical cardiac function assessment.
Yunus Saatchi, Andrew Gordon Wilson
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
Generative adversarial networks (GANs) can implicitly learn rich
distributions over images, audio, and data which are hard to model with an
explicit likelihood. We present a practical Bayesian formulation for
unsupervised and semi-supervised learning with GANs, in conjunction with
stochastic gradient Hamiltonian Monte Carlo to marginalize the weights of the
generator and discriminator networks. The resulting approach is straightforward
and obtains good performance without any standard interventions such as feature
matching, or mini-batch discrimination. By exploring an expressive posterior
over the parameters of the generator, the Bayesian GAN avoids mode-collapse,
produces interpretable candidate samples with notable variability, and in
particular provides state-of-the-art quantitative results for semi-supervised
learning on benchmarks including SVHN, CelebA, and CIFAR-10, outperforming
DCGAN, Wasserstein GANs, and DCGAN ensembles.
Yanan Li, Donghui Wang
Comments: This work was completed in Feb, 2015
Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Automatically learning features, especially robust features, has attracted
much attention in the machine learning community. In this paper, we propose a
new method to learn non-linear robust features by taking advantage of the data
manifold structure. We first follow the commonly used trick of the trade, that
is learning robust features with artificially corrupted data, which are
training samples with manually injected noise. Following the idea of the
auto-encoder, we first assume features should contain much information to well
reconstruct the input from its corrupted copies. However, merely reconstructing
clean input from its noisy copies could make data manifold in the feature space
noisy. To address this problem, we propose a new method, called Incremental
Auto-Encoders, to iteratively denoise the extracted features. We assume the
noisy manifold structure is caused by a diffusion process. Consequently, we
reverse this specific diffusion process to further contract this noisy
manifold, which results in an incremental optimization of model parameters .
Furthermore, we show these learned non-linear features can be stacked into a
hierarchy of features. Experimental results on real-world datasets demonstrate
the proposed method can achieve better classification performances.
B Ravi Kiran, Senthil Yogamani
Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV)
Background-Foreground classification is a fundamental well-studied problem in
computer vision. Due to the pixel-wise nature of modeling and processing in the
algorithm, it is usually difficult to satisfy real-time constraints. There is a
trade-off between the speed (because of model complexity) and accuracy.
Inspired by the rejection cascade of Viola-Jones classifier, we decompose the
Gaussian Mixture Model (GMM) into an adaptive cascade of classifiers. This way
we achieve a good improvement in speed without compensating for accuracy. In
the training phase, we learn multiple KDEs for different durations to be used
as strong prior distribution and detect probable oscillating pixels which
usually results in misclassifications. We propose a confidence measure for the
classifier based on temporal consistency and the prior distribution. The
confidence measure thus derived is used to adapt the learning rate and the
thresholds of the model, to improve accuracy. The confidence measure is also
employed to perform temporal and spatial sampling in a principled way. We
demonstrate a speed-up factor of 5x to 10x and 17 percent average improvement
in accuracy over several standard videos.
Fred Glover, Mark Lewis, Gary Kochenberger
Comments: 30 pages + 6 pages of Appendices
Subjects: Artificial Intelligence (cs.AI)
The quadratic unconstrained binary optimization (QUBO) problem arises in
diverse optimization applications ranging from Ising spin problems to classical
problems in graph theory and binary discrete optimization. The use of
preprocessing to transform the graph representing the QUBO problem into a
smaller equivalent graph is important for improving solution quality and time
for both exact and metaheuristic algorithms and is a step towards mapping large
scale QUBO to hardware graphs used in quantum annealing computers. In an
earlier paper (Lewis and Glover, 2016) a set of rules was introduced that
achieved significant QUBO reductions as verified through computational testing.
Here this work is extended with additional rules that provide further
reductions that succeed in exactly solving 10% of the benchmark QUBO problems.
An algorithm and associated data structures to efficiently implement the entire
set of rules is detailed and computational experiments are reported that
demonstrate their efficacy.
Kosetsu Tsukuda, Masataka Goto
Comments: Accepted by The 21st Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2017)
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG)
Online music services are increasing in popularity. They enable us to analyze
people’s music listening behavior based on play logs. Although it is known that
people listen to music based on topic (e.g., rock or jazz), we assume that when
a user is addicted to an artist, s/he chooses the artist’s songs regardless of
topic. Based on this assumption, in this paper, we propose a probabilistic
model to analyze people’s music listening behavior. Our main contributions are
three-fold. First, to the best of our knowledge, this is the first study
modeling music listening behavior by taking into account the influence of
addiction to artists. Second, by using real-world datasets of play logs, we
showed the effectiveness of our proposed model. Third, we carried out
qualitative experiments and showed that taking addiction into account enables
us to analyze music listening behavior from a new viewpoint in terms of how
people listen to music according to the time of day, how an artist’s songs are
listened to by people, etc. We also discuss the possibility of applying the
analysis results to applications such as artist similarity computation and song
recommendation.
Pavel Naumov, Jia Tao
Comments: An extended abstract of this paper will appear in Proceedings of 16th conference on Theoretical Aspects of Rationality and Knowledge (TARK-17), Liverpool, United Kingdom, July 24-26, 2017
Subjects: Artificial Intelligence (cs.AI); Computer Science and Game Theory (cs.GT); Logic in Computer Science (cs.LO); Multiagent Systems (cs.MA)
The existence of a coalition strategy to achieve a goal does not necessarily
mean that the coalition has enough information to know how to follow the
strategy. Neither does it mean that the coalition knows that such a strategy
exists. The article studies an interplay between the distributed knowledge,
coalition strategies, and coalition “know-how” strategies. The main technical
result is a sound and complete trimodal logical system that describes the
properties of this interplay.
Xiaoran Liu, Qin Lin, Sicco Verwer, Dmitri Jarnikov
Comments: This paper has been accepted by the Thirty-Second Annual ACM/IEEE Symposium on Logic in Computer Science (LICS) Workshop on Learning and Automata (LearnAut)
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Formal Languages and Automata Theory (cs.FL); Logic in Computer Science (cs.LO)
This paper focuses on detecting anomalies in a digital video broadcasting
(DVB) system from providers’ perspective. We learn a probabilistic
deterministic real timed automaton profiling benign behavior of encryption
control in the DVB control access system. This profile is used as a one-class
classifier. Anomalous items in a testing sequence are detected when the
sequence is not accepted by the learned model.
AmirEmad Ghassami, Saber Salehkaleybar, Negar Kiyavash, Kun Zhang
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
We study causal inference in a multi-environment setting, in which the
functional relations for producing the variables from their direct causes
remain the same across environments, while the distribution of exogenous noises
may vary. We introduce the idea of using the invariance of the functional
relations of the variables to their causes across a set of environments. We
define a notion of completeness for a causal inference algorithm in this
setting and prove the existence of such algorithm by proposing the baseline
algorithm. Additionally, we present an alternate algorithm that has
significantly improved computational and sample complexity compared to the
baseline algorithm. The experiment results show that the proposed algorithm
outperforms the other existing algorithms.
Agostino Capponi, Reza Ghanadan, Matt Stern
Comments: 15 pages, 10 figures
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI)
Autonomous systems can substantially enhance a human’s efficiency and
effectiveness in complex environments. Machines, however, are often unable to
observe the preferences of the humans that they serve. Despite the fact that
the human’s and machine’s objectives are aligned, asymmetric information, along
with heterogeneous sensitivities to risk by the human and machine, make their
joint optimization process a game with strategic interactions. We propose a
framework based on risk-sensitive dynamic games; the human seeks to optimize
her risk-sensitive criterion according to her true preferences, while the
machine seeks to adaptively learn the human’s preferences and at the same time
provide a good service to the human. We develop a class of performance measures
for the proposed framework based on the concept of regret. We then evaluate
their dependence on the risk-sensitivity and the degree of uncertainty. We
present applications of our framework to self-driving taxis, and robo-financial
advising.
Yunus Saatchi, Andrew Gordon Wilson
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
Generative adversarial networks (GANs) can implicitly learn rich
distributions over images, audio, and data which are hard to model with an
explicit likelihood. We present a practical Bayesian formulation for
unsupervised and semi-supervised learning with GANs, in conjunction with
stochastic gradient Hamiltonian Monte Carlo to marginalize the weights of the
generator and discriminator networks. The resulting approach is straightforward
and obtains good performance without any standard interventions such as feature
matching, or mini-batch discrimination. By exploring an expressive posterior
over the parameters of the generator, the Bayesian GAN avoids mode-collapse,
produces interpretable candidate samples with notable variability, and in
particular provides state-of-the-art quantitative results for semi-supervised
learning on benchmarks including SVHN, CelebA, and CIFAR-10, outperforming
DCGAN, Wasserstein GANs, and DCGAN ensembles.
Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, Omar Fawzi, Pascal Frossard, Stefano Soatto
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)
Deep networks have recently been shown to be vulnerable to universal
perturbations: there exist very small image-agnostic perturbations that cause
most natural images to be misclassified by such classifiers. In this paper, we
propose the first quantitative analysis of the robustness of classifiers to
universal perturbations, and draw a formal link between the robustness to
universal perturbations, and the geometry of the decision boundary.
Specifically, we establish theoretical bounds on the robustness of classifiers
under two decision boundary models (flat and curved models). We show in
particular that the robustness of deep networks to universal perturbations is
driven by a key property of their curvature: there exists shared directions
along which the decision boundary of deep networks is systematically positively
curved. Under such conditions, we prove the existence of small universal
perturbations. Our analysis further provides a novel geometric method for
computing universal perturbations, in addition to explaining their properties.
Alhussein Fawzi, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard, Stefano Soatto
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)
The goal of this paper is to analyze the geometric properties of deep neural
network classifiers in the input space. We specifically study the topology of
classification regions created by deep networks, as well as their associated
decision boundary. Through a systematic empirical investigation, we show that
state-of-the-art deep nets learn connected classification regions, and that the
decision boundary in the vicinity of datapoints is flat along most directions.
We further draw an essential connection between two seemingly unrelated
properties of deep networks: their sensitivity to additive perturbations in the
inputs, and the curvature of their decision boundary. The directions where the
decision boundary is curved in fact remarkably characterize the directions to
which the classifier is the most vulnerable. We finally leverage a fundamental
asymmetry in the curvature of the decision boundary of deep nets, and propose a
method to discriminate between original images, and images perturbed with small
adversarial examples. We show the effectiveness of this purely geometric
approach for detecting small adversarial perturbations in images, and for
recovering the labels of perturbed images.
Edwin Simonnet (LIUM), Sahar Ghannay (LIUM), Nathalie Camelin (LIUM), Yannick Estève (LIUM), Renato De Mori (LIA)
Comments: Interspeech 2017, Aug 2017, Stockholm, Sweden. 2017
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
This paper addresses the problem of automatic speech recognition (ASR) error
detection and their use for improving spoken language understanding (SLU)
systems. In this study, the SLU task consists in automatically extracting, from
ASR transcriptions , semantic concepts and concept/values pairs in a e.g
touristic information system. An approach is proposed for enriching the set of
semantic labels with error specific labels and by using a recently proposed
neural approach based on word embeddings to compute well calibrated ASR
confidence measures. Experimental results are reported showing that it is
possible to decrease significantly the Concept/Value Error Rate with a state of
the art system, outperforming previously published results performance on the
same experimental data. It also shown that combining an SLU approach based on
conditional random fields with a neural encoder/decoder attention based
architecture , it is possible to effectively identifying confidence islands and
uncertain semantic output segments useful for deciding appropriate error
handling actions by the dialogue manager strategy .
Daksh Varshneya, G. Srinivasaraghavan
Comments: 10 pages, 5 figures, Submitted to 31st Conference on Neural Information Processing Systems (NIPS 2017)
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)
Trajectory Prediction of dynamic objects is a widely studied topic in the
field of artificial intelligence. Thanks to a large number of applications like
predicting abnormal events, navigation system for the blind, etc. there have
been many approaches to attempt learning patterns of motion directly from data
using a wide variety of techniques ranging from hand-crafted features to
sophisticated deep learning models for unsupervised feature learning. All these
approaches have been limited by problems like inefficient features in the case
of hand crafted features, large error propagation across the predicted
trajectory and no information of static artefacts around the dynamic moving
objects. We propose an end to end deep learning model to learn the motion
patterns of humans using different navigational modes directly from data using
the much popular sequence to sequence model coupled with a soft attention
mechanism. We also propose a novel approach to model the static artefacts in a
scene and using these to predict the dynamic trajectories. The proposed method,
tested on trajectories of pedestrians, consistently outperforms previously
proposed state of the art approaches on a variety of large scale data sets. We
also show how our architecture can be naturally extended to handle multiple
modes of movement (say pedestrians, skaters, bikers and buses) simultaneously.
Panagiotis Mandros, Mario Boley, Jilles Vreeken
Comments: Accepted to the Proceedings of ACM SIGKDD conference, Halifax, Canada, August 2017
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Information Theory (cs.IT)
Given a database and a target attribute of interest, how can we tell whether
there exists a functional, or approximately functional dependence of the target
on any set of other attributes in the data? How can we reliably, without bias
to sample size or dimensionality, measure the strength of such a dependence?
And, how can we efficiently discover the optimal or (alpha)-approximate
top-(k) dependencies? These are exactly the questions we answer in this paper.
As we want to be agnostic on the form of the dependence, we adopt an
information-theoretic approach, and construct a reliable, bias correcting score
that can be efficiently computed. Moreover, we give an effective optimistic
estimator of this score, by which for the first time we can mine the
approximate functional dependencies from data with guarantees of optimality.
Empirical evaluation shows that the derived score achieves a good bias for
variance trade-off, can be used within an efficient discovery algorithm, and
indeed discovers meaningful dependencies. Most important, it remains reliable
in the face of data sparsity.
Vahan Huroyan, Gilad Lerman
Subjects: Numerical Analysis (math.NA); Artificial Intelligence (cs.AI); Distributed, Parallel, and Cluster Computing (cs.DC); Numerical Analysis (cs.NA)
We study Robust Subspace Recovery (RSR) in distributed settings. We consider
a huge data set in an ad hoc network without a central processor, where each
node has access only to one chunk of the data set. We assume that part of the
whole data set lies around a low-dimensional subspace and the other part is
composed of outliers that lie away from that subspace. The goal is to recover
the underlying subspace for the whole data set, without transferring the data
itself between the nodes. We apply the Consensus Based Gradient method for the
Geometric Median Subspace algorithm for RSR. We propose an iterative solution
for the local dual minimization problem and establish its (r)-linear
convergence. We show that this mathematical framework also extends to two
simpler problems: Principal Component Analysis and the geometric median. We
also explain how to distributedly implement the Reaper and Fast Median Subspace
algorithms for RSR. We demonstrate the competitive performance of our
algorithms for both synthetic and real data.
Niek Tax, Emin Alasgarov, Natalia Sidorova, Wil M.P. van der Aalst, Reinder Haakma
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Databases (cs.DB)
Process mining is a research field focused on the analysis of event data with
the aim of extracting insights related to dynamic behavior. Applying process
mining techniques on data from smart home environments has the potential to
provide valuable insights in (un)healthy habits and to contribute to ambient
assisted living solutions. Finding the right event labels to enable the
application of process mining techniques is however far from trivial, as simply
using the triggering sensor as the label for sensor events results in
uninformative models that allow for too much behavior (overgeneralizing).
Refinements of sensor level event labels suggested by domain experts have been
shown to enable discovery of more precise and insightful process models.
However, there exists no automated approach to generate refinements of event
labels in the context of process mining. In this paper we propose a framework
for the automated generation of label refinements based on the time attribute
of events, allowing us to distinguish behaviourally different instances of the
same event type based on their time attribute. We show on a case study with
real life smart home event data that using automatically generated refined
labels in process discovery, we can find more specific, and therefore more
insightful, process models. We observe that one label refinement could have an
effect on the usefulness of other label refinements when used together.
Therefore, we explore four strategies to generate useful combinations of
multiple label refinements and evaluate those on three real life smart home
event logs.
Gabriele Farina, John P. Dickerson, Tuomas Sandholm
Comments: Published at IJCAI-17
Subjects: Computer Science and Game Theory (cs.GT); Artificial Intelligence (cs.AI)
A kidney exchange is a centrally-administered barter market where patients
swap their willing yet incompatible donors. Modern kidney exchanges use
2-cycles, 3-cycles, and chains initiated by non-directed donors (altruists who
are willing to give a kidney to anyone) as the means for swapping.
We propose significant generalizations to kidney exchange. We allow more than
one donor to donate in exchange for their desired patient receiving a kidney.
We also allow for the possibility of a donor willing to donate if any of a
number of patients receive kidneys. Furthermore, we combine these notions and
generalize them. The generalization is to exchange among organ clubs, where a
club is willing to donate organs outside the club if and only if the club
receives organs from outside the club according to given specifications. We
prove that unlike in the standard model, the uncapped clearing problem is
NP-complete.
We also present the notion of operation frames that can be used to sequence
the operations across batches, and present integer programming formulations for
the market clearing problems for these new types of organ exchanges.
Experiments show that in the single-donation setting, operation frames
improve planning by 34%–51%. Allowing up to two donors to donate in exchange
for one kidney donated to their designated patient yields a further increase in
social welfare.
Terrence Szymanski, Claudia Orellana-Rodriguez, Mark T. Keane
Journal-ref: Natural Language Processing meets Journalism. IJCAI-16 workshop.
Pages 30-34. 2016
Subjects: Computation and Language (cs.CL); Human-Computer Interaction (cs.HC); Information Retrieval (cs.IR)
We present a software tool that employs state-of-the-art natural language
processing (NLP) and machine learning techniques to help newspaper editors
compose effective headlines for online publication. The system identifies the
most salient keywords in a news article and ranks them based on both their
overall popularity and their direct relevance to the article. The system also
uses a supervised regression model to identify headlines that are likely to be
widely shared on social media. The user interface is designed to simplify and
speed the editor’s decision process on the composition of the headline. As
such, the tool provides an efficient way to combine the benefits of automated
predictors of engagement and search-engine optimization (SEO) with human
judgments of overall headline quality.
Terrence Szymanski, Claudia Orellana-Rodriguez, Mark T. Keane
Journal-ref: Natural Language Processing meets Journalism. IJCAI-16 workshop.
Pages 30-34. 2016
Subjects: Computation and Language (cs.CL); Human-Computer Interaction (cs.HC); Information Retrieval (cs.IR)
We present a software tool that employs state-of-the-art natural language
processing (NLP) and machine learning techniques to help newspaper editors
compose effective headlines for online publication. The system identifies the
most salient keywords in a news article and ranks them based on both their
overall popularity and their direct relevance to the article. The system also
uses a supervised regression model to identify headlines that are likely to be
widely shared on social media. The user interface is designed to simplify and
speed the editor’s decision process on the composition of the headline. As
such, the tool provides an efficient way to combine the benefits of automated
predictors of engagement and search-engine optimization (SEO) with human
judgments of overall headline quality.
Tianxiao Shen, Tao Lei, Regina Barzilay, Tommi Jaakkola
Subjects: Computation and Language (cs.CL); Learning (cs.LG)
This paper focuses on style transfer on the basis of non-parallel text. This
is an instance of a broader family of problems including machine translation,
decipherment, and sentiment modification. The key technical challenge is to
separate the content from desired text characteristics such as sentiment. We
leverage refined alignment of latent representations across mono-lingual text
corpora with different characteristics. We deliberately modify encoded examples
according to their characteristics, requiring the reproduced instances to match
available examples with the altered characteristics as a population. We
demonstrate the effectiveness of this cross-alignment method on three tasks:
sentiment modification, decipherment of word substitution ciphers, and recovery
of word order.
Rohan Kshirsagar, Robert Morris, Sam Bowman
Comments: Accepted at CLPsych, ACL workshop. 8 pages, 5 figures
Subjects: Computation and Language (cs.CL)
Individuals on social media may reveal themselves to be in various states of
crisis (e.g. suicide, self-harm, abuse, or eating disorders). Detecting crisis
from social media text automatically and accurately can have profound
consequences. However, detecting a general state of crisis without explaining
why has limited applications. An explanation in this context is a coherent,
concise subset of the text that rationalizes the crisis detection. We explore
several methods to detect and explain crisis using a combination of neural and
non-neural techniques. We evaluate these techniques on a unique data set
obtained from Koko, an anonymous emotional support network available through
various messaging applications. We annotate a small subset of the samples
labeled with crisis with corresponding explanations. Our best technique
significantly outperforms the baseline for detection and explanation.
Patchigolla V S S Rahul, Sunil Kumar Sahu, Ashish Anand
Comments: The work has been accepted in BioNLP at ACL-2017
Subjects: Computation and Language (cs.CL)
Biomedical events describe complex interactions between various biomedical
entities. Event trigger is a word or a phrase which typically signifies the
occurrence of an event. Event trigger identification is an important first step
in all event extraction methods. However many of the current approaches either
rely on complex hand-crafted features or consider features only within a
window. In this paper we propose a method that takes the advantage of recurrent
neural network (RNN) to extract higher level features present across the
sentence. Thus hidden state representation of RNN along with word and entity
type embedding as features avoid relying on the complex hand-crafted features
generated using various NLP toolkits. Our experiments have shown to achieve
state-of-art F1-score on Multi Level Event Extraction (MLEE) corpus. We have
also performed category-wise analysis of the result and discussed the
importance of various features in trigger identification task.
Edwin Simonnet (LIUM), Sahar Ghannay (LIUM), Nathalie Camelin (LIUM), Yannick Estève (LIUM), Renato De Mori (LIA)
Comments: Interspeech 2017, Aug 2017, Stockholm, Sweden. 2017
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
This paper addresses the problem of automatic speech recognition (ASR) error
detection and their use for improving spoken language understanding (SLU)
systems. In this study, the SLU task consists in automatically extracting, from
ASR transcriptions , semantic concepts and concept/values pairs in a e.g
touristic information system. An approach is proposed for enriching the set of
semantic labels with error specific labels and by using a recently proposed
neural approach based on word embeddings to compute well calibrated ASR
confidence measures. Experimental results are reported showing that it is
possible to decrease significantly the Concept/Value Error Rate with a state of
the art system, outperforming previously published results performance on the
same experimental data. It also shown that combining an SLU approach based on
conditional random fields with a neural encoder/decoder attention based
architecture , it is possible to effectively identifying confidence islands and
uncertain semantic output segments useful for deciding appropriate error
handling actions by the dialogue manager strategy .
Dallas Card, Chenhao Tan, Noah A. Smith
Comments: 11 pages, 2 figures, 4 tables
Subjects: Machine Learning (stat.ML); Computation and Language (cs.CL)
Topic models for text corpora comprise a popular family of methods that have
inspired many extensions to encode properties such as sparsity, interactions
with covariates, and the gradual evolution of topics. In this paper, we combine
certain motivating ideas behind variations on topic models with modern
techniques for variational inference to produce a flexible framework for topic
modeling that allows for rapid exploration of different models. We first
discuss how our framework relates to existing models, and then demonstrate that
it achieves strong performance, with the introduction of sparsity controlling
the trade off between perplexity and topic coherence.
Saeed Akhoondian Amiri, Stefan Schmid, Sebastian Siebertz
Comments: arXiv admin note: substantial text overlap with arXiv:1602.02991
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Discrete Mathematics (cs.DM); Data Structures and Algorithms (cs.DS); Combinatorics (math.CO)
The Minimum Dominating Set (MDS) problem is one of the most fundamental and
challenging problems in distributed computing. While it is well-known that
minimum dominating sets cannot be approximated locally on general graphs, over
the last years, there has been much progress on computing local approximations
on sparse graphs, and in particular planar graphs.
In this paper we study distributed and deterministic MDS approximation
algorithms for graph classes beyond planar graphs. In particular, we show that
existing approximation bounds for planar graphs can be lifted to bounded genus
graphs, and present (1) a local constant-time, constant-factor MDS
approximation algorithm and (2) a local (mathcal{O}(log^*{n}))-time
approximation scheme. Our main technical contribution is a new analysis of a
slightly modified variant of an existing algorithm by Lenzen et al.
Interestingly, unlike existing proofs for planar graphs, our analysis does not
rely on direct topological arguments.
Bruna Peres
Comments: Thesis project presented to the Graduate Program in Computer Science of the Federal University of Minas Gerais in partial fulfillment of the requirements for the degree of Doctor in Computer Science
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
SplayNets are a distributed generalization of the classic splay tree data
structures. Given a set of communication requests and a network comprised of n
nodes, such that any pair of nodes is capable of establishing a direct
connection, the goal is to dynamically find a (locally routable) binary tree
topology, which connects all nodes and optimizes the routing cost for that
communication pattern, making local topology transformations (rotations) before
each request is served. In this work we present a distributed and concurrent
implementation of SplayNets. Analytical results show that our proposed
algorithm prevents loops and deadlocks from occurring between concurrent
rotations. We compute the total amortized average cost of a splay request in
number of rounds and number of time-slots and as a function of the empirical
entropies of source and destination nodes of the splay requests.
Erik Saule, Dinesh Panchananam, Alexander Hohl, Wenwu Tang, Eric Delmelle
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
The exponential growth of available data has increased the need for
interactive exploratory analysis. Dataset can no longer be understood through
manual crawling and simple statistics. In Geographical Information Systems
(GIS), the dataset is often composed of events localized in space and time; and
visualizing such a dataset involves building a map of where the events
occurred.
We focus in this paper on events that are localized among three dimensions
(latitude, longitude, and time), and on computing the first step of the
visualization pipeline, space-time kernel density estimation (STKDE), which is
most computationally expensive. Starting from a gold standard implementation,
we show how algorithm design and engineering, parallel decomposition, and
scheduling can be applied to bring near real-time computing to space-time
kernel density estimation. We validate our techniques on real world datasets
extracted from infectious disease, social media, and ornithology.
Raphael Kimmig, Henning Meyerhenke, Darren Strash
Comments: 18 pages, 12 figures, To appear at the 7th IEEE Workshop on Parallel / Distributed Computing and Optimization (PDCO 2017)
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS)
The subgraph enumeration problem asks us to find all subgraphs of a target
graph that are isomorphic to a given pattern graph. Determining whether even
one such isomorphic subgraph exists is NP-complete—and therefore finding all
such subgraphs (if they exist) is a time-consuming task. Subgraph enumeration
has applications in many fields, including biochemistry and social networks,
and interestingly the fastest algorithms for solving the problem for
biochemical inputs are sequential. Since they depend on depth-first tree
traversal, an efficient parallelization is far from trivial. Nevertheless,
since important applications produce data sets with increasing difficulty,
parallelism seems beneficial.
We thus present here a shared-memory parallelization of the state-of-the-art
subgraph enumeration algorithms RI and RI-DS (a variant of RI for dense graphs)
by Bonnici et al. [BMC Bioinformatics, 2013]. Our strategy uses work stealing
and our implementation demonstrates a significant speedup on real-world
biochemical data—despite a highly irregular data access pattern. We also
improve RI-DS by pruning the search space better; this further improves the
empirical running times compared to the already highly tuned RI-DS.
Calvin Newport
Comments: Extended Abstract to Appear in the Proceedings of the ACM Conference on the Principles of Distributed Computing (PODC 2017)
Subjects: Data Structures and Algorithms (cs.DS); Distributed, Parallel, and Cluster Computing (cs.DC)
In this paper, we study the fundamental problem of gossip in the mobile
telephone model: a recently introduced variation of the classical telephone
model modified to better describe the local peer-to-peer communication services
implemented in many popular smartphone operating systems. In more detail, the
mobile telephone model differs from the classical telephone model in three
ways: (1) each device can participate in at most one connection per round; (2)
the network topology can undergo a parameterized rate of change; and (3)
devices can advertise a parameterized number of bits about their state to their
neighbors in each round before connection attempts are initiated. We begin by
describing and analyzing new randomized gossip algorithms in this model under
the harsh assumption of a network topology that can change completely in every
round. We prove a significant time complexity gap between the case where nodes
can advertise (0) bits to their neighbors in each round, and the case where
nodes can advertise (1) bit. For the latter assumption, we present two
solutions: the first depends on a shared randomness source, while the second
eliminates this assumption using a pseudorandomness generator we prove to exist
with a novel generalization of a classical result from the study of two-party
communication complexity. We then turn our attention to the easier case where
the topology graph is stable, and describe and analyze a new gossip algorithm
that provides a substantial performance improvement for many parameters. We
conclude by studying a relaxed version of gossip in which it is only necessary
for nodes to each learn a specified fraction of the messages in the system.
Andrea Clementi, Luciano Gualà, Guido Proietti, Giacomo Scornavacca
Comments: Accepted at IPDPS’17
Subjects: Computer Science and Game Theory (cs.GT); Distributed, Parallel, and Cluster Computing (cs.DC)
The emph{rational fair consensus problem} can be informally defined as
follows. Consider a network of (n) (selfish) emph{rational agents}, each of
them initially supporting a emph{color} chosen from a finite set ( Sigma).
The goal is to design a protocol that leads the network to a stable
monochromatic configuration (i.e. a consensus) such that the probability that
the winning color is (c) is equal to the fraction of the agents that initially
support (c), for any (c in Sigma). Furthermore, this fairness property must
be guaranteed (with high probability) even in presence of any fixed
emph{coalition} of rational agents that may deviate from the protocol in order
to increase the winning probability of their supported colors. A protocol
having this property, in presence of coalitions of size at most (t), is said to
be a emph{whp,-(t)-strong equilibrium}. We investigate, for the first time,
the rational fair consensus problem in the GOSSIP communication model where, at
every round, every agent can actively contact at most one neighbor via a
emph{push(/)pull} operation. We provide a randomized GOSSIP protocol that,
starting from any initial color configuration of the complete graph, achieves
rational fair consensus within (O(log n)) rounds using messages of
(O(log^2n)) size, w.h.p. More in details, we prove that our protocol is a
whp,-(t)-strong equilibrium for any (t = o(n/log n)) and, moreover, it
tolerates worst-case permanent faults provided that the number of non-faulty
agents is (Omega(n)). As far as we know, our protocol is the first solution
which avoids any all-to-all communication, thus resulting in (o(n^2)) message
complexity.
Vahan Huroyan, Gilad Lerman
Subjects: Numerical Analysis (math.NA); Artificial Intelligence (cs.AI); Distributed, Parallel, and Cluster Computing (cs.DC); Numerical Analysis (cs.NA)
We study Robust Subspace Recovery (RSR) in distributed settings. We consider
a huge data set in an ad hoc network without a central processor, where each
node has access only to one chunk of the data set. We assume that part of the
whole data set lies around a low-dimensional subspace and the other part is
composed of outliers that lie away from that subspace. The goal is to recover
the underlying subspace for the whole data set, without transferring the data
itself between the nodes. We apply the Consensus Based Gradient method for the
Geometric Median Subspace algorithm for RSR. We propose an iterative solution
for the local dual minimization problem and establish its (r)-linear
convergence. We show that this mathematical framework also extends to two
simpler problems: Principal Component Analysis and the geometric median. We
also explain how to distributedly implement the Reaper and Fast Median Subspace
algorithms for RSR. We demonstrate the competitive performance of our
algorithms for both synthetic and real data.
Xiaoran Liu, Qin Lin, Sicco Verwer, Dmitri Jarnikov
Comments: This paper has been accepted by the Thirty-Second Annual ACM/IEEE Symposium on Logic in Computer Science (LICS) Workshop on Learning and Automata (LearnAut)
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Formal Languages and Automata Theory (cs.FL); Logic in Computer Science (cs.LO)
This paper focuses on detecting anomalies in a digital video broadcasting
(DVB) system from providers’ perspective. We learn a probabilistic
deterministic real timed automaton profiling benign behavior of encryption
control in the DVB control access system. This profile is used as a one-class
classifier. Anomalous items in a testing sequence are detected when the
sequence is not accepted by the learned model.
AmirEmad Ghassami, Saber Salehkaleybar, Negar Kiyavash, Kun Zhang
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
We study causal inference in a multi-environment setting, in which the
functional relations for producing the variables from their direct causes
remain the same across environments, while the distribution of exogenous noises
may vary. We introduce the idea of using the invariance of the functional
relations of the variables to their causes across a set of environments. We
define a notion of completeness for a causal inference algorithm in this
setting and prove the existence of such algorithm by proposing the baseline
algorithm. Additionally, we present an alternate algorithm that has
significantly improved computational and sample complexity compared to the
baseline algorithm. The experiment results show that the proposed algorithm
outperforms the other existing algorithms.
James A. Grant, David S. Leslie, Kevin Glazebrook, Roberto Szechtman
Comments: 16 pages
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
Motivated by problems in search and detection we present a solution to a
Combinatorial Multi-Armed Bandit (CMAB) problem with both heavy-tailed reward
distributions and a new class of feedback, filtered semibandit feedback. In a
CMAB problem an agent pulls a combination of arms from a set ({1,…,k}) in
each round, generating random outcomes from probability distributions
associated with these arms and receiving an overall reward. Under semibandit
feedback it is assumed that the random outcomes generated are all observed.
Filtered semibandit feedback allows the outcomes that are observed to be
sampled from a second distribution conditioned on the initial random outcomes.
This feedback mechanism is valuable as it allows CMAB methods to be applied to
sequential search and detection problems where combinatorial actions are made,
but the true rewards (number of objects of interest appearing in the round) are
not observed, rather a filtered reward (the number of objects the searcher
successfully finds, which must by definition be less than the number that
appear). We present an upper confidence bound type algorithm, Robust-F-CUCB,
and associated regret bound of order (mathcal{O}(ln(n))) to balance
exploration and exploitation in the face of both filtering of reward and heavy
tailed reward distributions.
Aamir Anis, Aly El Gamal, Salman Avestimehr, Antonio Ortega
Subjects: Learning (cs.LG)
Graph-based methods have been quite successful in solving unsupervised and
semi-supervised learning problems, as they provide a means to capture the
underlying geometry of the dataset. It is often desirable for the constructed
graph to satisfy two properties: first, data points that are similar in the
feature space should be strongly connected on the graph, and second, the class
label information should vary smoothly with respect to the graph, where
smoothness is measured using the spectral properties of the graph Laplacian
matrix. Recent works have justified some of these smoothness conditions by
showing that they are strongly linked to the semi-supervised smoothness
assumption and its variants. In this work, we reinforce this connection by
viewing the problem from a graph sampling theoretic perspective, where class
indicator functions are treated as bandlimited graph signals (in the
eigenvector basis of the graph Laplacian) and label prediction as a bandlimited
reconstruction problem. Our approach involves analyzing the bandwidth of class
indicator signals generated from statistical data models with separable and
nonseparable classes. These models are quite general and mimic the nature of
most real-world datasets. Our results show that in the asymptotic limit, the
bandwidth of any class indicator is also closely related to the geometry of the
dataset. This allows one to theoretically justify the assumption of
bandlimitedness of class indicator signals, thereby providing a sampling
theoretic interpretation of graph-based semi-supervised classification.
Yanan Li, Donghui Wang
Comments: This work was completed in Feb, 2015
Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Automatically learning features, especially robust features, has attracted
much attention in the machine learning community. In this paper, we propose a
new method to learn non-linear robust features by taking advantage of the data
manifold structure. We first follow the commonly used trick of the trade, that
is learning robust features with artificially corrupted data, which are
training samples with manually injected noise. Following the idea of the
auto-encoder, we first assume features should contain much information to well
reconstruct the input from its corrupted copies. However, merely reconstructing
clean input from its noisy copies could make data manifold in the feature space
noisy. To address this problem, we propose a new method, called Incremental
Auto-Encoders, to iteratively denoise the extracted features. We assume the
noisy manifold structure is caused by a diffusion process. Consequently, we
reverse this specific diffusion process to further contract this noisy
manifold, which results in an incremental optimization of model parameters .
Furthermore, we show these learned non-linear features can be stacked into a
hierarchy of features. Experimental results on real-world datasets demonstrate
the proposed method can achieve better classification performances.
Daksh Varshneya, G. Srinivasaraghavan
Comments: 10 pages, 5 figures, Submitted to 31st Conference on Neural Information Processing Systems (NIPS 2017)
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)
Trajectory Prediction of dynamic objects is a widely studied topic in the
field of artificial intelligence. Thanks to a large number of applications like
predicting abnormal events, navigation system for the blind, etc. there have
been many approaches to attempt learning patterns of motion directly from data
using a wide variety of techniques ranging from hand-crafted features to
sophisticated deep learning models for unsupervised feature learning. All these
approaches have been limited by problems like inefficient features in the case
of hand crafted features, large error propagation across the predicted
trajectory and no information of static artefacts around the dynamic moving
objects. We propose an end to end deep learning model to learn the motion
patterns of humans using different navigational modes directly from data using
the much popular sequence to sequence model coupled with a soft attention
mechanism. We also propose a novel approach to model the static artefacts in a
scene and using these to predict the dynamic trajectories. The proposed method,
tested on trajectories of pedestrians, consistently outperforms previously
proposed state of the art approaches on a variety of large scale data sets. We
also show how our architecture can be naturally extended to handle multiple
modes of movement (say pedestrians, skaters, bikers and buses) simultaneously.
Giuseppe Nuti
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
K-Nearest Neighbours (k-NN) is a popular classification and regression
algorithm, yet one of its main limitations is the difficulty in choosing the
number of neighbours. We present a Bayesian algorithm to compute the posterior
probability distribution for k given a target point within a data-set,
efficiently and without the use of Markov Chain Monte Carlo (MCMC) methods or
simulation – alongside an exact solution for distributions within the
exponential family. The central idea is that data points around our target are
generated by the same probability distribution, extending outwards over the
appropriate, though unknown, number of neighbours. Once the data is projected
onto a distance metric of choice, we can transform the choice of k into a
change-point detection problem, for which there is an efficient solution: we
recursively compute the probability of the last change-point as we move towards
our target, and thus de facto compute the posterior probability distribution
over k. Applying this approach to both a classification and a regression UCI
data-sets, we compare favourably and, most importantly, by removing the need
for simulation, we are able to compute the posterior probability of k exactly
and rapidly. As an example, the computational time for the Ripley data-set is a
few milliseconds compared to a few hours when using a MCMC approach.
Tadas Baltrušaitis, Chaitanya Ahuja, Louis-Philippe Morency
Subjects: Learning (cs.LG)
Our experience of the world is multimodal – we see objects, hear sounds, feel
texture, smell odors, and taste flavors. Modality refers to the way in which
something happens or is experienced and a research problem is characterized as
multimodal when it includes multiple such modalities. In order for Artificial
Intelligence to make progress in understanding the world around us, it needs to
be able to interpret such multimodal signals together. Multimodal machine
learning aims to build models that can process and relate information from
multiple modalities. It is a vibrant multi-disciplinary field of increasing
importance and with extraordinary potential. Instead of focusing on specific
multimodal applications, this paper surveys the recent advances in multimodal
machine learning itself and presents them in a common taxonomy. We go beyond
the typical early and late fusion categorization and identify broader
challenges that are faced by multimodal machine learning, namely:
representation, translation, alignment, fusion, and co-learning. This new
taxonomy will enable researchers to better understand the state of the field
and identify directions for future research.
Kevin Roth, Aurelien Lucchi, Sebastian Nowozin, Thomas Hofmann
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
Deep generative models based on Generative Adversarial Networks (GANs) have
demonstrated impressive sample quality but in order to work they require a
careful choice of architecture, parameter initialization, and selection of
hyper-parameters. This fragility is in part due to a dimensional mismatch
between the model distribution and the true distribution, causing their density
ratio and the associated f-divergence to be undefined. We overcome this
fundamental limitation and propose a new regularization approach with low
computational cost that yields a stable GAN training procedure. We demonstrate
the effectiveness of this approach on several datasets including common
benchmark image generation tasks. Our approach turns GAN models into reliable
building blocks for deep learning.
Niek Tax, Emin Alasgarov, Natalia Sidorova, Wil M.P. van der Aalst, Reinder Haakma
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Databases (cs.DB)
Process mining is a research field focused on the analysis of event data with
the aim of extracting insights related to dynamic behavior. Applying process
mining techniques on data from smart home environments has the potential to
provide valuable insights in (un)healthy habits and to contribute to ambient
assisted living solutions. Finding the right event labels to enable the
application of process mining techniques is however far from trivial, as simply
using the triggering sensor as the label for sensor events results in
uninformative models that allow for too much behavior (overgeneralizing).
Refinements of sensor level event labels suggested by domain experts have been
shown to enable discovery of more precise and insightful process models.
However, there exists no automated approach to generate refinements of event
labels in the context of process mining. In this paper we propose a framework
for the automated generation of label refinements based on the time attribute
of events, allowing us to distinguish behaviourally different instances of the
same event type based on their time attribute. We show on a case study with
real life smart home event data that using automatically generated refined
labels in process discovery, we can find more specific, and therefore more
insightful, process models. We observe that one label refinement could have an
effect on the usefulness of other label refinements when used together.
Therefore, we explore four strategies to generate useful combinations of
multiple label refinements and evaluate those on three real life smart home
event logs.
Ahmed Touati, Pierre-Luc Bacon, Doina Precup, Pascal Vincent
Comments: NIPS 2017 submission
Subjects: Learning (cs.LG)
Off-policy learning is key to scaling up reinforcement learning as it allows
to learn about a target policy from the experience generated by a different
behavior policy. Unfortunately, it has been challenging to combine off-policy
learning with function approximation and multi-step bootstrapping in a way that
leads to both stable and efficient algorithms. In this paper, we show that the
Tree Backup and Retrace algorithms are unstable with linear function
approximation, both in theory and with specific examples. Based on our
analysis, we then derive stable and efficient gradient-based algorithms,
compatible with accumulating or Dutch traces, using a novel methodology based
on proximal methods. In addition to convergence proofs, we provide
sample-complexity bounds.
Jean Lafond, Nicolas Vasilache, Léon Bottou
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
We define a second-order neural network stochastic gradient training
algorithm whose block-diagonal structure effectively amounts to normalizing the
unit activations. Investigating why this algorithm lacks in robustness then
reveals two interesting insights. The first insight suggests a new way to scale
the stepsizes, clarifying popular algorithms such as RMSProp as well as old
neural network tricks such as fanin stepsize scaling. The second insight
stresses the practical importance of dealing with fast changes of the curvature
of the cost.
Matt Feiszli
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
It can be difficult to tell whether a trained generative model has learned to
generate novel examples or has simply memorized a specific set of outputs. In
published work, it is common to attempt to address this visually, for example
by displaying a generated example and its nearest neighbor(s) in the training
set (in, for example, the L2 metric). As any generative model induces a
probability density on its output domain, we propose studying this density
directly. We first study the geometry of the latent representation and
generator, relate this to the output density, and then develop techniques to
compute and inspect the output density. As an application, we demonstrate that
“memorization” tends to a density made of delta functions concentrated on the
memorized examples. We note that without first understanding the geometry, the
measurement would be essentially impossible to make.
Tianxiao Shen, Tao Lei, Regina Barzilay, Tommi Jaakkola
Subjects: Computation and Language (cs.CL); Learning (cs.LG)
This paper focuses on style transfer on the basis of non-parallel text. This
is an instance of a broader family of problems including machine translation,
decipherment, and sentiment modification. The key technical challenge is to
separate the content from desired text characteristics such as sentiment. We
leverage refined alignment of latent representations across mono-lingual text
corpora with different characteristics. We deliberately modify encoded examples
according to their characteristics, requiring the reproduced instances to match
available examples with the altered characteristics as a population. We
demonstrate the effectiveness of this cross-alignment method on three tasks:
sentiment modification, decipherment of word substitution ciphers, and recovery
of word order.
Lev V. Utkin, Mikhail A. Ryabinin
Comments: arXiv admin note: substantial text overlap with arXiv:1704.08715
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
A Discriminative Deep Forest (DisDF) as a metric learning algorithm is
proposed in the paper. It is based on the Deep Forest or gcForest proposed by
Zhou and Feng and can be viewed as a gcForest modification. The case of the
fully supervised learning is studied when the class labels of individual
training examples are known. The main idea underlying the algorithm is to
assign weights to decision trees in random forest in order to reduce distances
between objects from the same class and to increase them between objects from
different classes. The weights are training parameters. A specific objective
function which combines Euclidean and Manhattan distances and simplifies the
optimization problem for training the DisDF is proposed. The numerical
experiments illustrate the proposed distance metric algorithm.
Yunus Saatchi, Andrew Gordon Wilson
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
Generative adversarial networks (GANs) can implicitly learn rich
distributions over images, audio, and data which are hard to model with an
explicit likelihood. We present a practical Bayesian formulation for
unsupervised and semi-supervised learning with GANs, in conjunction with
stochastic gradient Hamiltonian Monte Carlo to marginalize the weights of the
generator and discriminator networks. The resulting approach is straightforward
and obtains good performance without any standard interventions such as feature
matching, or mini-batch discrimination. By exploring an expressive posterior
over the parameters of the generator, the Bayesian GAN avoids mode-collapse,
produces interpretable candidate samples with notable variability, and in
particular provides state-of-the-art quantitative results for semi-supervised
learning on benchmarks including SVHN, CelebA, and CIFAR-10, outperforming
DCGAN, Wasserstein GANs, and DCGAN ensembles.
Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, Omar Fawzi, Pascal Frossard, Stefano Soatto
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)
Deep networks have recently been shown to be vulnerable to universal
perturbations: there exist very small image-agnostic perturbations that cause
most natural images to be misclassified by such classifiers. In this paper, we
propose the first quantitative analysis of the robustness of classifiers to
universal perturbations, and draw a formal link between the robustness to
universal perturbations, and the geometry of the decision boundary.
Specifically, we establish theoretical bounds on the robustness of classifiers
under two decision boundary models (flat and curved models). We show in
particular that the robustness of deep networks to universal perturbations is
driven by a key property of their curvature: there exists shared directions
along which the decision boundary of deep networks is systematically positively
curved. Under such conditions, we prove the existence of small universal
perturbations. Our analysis further provides a novel geometric method for
computing universal perturbations, in addition to explaining their properties.
Alhussein Fawzi, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard, Stefano Soatto
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)
The goal of this paper is to analyze the geometric properties of deep neural
network classifiers in the input space. We specifically study the topology of
classification regions created by deep networks, as well as their associated
decision boundary. Through a systematic empirical investigation, we show that
state-of-the-art deep nets learn connected classification regions, and that the
decision boundary in the vicinity of datapoints is flat along most directions.
We further draw an essential connection between two seemingly unrelated
properties of deep networks: their sensitivity to additive perturbations in the
inputs, and the curvature of their decision boundary. The directions where the
decision boundary is curved in fact remarkably characterize the directions to
which the classifier is the most vulnerable. We finally leverage a fundamental
asymmetry in the curvature of the decision boundary of deep nets, and propose a
method to discriminate between original images, and images perturbed with small
adversarial examples. We show the effectiveness of this purely geometric
approach for detecting small adversarial perturbations in images, and for
recovering the labels of perturbed images.
Marco Cristoforetti, Giuseppe Jurman, Andrea I. Nardelli, Cesare Furlanello
Subjects: High Energy Physics – Lattice (hep-lat); Statistical Mechanics (cond-mat.stat-mech); Learning (cs.LG)
In several physical systems, important properties characterizing the system
itself are theoretically related with specific degrees of freedom. Although
standard Monte Carlo simulations provide an effective tool to accurately
reconstruct the physical configurations of the system, they are unable to
isolate the different contributions corresponding to different degrees of
freedom. Here we show that unsupervised deep learning can become a valid
support to MC simulation, coupling useful insights in the phases detection task
with good reconstruction performance. As a testbed we consider the 2D XY model,
showing that a deep neural network based on variational autoencoders can detect
the continuous Kosterlitz-Thouless (KT) transitions, and that, if endowed with
the appropriate constrains, they generate configurations with meaningful
physical content.
Kosetsu Tsukuda, Masataka Goto
Comments: Accepted by The 21st Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2017)
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG)
Online music services are increasing in popularity. They enable us to analyze
people’s music listening behavior based on play logs. Although it is known that
people listen to music based on topic (e.g., rock or jazz), we assume that when
a user is addicted to an artist, s/he chooses the artist’s songs regardless of
topic. Based on this assumption, in this paper, we propose a probabilistic
model to analyze people’s music listening behavior. Our main contributions are
three-fold. First, to the best of our knowledge, this is the first study
modeling music listening behavior by taking into account the influence of
addiction to artists. Second, by using real-world datasets of play logs, we
showed the effectiveness of our proposed model. Third, we carried out
qualitative experiments and showed that taking addiction into account enables
us to analyze music listening behavior from a new viewpoint in terms of how
people listen to music according to the time of day, how an artist’s songs are
listened to by people, etc. We also discuss the possibility of applying the
analysis results to applications such as artist similarity computation and song
recommendation.
Wei Su, Yang Yang, Cuiling Fan
Subjects: Information Theory (cs.IT)
Binary sequences with optimal autocorrelation play important roles in radar,
communication, and cryptography. Finding new binary sequences with optimal
autocorrelation has been an interesting research topic in sequence design.
Ding-Helleseth-Lam sequences are such a class of binary sequences of period
(p), where (p) is an odd prime with (pequiv 1(mod~4)). The objective of this
letter is to present a construction of binary sequences of period (4p) via
interleaving four suitable Ding-Helleseth-Lam sequences. This construction
generates new binary sequences with optimal autocorrelation which can not be
produced by earlier ones.
Fadhil Firyaguna, Jacek Kibilda, Carlo Galiotto, Nicola Marchetti
Comments: 7 pages, 6 figures, Submitted to IEEE GLOBECOM 2017
Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)
Provisioning of high throughput millimetre-wave signal to indoor areas that
potentially serve large number of users, such as transportation hubs or
convention centres, will require dedicated indoor millimetre-wave access point
deployments. In this article we study dense deployments of millimetre-wave
access points mounted on the ceiling, and illuminating selected spots on the
ground with the use of fixed directional antennas. In this setup, the main
factor limiting signal propagation are blockages by human bodies. We evaluate
our system under a number of scenarios that take into account beamwidth of the
main-lobe, access point density, and positioning of a mobile device with
respect to the user’s body. We find that both coverage and area spectral
efficiency curves exhibit non-trivial behaviour which can be classified into
four regions related to the selection of access point density, beamwidth, and
height values. Furthermore, we observe a trade-off in beamwidth design, as the
optimal beamwidth maximizes either coverage or area spectral efficiency but not
both. Finally, when we consider different body shadowing scenarios, our network
design will optimize coverage and area spectral efficiency performance towards
either devices held in hand or worn directly against the body, as each of the
scenarios requires mutually exclusive settings of access point density and
beamwidth.
Tamir Bendory, Robert Beinert, Yonina C. Eldar
Subjects: Information Theory (cs.IT)
The problem of recovering a signal from its phaseless Fourier transform
measurements, called Fourier phase retrieval, arises in many applications in
engineering and science. Fourier phase retrieval poses fundamental theoretical
and algorithmic challenges. In general, there is no unique mapping between a
one-dimensional signal and its Fourier magnitude and therefore the problem is
ill-posed. Additionally, while almost all multidimensional signals are uniquely
mapped to their Fourier magnitude, the performance of existing algorithms is
generally not well-understood. In this chapter we survey methods to guarantee
uniqueness in Fourier phase retrieval. We then present different algorithmic
approaches to retrieve the signal in practice. We conclude by outlining some of
the main open questions in this field.
Serge Kas Hanna, Salim El Rouayheb
Comments: Submitted to IEEE Transactions on Information Theory. arXiv admin note: text overlap with arXiv:1702.04466
Subjects: Information Theory (cs.IT)
We consider the problem of constructing codes that can correct (delta)
deletions occurring in an arbitrary binary string of length (n) bits.
Varshamov-Tenengolts (VT) codes are zero-error single deletion ((delta=1))
correcting codes, and have an asymptotically optimal redundancy. Finding
similar codes for (delta geq 2) deletions is an open problem. We propose a
new family of codes, that we call Guess & Check (GC) codes, that can correct,
with high probability, up to (delta) deletions occurring in a binary string.
Moreover, we show that GC codes can also correct up to (delta) insertions. GC
codes are based on MDS codes and have an asymptotically optimal redundancy that
is (Theta(delta log n)). We provide deterministic polynomial time encoding
and decoding schemes for these codes. We also describe the applications of GC
codes to file synchronization.
Irina E. Bocharova, Boris D. Kudryashov, Vitaly Skachek, Yauhen Yakimenka
Comments: Submitted to IEEE Transactions on Information Theory
Subjects: Information Theory (cs.IT)
A new method for low-complexity near-maximum-likelihood (ML) decoding of
low-density parity-check (LDPC) codes over the additive white Gaussian noise
channel is presented. The proposed method termed belief-propagation–list
erasure decoding (BP-LED) is based on erasing carefully chosen unreliable bits
performed in case of BP decoding failure. A strategy of introducing erasures
into the received vector and a new erasure decoding algorithm are proposed. The
new erasure decoding algorithm, called list erasure decoding, combines ML
decoding over the BEC with list decoding applied if the ML decoder fails to
find a unique solution. The asymptotic exponent of the average list size for
random regular LDPC codes from the Gallager ensemble is analyzed. Furthermore,
a few examples of regular and irregular quasi-cyclic LDPC codes of short and
moderate lengths are studied by simulations and their performance is compared
with the upper bound on the LDPC ensemble-average performance and the upper
bound on the average performance of random linear codes under ML decoding. A
comparison with the BP decoding performance of the WiMAX standard codes and
performance of the near-ML BEAST decoding are presented. The new algorithm is
applied to decoding a short nonbinary LDPC code over the extension of the
binary Galois field. The obtained simulation results are compared to the upper
bound on the ensemble-average performance of the binary image of regular
nonbinary LDPC codes.
Lan V. Truong
Comments: 24 pages, 5 figures
Subjects: Information Theory (cs.IT)
In this paper, we investigate the performance of the Viterbi decoding
algorithm with/without automatic repeat request (ARQ) over a Rician flat fading
channel with unlimited interleaving. We show that the decay rate of the average
bit error probability with respect to the bit energy to noise ratio is of order
between (d_f) and (d_f+1) at high bit energy to noise ratio for both cases
(with ARQ and without ARQ), where (d_f) is the free distance of the
convolutional code. The Yamamoto-Itoh flag helps to reduce the average bit
error probability by a factor of (4^{d_f}) with a negligible retransmission
rate. Besides, the error exponent with respect to the Rician factor is shown to
be (d_f) for any fixed bit energy per noise ratio.
Suzie Brown, Oliver Johnson, Andrea Tassi
Comments: Manuscript submitted as a correspondence to IEEE Transactions on Vehicular Technology
Subjects: Information Theory (cs.IT); Performance (cs.PF)
Ultra-reliable Point-to-Multipoint (PtM) communications are expected to
become pivotal in networks offering future dependable services for smart
cities. In this regard, sparse Random Linear Network Coding (RLNC) techniques
have been widely employed to provide an efficient way to improve the
reliability of broadcast and multicast data streams. This paper addresses the
pressing concern of providing a tight approximation to the probability of a
user recovering a data stream protected by this kind of coding technique. In
particular, by exploiting the Stein-Chen method, we provide a novel and general
performance framework applicable to any combination of system and service
parameters, such as finite field sizes, lengths of the data stream and level of
sparsity. The deviation of the proposed approximation from Monte Carlo
simulations is negligible, improving significantly on the state of the art
performance bound.
Alexander Span, Vahid Aref, Henning Buelow, Stephan ten Brink
Comments: Accepted for ISIT 2017
Subjects: Information Theory (cs.IT)
Multi-soliton pulses are potential candidates for fiber optical transmission
where the information is modulated and recovered in the so-called nonlinear
Fourier domain. While this is an elegant technique to account for the channel
nonlinearity, the obtained spectral efficiency, so far, is not competitive with
the classic Nyquist-based schemes. In this paper, we study the evolution of the
time-bandwidth product of multi-solitons as they propagate along the optical
fiber. For second and third order soliton pulses, we numerically optimize the
pulse shapes to achieve the smallest time-bandwidth product when the phase of
the spectral amplitudes is used for modulation. Moreover, we analytically
estimate the pulse-duration and bandwidth of multi-solitons in some practically
important cases. Those estimations enable us to approximate the time-bandwidth
product for higher order solitons.
Fabio Benatti, Samad Khabbazi Oskouei, Ahmad Shafiei Deh Abad
Subjects: Information Theory (cs.IT); Dynamical Systems (math.DS); Quantum Physics (quant-ph)
We study the relations between the recently proposed machine-independent
quantum complexity of P. Gacs~cite{Gacs} and the entropy of classical and
quantum systems. On one hand, by restricting Gacs complexity to ergodic
classical dynamical systems, we retrieve the equality between the Kolmogorov
complexity rate and the Shannon entropy rate derived by A.A.
Brudno~cite{Brudno}. On the other hand, using the quantum Shannon-Mc Millan
theorem~cite{BSM}, we show that such an equality holds densely in the case of
ergodic quantum spin chains.
Zaidao Wen, Biao Hou, Licheng Jiao
Comments: Code is available
Journal-ref: IEEE Signal Processing Letters (Volume: 24, Issue: 5, May 2017)
Subjects: Information Theory (cs.IT)
Discrete multiple signal classification (MUSIC) with its low computational
cost and mild condition requirement becomes a significant noniterative
algorithm for joint sparse recovery (JSR). However, it fails in rank defective
problem caused by coherent or limited amount of multiple measurement vectors
(MMVs). In this letter, we provide a novel sight to address this problem by
interpreting JSR as a binary classification problem with respect to atoms.
Meanwhile, MUSIC essentially constructs a supervised classifier based on the
labeled MMVs so that its performance will heavily depend on the quality and
quantity of these training samples. From this viewpoint, we develop a
semisupervised MUSIC (SS-MUSIC) in the spirit of machine learning, which
declares that the insufficient supervised information in the training samples
can be compensated from those unlabeled atoms. Instead of constructing a
classifier in a fully supervised manner, we iteratively refine a semisupervised
classifier by exploiting the labeled MMVs and some reliable unlabeled atoms
simultaneously. Through this way, the required conditions and iterations can be
greatly relaxed and reduced. Numerical experimental results demonstrate that
SS-MUSIC can achieve much better recovery performances than other MUSIC
extended algorithms as well as some typical greedy algorithms for JSR in terms
of iterations and recovery probability.
Jae-Won Kim, Jong-Seon No
Subjects: Information Theory (cs.IT)
In this paper, new equivalence relationships between a network code with link
errors (NCLE) and an index code with side information errors (ICSIE) are
studied. First, for a given network coding instance, the equivalent index
coding instance is derived, where an NCLE is converted to the corresponding
ICSIE and vice versa. Next, for a given index coding instance, the equivalent
network coding instance is also derived, where an ICSIE is converted to the
corresponding NCLE and vice versa if a pair of encoding functions of an
original link and the duplicated link are functionally related in the network
code. Finally, several properties of an NCLE are derived from those of the
equivalent ICSIE using the fact that the NCLE and the ICSIE are equivalent.
Haoran Sun, Xiangyi Chen, Qingjiang Shi, Mingyi Hong, Xiao Fu, Nikos D. Sidiropoulos
Comments: This paper has been submitted to SPAWC 2017 for publication on March 10th, 2017
Subjects: Information Theory (cs.IT)
For decades, optimization has played a central role in addressing wireless
resource management problems such as power control and beamformer design.
However, these algorithms often require a considerable number of iterations for
convergence, which poses challenges for real-time processing. In this work, we
propose a new learning-based approach for wireless resource management. The key
idea is to treat the input and output of a resource allocation algorithm as an
unknown non-linear mapping and to use a deep neural network (DNN) to
approximate it. If the non-linear mapping can be learned accurately and
effectively by a DNN of moderate size, then such DNN can be used for resource
allocation in almost emph{real time}, since passing the input through a DNN to
get the output only requires a small number of simple operations. In this work,
we first characterize a class of `learnable algorithms’ and then design DNNs to
approximate some algorithms of interest in wireless communications. We use
extensive numerical simulations to demonstrate the superior ability of DNNs for
approximating two considerably complex algorithms that are designed for power
allocation in wireless transmit signal design, while giving orders of magnitude
speedup in computational time.
Jiahui Li, Yin Sun, Limin Xiao, Shidong Zhou, C. Emre Koksal
Comments: 15 pages, 11 figures
Subjects: Information Theory (cs.IT)
Fast and accurate analog beam tracking is an important and yet challenging
issue in 5G wireless networks, due to the inherent non-convexity of the
problem. In this paper, we develop a low-complexity recursive beam tracking
algorithm. In static beam tracking scenarios, this algorithm converges to the
Cram’er-Rao lower bound (CRLB) with very high probability. In dynamic beam
tracking scenarios, if combined with a simple TDMA pilot pattern, this
algorithm has the potential to track hundreds of independent beams, generated
by highly-mobile transmitters/reflectors, with low pilot overhead. Simulations
are provided to illustrate the performance gain of this algorithm.
Felipe Gómez-Cuba, Elza Erkip, Sundeep Rangan, Francisco J. González-Castaño
Comments: 30 pages, 4 figures, 1 table. Submitted to IEEE TWC
Subjects: Information Theory (cs.IT)
The availability of very wide spectrum in millimeter wave bands combined with
large antenna arrays and ultra dense networks raises two basic questions: What
is the true value of overly abundant degrees of freedom and how can networks be
designed to fully exploit them? This paper determines the capacity scaling of
large cellular networks as a function of bandwidth, area, number of antennas
and base station density. It is found that the network capacity has a
fundamental bandwidth scaling limit, beyond which the network becomes
power-limited. An infrastructure multi-hop protocol achieves the optimal
network capacity scaling for all network parameters. In contrast, current
protocols that use only single-hop direct transmissions can not achieve the
capacity scaling in wideband regimes except in the special case when the
density of base stations is taken to impractical extremes. This finding
suggests that multi-hop communication will be important to fully realize the
potential of next-generation cellular networks. Dedicated relays, if
sufficiently dense, can also perform this task, relieving user nodes from the
battery drain of cooperation. On the other hand, more sophisticated strategies
such as hierarchical cooperation, that are essential for achieving capacity
scaling in ad hoc networks, are unnecessary in the cellular context.
Salman Salamatian, Ahmad Beirami, Asaf Cohen, Muriel Médard
Comments: Accepted at IEEE International Symposium on Information Theory (ISIT) 2017
Subjects: Information Theory (cs.IT)
We study a notion of guesswork, where multiple agents intend to launch a
coordinated brute-force attack to find a single binary secret string, and each
agent has access to side information generated through either a BEC or a BSC.
The average number of trials required to find the secret string grows
exponentially with the length of the string, and the rate of the growth is
called the guesswork exponent. We compute the guesswork exponent for several
multi-agent attacks. We show that a multi-agent attack reduces the guesswork
exponent compared to a single agent, even when the agents do not exchange
information to coordinate their attack, and try to individually guess the
secret string using a predetermined scheme in a decentralized fashion. Further,
we show that the guesswork exponent of two agents who do coordinate their
attack is strictly smaller than that of any finite number of agents
individually performing decentralized guesswork.
Panagiotis Mandros, Mario Boley, Jilles Vreeken
Comments: Accepted to the Proceedings of ACM SIGKDD conference, Halifax, Canada, August 2017
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Information Theory (cs.IT)
Given a database and a target attribute of interest, how can we tell whether
there exists a functional, or approximately functional dependence of the target
on any set of other attributes in the data? How can we reliably, without bias
to sample size or dimensionality, measure the strength of such a dependence?
And, how can we efficiently discover the optimal or (alpha)-approximate
top-(k) dependencies? These are exactly the questions we answer in this paper.
As we want to be agnostic on the form of the dependence, we adopt an
information-theoretic approach, and construct a reliable, bias correcting score
that can be efficiently computed. Moreover, we give an effective optimistic
estimator of this score, by which for the first time we can mine the
approximate functional dependencies from data with guarantees of optimality.
Empirical evaluation shows that the derived score achieves a good bias for
variance trade-off, can be used within an efficient discovery algorithm, and
indeed discovers meaningful dependencies. Most important, it remains reliable
in the face of data sparsity.