Travis Desell
Comments: 17 pages, 13 figures. Submitted to the 2017 Genetic and Evolutionary Computation Conference (GECCO 2017)
Subjects: Neural and Evolutionary Computing (cs.NE)
This work presents a new algorithm called evolutionary exploration of
augmenting convolutional topologies (EXACT), which is capable of evolving the
structure of convolutional neural networks (CNNs). EXACT is in part modeled
after the neuroevolution of augmenting topologies (NEAT) algorithm, with
notable exceptions to allow it to scale to large scale distributed computing
environments and evolve networks with convolutional filters. In addition to
multithreaded and MPI versions, EXACT has been implemented as part of a BOINC
volunteer computing project, allowing large scale evolution. During a period of
two months, over 4,500 volunteered computers on the Citizen Science Grid
trained over 120,000 CNNs and evolved networks reaching 98.32% test data
accuracy on the MNIST handwritten digits dataset. These results are even
stronger as the backpropagation strategy used to train the CNNs was fairly
rudimentary (ReLU units, L2 regularization and Nesterov momentum) and these
were initial test runs done without refinement of the backpropagation
hyperparameters. Further, the EXACT evolutionary strategy is independent of the
method used to train the CNNs, so they could be further improved by advanced
techniques like elastic distortions, pretraining and dropout. The evolved
networks are also quite interesting, showing “organic” structures and
significant differences from standard human designed architectures.
Thomas E. Potok, Catherine Schuman, Steven R. Young, Robert M. Patton, Federico Spedalieri, Jeremy Liu, Ke-Thia Yao, Garrett Rose, Gangotree Chakma
Subjects: Neural and Evolutionary Computing (cs.NE); Learning (cs.LG)
Current Deep Learning approaches have been very successful using
convolutional neural networks (CNN) trained on large graphical processing units
(GPU)-based computers. Three limitations of this approach are: 1) they are
based on a simple layered network topology, i.e., highly connected layers,
without intra-layer connections; 2) the networks are manually configured to
achieve optimal results, and 3) the implementation of neuron model is expensive
in both cost and power. In this paper, we evaluate deep learning models using
three different computing architectures to address these problems: quantum
computing to train complex topologies, high performance computing (HPC) to
automatically determine network topology, and neuromorphic computing for a
low-power hardware implementation. We use the MNIST dataset for our experiment,
due to input size limitations of current quantum computers. Our results show
the feasibility of using the three architectures in tandem to address the above
deep learning limitations. We show a quantum computer can find high quality
values of intra-layer connections weights, in a tractable time as the
complexity of the network increases; a high performance computer can find
optimal layer-based topologies; and a neuromorphic computer can represent the
complex topology and weights derived from the other architectures in low power
memristive hardware.
Sailesh Conjeti, Magdalini Paschali, Amin Katouzian, Nassir Navab
Comments: 10 pages, 7 figures, under review at MICCAI 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In this paper, for the first time, we introduce a multiple instance (MI) deep
hashing technique for learning discriminative hash codes with weak bag-level
supervision suited for large-scale retrieval. We learn such hash codes by
aggregating deeply learnt hierarchical representations across bag members
through a dedicated MI pool layer. For better trainability and retrieval
quality, we propose a two-pronged approach that includes robust optimization
and training with an auxiliary single instance hashing arm which is
down-regulated gradually. We pose retrieval for tumor assessment as an MI
problem because tumors often coexist with benign masses and could exhibit
complementary signatures when scanned from different anatomical views.
Experimental validations on benchmark mammography and histology datasets
demonstrate improved retrieval performance over the state-of-the-art methods.
Yifan Sun, Liang Zheng, Weijian Deng, Shengjin Wang
Comments: This version is not fully edited and will be updated soon
Subjects: Computer Vision and Pattern Recognition (cs.CV)
This paper proposes the SVDNet for retrieval problems, with focus on the
application of person re-identification (re-ID). We view the weight matrix (W)
as a set of weight vectors or basis. It is observed that the weight vectors are
usually highly-correlated. This problem leads correlations among entries of the
FC descriptor, and compromises the retrieval performance based on the Euclidean
distance. To alleviate the correlation problem, this paper proposes to
decorrelate the learned weight vectors using singular vector decomposition
(SVD). Specifically, we design a novel training strategy with the “restraint
and relaxation iteration” (RRI) scheme. We conduct experiment on the
Market-1501, CUHK03, and Duke datasets, and show that RRI effectively reduces
the correlation among the projection vectors and significantly improves the
re-ID accuracy. On the Market-1501 dataset, for instance, rank-1 accuracy is
improved from 55.2% to 80.5% for CaffeNet, and from 73.8% to 83.1% for
ResNet-50.
Andre Ebert, Sebastian Feld, Florian Dorfmeister
Comments: 4 Pages, 6 Figures, Accepted at the The 23rd International Conference on Systems, Signals and Image Processing (IWSSIP)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Mutual usage of vehicles as well as car sharing became more and more
attractive during the last years. Especially in urban environments with limited
parking possibilities and a higher risk for traffic jams, car rentals and
sharing services may save time and money. But when renting a vehicle it could
already be damaged (e.g., scratches or bumps inflicted by a previous user)
without the damage being perceived by the service provider. In order to address
such problems, we present an automated, motion-based system for impact
detection, that facilitates a common smartphone as a sensor platform. The
system is capable of detecting the impact segment and the point of time of an
impact event on a vehicle’s surface, as well as its direction of origin. With
this additional specific knowledge, it may be possible to reconstruct the
circumstances of an impact event, e.g., to prove possible innocence of a
service’s customer.
Nan Xue, Gui-Song Xia, Xiang Bai, Liangpei Zhang, Weiming Shen
Comments: 24 pages, 15 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Junctions play an important role in the characterization of local geometric
structures in images, the detection of which is a longstanding and challenging
task. Existing junction detectors usually focus on identifying the junction
locations and the orientations of the junction branches while ignoring their
scales; however, these scales also contain rich geometric information. This
paper presents a novel approach to junction detection and characterization that
exploits the locally anisotropic geometries of a junction and estimates the
scales of these geometries using an emph{a contrario} model. The output
junctions have anisotropic scales — i.e., each branch of a junction is
associated with an independent scale parameter — and are thus termed
anisotropic-scale junctions (ASJs). We then apply the newly detected ASJs for
the matching of indoor images, in which there may be dramatic changes in
viewpoint and the detected local visual features, e.g., key-points, are usually
insufficiently distinctive. We propose to use the anisotropic geometries of our
junctions to improve the matching precision for indoor images. Matching results
obtained on sets of indoor images demonstrate that our approach achieves
state-of-the-art performance in indoor image matching.
Li Liu, Fumin Shen, Yuming Shen, Xianglong Liu, Ling Shao
Comments: This paper will appear as a spotlight paper in CVPR2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Free-hand sketch-based image retrieval (SBIR) is a specific cross-view
retrieval task, in which queries are abstract and ambiguous sketches while the
retrieval database is formed with natural images. Work in this area mainly
focuses on extracting representative and shared features for sketches and
natural images. However, these can neither cope well with the geometric
distortion between sketches and images nor be feasible for large-scale SBIR due
to the heavy continuous-valued distance computation. In this paper, we speed up
SBIR by introducing a novel binary coding method, named extbf{Deep Sketch
Hashing} (DSH), where a semi-heterogeneous deep architecture is proposed and
incorporated into an end-to-end binary coding framework. Specifically, three
convolutional neural networks are utilized to encode free-hand sketches,
natural images and, especially, the auxiliary sketch-tokens which are adopted
as bridges to mitigate the sketch-image geometric distortion. The learned DSH
codes can effectively capture the cross-view similarities as well as the
intrinsic semantic correlations between different categories. To the best of
our knowledge, DSH is the first hashing work specifically designed for
category-level SBIR with an end-to-end deep architecture. The proposed DSH is
comprehensively evaluated on two large-scale datasets of TU-Berlin Extension
and Sketchy, and the experiments consistently show DSH’s superior SBIR
accuracies over several state-of-the-art methods, while achieving significantly
reduced retrieval time and memory footprint.
Ignacio Rocco, Relja Arandjelović, Josef Sivic
Comments: To appear in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
We address the problem of determining correspondences between two images in
agreement with a geometric model such as an affine or thin-plate-spline
transformation, and estimating its parameters. The contributions of this work
are three-fold. First, we propose a convolutional neural network architecture
for geometric matching. The architecture is based on three main components that
mimic the standard steps of feature extraction, matching and simultaneous
inlier detection and model parameter estimation, while being trainable
end-to-end. Second, we demonstrate that the network parameters can be trained
from synthetically generated imagery without the need for manual annotation and
that our matching layer significantly increases generalization capabilities to
never seen before images. Finally, we show that the same model can perform both
instance-level and category-level matching giving state-of-the-art results on
the challenging Proposal Flow dataset.
Antonio Foncubierta-Rodríguez, Henning Müller, Adrien Depeursinge
Subjects: Computer Vision and Pattern Recognition (cs.CV)
The Bag–of–Visual–Words (BoVW) is a visual description technique that aims
at shortening the semantic gap by partitioning a low–level feature space into
regions of the feature space that potentially correspond to visual concepts and
by giving more value to this space. In this paper we present a conceptual
analysis of three major properties of language grammar and how they can be
adapted to the computer vision and image understanding domain based on the bag
of visual words paradigm. Evaluation of the visual grammar shows that a
positive impact on classification accuracy and/or descriptor size is obtained
when the technique are applied when the proposed techniques are applied.
Vincent Andrearczyk, Paul F. Whelan
Comments: 19 pages, 10 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Dynamic Textures (DTs) are sequences of images of moving scenes that exhibit
certain stationarity properties in time such as smoke, vegetation and fire. The
analysis of DT is important for recognition, segmentation, synthesis or
retrieval for a range of applications including surveillance, medical imaging
and remote sensing. Deep learning methods have shown impressive results and are
now the new state of the art for a wide range of computer vision tasks
including image and video recognition and segmentation. In particular,
Convolutional Neural Networks (CNNs) have recently proven to be well suited for
texture analysis with a design similar to a filter bank approach. In this
paper, we develop a new approach to DT analysis based on a CNN method applied
on three orthogonal planes x y , xt and y t . We train CNNs on spatial frames
and temporal slices extracted from the DT sequences and combine their outputs
to obtain a competitive DT classifier. Our results on a wide range of commonly
used DT classification benchmark datasets prove the robustness of our approach.
Significant improvement of the state of the art is shown on the larger
datasets.
Jin Qi, Miao Le, Chunming Li, Ping Zhou
Comments: 4 pages, 3 figures. ISIC2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
With a large influx of dermoscopy images and a growing shortage of
dermatologists, automatic dermoscopic image analysis plays an essential role in
skin cancer diagnosis. In this paper, a new deep fully convolutional neural
network (FCNN) is proposed to automatically segment melanoma out of skin images
by end-to-end learning with only pixels and labels as inputs. Our proposed FCNN
is capable of using both local and global information to segment melanoma by
adopting skipping layers. The public benchmark database consisting of 150
validation images, 600 test images and 2000 training images in the melanoma
detection challenge 2017 at International Symposium Biomedical Imaging 2017 is
used to test the performance of our algorithm. All large size images (for
example, (4000 imes 6000) pixels) are reduced to much smaller images with
(384 imes 384) pixels (more than 10 times smaller). We got and submitted
preliminary results to the challenge without any pre or post processing. The
performance of our proposed method could be further improved by data
augmentation and by avoiding image size reduction.
Ruth Fong, Walter Scheirer, David Cox
Comments: Supplemental material can be downloaded here: this http URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Machine learning is a field of computer science that builds algorithms that
learn. In many cases, machine learning algorithms are used to recreate a human
ability like adding a caption to a photo, driving a car, or playing a game.
While the human brain has long served as a source of inspiration for machine
learning, little effort has been made to directly use data collected from
working brains as a guide for machine learning algorithms. Here we demonstrate
a new paradigm of “neurally-weighted” machine learning, which takes fMRI
measurements of human brain activity from subjects viewing images, and infuses
these data into the training process of an object recognition learning
algorithm to make it more consistent with the human brain. After training,
these neurally-weighted classifiers are able to classify images without
requiring any additional neural data. We show that our neural-weighting
approach can lead to large performance gains when used with traditional machine
vision features, as well as to significant improvements with already
high-performing convolutional neural network features. The effectiveness of
this approach points to a path forward for a new class of hybrid machine
learning algorithms which take both inspiration and direct constraints from
neuronal data.
Zhe Jin, Yen-Lung Lai, Jung-Yeon Hwang, Soohyung Kim, Andrew Beng Jin Teoh
Comments: 13 pages, 7 figures, 6 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Despite a variety of theoretical-sound techniques have been proposed to
generate cancellable biometric templates, there is rarely practical solution
that satisfies non-invertibility, revocability, non-linkability and performance
simultaneously. In this paper, we propose a locality sensitive hashing inspired
cancellable biometrics framework, namely “Index-of-Max” (IoM) hashing. Briefly,
IoM hashing non-linearly transforms a realvalued feature vector into discrete
index hashed code. We demonstrate two realizations from IoM hashing framework,
namely Gaussian Random Projection based and Uniformly Random Permutation based
hashing schemes. The discrete indices representation nature of IoM hashed codes
enjoy several merits such as it empowers strong concealment to biometric
information. This contributes to the solid ground of noninvertibility
guarantee. IoM hashing is insensitive to the magnitude of features, hence are
more robust against biometric features variation and its magnitude-independence
trait makes the resultant hash codes scale-invariant, which is critical for
matching and feature alignment. The experimental results demonstrate reasonable
accuracy performance on benchmark FVC2002 and FVC2004 fingerprint databases.
Moreover, the analyses justify its resilience to the existing and newly
introduced security and privacy attacks as well as satisfy the revocability and
unlinkability criteria of cancellable biometrics. Besides, the implementation
of IoM hashing is also incredibly simple for practical applications.
Ke Gong, Xiaodan Liang, Xiaohui Shen, Liang Lin
Comments: Accepted to appear in CVPR 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)
Human parsing has recently attracted a lot of research interests due to its
huge application potentials. However existing datasets have limited number of
images and annotations, and lack the variety of human appearances and the
coverage of challenging cases in unconstrained environment. In this paper, we
introduce a new benchmark “Look into Person (LIP)” that makes a significant
advance in terms of scalability, diversity and difficulty, a contribution that
we feel is crucial for future developments in human-centric analysis. This
comprehensive dataset contains over 50,000 elaborately annotated images with 19
semantic part labels, which are captured from a wider range of viewpoints,
occlusions and background complexity. Given these rich annotations we perform
detailed analyses of the leading human parsing approaches, gaining insights
into the success and failures of these methods. Furthermore, in contrast to the
existing efforts on improving the feature discriminative capability, we solve
human parsing by exploring a novel self-supervised structure-sensitive learning
approach, which imposes human pose structures into parsing results without
resorting to extra supervision (i.e., no need for specifically labeling human
joints in model training). Our self-supervised learning framework can be
injected into any advanced neural networks to help incorporate rich high-level
knowledge regarding human joints from a global perspective and improve the
parsing results. Extensive evaluations on our LIP and the public
PASCAL-Person-Part dataset demonstrate the superiority of our method.
Dingding Cai, Ke Chen, Yanlin Qian, Joni-Kristian Kämäräinen
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Successful fine-grained image classification methods learn subtle details
between visually similar (sub-)classes, but the problem becomes significantly
more challenging if the details are missing due to low resolution. Encouraged
by the recent success of Convolutional Neural Network (CNN) architectures in
image classification, we propose a novel resolution-aware deep model which
combines convolutional image super-resolution and convolutional fine-grained
classification into a single model in an end-to-end manner. Extensive
experiments on the Stanford Cars and Caltech-UCSD Birds 200-2011 benchmarks
demonstrate that the proposed model consistently performs better than
conventional convolutional net on classifying fine-grained object classes in
low-resolution images.
Peter van Beek, R. Wayne Oldford
Comments: 20 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
White balancing is a fundamental step in the image processing pipeline. The
process involves estimating the chromaticity of the illuminant or light source
and using the estimate to correct the image to remove any color cast. Given the
importance of the problem, there has been much previous work on illuminant
estimation. Recently, an approach based on ensembles of univariate regression
trees that are fit using the squared-error loss function has been proposed and
shown to give excellent performance. In this paper, we show that a simpler and
more accurate ensemble model can be learned by (i) using multivariate
regression trees to take into account that the chromaticity components of the
illuminant are correlated and constrained, and (ii) fitting each tree by
directly minimizing a loss function of interest—such as recovery angular
error or reproduction angular error—rather than indirectly using the
squared-error loss function as a surrogate. We show empirically that overall
our method leads to improved performance on diverse image sets.
Soumyabrata Dev, Shilpa Manandhar, Feng Yuan, Yee Hui Lee, Stefan Winkler
Comments: Accepted in Proc. IEEE AP-S Symposium on Antennas and Propagation and USNC-URSI Radio Science Meeting, 2017
Subjects: Atmospheric and Oceanic Physics (physics.ao-ph); Computer Vision and Pattern Recognition (cs.CV)
The analysis of clouds in the earth’s atmosphere is important for a variety
of applications, viz. weather reporting, climate forecasting, and solar energy
generation. In this paper, we focus our attention on the impact of cloud on the
total solar irradiance reaching the earth’s surface. We use weather station to
record the total solar irradiance. Moreover, we employ collocated ground-based
sky camera to automatically compute the instantaneous cloud coverage. We
analyze the relationship between measured solar irradiance and computed cloud
coverage value, and conclude that higher cloud coverage greatly impacts the
total solar irradiance. Such studies will immensely help in solar energy
generation and forecasting.
Leonie Zeune, Stephan A. van Gils, Leon W.M.M. Terstappen, Christoph Brune
Comments: 13 pages, 7 figures, conference SSVM 2017
Subjects: Numerical Analysis (math.NA); Computer Vision and Pattern Recognition (cs.CV); Spectral Theory (math.SP)
This paper focuses on multi-scale approaches for variational methods and
corresponding gradient flows. Recently, for convex regularization functionals
such as total variation, new theory and algorithms for nonlinear eigenvalue
problems via nonlinear spectral decompositions have been developed. Those
methods open new directions for advanced image filtering. However, for an
effective use in image segmentation and shape decomposition, a clear
interpretation of the spectral response regarding size and intensity scales is
needed but lacking in current approaches. In this context, (L^1) data
fidelities are particularly helpful due to their interesting multi-scale
properties such as contrast invariance. Hence, the novelty of this work is the
combination of (L^1)-based multi-scale methods with nonlinear spectral
decompositions. We compare (L^1) with (L^2) scale-space methods in view of
spectral image representation and decomposition. We show that the contrast
invariant multi-scale behavior of (L^1-TV) promotes sparsity in the spectral
response providing more informative decompositions. We provide a numerical
method and analyze synthetic and biomedical images at which decomposition leads
to improved segmentation.
Denis Volkhonskiy, Ivan Nazarov, Boris Borisenko, Evgeny Burnaev
Comments: 8 pages, 2 figures, Workshop on Adversarial Training (NIPS 2016, Barcelona, Spain)
Subjects: Multimedia (cs.MM); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Applications (stat.AP)
Steganography is collection of methods to hide secret information (“payload”)
within non-secret information (“container”). Its counterpart, Steganalysis, is
the practice of determining if a message contains a hidden payload, and
recovering it if possible. Presence of hidden payloads is typically detected by
a binary classifier. In the present study, we propose a new model for
generating image-like containers based on Deep Convolutional Generative
Adversarial Networks (DCGAN). This approach allows to generate more
setganalysis-secure message embedding using standard steganography algorithms.
Experiment results demonstrate that the new model successfully deceives the
steganography analyzer, and for this reason, can be used in steganographic
applications.
Yazhou Yao, Jian Zhang, Fumin Shen, Xiansheng Hua, Wankou Yang, Zhenmin Tang
Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV)
Studies show that refining real-world categories into semantic subcategories
contributes to better image modeling and classification. Previous image
sub-categorization work relying on labeled images and WordNet’s hierarchy is
not only labor-intensive, but also restricted to classify images into NOUN
subcategories. To tackle these problems, in this work, we exploit general
corpus information to automatically select and subsequently classify web images
into semantic rich (sub-)categories. The following two major challenges are
well studied: 1) noise in the labels of subcategories derived from the general
corpus; 2) noise in the labels of images retrieved from the web. Specifically,
we first obtain the semantic refinement subcategories from the text perspective
and remove the noise by the relevance-based approach. To suppress the search
error induced noisy images, we then formulate image selection and classifier
learning as a multi-class multi-instance learning problem and propose to solve
the employed problem by the cutting-plane algorithm. The experiments show
significant performance gains by using the generated data of our way on both
image categorization and sub-categorization tasks. The proposed approach also
consistently outperforms existing weakly supervised and web-supervised
approaches.
Justinas Miseikis, Matthias Ruther, Bernhard Walzel, Mario Hirz, Helmut Brunner
Comments: 6 pages, 8 figures, OAGM and ARW Joint Workshop 2017 on Vision, Automation and Robotics
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
Electric vehicles (EVs) and plug-in hybrid vehicles (PHEVs) are rapidly
gaining popularity on our roads. Besides a comparatively high purchasing price,
the main two problems limiting their use are the short driving range and
inconvenient charging process. In this paper we address the following by
presenting an automatic robot-based charging station with 3D vision guidance
for plugging and unplugging the charger. First of all, the whole system concept
consisting of a 3D vision system, an UR10 robot and a charging station is
presented. Then we show the shape-based matching methods used to successfully
identify and get the exact pose of the charging port. The same approach is used
to calibrate the camera-robot system by using just known structure of the
connector plug and no additional markers. Finally, a three-step robot motion
planning procedure for plug-in is presented and functionality is demonstrated
in a series of successful experiments.
A. Asensio Ramos (1,2), I. S. Requerey (1,2), N. Vitas (1,2) ((1) Instituto de Astrofisica de Canarias, (2) Universidad de La Laguna)
Comments: 9 pages, 5 figures, submitted to A&A
Subjects: Solar and Stellar Astrophysics (astro-ph.SR); Computer Vision and Pattern Recognition (cs.CV)
Many phenomena taking place in the solar photosphere are controlled by plasma
motions. Although the line-of-sight component of the velocity can be estimated
using the Doppler effect, we do not have direct spectroscopic access to the
components that are perpendicular to the line-of-sight. These components are
typically estimated using methods based on local correlation tracking. We have
designed DeepVel, an end-to-end deep neural network that produces an estimation
of the velocity at every single pixel and at every time step and at three
different heights in the atmosphere from just two consecutive continuum images.
We confront DeepVel with local correlation tracking, pointing out that they
give very similar results in the time- and spatially-averaged cases. We use the
network to study the evolution in height of the horizontal velocity field in
fragmenting granules, supporting the buoyancy-braking mechanism for the
formation of integranular lanes in these granules. We also show that DeepVel
can capture very small vortices, so that we can potentially expand the scaling
cascade of vortices to very small sizes and durations.
Xiao-Fan Niu, Wu-Jun Li
Subjects: Artificial Intelligence (cs.AI)
Knowledge graph embedding aims at translating the knowledge graph into
numerical representations by transforming the entities and relations into con-
tinuous low-dimensional vectors. Recently, many methods [1, 5, 3, 2, 6] have
been proposed to deal with this problem, but existing single-thread implemen-
tations of them are time-consuming for large-scale knowledge graphs. Here, we
design a unified parallel framework to parallelize these methods, which
achieves a significant time reduction without in uencing the accuracy. We name
our framework as ParaGraphE, which provides a library for parallel knowledge
graph embedding. The source code can be downloaded from https:
//github.com/LIBBLE/LIBBLE-MultiThread/tree/master/ParaGraphE.
Yuxin Chen, Jean-Michel Renders, Morteza Haghir Chehreghani, Andreas Krause
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)
We consider the optimal value of information (VoI) problem, where the goal is
to sequentially select a set of tests with a minimal cost, so that one can
efficiently make the best decision based on the observed outcomes. Existing
algorithms are either heuristics with no guarantees, or scale poorly (with
exponential run time in terms of the number of available tests). Moreover,
these methods assume a known distribution over the test outcomes, which is
often not the case in practice. We propose an efficient sampling-based online
learning framework to address the above issues. First, assuming the
distribution over hypotheses is known, we propose a dynamic hypothesis
enumeration strategy, which allows efficient information gathering with strong
theoretical guarantees. We show that with sufficient amount of samples, one can
identify a near-optimal decision with high probability. Second, when the
parameters of the hypotheses distribution are unknown, we propose an algorithm
which learns the parameters progressively via posterior sampling in an online
fashion. We further establish a rigorous bound on the expected regret. We
demonstrate the effectiveness of our approach on a real-world interactive
troubleshooting application and show that one can efficiently make high-quality
decisions with low cost.
Gal Dalal, Balazs Szorenyi, Gugan Thoppe, Shie Mannor
Subjects: Artificial Intelligence (cs.AI)
Two-timescale Stochastic Approximation (SA) algorithms are widely used in
Reinforcement Learning (RL). In such methods, the iterates consist of two parts
that are updated using different stepsizes. We develop the first convergence
rate result for these algorithms; in particular, we provide a general
methodology for analyzing two-timescale linear SA. We apply our methodology to
two-timescale RL algorithms such as GTD(0), GTD2, and TDC.
Yongjoo Park, Ahmad Shahab Tajik, Michael Cafarella, Barzan Mozafari
Comments: This manuscript is an extended report of the work published in ACM SIGMOD conference 2017
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI)
In today’s databases, previous query answers rarely benefit answering future
queries. For the first time, to the best of our knowledge, we change this
paradigm in an approximate query processing (AQP) context. We make the
following observation: the answer to each query reveals some degree of
knowledge about the answer to another query because their answers stem from the
same underlying distribution that has produced the entire dataset. Exploiting
and refining this knowledge should allow us to answer queries more
analytically, rather than by reading enormous amounts of raw data. Also,
processing more queries should continuously enhance our knowledge of the
underlying distribution, and hence lead to increasingly faster response times
for future queries.
We call this novel idea—learning from past query answers—Database
Learning. We exploit the principle of maximum entropy to produce answers, which
are in expectation guaranteed to be more accurate than existing sample-based
approximations. Empowered by this idea, we build a query engine on top of Spark
SQL, called Verdict. We conduct extensive experiments on real-world query
traces from a large customer of a major database vendor. Our results
demonstrate that database learning supports 73.7% of these queries, speeding
them up by up to 23.0x for the same accuracy level compared to existing AQP
systems.
Mohammad Gheshlaghi Azar, Ian Osband, Rémi Munos
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG)
We consider the problem of efficient exploration in finite horizon MDPs.We
show that an optimistic modification to model-based value iteration, can
achieve a regret bound ( ilde{O}( sqrt{HSAT} + H^2S^2A+Hsqrt{T})) where (H)
is the time horizon, (S) the number of states, (A) the number of actions and
(T) the time elapsed. This result improves over the best previous known bound
( ilde{O}(HS sqrt{AT})) achieved by the UCRL2 algorithm.The key significance
of our new results is that when (Tgeq H^3S^3A) and (SAgeq H), it leads to a
regret of ( ilde{O}(sqrt{HSAT})) that matches the established lower bounds of
(Omega(sqrt{HSAT})) up to a logarithmic factor. Our analysis contain two key
insights. We use careful application of concentration inequalities to the
optimal value function as a whole, rather than to the transitions probabilities
(to improve scaling in (S)), and we use “exploration bonuses” based on
Bernstein’s inequality, together with using a recursive -Bellman-type- Law of
Total Variance (to improve scaling in (H)).
Ke Gong, Xiaodan Liang, Xiaohui Shen, Liang Lin
Comments: Accepted to appear in CVPR 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)
Human parsing has recently attracted a lot of research interests due to its
huge application potentials. However existing datasets have limited number of
images and annotations, and lack the variety of human appearances and the
coverage of challenging cases in unconstrained environment. In this paper, we
introduce a new benchmark “Look into Person (LIP)” that makes a significant
advance in terms of scalability, diversity and difficulty, a contribution that
we feel is crucial for future developments in human-centric analysis. This
comprehensive dataset contains over 50,000 elaborately annotated images with 19
semantic part labels, which are captured from a wider range of viewpoints,
occlusions and background complexity. Given these rich annotations we perform
detailed analyses of the leading human parsing approaches, gaining insights
into the success and failures of these methods. Furthermore, in contrast to the
existing efforts on improving the feature discriminative capability, we solve
human parsing by exploring a novel self-supervised structure-sensitive learning
approach, which imposes human pose structures into parsing results without
resorting to extra supervision (i.e., no need for specifically labeling human
joints in model training). Our self-supervised learning framework can be
injected into any advanced neural networks to help incorporate rich high-level
knowledge regarding human joints from a global perspective and improve the
parsing results. Extensive evaluations on our LIP and the public
PASCAL-Person-Part dataset demonstrate the superiority of our method.
Sercan O. Arik, Markus Kliegl, Rewon Child, Joel Hestness, Andrew Gibiansky, Chris Fougner, Ryan Prenger, Adam Coates
Comments: Submitted to Interspeech 2017
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG)
Keyword spotting (KWS) constitutes a major component of human-technology
interfaces. Maximizing the detection accuracy at a low false alarm (FA) rate,
while minimizing the footprint size, latency and complexity are the goals for
KWS. Towards achieving them, we study Convolutional Recurrent Neural Networks
(CRNNs). Inspired by large-scale state-of-the-art speech recognition systems,
we combine the strengths of convolutional layers and recurrent layers to
exploit local structure and long-range context. We analyze the effect of
architecture parameters, and propose training strategies to improve
performance. With only ~230k parameters, our CRNN model yields acceptably low
latency, and achieves 97.71% accuracy at 0.5 FA/hour for 5 dB signal-to-noise
ratio.
Phong-Khac Do, Huy-Tien Nguyen, Chien-Xuan Tran, Minh-Tien Nguyen, Minh-Le Nguyen
Comments: 15 pages, 2 figures, Tenth International Workshop on Juris-informatics (JURISIN 2016) associated with JSAI International Symposia on AI 2016 (IsAI-2016)
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
This paper presents a study of employing Ranking SVM and Convolutional Neural
Network for two missions: legal information retrieval and question answering in
the Competition on Legal Information Extraction/Entailment. For the first task,
our proposed model used a triple of features (LSI, Manhattan, Jaccard), and is
based on paragraph level instead of article level as in previous studies. In
fact, each single-paragraph article corresponds to a particular paragraph in a
huge multiple-paragraph article. For the legal question answering task,
additional statistical features from information retrieval task integrated into
Convolutional Neural Network contribute to higher accuracy.
Myunga Jang, Jinho D. Choi, James Allan
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)
Technical documents contain a fair amount of unnatural language, such as
tables, formulas, pseudo-codes, etc. Unnatural language can be an important
factor of confusing existing NLP tools. This paper presents an effective method
of distinguishing unnatural language from natural language, and evaluates the
impact of unnatural language detection on NLP tasks such as document
clustering. We view this problem as an information extraction task and build a
multiclass classification model identifying unnatural language components into
four categories. First, we create a new annotated corpus by collecting slides
and papers in various formats, PPT, PDF, and HTML, where unnatural language
components are annotated into four categories. We then explore features
available from plain text to build a statistical model that can handle any
format as long as it is converted into plain text. Our experiments show that
removing unnatural language components gives an absolute improvement in
document clustering up to 15%. Our corpus and tool are publicly available.
Wenli Zhuang, Ernie Chang
Subjects: Computation and Language (cs.CL)
This paper describes a neural-network model which performed competitively
(top 6) at the SemEval 2017 cross-lingual Semantic Textual Similarity (STS)
task. Our system employs an attention-based recurrent neural network model that
optimizes the sentence similarity. In this paper, we describe our participation
in the multilingual STS task which measures similarity across English, Spanish,
and Arabic.
Florian Strub, Harm de Vries, Jeremie Mary, Bilal Piot, Aaron Courville, Olivier Pietquin
Subjects: Computation and Language (cs.CL)
End-to-end design of dialogue systems has recently become a popular research
topic thanks to powerful tools such as encoder-decoder architectures for
sequence-to-sequence learning. Yet, most current approaches cast human-machine
dialogue management as a supervised learning problem, aiming at predicting the
next utterance of a participant given the full history of the dialogue. This
vision is too simplistic to render the intrinsic planning problem inherent to
dialogue as well as its grounded nature, making the context of a dialogue
larger than the sole history. This is why only chit-chat and question answering
tasks have been addressed so far using end-to-end architectures. In this paper,
we introduce a Deep Reinforcement Learning method to optimize visually grounded
task-oriented dialogues, based on the policy gradient algorithm. This approach
is tested on a dataset of 120k dialogues collected through Mechanical Turk and
provides encouraging results at solving both the problem of generating natural
dialogues and the task of discovering a specific object in a complex picture.
Sercan O. Arik, Markus Kliegl, Rewon Child, Joel Hestness, Andrew Gibiansky, Chris Fougner, Ryan Prenger, Adam Coates
Comments: Submitted to Interspeech 2017
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG)
Keyword spotting (KWS) constitutes a major component of human-technology
interfaces. Maximizing the detection accuracy at a low false alarm (FA) rate,
while minimizing the footprint size, latency and complexity are the goals for
KWS. Towards achieving them, we study Convolutional Recurrent Neural Networks
(CRNNs). Inspired by large-scale state-of-the-art speech recognition systems,
we combine the strengths of convolutional layers and recurrent layers to
exploit local structure and long-range context. We analyze the effect of
architecture parameters, and propose training strategies to improve
performance. With only ~230k parameters, our CRNN model yields acceptably low
latency, and achieves 97.71% accuracy at 0.5 FA/hour for 5 dB signal-to-noise
ratio.
Phong-Khac Do, Huy-Tien Nguyen, Chien-Xuan Tran, Minh-Tien Nguyen, Minh-Le Nguyen
Comments: 15 pages, 2 figures, Tenth International Workshop on Juris-informatics (JURISIN 2016) associated with JSAI International Symposia on AI 2016 (IsAI-2016)
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
This paper presents a study of employing Ranking SVM and Convolutional Neural
Network for two missions: legal information retrieval and question answering in
the Competition on Legal Information Extraction/Entailment. For the first task,
our proposed model used a triple of features (LSI, Manhattan, Jaccard), and is
based on paragraph level instead of article level as in previous studies. In
fact, each single-paragraph article corresponds to a particular paragraph in a
huge multiple-paragraph article. For the legal question answering task,
additional statistical features from information retrieval task integrated into
Convolutional Neural Network contribute to higher accuracy.
Myunga Jang, Jinho D. Choi, James Allan
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)
Technical documents contain a fair amount of unnatural language, such as
tables, formulas, pseudo-codes, etc. Unnatural language can be an important
factor of confusing existing NLP tools. This paper presents an effective method
of distinguishing unnatural language from natural language, and evaluates the
impact of unnatural language detection on NLP tasks such as document
clustering. We view this problem as an information extraction task and build a
multiclass classification model identifying unnatural language components into
four categories. First, we create a new annotated corpus by collecting slides
and papers in various formats, PPT, PDF, and HTML, where unnatural language
components are annotated into four categories. We then explore features
available from plain text to build a statistical model that can handle any
format as long as it is converted into plain text. Our experiments show that
removing unnatural language components gives an absolute improvement in
document clustering up to 15%. Our corpus and tool are publicly available.
Bas van IJzendoorn
Comments: responsible teacher: Johan Pouwelse
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Cryptography and Security (cs.CR); Computers and Society (cs.CY)
Online trust systems are playing an important role in to-days world and face
various challenges in building them. Billions of dollars of products and
services are traded through electronic commerce, files are shared among large
peer-to-peer networks and smart contracts can potentially replace paper
contracts with digital contracts. These systems rely on trust mechanisms in
peer-to-peer networks like reputation systems or a trustless public ledger. In
most cases, reputation systems are build to determine the trustworthiness of
users and to provide incentives for users to make a fair contribution to the
peer-to-peer network. The main challenges are how to set up a good trust
system, how to deal with security issues and how to deal with strategic users
trying to cheat on the system. The Sybil attack, the most important attack on
reputation systems is discussed. At last match making in two sided markets and
the strategy proofness of these markets are discussed.
Blair Archibald, Ciaran McCreesh, Patrick Maier, Rob Stewart, Phil Trinder
Comments: 38 pages, 12 figures, submitted to the Journal of Parallel and Distributed Computing
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Branch and bound searches are a common technique for solving global
optimisation and decision problems, yet their irregularity, search order
dependence, and the need to share bound information globally makes it
challenging to implement them in parallel, and to reason about their parallel
performance.
We identify three key parallel search properties for replicable branch and
bound implementations: Sequential Lower Bound, Non-increasing Runtimes, and
Repeatability. We define a formal model for parallel branch and bound search
problems and show its generality by using it to define three benchmarks:
finding a Maximum Clique in a graph, 0/1 Knapsack and Travelling Salesperson
(TSP). We present a Generic Branch and Bound search API that conforms to the
model. For reusability we encapsulate the search behaviours as a pair of
algorithmic based skeletons in a distributed-memory parallel Haskell. Crucially
the Ordered skeleton is designed to guarantee the parallel search properties,
potentially at a performance cost compared with the Unordered skeleton.
We compare the sequential performance of the skeletons with a class leading
C++ search implementation. We then use relative speedups to evaluate the
skeletons for 40 benchmark instances on a cluster using 200 workers. The
Ordered skeleton preserves the Sequential Lower Bound for all benchmark
instances while the Unordered skeleton violates the property for 5 TSP
instances. The Ordered skeleton preserves Non-increasing Runtimes for all
benchmark instances while the Unordered skeleton violates the property for many
instances of all three benchmarks. The Ordered skeleton delivers far more
repeatable performance than the Unordered skeleton (Repeatability property)
with a median relative standard deviation (RSD) of 1.78% vs 5.56%, 1.83% vs
87.56% and 1.96% vs 8.61% for all Maximum Clique, Knapsack and TSP instances
respectively.
Christian Schulz, Jesper Larsson Träff
Comments: arXiv admin note: text overlap with arXiv:1311.1714
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS); Combinatorics (math.CO)
This paper severs as a user guide to the mapping framework VieM (Vienna
Mapping and Sparse Quadratic Assignment). We give a rough overview of the
techniques used within the framework and describe the user interface as well as
the file formats used.
Zhuolun Xiang, Nitin H. Vaidya
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Distributed shared memory systems maintain multiple replicas of the shared
memory locations. Maintaining causal consistency in such systems has received
significant attention in the past. However, much of the previous literature
focuses on full replication wherein each replica stores a copy of all the
locations in the shared memory. In this paper, we investigate causal
consistency in partially replicated systems, wherein each replica may store
only a subset of the shared data. To achieve causal consistency, it is
necessary to ensure that, before an update is performed at any given replica,
all causally preceding updates must also be performed. Achieving this goal
requires some mechanism to track causal dependencies. In the context of full
replication, this goal is often achieved using vector timestamps, with the
number of vector elements being equal to the number of replicas. Building on
the past work, this paper makes three key contributions: 1. We develop lower
bounds on the size of the timestamps that must be maintained in order to
achieve causal consistency in partially replicated systems. The size of the
timestamps is a function of the manner in which the replicas share data, and
the set of replicas accessed by each client. 2. We present an algorithm to
achieve causal consistency in partially replicated systems using simple vector
timestamps. 3. We present some optimizations to improve the overhead of the
timestamps required with partial replication.
Mitar Milutinovic, Warren He, Howard Wu, Maxinder Kanwal
Comments: SysTEX ’16, December 12-16, 2016, Trento, Italy
Subjects: Cryptography and Security (cs.CR); Distributed, Parallel, and Cluster Computing (cs.DC)
In the paper, we present designs for multiple blockchain consensus primitives
and a novel blockchain system, all based on the use of trusted execution
environments (TEEs), such as Intel SGX-enabled CPUs. First, we show how using
TEEs for existing proof of work schemes can make mining equitably distributed
by preventing the use of ASICs. Next, we extend the design with proof of time
and proof of ownership consensus primitives to make mining energy- and
time-efficient. Further improving on these designs, we present a blockchain
using a proof of luck consensus protocol. Our proof of luck blockchain uses a
TEE platform’s random number generation to choose a consensus leader, which
offers low-latency transaction validation, deterministic confirmation time,
negligible energy consumption, and equitably distributed mining. Lastly, we
discuss a potential protection against up to a constant number of compromised
TEEs.
Francesco Orsini, Daniele Baracchi, Paolo Frasconi
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
We introduce an architecture based on deep hierarchical decompositions to
learn effective representations of large graphs. Our framework extends classic
R-decompositions used in kernel methods, enabling nested “part-of-part”
relations. Unlike recursive neural networks, which unroll a template on input
graphs directly, we unroll a neural network template over the decomposition
hierarchy, allowing us to deal with the high degree variability that typically
characterize social network graphs. Deep hierarchical decompositions are also
amenable to domain compression, a technique that reduces both space and time
complexity by exploiting symmetries. We show empirically that our approach is
competitive with current state-of-the-art graph classification methods,
particularly when dealing with social network datasets.
Tien Thanh Nguyen, Xuan Cuong Pham, Alan Wee-Chung Liew, Witold Pedrycz
Comments: 33 pages, 3 figures
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
In this study, we introduce a new approach to combine multi-classifiers in an
ensemble system. Instead of using numeric membership values encountered in
fixed combining rules, we construct interval membership values associated with
each class prediction at the level of meta-data of observation by using
concepts of information granules. In the proposed method, uncertainty
(diversity) of findings produced by the base classifiers is quantified by
interval-based information granules. The discriminative decision model is
generated by considering both the bounds and the length of the obtained
intervals. We select ten and then fifteen learning algorithms to build a
heterogeneous ensemble system and then conducted the experiment on a number of
UCI datasets. The experimental results demonstrate that the proposed approach
performs better than the benchmark algorithms including six fixed combining
methods, one trainable combining method, AdaBoost, Bagging, and Random
Subspace.
Sainbayar Sukhbaatar, Ilya Kostrikov, Arthur Szlam, Rob Fergus
Subjects: Learning (cs.LG)
We describe a simple scheme that allows an agent to explore its environment
in an unsupervised manner. Our scheme pits two versions of the same agent,
Alice and Bob, against one another. Alice proposes a task for Bob to complete;
and then Bob attempts to complete the task. In this work we will focus on
(nearly) reversible environments, or environments that can be reset, and Alice
will “propose” the task by running a set of actions and then Bob must partially
undo, or repeat them, respectively. Via an appropriate reward structure, Alice
and Bob automatically generate a curriculum of exploration, enabling
unsupervised training of the agent. When deployed on an RL task within the
environment, this unsupervised training reduces the number of episodes needed
to learn.
Vijayaraghavan Murali, Swarat Chaudhuri, Chris Jermaine
Subjects: Programming Languages (cs.PL); Learning (cs.LG)
We present a data-driven approach to the problem of inductive computer
program synthesis. Our method learns a probabilistic model for real-world
programs from a corpus of existing code. It uses this model during synthesis to
automatically infer a posterior distribution over sketches, or syntactic models
of the problem to be synthesized. Sketches sampled from this posterior are then
used to drive combinatorial synthesis of a program in a high-level programming
language.
The key technical innovation of our approach — embodied in a system called
Bayou — is utilizing user-supplied evidence as to the program’s desired
behavior, along with a Bayesian update, to obtain a posterior distribution over
the program’s true, latent specification (indicating user intent), which in
turn produces a posterior over possible sketches. As we show experimentally,
explicitly modeling uncertainty in specification significantly increases the
accuracy of the synthesis algorithm. We evaluate Bayou’s ability to synthesize
Java and Android methods. We find that using just a few example API sequences
to communicate user intent, Bayou can synthesize complex method bodies, some
implementing tasks never encountered during training.
David Belanger, Bishan Yang, Andrew McCallum
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
Structured Prediction Energy Networks (Belanger and McCallum, 2016) (SPENs)
are a simple, yet expressive family of structured prediction models. An energy
function over candidate structured outputs is given by a deep network, and
predictions are formed by gradient-based optimization. Unfortunately, we have
struggled to apply the structured SVM (SSVM) learning method of Belanger and
McCallum, 2016 to applications with more complex structure than multi-label
classification. In general, SSVMs are unreliable whenever exact energy
minimization is intractable. In response, we present end-to-end learning for
SPENs, where the energy function is discriminatively trained by
back-propagating through gradient-based prediction. This paper presents a
collection of methods necessary to apply the technique to problems with complex
structure. For example, we avoid vanishing gradients when learning SPENs for
convex relaxations of discrete prediction problems and explicitly train models
such that energy minimization converges quickly in practice. Using end-to-end
learning, we demonstrate the power of SPENs on 7-Scenes depth image denoising
and CoNLL-2005 semantic role labeling tasks. In both, we outperform competitive
baselines that employ more simplistic energy functions, but perform exact
energy minimization. In particular, for denoising we achieve 40 PSNR,
outperforming the previous state-of-the-art of 36.
Ignacio Rocco, Relja Arandjelović, Josef Sivic
Comments: To appear in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
We address the problem of determining correspondences between two images in
agreement with a geometric model such as an affine or thin-plate-spline
transformation, and estimating its parameters. The contributions of this work
are three-fold. First, we propose a convolutional neural network architecture
for geometric matching. The architecture is based on three main components that
mimic the standard steps of feature extraction, matching and simultaneous
inlier detection and model parameter estimation, while being trainable
end-to-end. Second, we demonstrate that the network parameters can be trained
from synthetically generated imagery without the need for manual annotation and
that our matching layer significantly increases generalization capabilities to
never seen before images. Finally, we show that the same model can perform both
instance-level and category-level matching giving state-of-the-art results on
the challenging Proposal Flow dataset.
Erwin Quiring, Daniel Arp, Konrad Rieck
Subjects: Cryptography and Security (cs.CR); Learning (cs.LG)
Machine learning is increasingly used in security-critical applications, such
as autonomous driving, face recognition and malware detection. Most learning
methods, however, have not been designed with security in mind and thus are
vulnerable to different types of attacks. This problem has motivated the
research field of adversarial machine learning that is concerned with attacking
and defending learning methods. Concurrently, a different line of research has
tackled a very similar problem: In digital watermarking information are
embedded in a signal in the presence of an adversary. As a consequence, this
research field has also extensively studied techniques for attacking and
defending watermarking methods.
The two research communities have worked in parallel so far, unnoticeably
developing similar attack and defense strategies. This paper is a first effort
to bring these communities together. To this end, we present a unified notation
of black-box attacks against machine learning and watermarking that reveals the
similarity of both settings. To demonstrate the efficacy of this unified view,
we apply concepts from watermarking to machine learning and vice versa. We show
that countermeasures from watermarking can mitigate recent model-extraction
attacks and, similarly, that techniques for hardening machine learning can fend
off oracle attacks against watermarks. Our work provides a conceptual link
between two research fields and thereby opens novel directions for improving
the security of both, machine learning and digital watermarking.
Oscar De Somer, Ana Soares, Tristan Kuijpers, Koen Vossen, Koen Vanthournout, Fred Spiessens
Comments: Submitted to IEEE ISGT Europe 2017
Subjects: Systems and Control (cs.SY); Learning (cs.LG)
This paper demonstrates a data-driven control approach for demand response in
real-life residential buildings. The objective is to optimally schedule the
heating cycles of the Domestic Hot Water (DHW) buffer to maximize the
self-consumption of the local photovoltaic (PV) production. A model-based
reinforcement learning technique is used to tackle the underlying sequential
decision-making problem. The proposed algorithm learns the stochastic occupant
behavior, predicts the PV production and takes into account the dynamics of the
system. A real-life experiment with six residential buildings is performed
using this algorithm. The results show that the self-consumption of the PV
production is significantly increased, compared to the default thermostat
control.
Yuxin Chen, Jean-Michel Renders, Morteza Haghir Chehreghani, Andreas Krause
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)
We consider the optimal value of information (VoI) problem, where the goal is
to sequentially select a set of tests with a minimal cost, so that one can
efficiently make the best decision based on the observed outcomes. Existing
algorithms are either heuristics with no guarantees, or scale poorly (with
exponential run time in terms of the number of available tests). Moreover,
these methods assume a known distribution over the test outcomes, which is
often not the case in practice. We propose an efficient sampling-based online
learning framework to address the above issues. First, assuming the
distribution over hypotheses is known, we propose a dynamic hypothesis
enumeration strategy, which allows efficient information gathering with strong
theoretical guarantees. We show that with sufficient amount of samples, one can
identify a near-optimal decision with high probability. Second, when the
parameters of the hypotheses distribution are unknown, we propose an algorithm
which learns the parameters progressively via posterior sampling in an online
fashion. We further establish a rigorous bound on the expected regret. We
demonstrate the effectiveness of our approach on a real-world interactive
troubleshooting application and show that one can efficiently make high-quality
decisions with low cost.
Mohammad Gheshlaghi Azar, Ian Osband, Rémi Munos
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG)
We consider the problem of efficient exploration in finite horizon MDPs.We
show that an optimistic modification to model-based value iteration, can
achieve a regret bound ( ilde{O}( sqrt{HSAT} + H^2S^2A+Hsqrt{T})) where (H)
is the time horizon, (S) the number of states, (A) the number of actions and
(T) the time elapsed. This result improves over the best previous known bound
( ilde{O}(HS sqrt{AT})) achieved by the UCRL2 algorithm.The key significance
of our new results is that when (Tgeq H^3S^3A) and (SAgeq H), it leads to a
regret of ( ilde{O}(sqrt{HSAT})) that matches the established lower bounds of
(Omega(sqrt{HSAT})) up to a logarithmic factor. Our analysis contain two key
insights. We use careful application of concentration inequalities to the
optimal value function as a whole, rather than to the transitions probabilities
(to improve scaling in (S)), and we use “exploration bonuses” based on
Bernstein’s inequality, together with using a recursive -Bellman-type- Law of
Total Variance (to improve scaling in (H)).
Ke Gong, Xiaodan Liang, Xiaohui Shen, Liang Lin
Comments: Accepted to appear in CVPR 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)
Human parsing has recently attracted a lot of research interests due to its
huge application potentials. However existing datasets have limited number of
images and annotations, and lack the variety of human appearances and the
coverage of challenging cases in unconstrained environment. In this paper, we
introduce a new benchmark “Look into Person (LIP)” that makes a significant
advance in terms of scalability, diversity and difficulty, a contribution that
we feel is crucial for future developments in human-centric analysis. This
comprehensive dataset contains over 50,000 elaborately annotated images with 19
semantic part labels, which are captured from a wider range of viewpoints,
occlusions and background complexity. Given these rich annotations we perform
detailed analyses of the leading human parsing approaches, gaining insights
into the success and failures of these methods. Furthermore, in contrast to the
existing efforts on improving the feature discriminative capability, we solve
human parsing by exploring a novel self-supervised structure-sensitive learning
approach, which imposes human pose structures into parsing results without
resorting to extra supervision (i.e., no need for specifically labeling human
joints in model training). Our self-supervised learning framework can be
injected into any advanced neural networks to help incorporate rich high-level
knowledge regarding human joints from a global perspective and improve the
parsing results. Extensive evaluations on our LIP and the public
PASCAL-Person-Part dataset demonstrate the superiority of our method.
Kiran Bangalore Ravi, Jean Serra
Comments: Accepted at ISMM 2017
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
Random forests perform bootstrap-aggregation by sampling the training samples
with replacement. This enables the evaluation of out-of-bag error which serves
as a internal cross-validation mechanism. Our motivation lies in using the
unsampled training samples to improve each decision tree in the ensemble. We
study the effect of using the out-of-bag samples to improve the generalization
error first of the decision trees and second the random forest by post-pruning.
A preliminary empirical study on four UCI repository datasets show consistent
decrease in the size of the forests without considerable loss in accuracy.
Sercan O. Arik, Markus Kliegl, Rewon Child, Joel Hestness, Andrew Gibiansky, Chris Fougner, Ryan Prenger, Adam Coates
Comments: Submitted to Interspeech 2017
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG)
Keyword spotting (KWS) constitutes a major component of human-technology
interfaces. Maximizing the detection accuracy at a low false alarm (FA) rate,
while minimizing the footprint size, latency and complexity are the goals for
KWS. Towards achieving them, we study Convolutional Recurrent Neural Networks
(CRNNs). Inspired by large-scale state-of-the-art speech recognition systems,
we combine the strengths of convolutional layers and recurrent layers to
exploit local structure and long-range context. We analyze the effect of
architecture parameters, and propose training strategies to improve
performance. With only ~230k parameters, our CRNN model yields acceptably low
latency, and achieves 97.71% accuracy at 0.5 FA/hour for 5 dB signal-to-noise
ratio.
Thomas E. Potok, Catherine Schuman, Steven R. Young, Robert M. Patton, Federico Spedalieri, Jeremy Liu, Ke-Thia Yao, Garrett Rose, Gangotree Chakma
Subjects: Neural and Evolutionary Computing (cs.NE); Learning (cs.LG)
Current Deep Learning approaches have been very successful using
convolutional neural networks (CNN) trained on large graphical processing units
(GPU)-based computers. Three limitations of this approach are: 1) they are
based on a simple layered network topology, i.e., highly connected layers,
without intra-layer connections; 2) the networks are manually configured to
achieve optimal results, and 3) the implementation of neuron model is expensive
in both cost and power. In this paper, we evaluate deep learning models using
three different computing architectures to address these problems: quantum
computing to train complex topologies, high performance computing (HPC) to
automatically determine network topology, and neuromorphic computing for a
low-power hardware implementation. We use the MNIST dataset for our experiment,
due to input size limitations of current quantum computers. Our results show
the feasibility of using the three architectures in tandem to address the above
deep learning limitations. We show a quantum computer can find high quality
values of intra-layer connections weights, in a tractable time as the
complexity of the network increases; a high performance computer can find
optimal layer-based topologies; and a neuromorphic computer can represent the
complex topology and weights derived from the other architectures in low power
memristive hardware.
Alexander A. Okandeji, Muhammad R. A. Khandaker, Kai-Kit Wong, Gan Zheng, Yangyang Zhang, Zhongbin Zheng
Subjects: Information Theory (cs.IT)
This paper investigates a multiuser multiple-input single-output (MISO)
full-duplex (FD) system for simultaneous wireless information and power
transfer (SWIPT), in which a multi-antenna base station (BS) simultaneously
sends wirelessly information and power to a set of single-antenna mobile
stations (MSs) using power splitters (PSs) in the downlink and receives
information in the uplink in FD mode. In particular, we address the joint
design of the receive PS ratio and the transmit power at the MSs, and the
beamforming matrix at the BS under signal-to-interference-plus-noise ratio
(SINR) and the harvested power constraints. Using semidefinite relaxation
(SDR), we obtain the solution to the problem with imperfect channel state
information (CSI) of the self-interfering channels. Furthermore, we propose
another suboptimal zero-forcing (ZF) based solution by separating the
optimization of the transmit beamforming vector and the PS ratio. Simulation
results are provided to evaluate the performance of the proposed beamforming
designs.
Giovanni Geraci, Adrian Garcia-Rodriguez, David López-Pérez, Andrea Bonfante, Lorenzo Galati Giordano, Holger Claussen
Comments: To appear in Proc. IEEE ICC 2017
Subjects: Information Theory (cs.IT)
We consider cellular base stations (BSs) equipped with a large number of
antennas and operating in the unlicensed band. We denote such system as massive
MIMO unlicensed (mMIMO-U). We design the key procedures required to guarantee
coexistence between a cellular BS and nearby Wi-Fi devices. These include:
neighboring Wi-Fi channel covariance estimation, allocation of spatial degrees
of freedom for interference suppression, and enhanced channel sensing and data
transmission phases. We evaluate the performance of the so-designed mMIMO-U,
showing that it allows simultaneous cellular and Wi-Fi transmissions by keeping
their mutual interference below the regulatory threshold. The same is not true
for conventional listen-before-talk (LBT) operations. As a result, mMIMO-U
boosts the aggregate cellular-plus-Wi-Fi data rate in the unlicensed band with
respect to conventional LBT, exhibiting increasing gains as the number of BS
antennas grows.
Felix Fellhauer, Nabil Loghin, Dana Ciochina, Thomas Handte, Stephan ten Brink
Comments: Submitted to 2017 International Workshop on Signal Processing Advances in Wireless Communications (SPAWC)
Subjects: Information Theory (cs.IT)
This paper introduces a low complexity method for antenna sector selection in
mmWave Hybrid MIMO communication systems like the IEEE 802.11ay amendment for
Wireless LANs. The method is backwards compatible to the methods already
defined for the released mmWave standard IEEE 802.11ad. We introduce an
extension of the 802.11ad channel model to support common Hybrid MIMO
configurations. The proposed method is evaluated and compared to the
theoretical limit of transmission rates found by exhaustive search. In contrast
to state-of-the-art solutions, the presented method requires sparse channel
information only. Numerical results show a significant complexity reduction in
terms of number of necessary trainings, while approaching maximum achievable
rate.
Zhengquan Zhang, Zheng Ma, Ming Xiao, Pingzhi Fan
Comments: 30 pages, 8 figures
Subjects: Information Theory (cs.IT)
Broadcasting/multicasting is an efficient mechanism for multimedia
communications due to its high spectrum efficiency, which achieves
point-to-multipoint transmission on the same radio resources. To satisfy the
increasing demands for multimedia broadcast multicast service (MBMS), we
present a power domain non-orthogonal MBMS transmission scheme in a K-tier
heterogeneous network (HetNet). Firstly, the system model, usage scenarios, and
fundamentals of the presented scheme are discussed. Next, a tractable framework
is developed to analyse the performance of non-orthogonal MBMS transmission, by
using stochastic geometry. Based on this framework, the analytical expressions
for the signal-to-interference-plus-noise ratio (SINR) coverage probability,
average number of served users, and sum rate are derived. Furthermore,
synchronous non-orthogonal MBMS transmission to further improving the system
performance is also studied. The results demonstrate that non-orthogonal MBMS
transmission can achieve better performance than the conventional one, in which
non-orthogonal multirate one can fully utilize channel conditions to achieve a
significant rate gain, while non-orthogonal multi-service one can efficiently
use power resources to guarantee the quality of service (QoS) of high priority
users, and also provide services for low priority users simultaneously.
Syed Mohsin Abbas, YouZhe Fan, Ji Chen, Chi-Ying Tsui
Comments: Accepted for publication in IEEE International Symposium on Circuits & Systems 2017 (ISCAS 2017)
Subjects: Information Theory (cs.IT)
Owing to their capacity-achieving performance and low encoding and decoding
complexity, polar codes have drawn much research interests recently. Successive
cancellation decoding (SCD) and belief propagation decoding (BPD) are two
common approaches for decoding polar codes. SCD is sequential in nature while
BPD can run in parallel. Thus BPD is more attractive for low latency
applications. However BPD has some performance degradation at higher SNR when
compared with SCD. Concatenating LDPC with Polar codes is one popular approach
to enhance the performance of BPD , where a short LDPC code is used as an outer
code and Polar code is used as an inner code. In this work we propose a new way
to construct concatenated LDPC-Polar code, which not only outperforms
conventional BPD and existing concatenated LDPC-Polar code but also shows a
performance improvement of 0.5 dB at higher SNR regime when compared with SCD.
Mohammad Eslami, Seyed Hamid Safavi, Farah Torkamani-Azar, Esfandiar Mehrshahi
Comments: Submitted to the 25th European Signal Processing Conference (EUSIPCO 2017). arXiv admin note: text overlap with arXiv:1612.02892
Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)
Spectrum resources management of growing demands is a challenging problem and
Cognitive Radio (CR) known to be capable of improving the spectrum utilization.
Recently, Power Spectral Density (PSD) map is defined to enable the CR to reuse
the frequency resources regarding to the area. For this reason, the sensed PSDs
are collected by the distributed sensors in the area and fused by a Fusion
Center (FC). But, for a given zone, the sensed PSDs by neighbor CR sensors may
contain a shared common component for a while. This component can be exploited
in the theory of the Distributed Source Coding (DSC) to make the sensors
transmission data more compressed. However, uncertain channel fading and random
shadowing would lead to varying signal strength at different CRs, even placed
close to each other. Hence, existence of some perturbations in the transmission
procedure yields to some imperfection in the reporting channel and as a result
it degrades the performance remarkably. The main focus of this paper is to be
able to reconstruct the PSDs of sensors extit{robustly} based on the
Distributed Compressive Sensing (DCS) when the data transmission is slightly
imperfect. Simulation results verify the robustness of the proposed scheme.
Manoj A, Arun Pachai Kannu
Comments: 5 page, 3 figures
Subjects: Information Theory (cs.IT)
In a multi-user millimeter (mm) wave communication system, we consider the
problem of estimating the channel response between the central node (base
station) and each of the user equipments (UE). We propose three different
strategies: 1) Each UE estimates its channel separately, 2) Base station
estimates all the UEs channels jointly, and 3) Two stage process with
estimation done at both UE and base station. Exploiting the low rank nature of
the mm wave channels, we propose a generalized block orthogonal matching
pursuit (G.BOMP) framework for channel estimation in all the three strategies.
Our simulation results show that, the average beamforming gain of the G.BOMP
algorithm is higher than that of the conventional OMP algorithm and other
existing works on the multi-user mm wave system.
Mohammad Mozaffari, Walid Saad, Mehdi Bennis, Merouane Debbah
Subjects: Information Theory (cs.IT)
In this paper, the efficient deployment and mobility of multiple unmanned
aerial vehicles (UAVs), used as aerial base stations to collect data from
ground Internet of Things (IoT) devices, is investigated. In particular, to
enable reliable uplink communications for IoT devices with a minimum total
transmit power, a novel framework is proposed for jointly optimizing the
three-dimensional (3D) placement and mobility of the UAVs, device-UAV
association, and uplink power control. First, given the locations of active IoT
devices at each time instant, the optimal UAVs’ locations and associations are
determined. Next, to dynamically serve the IoT devices in a time-varying
network, the optimal mobility patterns of the UAVs are analyzed. To this end,
based on the activation process of the IoT devices, the time instances at which
the UAVs must update their locations are derived. Moreover, the optimal 3D
trajectory of each UAV is obtained in a way that the total energy used for the
mobility of the UAVs is minimized while serving the IoT devices. Simulation
results show that, using the proposed approach, the total transmit power of the
IoT devices is reduced by 45% compared to a case in which stationary aerial
base stations are deployed. In addition, the proposed approach can yield a
maximum of 28% enhanced system reliability compared to the stationary case. The
results also reveal an inherent tradeoff between the number of update times,
the mobility of the UAVs, and the transmit power of the IoT devices. In
essence, a higher number of updates can lead to lower transmit powers for the
IoT devices at the cost of an increased mobility for the UAVs.
Mukul Agarwal, Sanjoy Mitter, Anant Sahai
Subjects: Information Theory (cs.IT)
Theorems from Part 1 of this paper are generalized to {psi}-mixing sources
in this paper. Application to Markoff chains and order m Markoff chains is
presented. The main result is the generalization of Theorem 1 in Part 1.
Mukul Agarwal, Sanjoy Mitter, Anant Sahai
Subjects: Information Theory (cs.IT)
In this paper, the problem of communication over an essentially unknown
channel, which is known to be able to communicate a source to a destination to
within a certain distortion level, is con- sidered from a behavioral,
interconnection view-point. Rates of re- liable communication are derived and
source-channel separation for communication with fidelity criteria is proved.
The results are then generalized to the multi-user setting under certain
assump- tions. Other applications of this problem problem which follow from
this perspective are discussed.
Ilya Dumer
Comments: 4 pages, 2 figures
Subjects: Information Theory (cs.IT)
We consider the known list decoding algorithms for polar codes and compare
their complexity.
Index terms: Polar codes; Reed-Muller codes; successive cancellation
decoding.
Ilya Dumer
Journal-ref: IEEE Trans. Info. Theory, vol. 50, pp. 811-823, 2004
Subjects: Information Theory (cs.IT)
Recursive decoding techniques are considered for Reed-Muller (RM) codes of
growing length (n) and fixed order (r.) An algorithm is designed that has
complexity of order (nlog n) and corrects most error patterns of weight up to
(n(1/2-varepsilon)) given that (varepsilon) exceeds (n^{-1/2^{r}}.) This
improves the asymptotic bounds known for decoding RM codes with nonexponential
complexity.
Ilya Dumer, Kirill Shabunov
Journal-ref: IEEE Trans. Info. Theory, vol. 52, no. 3, pp. 1260-1266, 2006
Subjects: Information Theory (cs.IT)
Recursive list decoding is considered for Reed-Muller (RM) codes. The
algorithm repeatedly relegates itself to the shorter RM codes by recalculating
the posterior probabilities of their symbols. Intermediate decodings are only
performed when these recalculations reach the trivial RM codes. In turn, the
updated lists of most plausible codewords are used in subsequent decodings. The
algorithm is further improved by using permutation techniques on code positions
and by eliminating the most error-prone information bits. Simulation results
show that for all RM codes of length 256 and many subcodes of length 512, these
algorithms approach maximum-likelihood (ML) performance within a margin of 0.1
dB. As a result, we present tight experimental bounds on ML performance for
these codes.
Ilya Dumer, Kirill Shabunov
Journal-ref: “Information, Coding and Mathematics”, ed. M. Blaum, P. Farrell,
and H.C.A. van Tilborg, Kluwer, Boston, 2002, pp. 279-298
Subjects: Information Theory (cs.IT)
We consider recursive decoding for Reed-Muller (RM) codes and their subcodes.
Two new recursive techniques are described. We analyze asymptotic properties of
these algorithms and show that they substantially outperform other decoding
algorithms with nonexponential complexity known for RM codes. Decoding
performance is further enhanced by using intermediate code lists and
permutation procedures. For moderate lengths up to 512, near-optimum decoding
with feasible complexity is obtained.
Ilya Dumer
Journal-ref: Proc. 37th Allerton Conf. on Commun., Control, and Computing,
Monticello, IL, USA, 1999, pp. 61-69
Subjects: Information Theory (cs.IT)
New soft- and hard decision decoding algorithms are presented for general
Reed-Muller codes (left{genfrac{}{}{0pt}{}{m}{r}
ight} ) of length (2^{m})
and distance (2^{m-r}). We use Plotkin ((u,u+v)) construction and decompose
code (left{genfrac{}{}{0pt}{}{m}{r}
ight} ) onto subblocks
(uinleft{genfrac{}{}{0pt}{}{m-1}{r}
ight} ) and
(vinleft{genfrac{}{}{0pt}{}{m-1}{r-1}
ight} .) In decoding, we first try
to find a subblock (v) from the better protected code and then proceed with the
block (u). The likelihoods of the received symbols are recalculated in a way
similar to belief propagation. Thus, decoding is relegated to the two
constituent codes. We repeat this recursion and execute decoding only at the
end nodes (left{genfrac{}{}{0pt}{}{j}{1}
ight} ) and
(left{genfrac{}{}{0pt}{}{j}{j-1}
ight} ). The overall complexity has low
order of (nlog n.) It is shown that this decoding substantially outperforms
other algorithms of polynomial complexity known for RM codes. In particular,
for medium and high code rates, the algorithm corrects most error patterns of
weight (dln d/2.)
Ilya Dumer, Kirill Shabunov
Journal-ref: Proc. 38th Allerton Conf. Communication, Control, and Computing,
Monticello, IL, USA, 2000, pp. 71-80
Subjects: Information Theory (cs.IT)
We consider recursive decoding techniques for RM codes, their subcodes, and
newly designed codes. For moderate lengths up to 512, we obtain near-optimum
decoding with feasible complexity.
M.E. Shirokov
Comments: 25 pages, any comments are welcome
Subjects: Quantum Physics (quant-ph); Information Theory (cs.IT); Mathematical Physics (math-ph)
We present a family of easily computable upper bounds for the Holevo quantity
of ensemble of quantum states depending on a reference state as a free
parameter. These upper bounds are obtained by combining probabilistic and
metric characteristics of the ensemble. We show that appropriate choice of the
reference state gives tight upper bounds for the Holevo quantity which in many
cases improve existing estimates in the literature.
We also present upper bound for the Holevo quantity of a generalized ensemble
of quantum states with finite average energy depending on metric divergence of
the ensemble. The specification of this upper bound for the multi-mode quantum
oscillator is tight for large energy.
The above results are used to obtain tight upper bounds for the Holevo
capacity of finite-dimensional and infinite-dimensional energy-constrained
quantum channels depending on metric characteristics of the channel output.
Christian Kümmerle, Juliane Sigl
Comments: 46 pages, 4 figures
Subjects: Numerical Analysis (math.NA); Information Theory (cs.IT)
We propose a new iteratively reweighted least squares (IRLS) algorithm for
the recovery of a matrix (X in mathbb{C}^{d_1 imes d_2}) of rank (r
llmin(d_1,d_2)) from incomplete linear observations, solving a sequence of
low complexity linear problems. The easily implementable algorithm, which we
call harmonic mean iteratively reweighted least squares (HM-IRLS), optimizes a
non-convex Schatten-(p) quasi-norm penalization to promote low-rankness and
carries three major strengths, in particular for the matrix completion setting.
First, the algorithm converges globally to the low-rank matrix for relevant,
interesting cases, for which any other (non-)convex state-of-the-art
optimization approach fails the recovery. Secondly, HM-IRLS exhibits an
empirical recovery probability close to (100\%) even for a number of
measurements very close to the theoretical lower bound (r (d_1 +d_2 -r)), i.e.,
already for significantly fewer linear observations than any other tractable
approach in the literature. Thirdly, HM-IRLS exhibits a locally superlinear
rate of convergence (of order (2-p)) if the linear observations fulfill a
suitable null space property. While for the first two properties we have so far
only strong empirical evidence, we prove the third property as our main
theoretical result.