Peter Sanders, Christian Schulz, Darren Strash, Robert Williger
Comments: arXiv admin note: text overlap with arXiv:1509.01190, arXiv:1110.0477
Subjects: Neural and Evolutionary Computing (cs.NE); Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS)
Computing high quality node separators in large graphs is necessary for a
variety of applications, ranging from divide-and-conquer algorithms to VLSI
design. In this work, we present a novel distributed evolutionary algorithm
tackling the k-way node separator problem. A key component of our contribution
includes new k-way local search algorithms based on maximum flows. We combine
our local search with a multilevel approach to compute an initial population
for our evolutionary algorithm, and further show how to modify the coarsening
stage of our multilevel algorithm to create effective combine and mutation
operations. Lastly, we combine these techniques with a scalable communication
protocol, producing a system that is able to compute high quality solutions in
a short amount of time. Our experiments against competing algorithms show that
our advanced evolutionary algorithm computes the best result on 94% of the
chosen benchmark instances.
Mohammadreza Babaee, Duc Tung Dinh, Gerhard Rigoll
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In this work, we present a novel background subtraction system that uses a
deep Convolutional Neural Network (CNN) to perform the segmentation. With this
approach, feature engineering and parameter tuning become unnecessary since the
network parameters can be learned from data by training a single CNN that can
handle various video scenes. Additionally, we propose a new approach to
estimate background model from video. For the training of the CNN, we employed
randomly 5 percent video frames and their ground truth segmentations taken from
the Change Detection challenge 2014(CDnet 2014). We also utilized
spatial-median filtering as the post-processing of the network outputs. Our
method is evaluated with different data-sets, and the network outperforms the
existing algorithms with respect to the average ranking over different
evaluation metrics. Furthermore, due to the network architecture, our CNN is
capable of real time processing.
Afshin Dehghan, Syed Zain Masood, Guang Shu, Enrique. G. Ortiz
Comments: 7 Pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
This paper describes the details of Sighthound’s fully automated vehicle
make, model and color recognition system. The backbone of our system is a deep
convolutional neural network that is not only computationally inexpensive, but
also provides state-of-the-art results on several competitive benchmarks.
Additionally, our deep network is trained on a large dataset of several million
images which are labeled through a semi-automated process. Finally we test our
system on several public datasets as well as our own internal test dataset. Our
results show that we outperform other methods on all benchmarks by significant
margins. Our model is available to developers through the Sighthound Cloud API
at this https URL
Xinyu Li, Yanyi Zhang, Jianyu Zhang, Shuhong Chen, Ivan Marsic, Richard A. Farneth, Randall S. Burd
Comments: 14 pages, 12 figures, under review
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We introduce a system that recognizes concurrent activities from real-world
data captured by multiple sensors of different types. The recognition is
achieved in two steps. First, we extract spatial and temporal features from the
multimodal data. We feed each datatype into a convolutional neural network that
extracts spatial features, followed by a long-short term memory network that
extracts temporal information in the sensory data. The extracted features are
then fused for decision making in the second step. Second, we achieve
concurrent activity recognition with a single classifier that encodes a binary
output vector in which elements indicate whether the corresponding activity
types are currently in progress. We tested our system with three datasets from
different domains recorded using different sensors and achieved performance
comparable to existing systems designed specifically for those domains. Our
system is the first to address the concurrent activity recognition with
multisensory data using a single model, which is scalable, simple to train and
easy to deploy.
Enzo Ferrante, Nikos Paragios
Comments: Under evaluation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
During the last decades, the research community of medical imaging has
witnessed continuous advances in image registration methods, which pushed the
limits of the state-of-the-art and enabled the development of novel medical
procedures. A particular type of image registration problem, known as
slice-to-volume registration, played a fundamental role in areas like image
guided surgeries and volumetric image reconstruction. However, to date, and
despite the extensive literature available on this topic, no survey has been
written to discuss this challenging problem. This paper introduces the first
comprehensive survey of the literature about slice-to-volume registration,
presenting a categorical study of the algorithms according to an ad-hoc
taxonomy and analyzing advantages and disadvantages of every category. We draw
some general conclusions from this analysis and present our perspectives on the
future of the field.
Jinsoo Choi, Tae-Hyun Oh, In So Kweon
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
The best summary of a long video differs among different people due to its
highly subjective nature. Even for the same person, the best summary may change
with time or mood. In this paper, we introduce the task of generating
customized video summaries through simple text. First, we train a deep
architecture to effectively learn semantic embeddings of video frames by
leveraging the abundance of image-caption data via a progressive and residual
manner. Given a user-specific text description, our algorithm is able to select
semantically relevant video segments and produce a temporally aligned video
summary. In order to evaluate our textually customized video summaries, we
conduct experimental comparison with baseline methods that utilize ground-truth
information. Despite the challenging baselines, our method still manages to
show comparable or even exceeding performance. We also show that our method is
able to generate semantically diverse video summaries by only utilizing the
learned visual embeddings.
Yong Wang, Ke Lu
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Multi-camera tracking is quite different from single camera tracking, and it
faces new technology and system architecture challenges. By analyzing the
corresponding characteristics and disadvantages of the existing algorithms,
problems in multi-camera tracking are summarized and some new directions for
future work are also generalized.
Kota Hara, Raviteja Vemulapalli, Rama Chellappa
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Deep Convolutional Neural Networks (DCNN) have been proven to be effective
for various computer vision problems. In this work, we demonstrate its
effectiveness on a continuous object orientation estimation task, which
requires prediction of 0 to 360 degrees orientation of the objects. We do so by
proposing and comparing three continuous orientation prediction approaches
designed for the DCNNs. The first two approaches work by representing an
orientation as a point on a unit circle and minimizing either L2 loss or
angular difference loss. The third method works by first converting the
continuous orientation estimation task into a set of discrete orientation
estimation tasks and then converting the discrete orientation outputs back to
the continuous orientation using a mean-shift algorithm. By evaluating on a
vehicle orientation estimation task and a pedestrian orientation estimation
task, we demonstrate that the discretization-based approach not only works
better than the other two approaches but also achieves state-of-the-art
performance. We also demonstrate that finding an appropriate feature
representation is critical to achieve a good performance when adapting a DCNN
trained for an image recognition task.
Xinxin Zuo, Sen Wang, Jiangbin Zheng, Ruigang Yang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In this paper we present a novel approach for depth map enhancement from an
RGB-D video sequence. The basic idea is to exploit the shading information in
the color image. Instead of making assumption about surface albedo or
controlled object motion and lighting, we use the lighting variations
introduced by casual object movement. We are effectively calculating
photometric stereo from a moving object under natural illuminations. The key
technical challenge is to establish correspondences over the entire image set.
We therefore develop a lighting insensitive robust pixel matching technique
that out-performs optical flow method in presence of lighting variations. In
addition we present an expectation-maximization framework to recover the
surface normal and albedo simultaneously, without any regularization term. We
have validated our method on both synthetic and real datasets to show its
superior performance on both surface details recovery and intrinsic
decomposition.
Kota Hara, Ming-Yu Liu, Oncel Tuzel, Amir-massoud Farahmand
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We propose augmenting deep neural networks with an attention mechanism for
the visual object detection task. As perceiving a scene, humans have the
capability of multiple fixation points, each attended to scene content at
different locations and scales. However, such a mechanism is missing in the
current state-of-the-art visual object detection methods. Inspired by the human
vision system, we propose a novel deep network architecture that imitates this
attention mechanism. As detecting objects in an image, the network adaptively
places a sequence of glimpses of different shapes at different locations in the
image. Evidences of the presence of an object and its location are extracted
from these glimpses, which are then fused for estimating the object class and
bounding box coordinates. Due to lacks of ground truth annotations of the
visual attention mechanism, we train our network using a reinforcement learning
algorithm with policy gradients. Experiment results on standard object
detection benchmarks show that the proposed network consistently outperforms
the baseline networks that does not model the attention mechanism.
Ashraf A. Shahin
Comments: this http URL
Journal-ref: International Journal of Advanced Computer Science and
Applications(IJACSA), 8(1), 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arabic language is one of the most popular languages in the world. Hundreds
of millions of people in many countries around the world speak Arabic as their
native speaking. However, due to complexity of Arabic language, recognition of
printed and handwritten Arabic text remained untouched for a very long time
compared with English and Chinese. Although, in the last few years, significant
number of researches has been done in recognizing printed and handwritten
Arabic text, it stills an open research field due to cursive nature of Arabic
script. This paper proposes automatic printed Arabic text recognition technique
based on linear and ellipse regression techniques. After collecting all
possible forms of each character, unique code is generated to represent each
character form. Each code contains a sequence of lines and ellipses. To
recognize fonts, a unique list of codes is identified to be used as a
fingerprint of font. The proposed technique has been evaluated using over 14000
different Arabic words with different fonts and experimental results show that
average recognition rate of the proposed technique is 86%.
Nadav Israel, Lior Wolf, Ran Barzilay, Gal Shoval
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Automatic recognition of facial gestures is becoming increasingly important
as real world AI agents become a reality. In this paper, we present an
automated system that recognizes facial gestures by capturing local changes and
encoding the motion into a histogram of frequencies. We evaluate the proposed
method by demonstrating its effectiveness on spontaneous face action
benchmarks: the FEEDTUM dataset, the Pain dataset and the HMDB51 dataset. The
results show that, compared to known methods, the new encoding methods
significantly improve the recognition accuracy and the robustness of analysis
for a variety of applications.
Iaroslav Melekhov (1), Juho Kannala (1), Esa Rahtu (2) ((1) Aalto University, (2) University of Oulu)
Comments: Submitted to Scandinavian Conference on Image Analysis (SCIA) 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We present a method for estimating relative camera pose between a pair of
images. The goal is to propose accurate estimations the relative orientation
vector representing by rotation matrix and translation vector of two cameras
capturing the same scene. Our approach is based on convolutional neural
networks and directly estimates camera motion between two RGB images by solving
regression problem. The proposed network is trained in an end-to-end manner
utilizing transfer learning from large scale classification data. The method is
compared to a classical local feature based pipeline (SURF, ORB) of relative
pose estimation and we demonstrate the cases where our deep model outperforms
the traditional approach significantly. Finally, we evaluated experiments with
applying Spatial Pyramid Pooling (SPP) layer which can produce a fixed-size
representation regardless the size of the input image. The results confirm that
SPP further improves the performance of the proposed approach.
U. A. Nnolim
Comments: 31 pages, 17 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
This report describes the experimental results obtained using a proposed
variational Retinex algorithm for controlled illumination correction. Two
colour restoration and enhancement schemes of the algorithm are presented for
drastically improved results. The algorithm modifies the reflectance image
using global and local contrast enhancement approaches and gradually removes
the residual illumination to yield highly pleasing results. The proposed
algorithms are optimized by way of simultaneous perceptual quality metric (PQM)
stabilization and entropy maximization for fully automated processing solving
the problem of determination of stopping time. The usage of the HSI or HSV
colour space ensures a unique solution to the optimization problem unlike in
the RGB space where there is none (forcing manual selection of number of
iteration. The proposed approach preserves and enhances details in both bright
and dark regions of underexposed images in addition to eliminating the colour
distortion, over-exposure in bright image regions, halo effect and grey-world
violations observed in Retinex-based approaches. Extensive experiments indicate
consistent performance as the proposed approach exploits and augments the
advantages of PDE-based formulation, performing illumination correction, colour
enhancement correction and restoration, contrast enhancement and noise
suppression. Comparisons shows that the proposed approach surpasses most of the
other conventional algorithms found in the literature.
Shervin Minaee, Amirali Abdolrashidi, Yao Wang
Comments: IEEE Signal Processing in Medicine and Biology Symposium, 2016
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Iris is one of the popular biometrics that is widely used for identity
authentication. Different features have been used to perform iris recognition
in the past. Most of them are based on hand-crafted features designed by
biometrics experts. Due to tremendous success of deep learning in computer
vision problems, there has been a lot of interest in applying features learned
by convolutional neural networks on general image recognition to other tasks
such as segmentation, face recognition, and object detection. In this paper, we
have investigated the application of deep features extracted from VGG-Net for
iris recognition. The proposed scheme has been tested on two well-known iris
databases, and has shown promising results with the best accuracy rate of
99.4\%, which outperforms the previous best result.
Naftali Zon, Rana Hanocka, Nahum Kiryati
Subjects: Computer Vision and Pattern Recognition (cs.CV)
PROBE (Progressive Removal of Blur Residual) is a recursive framework for
blind deblurring. Using the elementary modified inverse filter at its core,
PROBE’s experimental performance meets or exceeds the state of the art, both
visually and quantitatively. Remarkably, PROBE lends itself to analysis that
reveals its convergence properties. PROBE is motivated by recent ideas on
progressive blind deblurring, but breaks away from previous research by its
simplicity, speed, performance and potential for analysis. PROBE is neither a
functional minimization approach, nor an open-loop sequential method (blur
kernel estimation followed by non-blind deblurring). PROBE is a feedback
scheme, deriving its unique strength from the closed-loop architecture rather
than from the accuracy of its algorithmic components.
Andrey Kuehlkamp, Benedict Becker, Kevin Bowyer
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Predicting a person’s gender based on the iris texture has been explored by
several researchers. This paper considers several dimensions of experimental
work on this problem, including person-disjoint train and test, and the effect
of cosmetics on eyelash occlusion and imperfect segmentation. We also consider
the use of multi-layer perceptron and convolutional neural networks as
classifiers, comparing the use of data-driven and hand-crafted features. Our
results suggest that the gender-from-iris problem is more difficult than has so
far been appreciated. Estimating accuracy using a mean of N person-disjoint
train and test partitions, and considering the effect of makeup – a combination
of experimental conditions not present in any previous work – we find a much
weaker ability to predict gender-from-iris texture than has been suggested in
previous work.
Seyede Mahya Hazavei, Hamid Reza Shahdoosti
Comments: 6 pages, 4 figures, conference
Subjects: Computer Vision and Pattern Recognition (cs.CV)
The bilateral filter is a useful nonlinear filter which without smoothing
edges, it does spatial averaging. In the literature, the effectiveness of this
method for image denoising is shown. In this paper, an extension of this method
is proposed which is based on complex wavelet transform. In fact, the bilateral
filtering is applied to the low-frequency (approximation) subbands of the
decomposed image using complex wavelet transform, while the thresholding
approach is applied to the high frequency subbands. Using the bilateral filter
in the complex wavelet domain forms a new image denoising framework.
Experimental results for real data are provided, by which one can see the
effectiveness of the proposed method in eliminating noise.
David Hall, Feras Dayoub, Jason Kulk, Chris McCool
Comments: to appear in the proceedings of the IEEE International Conference on Robotics and Automation ICRA2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Weed scouting is an important part of modern integrated weed management but
can be time consuming and sparse when performed manually. Automated weed
scouting and weed destruction has typically been performed using classification
systems able to classify a set group of species known a priori. This greatly
limits deployability as classification systems must be retrained for any field
with a different set of weed species present within them. In order to overcome
this limitation, this paper works towards developing a clustering approach to
weed scouting which can be utilized in any field without the need for prior
species knowledge. We demonstrate our system using challenging data collected
in the field from an agricultural robotics platform. We show that considerable
improvements can be made by (i) learning low-dimensional (bottleneck) features
using a deep convolutional neural network to represent plants in general and
(ii) tying views of the same area (plant) together. Deploying this algorithm on
in-field data collected by AgBotII, we are able to successfully cluster cotton
plants from grasses without prior knowledge or training for the specific plants
in the field.
Youngwan Lee, Huien Kim, Eunsoo Park, Xuenan Cui, Hakil Kim
Comments: 8 pages, 9 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Since convolutional neural network(CNN)models emerged,several tasks in
computer vision have actively deployed CNN models for feature extraction.
However,the conventional CNN models have a high computational cost and require
high memory capacity, which is impractical and unaffordable for commercial
applications such as real-time on-road object detection on embedded boards or
mobile platforms. To tackle this limitation of CNN models, this paper proposes
a wide-residual-inception (WR-Inception) network, which constructs the
architecture based on a residual inception unit that captures objects of
various sizes on the same feature map, as well as shallower and wider layers,
compared to state-of-the-art networks like ResNet. To verify the proposed
networks, this paper conducted two experiments; one is a classification task on
CIFAR-10/100 and the other is an on-road object detection task using a
Single-Shot Multi-box Detector(SSD) on the KITTI dataset.
Eyasu Zemene, Yonatan Tariku, Haroon Idrees, Andrea Prati, Marcello Pelillo, Mubarak Shah
Subjects: Computer Vision and Pattern Recognition (cs.CV)
This paper presents a new approach for the challenging problem of
geo-locating an image using image matching in a structured database of
city-wide reference images with known GPS coordinates. We cast the
geo-localization as a clustering problem on local image features. Akin to
existing approaches on the problem, our framework builds on low-level features
which allow partial matching between images. For each local feature in the
query image, we find its approximate nearest neighbors in the reference set.
Next, we cluster the features from reference images using Dominant Set
clustering, which affords several advantages over existing approaches. First,
it permits variable number of nodes in the cluster which we use to dynamically
select the number of nearest neighbors (typically coming from multiple
reference images) for each query feature based on its discrimination value.
Second, as we also quantify in our experiments, this approach is several orders
of magnitude faster than existing approaches. Thus, we obtain multiple clusters
(different local maximizers) and obtain a robust final solution to the problem
using multiple weak solutions through constrained Dominant Set clustering on
global image features, where we enforce the constraint that the query image
must be included in the cluster. This second level of clustering also bypasses
heuristic approaches to voting and selecting the reference image that matches
to the query. We evaluated the proposed framework on an existing dataset of
102k street view images as well as a new dataset of 300k images, and show that
it outperforms the state-of-the-art by 20% and 7%, respectively, on the two
datasets.
Andrey Kuehlkamp, Kevin W. Bowyer
Comments: 2016 IEEE Winter Conference on Applications of Computer Vision (WACV)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Iris recognition systems are a mature technology that is widely used
throughout the world. In identification (as opposed to verification) mode, an
iris to be recognized is typically matched against all N enrolled irises. This
is the classic “1-to-N search”. In order to improve the speed of large-scale
identification, a modified “1-to-First” search has been used in some
operational systems. A 1-to-First search terminates with the first
below-threshold match that is found, whereas a 1-to-N search always finds the
best match across all enrollments. We know of no previous studies that evaluate
how the accuracy of 1-to-First search differs from that of 1-to-N search. Using
a dataset of over 50,000 iris images from 2,800 different irises, we perform
experiments to evaluate the relative accuracy of 1-to-First and 1-to-N search.
We evaluate how the accuracy difference changes with larger numbers of enrolled
irises, and with larger ranges of rotational difference allowed between iris
images. We find that False Match error rate for 1-to-First is higher than for
1-to-N, and the the difference grows with larger number of enrolled irises and
with larger range of rotation.
Dolev Raviv, Margarita Osadchy
Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Deep Learning (DL) methods show very good performance when trained on large,
balanced data sets. However, many practical problems involve imbalanced data
sets, or/and classes with a small number of training samples. The performance
of DL methods as well as more traditional classifiers drops significantly in
such settings. Most of the existing solutions for imbalanced problems focus on
customizing the data for training. A more principled solution is to use mixed
Hinge-Minimax risk [19] specifically designed to solve binary problems with
imbalanced training sets. Here we propose a Latent Hinge Minimax (LHM) risk and
a training algorithm that generalizes this paradigm to an ensemble of
hyperplanes that can form arbitrary complex, piecewise linear boundaries. To
extract good features, we combine LHM model with CNN via transfer learning. To
solve multi-class problem we map pre-trained category-specific LHM classifiers
to a multi-class neural network and adjust the weights with very fast tuning.
LHM classifier enables the use of unlabeled data in its training and the
mapping allows for multi-class inference, resulting in a classifier that
performs better than alternatives when trained on a small number of training
samples.
Philippe Balbiani, David Fernández-Duque, Emiliano Lorini
Subjects: Artificial Intelligence (cs.AI); Logic in Computer Science (cs.LO)
We present a family of logics for reasoning about agents’ positions and
motion in the plane which have several potential applications in the area of
multi-agent systems (MAS), such as multi-agent planning and robotics. The most
general logic includes (i) atomic formulas for representing the truth of a
given fact or the presence of a given agent at a certain position of the plane,
(ii) atomic programs corresponding to the four basic orientations in the plane
(up, down, left, right) as well as the four program constructs of propositional
dynamic logic (sequential composition, nondeterministic composition, iteration
and test). As this logic is not computably enumerable, we study some
interesting decidable and axiomatizable fragments of it. We also present a
decidable extension of the iteration-free fragment of the logic by special
programs representing motion of agents in the plane.
Zi Jian Yang, Yong Wang
Subjects: Artificial Intelligence (cs.AI)
With the advent of modern computer networks, fault diagnosis has been a focus
of research activity. This paper reviews the history of fault diagnosis in
networks and discusses the main methods in information gathering section,
information analyzing section and diagnosing and revolving section of fault
diagnosis in networks. Emphasis will be placed upon knowledge-based methods
with discussing the advantages and shortcomings of the different methods. The
survey is concluded with a description of some open problems.
Andrea Callia D'Iddio, Michael Huth
Comments: 17 pages, 3 figures, link to open research data and code available
Subjects: Artificial Intelligence (cs.AI); Mathematical Software (cs.MS)
Optimization of Mixed-Integer Non-Linear Programming (MINLP) supports
important decisions in applications such as Chemical Process Engineering. But
current solvers have limited ability for deductive reasoning or the use of
domain-specific theories, and the management of integrality constraints does
not yet exploit automated reasoning tools such as SMT solvers. This seems to
limit both scalability and reach of such tools in practice. We therefore
present a tool, ManyOpt, for MINLP optimization that enables experimentation
with reduction techniques which transform a MINLP problem to feasibility
checking realized by an SMT solver. ManyOpt is similar to the SAT solver
ManySAT in that it runs a specified number of such reduction techniques in
parallel to get the strongest result on a given MINLP problem. The tool is
implemented in layers, which we may see as features and where reduction
techniques are feature vectors. Some of these features are inspired by known
MINLP techniques whereas others are novel and specific to SMT. Our experimental
results on standard benchmarks demonstrate the benefits of this approach. The
tool supports a variety of SMT solvers and is easily extensible with new
features, courtesy of its layered structure. For example, logical formulas for
deductive reasoning are easily added to constrain further the optimization of a
MINLP problem of interest.
Shumeet Baluja, Michele Covell, Rahul Sukthankar
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Systems and Control (cs.SY)
Real-time optimization of traffic flow addresses important practical
problems: reducing a driver’s wasted time, improving city-wide efficiency,
reducing gas emissions and improving air quality. Much of the current research
in traffic-light optimization relies on extending the capabilities of traffic
lights to either communicate with each other or communicate with vehicles.
However, before such capabilities become ubiquitous, opportunities exist to
improve traffic lights by being more responsive to current traffic situations
within the current, already deployed, infrastructure. In this paper, we
introduce a traffic light controller that employs bidding within micro-auctions
to efficiently incorporate traffic sensor information; no other outside sources
of information are assumed. We train and test traffic light controllers on
large-scale data collected from opted-in Android cell-phone users over a period
of several months in Mountain View, California and the River North neighborhood
of Chicago, Illinois. The learned auction-based controllers surpass (in both
the relevant metrics of road-capacity and mean travel time) the currently
deployed lights, optimized static-program lights, and longer-term planning
approaches, in both cities, measured using real user driving data.
Guy Katz, Clark Barrett, David Dill, Kyle Julian, Mykel Kochenderfer
Subjects: Artificial Intelligence (cs.AI); Logic in Computer Science (cs.LO)
Deep neural networks have emerged as a widely used and effective means for
tackling complex, real-world problems. However, a major obstacle in applying
them to safety-critical systems is the great difficulty in providing formal
guarantees about their behavior. We present a novel, scalable, and efficient
technique for verifying properties of deep neural networks (or providing
counter-examples). The technique is based on the simplex method, extended to
handle the non-convex Rectified Linear Unit (ReLU) activation function, which
is a crucial ingredient in many modern neural networks. The verification
procedure tackles neural networks as a whole, without making any simplifying
assumptions. We evaluated our technique on a prototype deep neural network
implementation of the next-generation Airborne Collision Avoidance System for
unmanned aircraft (ACAS Xu). Results show that our technique can successfully
prove properties of networks that are an order of magnitude larger than the
largest networks verified using existing methods.
Afshin Dehghan, Syed Zain Masood, Guang Shu, Enrique. G. Ortiz
Comments: 7 Pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
This paper describes the details of Sighthound’s fully automated vehicle
make, model and color recognition system. The backbone of our system is a deep
convolutional neural network that is not only computationally inexpensive, but
also provides state-of-the-art results on several competitive benchmarks.
Additionally, our deep network is trained on a large dataset of several million
images which are labeled through a semi-automated process. Finally we test our
system on several public datasets as well as our own internal test dataset. Our
results show that we outperform other methods on all benchmarks by significant
margins. Our model is available to developers through the Sighthound Cloud API
at this https URL
Bas van Stein, Hao Wang, Wojtek Kowalczyk, Michael Emmerich, Thomas Bäck
Comments: Submitted to IEEE Computational Intelligence Magazine for review
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Kriging or Gaussian Process Regression is applied in many fields as a
non-linear regression model as well as a surrogate model in the field of
evolutionary computation. However, the computational and space complexity of
Kriging, that is cubic and quadratic in the number of data points respectively,
becomes a major bottleneck with more and more data available nowadays. In this
paper, we propose a general methodology for the complexity reduction, called
cluster Kriging, where the whole data set is partitioned into smaller clusters
and multiple Kriging models are built on top of them. In addition, four Kriging
approximation algorithms are proposed as candidate algorithms within the new
framework. Each of these algorithms can be applied to much larger data sets
while maintaining the advantages and power of Kriging. The proposed algorithms
are explained in detail and compared empirically against a broad set of
existing state-of-the-art Kriging approximation methods on a well-defined
testing framework. According to the empirical study, the proposed algorithms
consistently outperform the existing algorithms. Moreover, some practical
suggestions are provided for using the proposed algorithms.
Arya Mazumdar, Barna Saha
Comments: Appears in AAAI-17
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Learning (cs.LG)
Entity resolution (ER) is the task of identifying all records in a database
that refer to the same underlying entity, and are therefore duplicates of each
other. Due to inherent ambiguity of data representation and poor data quality,
ER is a challenging task for any automated process. As a remedy, human-powered
ER via crowdsourcing has become popular in recent years. Using crowd to answer
queries is costly and time consuming. Furthermore, crowd-answers can often be
faulty. Therefore, crowd-based ER methods aim to minimize human participation
without sacrificing the quality and use a computer generated similarity matrix
actively. While, some of these methods perform well in practice, no theoretical
analysis exists for them, and further their worst case performances do not
reflect the experimental findings. This creates a disparity in the
understanding of the popular heuristics for this problem. In this paper, we
make the first attempt to close this gap. We provide a thorough analysis of the
prominent heuristic algorithms for crowd-based ER. We justify experimental
observations with our analysis and information theoretic lower bounds.
Zeeshan Khawar Malik, Mo Kobrosli, Peter Maas
Subjects: Information Retrieval (cs.IR); Learning (cs.LG); Machine Learning (stat.ML)
Deep Neural Networks, and specifically fully-connected convolutional neural
networks are achieving remarkable results across a wide variety of domains.
They have been trained to achieve state-of-the-art performance when applied to
problems such as speech recognition, image classification, natural language
processing and bioinformatics. Most of these deep learning models when applied
to classification employ the softmax activation function for prediction and aim
to minimize cross-entropy loss. In this paper, we have proposed a supervised
model for dominant category prediction to improve search recall across all eBay
classifieds platforms. The dominant category label for each query in the last
90 days is first calculated by summing the total number of collaborative clicks
among all categories. The category having the highest number of collaborative
clicks for the given query will be considered its dominant category. Second,
each query is transformed to a numeric vector by mapping each unique word in
the query document to a unique integer value; all padded to equal length based
on the maximum document length within the pre-defined vocabulary size. A
fully-connected deep convolutional neural network (CNN) is then applied for
classification. The proposed model achieves very high classification accuracy
compared to other state-of-the-art machine learning techniques.
Nikolaos Polatidis, Christos K. Georgiadis
Journal-ref: Computer Standards & Interfaces, 51, 14-21 (2017)
Subjects: Information Retrieval (cs.IR)
One of the most used approaches for providing recommendations in various
online environments such as e-commerce is collaborative filtering. Although,
this is a simple method for recommending items or services, accuracy and
quality problems still exist. Thus, we propose a dynamic multi-level
collaborative filtering method that improves the quality of the
recommendations. The proposed method is based on positive and negative
adjustments and can be used in different domains that utilize collaborative
filtering to increase the quality of the user experience. Furthermore, the
effectiveness of the proposed method is shown by providing an extensive
experimental evaluation based on three real datasets and by comparisons to
alternative methods.
Shaohua Li, Tat-Seng Chua
Comments: 5 pages, 6 figures
Subjects: Information Retrieval (cs.IR)
Traditionally a document is visualized by a word cloud. Recently, distributed
representation methods for documents have been developed, which map a document
to a set of topic embeddings. Visualizing such a representation is useful to
present the semantics of a document in higher granularity; it is also
challenging, as there are multiple topics, each containing multiple words. We
propose to visualize a set of topics using Topic Cloud, which is a pie chart
consisting of topic slices, where each slice contains important words in this
topic. To make important topics/words visually prominent, the sizes of topic
slices and word fonts are proportional to their importance in the document. A
topic cloud can help the user quickly evaluate the quality of derived document
representations. For NLP practitioners, It can be used to qualitatively compare
the topic quality of different document representation algorithms, or to
inspect how model parameters impact the derived representations.
Yifan Chen, Xiang Zhao
Subjects: Information Retrieval (cs.IR)
Top-(N) recommender systems typically utilize side information to address the
problem of data sparsity. As nowadays side information is growing towards high
dimensionality, the performances of existing methods deteriorate in terms of
both effectiveness and efficiency, which imposes a severe technical challenge.
In order to take advantage of high-dimensional side information, we propose in
this paper an embedded feature selection method to facilitate top-(N)
recommendation. In particular, we propose to learn feature weights of side
information, where zero-valued features are naturally filtered out. We also
introduce non-negativity and sparsity to the feature weights, to facilitate
feature selection and encourage low-rank structure. Two optimization problems
are accordingly put forward, respectively, where the feature selection is
tightly or loosely coupled with the learning procedure. Augmented Lagrange
Multiplier and Alternating Direction Method are applied to efficiently solve
the problems. Experiment results demonstrate the superior recommendation
quality of the proposed algorithm to that of the state-of-the-art alternatives.
Helge Holzmann, Wolfgang Nejdl, Avishek Anand
Comments: SIGIR 2016, Pisa, Italy
Subjects: Information Retrieval (cs.IR)
Web archives are large longitudinal collections that store webpages from the
past, which might be missing on the current live Web. Consequently, temporal
search over such collections is essential for finding prominent missing
webpages and tasks like historical analysis. However, this has been challenging
due to the lack of popularity information and proper ground truth to evaluate
temporal retrieval models. In this paper we investigate the applicability of
external longitudinal resources to identify important and popular websites in
the past and analyze the social bookmarking service Delicious for this purpose.
The timestamped bookmarks on Delicious provide explicit cues about popular
time periods in the past along with relevant descriptors. These are valuable to
identify important documents in the past for a given temporal query. Focusing
purely on recall, we analyzed more than 12,000 queries and find that using
Delicious yields average recall values from 46% up to 100%, when limiting
ourselves to the best represented queries in the considered dataset. This
constitutes an attractive and low-overhead approach for quick access into Web
archives by not dealing with the actual contents.
Karim Banawan, Sennur Ulukus
Comments: Submitted to IEEE Transactions on Information Theory, February 2017
Subjects: Information Theory (cs.IT); Cryptography and Security (cs.CR); Information Retrieval (cs.IR)
We consider the problem of multi-message private information retrieval (MPIR)
from (N) non-communicating replicated databases. In MPIR, the user is
interested in retrieving (P) messages out of (M) stored messages without
leaking the identity of the retrieved messages. The information-theoretic sum
capacity of MPIR (C_s^P) is the maximum number of desired message symbols that
can be retrieved privately per downloaded symbol. For the case (P geq
frac{M}{2}), we determine the exact sum capacity of MPIR as
(C_s^P=frac{1}{1+frac{M-P}{PN}}). The achievable scheme in this case is based
on downloading MDS-coded mixtures of all messages. For (P leq frac{M}{2}), we
develop lower and upper bounds for all (M,P,N). These bounds match if the total
number of messages (M) is an integer multiple of the number of desired messages
(P), i.e., (frac{M}{P} in mathbb{N}). In this case,
(C_s^P=frac{1-frac{1}{N}}{1-(frac{1}{N})^{M/P}}). The achievable scheme in
this case generalizes the single-message capacity achieving scheme to have
unbalanced number of stages per round of download. For all the remaining cases,
the difference between the lower and upper bound is at most (0.0082), which
occurs for (M=5), (P=2), (N=2). Our results indicate that joint retrieval of
desired messages is more efficient than successive use of single-message
retrieval schemes.
Daniele Falavigna, Marco Matassoni, Shahab Jalalvand, Matteo Negri, Marco Turchi
Comments: Computer Speech & Language December 2016
Subjects: Computation and Language (cs.CL)
In this paper we propose to exploit the automatic Quality Estimation (QE) of
ASR hypotheses to perform the unsupervised adaptation of a deep neural network
modeling acoustic probabilities. Our hypothesis is that significant
improvements can be achieved by: i)automatically transcribing the evaluation
data we are currently trying to recognise, and ii) selecting from it a subset
of “good quality” instances based on the word error rate (WER) scores predicted
by a QE component. To validate this hypothesis, we run several experiments on
the evaluation data sets released for the CHiME-3 challenge. First, we operate
in oracle conditions in which manual transcriptions of the evaluation data are
available, thus allowing us to compute the “true” sentence WER. In this
scenario, we perform the adaptation with variable amounts of data, which are
characterised by different levels of quality. Then, we move to realistic
conditions in which the manual transcriptions of the evaluation data are not
available. In this case, the adaptation is performed on data selected according
to the WER scores “predicted” by a QE component. Our results indicate that: i)
QE predictions allow us to closely approximate the adaptation results obtained
in oracle conditions, and ii) the overall ASR performance based on the proposed
QE-driven adaptation method is significantly better than the strong, most
recent, CHiME-3 baseline.
Iñaki San Vicente, Rodrigo Agerri, German Rigau
Comments: 8 pages plus 2 pages of references
Journal-ref: Proceedings of the 14th Conference of the European Chapter of the
Association for Computational Linguistics (EACL 2014), pages 88-97,
Gothenburg, Sweden, April 26-30 2014
Subjects: Computation and Language (cs.CL)
This paper presents a simple, robust and (almost) unsupervised
dictionary-based method, qwn-ppv (Q-WordNet as Personalized PageRanking Vector)
to automatically generate polarity lexicons. We show that qwn-ppv outperforms
other automatically generated lexicons for the four extrinsic evaluations
presented here. It also shows very competitive and robust results with respect
to manually annotated ones. Results suggest that no single lexicon is best for
every task and dataset and that the intrinsic evaluation of polarity lexicons
is not a good performance indicator on a Sentiment Analysis task. The qwn-ppv
method allows to easily create quality polarity lexicons whenever no
domain-based annotated corpora are available for a given language.
Omkar Dhariya, Shrikant Malviya, Uma Shanker Tiwary
Comments: 31st International Conference on Information Networking (ICOIN-2017)
Subjects: Computation and Language (cs.CL)
In this paper, an extended combined approach of phrase based statistical
machine translation (SMT), example based MT (EBMT) and rule based MT (RBMT) is
proposed to develop a novel hybrid data driven MT system capable of
outperforming the baseline SMT, EBMT and RBMT systems from which it is derived.
In short, the proposed hybrid MT process is guided by the rule based MT after
getting a set of partial candidate translations provided by EBMT and SMT
subsystems. Previous works have shown that EBMT systems are capable of
outperforming the phrase-based SMT systems and RBMT approach has the strength
of generating structurally and morphologically more accurate results. This
hybrid approach increases the fluency, accuracy and grammatical precision which
improve the quality of a machine translation system. A comparison of the
proposed hybrid machine translation (HTM) model with renowned translators i.e.
Google, BING and Babylonian is also presented which shows that the proposed
model works better on sentences with ambiguity as well as comprised of idioms
than others.
Jonathan Herzig, Jonathan Berant
Subjects: Computation and Language (cs.CL)
A fundamental challenge in developing semantic parsers is the paucity of
strong supervision in the form of language utterances annotated with logical
form. In this paper, we propose to exploit structural regularities in language
in different domains, and train semantic parsers over multiple knowledge-bases
(KBs), while sharing information across datasets. We find that we can
substantially improve parsing accuracy by training a single
sequence-to-sequence model over multiple KBs, when providing an encoding of the
domain at decoding time. Our model achieves state-of-the-art performance on the
Overnight dataset (containing eight domains), improves performance over a
single KB baseline from 75.6% to 79.6%, while obtaining a 7x reduction in the
number of model parameters.
Zhongqing Wang, Yue Zhang
Subjects: Computation and Language (cs.CL)
We present opinion recommendation, a novel task of jointly predicting a
custom review with a rating score that a certain user would give to a certain
product or service, given existing reviews and rating scores to the product or
service by other users, and the reviews that the user has given to other
products and services. A characteristic of opinion recommendation is the
reliance of multiple data sources for multi-task joint learning, which is the
strength of neural models. We use a single neural network to model users and
products, capturing their correlation and generating customised product
representations using a deep memory network, from which customised ratings and
reviews are constructed jointly. Results show that our opinion recommendation
system gives ratings that are closer to real user ratings on Yelp.com data
compared with Yelp’s own ratings, and our methods give better results compared
to several pipelines baselines using state-of-the-art sentiment rating and
summarization systems.
Hongyu Gong, Jiaqi Mu, Suma Bhat, Pramod Viswanath
Subjects: Computation and Language (cs.CL)
Prepositions are highly polysemous, and their variegated senses encode
significant semantic information. In this paper we match each preposition’s
complement and attachment and their interplay crucially to the geometry of the
word vectors to the left and right of the preposition. Extracting such features
from the vast number of instances of each preposition and clustering them makes
for an efficient preposition sense disambigution (PSD) algorithm, which is
comparable to and better than state-of-the-art on two benchmark datasets. Our
reliance on no external linguistic resource allows us to scale the PSD
algorithm to a large WikiCorpus and learn sense-specific preposition
representations — which we show to encode semantic relations and paraphrasing
of verb particle compounds, via simple vector operations.
Jiaqi Mu, Suma Bhat, Pramod Viswanath
Subjects: Computation and Language (cs.CL); Machine Learning (stat.ML)
Real-valued word representations have transformed NLP applications, popular
examples are word2vec and GloVe, recognized for their ability to capture
linguistic regularities. In this paper, we demonstrate a very simple, and yet
counter-intuitive, postprocessing technique — eliminate the common mean vector
and a few top dominating directions from the word vectors — that renders
off-the-shelf representations even stronger. The postprocessing is empirically
validated on a variety of lexical-level intrinsic tasks (word similarity,
concept categorization, word analogy) and sentence-level extrinsic tasks
(semantic textual similarity) on multiple datasets and with a variety of
representation methods and hyperparameter choices in multiple languages, in
each case, the processed representations are consistently better than the
original ones. Furthermore, we demonstrate quantitatively in downstream
applications that neural network architectures “automatically learn” the
postprocessing operation.
Chunxi Liu, Jinyi Yang, Ming Sun, Santosh Kesiraju, Alena Rott, Lucas Ondel, Pegah Ghahremani, Najim Dehak, Lukas Burget, Sanjeev Khudanpur
Comments: 5 pages, 1 figure; Accepted for publication at ICASSP 2017
Subjects: Computation and Language (cs.CL)
Acoustic unit discovery (AUD) is a process of automatically identifying a
categorical acoustic unit inventory from speech and producing corresponding
acoustic unit tokenizations. AUD provides an important avenue for unsupervised
acoustic model training in a zero resource setting where expert-provided
linguistic knowledge and transcribed speech are unavailable. Therefore, to
further facilitate zero-resource AUD process, in this paper, we demonstrate
acoustic feature representations can be significantly improved by (i)
performing linear discriminant analysis (LDA) in an unsupervised self-trained
fashion, and (ii) leveraging resources of other languages through building a
multilingual bottleneck (BN) feature extractor to give effective cross-lingual
generalization. Moreover, we perform comprehensive evaluations of AUD efficacy
on multiple downstream speech applications, and their correlated performance
suggests that AUD evaluations are feasible using different alternative language
resources when only a subset of these evaluation resources can be available in
typical zero resource applications.
Iacer Calixto, Qun Liu, Nick Campbell
Comments: 8 pages (11 including references), 2 figures
Subjects: Computation and Language (cs.CL)
We introduce a Multi-modal Neural Machine Translation model in which a
doubly-attentive decoder naturally incorporates spatial visual features
obtained using pre-trained convolutional neural networks, bridging the gap
between image description and translation. Our decoder learns to attend to
source-language words and parts of an image independently by means of two
separate attention mechanisms as it generates words in the target language. We
find that our model can efficiently exploit not just back-translated in-domain
multi-modal data but also large general-domain text-only MT corpora. We also
report state-of-the-art results on the Multi30k data set.
Helge Holzmann, Nina Tahmasebi, Thomas Risse
Comments: IJDL 2015
Journal-ref: International Journal on Digital Libraries 2015, Volume 15, Issue
2, pp 209-235
Subjects: Computation and Language (cs.CL); Digital Libraries (cs.DL)
Advancements in technology and culture lead to changes in our language. These
changes create a gap between the language known by users and the language
stored in digital archives. It affects user’s possibility to firstly find
content and secondly interpret that content. In previous work we introduced our
approach for Named Entity Evolution Recognition~(NEER) in newspaper
collections. Lately, increasing efforts in Web preservation lead to increased
availability of Web archives covering longer time spans. However, language on
the Web is more dynamic than in traditional media and many of the basic
assumptions from the newspaper domain do not hold for Web data. In this paper
we discuss the limitations of existing methodology for NEER. We approach these
by adapting an existing NEER method to work on noisy data like the Web and the
Blogosphere in particular. We develop novel filters that reduce the noise and
make use of Semantic Web resources to obtain more information about terms. Our
evaluation shows the potentials of the proposed approach.
Helge Holzmann, Thomas Risse
Comments: Digital Libraries (JCDL) 2014, London, UK
Subjects: Computation and Language (cs.CL); Digital Libraries (cs.DL)
The evolution of named entities affects exploration and retrieval tasks in
digital libraries. An information retrieval system that is aware of name
changes can actively support users in finding former occurrences of evolved
entities. However, current structured knowledge bases, such as DBpedia or
Freebase, do not provide enough information about evolutions, even though the
data is available on their resources, like Wikipedia. Our emph{Evolution Base}
prototype will demonstrate how excerpts describing name evolutions can be
identified on these websites with a promising precision. The descriptions are
classified by means of models that we trained based on a recent analysis of
named entity evolutions on Wikipedia.
Helge Holzmann, Thomas Risse
Comments: WebSci 2014, Bloomington, IN, USA
Subjects: Computation and Language (cs.CL); Digital Libraries (cs.DL)
Accessing Web archives raises a number of issues caused by their temporal
characteristics. Additional knowledge is needed to find and understand older
texts. Especially entities mentioned in texts are subject to change. Most
severe in terms of information retrieval are name changes. In order to find
entities that have changed their name over time, search engines need to be
aware of this evolution. We tackle this problem by analyzing Wikipedia in terms
of entity evolutions mentioned in articles. We present statistical data on
excerpts covering name changes, which will be used to discover similar text
passages and extract evolution knowledge in future work.
Helge Holzmann, Thomas Risse
Comments: WISE 2014, Thessaloniki, Greece
Subjects: Computation and Language (cs.CL); Digital Libraries (cs.DL)
Working with Web archives raises a number of issues caused by their temporal
characteristics. Depending on the age of the content, additional knowledge
might be needed to find and understand older texts. Especially facts about
entities are subject to change. Most severe in terms of information retrieval
are name changes. In order to find entities that have changed their name over
time, search engines need to be aware of this evolution. We tackle this problem
by analyzing Wikipedia in terms of entity evolutions mentioned in articles
regardless the structural elements. We gathered statistics and automatically
extracted minimum excerpts covering name changes by incorporating lists
dedicated to that subject. In future work, these excerpts are going to be used
to discover patterns and detect changes in other sources. In this work we
investigate whether or not Wikipedia is a suitable source for extracting the
required knowledge.
Maria Nadejde, Siva Reddy, Rico Sennrich, Tomasz Dwojak, Marcin Junczys-Dowmunt, Philipp Koehn, Alexandra Birch
Subjects: Computation and Language (cs.CL)
Neural machine translation (NMT) models are able to partially learn syntactic
information from sequential lexical information. Still, some complex syntactic
phenomena such as prepositional phrase attachment are poorly modeled. This work
aims to answer two questions: 1) Does explicitly modeling source or target
language syntax help NMT? 2) Is tight integration of words and syntax better
than multitask training? We introduce syntactic information in the form of CCG
supertags either in the source as an extra feature in the embedding, or in the
target, by interleaving the target supertags with the word sequence. Our
results on WMT data show that explicitly modeling syntax improves machine
translation quality for English-German, a high-resource pair, and for
English-Romanian, a low-resource pair and also several syntactic phenomena
including prepositional phrase attachment. Furthermore, a tight coupling of
words and syntax improves translation quality more than multitask training.
Yong Wang, Ya Wei Zhao
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
This paper made a short review of Cloud Computing and Big Data, and discussed
the portability of general data mining algorithms to Cloud Computing platform.
It revealed the Cloud Computing platform based on Map-Reduce cannot solve all
the Big Data and data mining problems. Transplanting the general data mining
algorithms to the real-time Cloud Computing platform will be one of the
research focuses in Cloud Computing and Big Data.
Ashraf A. Shahin
Comments: this http URL
Journal-ref: International Journal of Advanced Computer Science and
Applications(IJACSA), 8(1), 2017
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Elasticity is one of key features of cloud computing. Elasticity allows
Software as a Service (SaaS) applications’ provider to reduce cost of running
applications. In large SaaS applications that are developed using
service-oriented architecture model, each service is deployed in a separated
virtual machine and may use one or more services to complete its task.
Although, scaling service independently from its required services propagates
scaling problem to other services, most of current elasticity approaches do not
consider functional dependencies between services, which increases the
probability of violating service level agreement. In this paper, architecture
of SaaS application is modeled as multi-class M/M/m processor sharing queuing
model with deadline to take into account functional dependencies between
services during estimating required scaling resources. Experimental results
show effectiveness of the proposed model in estimating required resources
during scaling virtual resources.
Yinhao Lu, Buyang Cao, Fred Glover
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Social and Information Networks (cs.SI)
The well-known K-means clustering algorithm has been employed widely in
different application domains ranging from data analytics to logistics
applications. However, the K-means algorithm can be affected by factors such as
the initial choice of centroids and can readily become trapped in a local
optimum. In this paper, we propose an improved K-means clustering algorithm
that is augmented by a Tabu Search strategy, and which is better adapted to
meet the needs of big data applications. Our design is further enhanced to take
advantage of parallel processing based on the Spark framework. Computational
experiments demonstrate the superiority of our Tabu Search based clustering
algorithm over a widely used version of the K-means approach embodied in Spark
MLlib, comparing the algorithms in terms of scalability, accuracy, and
effectiveness.
Peter Sanders, Christian Schulz, Darren Strash, Robert Williger
Comments: arXiv admin note: text overlap with arXiv:1509.01190, arXiv:1110.0477
Subjects: Neural and Evolutionary Computing (cs.NE); Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS)
Computing high quality node separators in large graphs is necessary for a
variety of applications, ranging from divide-and-conquer algorithms to VLSI
design. In this work, we present a novel distributed evolutionary algorithm
tackling the k-way node separator problem. A key component of our contribution
includes new k-way local search algorithms based on maximum flows. We combine
our local search with a multilevel approach to compute an initial population
for our evolutionary algorithm, and further show how to modify the coarsening
stage of our multilevel algorithm to create effective combine and mutation
operations. Lastly, we combine these techniques with a scalable communication
protocol, producing a system that is able to compute high quality solutions in
a short amount of time. Our experiments against competing algorithms show that
our advanced evolutionary algorithm computes the best result on 94% of the
chosen benchmark instances.
Zihang Dai, Amjad Almahairi, Philip Bachman, Eduard Hovy, Aaron Courville
Comments: Accepted into ICLR 2017
Subjects: Learning (cs.LG)
In this paper, we propose to equip Generative Adversarial Networks with the
ability to produce direct energy estimates for samples.Specifically, we propose
a flexible adversarial training framework, and prove this framework not only
ensures the generator converges to the true data distribution, but also enables
the discriminator to retain the density information at the global optimal. We
derive the analytic form of the induced solution, and analyze the properties.
In order to make the proposed framework trainable in practice, we introduce two
effective approximation techniques. Empirically, the experiment results closely
match our theoretical analysis, verifying the discriminator is able to recover
the energy of data distribution.
Qiuyan Yan, Shixiong Xia, Fanrong Meng
Subjects: Learning (cs.LG)
Class imbalance is one of the challenging problems for machine learning in
many real-world applications, such as coal and gas burst accident monitoring:
the burst premonition data is extreme smaller than the normal data, however,
which is the highlight we truly focus on. Cost-sensitive adjustment approach is
a typical algorithm-level method resisting the data set imbalance. For SVMs
classifier, which is modified to incorporate varying penalty parameter(C) for
each of considered groups of examples. However, the C value is determined
empirically, or is calculated according to the evaluation metric, which need to
be computed iteratively and time consuming. This paper presents a novel
cost-sensitive SVM method whose penalty parameter C optimized on the basis of
cluster probability density function(PDF) and the cluster PDF is estimated only
according to similarity matrix and some predefined hyper-parameters.
Experimental results on various standard benchmark data sets and real-world
data with different ratios of imbalance show that the proposed method is
effective in comparison with commonly used cost-sensitive techniques.
Piotr Szymański
Subjects: Learning (cs.LG); Mathematical Software (cs.MS)
Scikit-multilearn is a Python library for performing multi-label
classification. The library is compatible with the scikit/scipy ecosystem and
uses sparse matrices for all internal operations. It provides native Python
implementations of popular multi-label classification methods alongside novel
framework for label space partitioning and division. It includes graph-based
community detection methods that utilize the powerful igraph library for
extracting label dependency information. In addition its code is well test
covered and follows PEP8. Source code and documentation can be downloaded from
this http URL and also via pip. The library follows scikit’s BSD licencing
scheme.
Bas van Stein, Hao Wang, Wojtek Kowalczyk, Michael Emmerich, Thomas Bäck
Comments: Submitted to IEEE Computational Intelligence Magazine for review
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Kriging or Gaussian Process Regression is applied in many fields as a
non-linear regression model as well as a surrogate model in the field of
evolutionary computation. However, the computational and space complexity of
Kriging, that is cubic and quadratic in the number of data points respectively,
becomes a major bottleneck with more and more data available nowadays. In this
paper, we propose a general methodology for the complexity reduction, called
cluster Kriging, where the whole data set is partitioned into smaller clusters
and multiple Kriging models are built on top of them. In addition, four Kriging
approximation algorithms are proposed as candidate algorithms within the new
framework. Each of these algorithms can be applied to much larger data sets
while maintaining the advantages and power of Kriging. The proposed algorithms
are explained in detail and compared empirically against a broad set of
existing state-of-the-art Kriging approximation methods on a well-defined
testing framework. According to the empirical study, the proposed algorithms
consistently outperform the existing algorithms. Moreover, some practical
suggestions are provided for using the proposed algorithms.
Dolev Raviv, Margarita Osadchy
Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Deep Learning (DL) methods show very good performance when trained on large,
balanced data sets. However, many practical problems involve imbalanced data
sets, or/and classes with a small number of training samples. The performance
of DL methods as well as more traditional classifiers drops significantly in
such settings. Most of the existing solutions for imbalanced problems focus on
customizing the data for training. A more principled solution is to use mixed
Hinge-Minimax risk [19] specifically designed to solve binary problems with
imbalanced training sets. Here we propose a Latent Hinge Minimax (LHM) risk and
a training algorithm that generalizes this paradigm to an ensemble of
hyperplanes that can form arbitrary complex, piecewise linear boundaries. To
extract good features, we combine LHM model with CNN via transfer learning. To
solve multi-class problem we map pre-trained category-specific LHM classifiers
to a multi-class neural network and adjust the weights with very fast tuning.
LHM classifier enables the use of unlabeled data in its training and the
mapping allows for multi-class inference, resulting in a classifier that
performs better than alternatives when trained on a small number of training
samples.
Jessica Gliozzo
Comments: MSc Thesis, Advisor: G. Valentini, Co-Advisors: A. Paccanaro and M. Re, 92 pages, 36 figures, 10 tables
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
In this thesis we present the novel semi-supervised network-based algorithm
P-Net, which is able to rank and classify patients with respect to a specific
phenotype or clinical outcome under study. The peculiar and innovative
characteristic of this method is that it builds a network of samples/patients,
where the nodes represent the samples and the edges are functional or genetic
relationships between individuals (e.g. similarity of expression profiles), to
predict the phenotype under study. In other words, it constructs the network in
the “sample space” and not in the “biomarker space” (where nodes represent
biomolecules (e.g. genes, proteins) and edges represent functional or genetic
relationships between nodes), as usual in state-of-the-art methods. To assess
the performances of P-Net, we apply it on three different publicly available
datasets from patients afflicted with a specific type of tumor: pancreatic
cancer, melanoma and ovarian cancer dataset, by using the data and following
the experimental set-up proposed in two recently published papers [Barter et
al., 2014, Winter et al., 2012]. We show that network-based methods in the
“sample space” can achieve results competitive with classical supervised
inductive systems. Moreover, the graph representation of the samples can be
easily visualized through networks and can be used to gain visual clues about
the relationships between samples, taking into account the phenotype associated
or predicted for each sample. To our knowledge this is one of the first works
that proposes graph-based algorithms working in the “sample space” of the
biomolecular profiles of the patients to predict their phenotype or outcome,
thus contributing to a novel research line in the framework of the Network
Medicine.
Minnan Luo, Xiaojun Chang, Yi Yang, Liqiang Nie, Alexander G. Hauptmann, Qinghua Zheng
Comments: 36 pages
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
The heterogeneity-gap between different modalities brings a significant
challenge to multimedia information retrieval. Some studies formalize the
cross-modal retrieval tasks as a ranking problem and learn a shared multi-modal
embedding space to measure the cross-modality similarity. However, previous
methods often establish the shared embedding space based on linear mapping
functions which might not be sophisticated enough to reveal more complicated
inter-modal correspondences. Additionally, current studies assume that the
rankings are of equal importance, and thus all rankings are used
simultaneously, or a small number of rankings are selected randomly to train
the embedding space at each iteration. Such strategies, however, always suffer
from outliers as well as reduced generalization capability due to their lack of
insightful understanding of procedure of human cognition. In this paper, we
involve the self-paced learning theory with diversity into the cross-modal
learning to rank and learn an optimal multi-modal embedding space based on
non-linear mapping functions. This strategy enhances the model’s robustness to
outliers and achieves better generalization via training the model gradually
from easy rankings by diverse queries to more complex ones. An efficient
alternative algorithm is exploited to solve the proposed challenging problem
with fast convergence in practice. Extensive experimental results on several
benchmark datasets indicate that the proposed method achieves significant
improvements over the state-of-the-arts in this literature.
Wenshuo Wang, Ding Zhao, Junqiang Xi, Wei Han
Comments: 12 pages, 13 figures, Journal
Subjects: Learning (cs.LG); Systems and Control (cs.SY)
Misunderstanding of driver correction behaviors (DCB) is the primary reason
for false warnings of lane-departure-prediction systems. We propose a
learning-based approach to predicting unintended lane-departure behaviors (LDB)
and the chance for drivers to bring the vehicle back to the lane. First, in
this approach, a personalized driver model for lane-departure and lane-keeping
behavior is established by combining the Gaussian mixture model and the hidden
Markov model. Second, based on this model, we develop an online model-based
prediction algorithm to predict the forthcoming vehicle trajectory and judge
whether the driver will demonstrate an LDB or a DCB. We also develop a warning
strategy based on the model-based prediction algorithm that allows the
lane-departure warning system to be acceptable for drivers according to the
predicted trajectory. In addition, the naturalistic driving data of 10 drivers
is collected through the University of Michigan Safety Pilot Model Deployment
program to train the personalized driver model and validate this approach. We
compare the proposed method with a basic time-to-lane-crossing (TLC) method and
a TLC-directional sequence of piecewise lateral slopes (TLC-DSPLS) method. The
results show that the proposed approach can reduce the false-warning rate to
3.07\%.
Shixia Liu, Xiting Wang, Mengchen Liu, Jun Zhu
Comments: This article will be published in Visual Infomatics
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
Interactive model analysis, the process of understanding, diagnosing, and
refining a machine learning model with the help of interactive visualization,
is very important for users to efficiently solve real-world artificial
intelligence and data mining problems. Dramatic advances in big data analytics
has led to a wide variety of interactive model analysis tasks. In this paper,
we present a comprehensive analysis and interpretation of this rapidly
developing area. Specifically, we classify the relevant work into three
categories: understanding, diagnosis, and refinement. Each category is
exemplified by recent influential work. Possible future research opportunities
are also explored and discussed.
Zhengbing Hu, Yevgeniy V. Bodyanskiy, Oleksii K. Tyshchenko, Viktoriia O. Samitova
Comments: International Journal of Intelligent Systems and Applications(IJISA), Vol. 9, No. 2, February 2017
Subjects: Learning (cs.LG)
A task of clustering data given in the ordinal scale under conditions of
overlapping clusters has been considered. It’s proposed to use an approach
based on memberhsip and likelihood functions sharing. A number of performed
experiments proved effectiveness of the proposed method. The proposed method is
characterized by robustness to outliers due to a way of ordering values while
constructing membership functions.
Gregory Kahn, Adam Villaflor, Vitchyr Pong, Pieter Abbeel, Sergey Levine
Subjects: Learning (cs.LG); Robotics (cs.RO)
Reinforcement learning can enable complex, adaptive behavior to be learned
automatically for autonomous robotic platforms. However, practical deployment
of reinforcement learning methods must contend with the fact that the training
process itself can be unsafe for the robot. In this paper, we consider the
specific case of a mobile robot learning to navigate an a priori unknown
environment while avoiding collisions. In order to learn collision avoidance,
the robot must experience collisions at training time. However, high-speed
collisions, even at training time, could damage the robot. A successful
learning method must therefore proceed cautiously, experiencing only low-speed
collisions until it gains confidence. To this end, we present an
uncertainty-aware model-based learning algorithm that estimates the probability
of collision together with a statistical estimate of uncertainty. By
formulating an uncertainty-dependent cost function, we show that the algorithm
naturally chooses to proceed cautiously in unfamiliar environments, and
increases the velocity of the robot in settings where it has high confidence.
Our predictive model is based on bootstrapped neural networks using dropout,
allowing it to process raw sensory inputs from high-bandwidth sensors such as
cameras. Our experimental evaluation demonstrates that our method effectively
minimizes dangerous collisions at training time in an obstacle avoidance task
for a simulated and real-world quadrotor, and a real-world RC car. Videos of
the experiments can be found at this https URL
Zeeshan Khawar Malik, Mo Kobrosli, Peter Maas
Subjects: Information Retrieval (cs.IR); Learning (cs.LG); Machine Learning (stat.ML)
Deep Neural Networks, and specifically fully-connected convolutional neural
networks are achieving remarkable results across a wide variety of domains.
They have been trained to achieve state-of-the-art performance when applied to
problems such as speech recognition, image classification, natural language
processing and bioinformatics. Most of these deep learning models when applied
to classification employ the softmax activation function for prediction and aim
to minimize cross-entropy loss. In this paper, we have proposed a supervised
model for dominant category prediction to improve search recall across all eBay
classifieds platforms. The dominant category label for each query in the last
90 days is first calculated by summing the total number of collaborative clicks
among all categories. The category having the highest number of collaborative
clicks for the given query will be considered its dominant category. Second,
each query is transformed to a numeric vector by mapping each unique word in
the query document to a unique integer value; all padded to equal length based
on the maximum document length within the pre-defined vocabulary size. A
fully-connected deep convolutional neural network (CNN) is then applied for
classification. The proposed model achieves very high classification accuracy
compared to other state-of-the-art machine learning techniques.
K. Mills, M. Spanner, I. Tamblyn
Subjects: Materials Science (cond-mat.mtrl-sci); Learning (cs.LG); Chemical Physics (physics.chem-ph)
We have trained a deep (convolutional) neural network to predict the
ground-state energy of an electron in four classes of confining two-dimensional
electrostatic potentials. On randomly generated potentials, for which there is
no analytic form for either the potential or the ground-state energy, the
neural network model was able to predict the ground-state energy to within
chemical accuracy, with a median absolute error of 1.49 MHa. We also
investigate the performance of the model in predicting other quantities such as
the kinetic energy and the first excited-state energy of random potentials.
While we demonstrated this approach on a simple, tractable problem, the
transferability and excellent performance of the resulting model suggests
further applications of deep neural networks to problems of electronic
structure.
Arya Mazumdar, Barna Saha
Comments: Appears in AAAI-17
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Learning (cs.LG)
Entity resolution (ER) is the task of identifying all records in a database
that refer to the same underlying entity, and are therefore duplicates of each
other. Due to inherent ambiguity of data representation and poor data quality,
ER is a challenging task for any automated process. As a remedy, human-powered
ER via crowdsourcing has become popular in recent years. Using crowd to answer
queries is costly and time consuming. Furthermore, crowd-answers can often be
faulty. Therefore, crowd-based ER methods aim to minimize human participation
without sacrificing the quality and use a computer generated similarity matrix
actively. While, some of these methods perform well in practice, no theoretical
analysis exists for them, and further their worst case performances do not
reflect the experimental findings. This creates a disparity in the
understanding of the popular heuristics for this problem. In this paper, we
make the first attempt to close this gap. We provide a thorough analysis of the
prominent heuristic algorithms for crowd-based ER. We justify experimental
observations with our analysis and information theoretic lower bounds.
Shumeet Baluja, Michele Covell, Rahul Sukthankar
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Systems and Control (cs.SY)
Real-time optimization of traffic flow addresses important practical
problems: reducing a driver’s wasted time, improving city-wide efficiency,
reducing gas emissions and improving air quality. Much of the current research
in traffic-light optimization relies on extending the capabilities of traffic
lights to either communicate with each other or communicate with vehicles.
However, before such capabilities become ubiquitous, opportunities exist to
improve traffic lights by being more responsive to current traffic situations
within the current, already deployed, infrastructure. In this paper, we
introduce a traffic light controller that employs bidding within micro-auctions
to efficiently incorporate traffic sensor information; no other outside sources
of information are assumed. We train and test traffic light controllers on
large-scale data collected from opted-in Android cell-phone users over a period
of several months in Mountain View, California and the River North neighborhood
of Chicago, Illinois. The learned auction-based controllers surpass (in both
the relevant metrics of road-capacity and mean travel time) the currently
deployed lights, optimized static-program lights, and longer-term planning
approaches, in both cities, measured using real user driving data.
Lennart Gulikers, Marc Lelarge, Laurent Massoulié
Subjects: Probability (math.PR); Learning (cs.LG); Social and Information Networks (cs.SI); Machine Learning (stat.ML)
Motivated by community detection, we characterise the spectrum of the
non-backtracking matrix (B) in the Degree-Corrected Stochastic Block Model.
Specifically, we consider a random graph on (n) vertices partitioned into two
equal-sized clusters. The vertices have i.i.d. weights ({ phi_u }_{u=1}^n)
with second moment (Phi^{(2)}). The intra-cluster connection probability for
vertices (u) and (v) is (frac{phi_u phi_v}{n}a) and the inter-cluster
connection probability is (frac{phi_u phi_v}{n}b).
We show that with high probability, the following holds: The leading
eigenvalue of the non-backtracking matrix (B) is asymptotic to (
ho =
frac{a+b}{2} Phi^{(2)}). The second eigenvalue is asymptotic to (mu_2 =
frac{a-b}{2} Phi^{(2)}) when (mu_2^2 >
ho), but asymptotically bounded by
(sqrt{
ho}) when (mu_2^2 leq
ho). All the remaining eigenvalues are
asymptotically bounded by (sqrt{
ho}). As a result, a clustering
positively-correlated with the true communities can be obtained based on the
second eigenvector of (B) in the regime where (mu_2^2 >
ho.)
In a previous work we obtained that detection is impossible when (mu_2^2 <
ho,) meaning that there occurs a phase-transition in the sparse regime of the
Degree-Corrected Stochastic Block Model.
As a corollary, we obtain that Degree-Corrected ErdH{o}s-R’enyi graphs
asymptotically satisfy the graph Riemann hypothesis, a quasi-Ramanujan
property.
A by-product of our proof is a weak law of large numbers for
local-functionals on Degree-Corrected Stochastic Block Models, which could be
of independent interest.
Karim Banawan, Sennur Ulukus
Comments: Submitted to IEEE Transactions on Information Theory, February 2017
Subjects: Information Theory (cs.IT); Cryptography and Security (cs.CR); Information Retrieval (cs.IR)
We consider the problem of multi-message private information retrieval (MPIR)
from (N) non-communicating replicated databases. In MPIR, the user is
interested in retrieving (P) messages out of (M) stored messages without
leaking the identity of the retrieved messages. The information-theoretic sum
capacity of MPIR (C_s^P) is the maximum number of desired message symbols that
can be retrieved privately per downloaded symbol. For the case (P geq
frac{M}{2}), we determine the exact sum capacity of MPIR as
(C_s^P=frac{1}{1+frac{M-P}{PN}}). The achievable scheme in this case is based
on downloading MDS-coded mixtures of all messages. For (P leq frac{M}{2}), we
develop lower and upper bounds for all (M,P,N). These bounds match if the total
number of messages (M) is an integer multiple of the number of desired messages
(P), i.e., (frac{M}{P} in mathbb{N}). In this case,
(C_s^P=frac{1-frac{1}{N}}{1-(frac{1}{N})^{M/P}}). The achievable scheme in
this case generalizes the single-message capacity achieving scheme to have
unbalanced number of stages per round of download. For all the remaining cases,
the difference between the lower and upper bound is at most (0.0082), which
occurs for (M=5), (P=2), (N=2). Our results indicate that joint retrieval of
desired messages is more efficient than successive use of single-message
retrieval schemes.
Anoosheh Heidarzadeh, Alex Sprintson
Subjects: Information Theory (cs.IT)
This paper considers the problem of designing maximum distance separable
(MDS) codes over small fields with constraints on the support of their
generator matrices. For any given (m imes n) binary matrix (M), the GM-MDS
conjecture, proposed by Dau et al., states that if (M) satisfies the so-called
MDS condition, then for any field (mathbb{F}) of size (qgeq n+m-1), there
exists an ([n,m]_q) MDS code whose generator matrix (G), with entries in
(mathbb{F}), fits the matrix (M) (i.e., (M) is the support matrix of (G)).
Despite all the attempts by the coding theory community, this conjecture
remains still open in general. It was shown, independently by Yan et al. and
Dau et al., that the GM-MDS conjecture holds if the following conjecture,
referred to as the TM-MDS conjecture, holds: if (M) satisfies the MDS
condition, then the determinant of a transform matrix (T), such that (TV) fits
(M), is not identically zero, where (V) is a Vandermonde matrix with distinct
parameters. In this work, we first reformulate the TM-MDS conjecture in terms
of the Wronskian determinant, and then present an algebraic-combinatorial
approach based on polynomial-degree reduction for proving this conjecture. Our
proof technique’s strength is based primarily on reducing inherent
combinatorics in the proof. We demonstrate the strength of our technique by
proving the TM-MDS conjecture for the cases where the number of rows ((m)) of
(M) is upper bounded by (5). For this class of special cases of (M) where the
only additional constraint is on (m), only cases with (mleq 4) were previously
proven theoretically, and the previously used proof techniques are not
applicable to cases with (m > 4).
Mudasar Bacha, Yueping Wu, Bruno Clerckx
Comments: Accepted for publication in IEEE Transactions on Wireless Communications
Subjects: Information Theory (cs.IT)
In order to improve the uplink performance of future cellular networks, the
idea to decouple the downlink (DL) and uplink (UL) association has recently
been shown to provide significant gain in terms of both coverage and rate
performance. However, all the work is limited to SISO network. Therefore, to
study the gain provided by the DL and UL decoupling in multi-antenna base
stations (BSs) setup, we study a two tier heterogeneous network consisting of
multi-antenna BSs, and single antenna user equipments (UEs). We use maximal
ratio combining (MRC) as a linear receiver at the BSs and using tools from
stochastic geometry, we derive tractable expressions for both signal to
interference ratio (SIR) coverage probability and rate coverage probability. We
observe that as the disparity in the beamforming gain of both tiers increases,
the gain in term of SIR coverage probability provided by the decoupled
association over non-decoupled association decreases. We further observe that
when there is asymmetry in the number of antennas of both tier, then we need
further biasing towards femto-tier on the top of decoupled association to
balance the load and get optimal rate coverage probability.
Kilian Roth, Josef A. Nossek
Subjects: Information Theory (cs.IT)
For future mmWave mobile communication systems the use of analog/hybrid
beamforming is envisioned be a key as- pect. The synthesis of beams is a key
technology of enable the best possible operation during beamsearch, data
transmission and MU MIMO operation. The developed method for synthesizing beams
is based on previous work in radar technology considering only phase array
antennas. With this technique it is possible to generate a desired beam of any
shape with the constraints of the desired target transceiver antenna frontend.
It is not constraint to a certain antenna array geometry, but can handle 1d, 2d
and even 3d antenna array geometries like cylindric arrays. The numerical
examples show that the method can synthesize beams by considering a user
defined tradeoff between gain, transition width and passband ripples.
Enrico Piovano, Hamdi Joudeh, Bruno Clerckx
Comments: Submitted to ISIT 2017
Subjects: Information Theory (cs.IT)
This work investigates the interplay of coded caching and spatial
multiplexing in an overloaded Multiple-Input-Single-Output (MISO) Broadcast
Channel (BC), i.e. a system where the number of users is greater than the
number of transmitting antennas. On one hand, coded caching uses the aggregate
global cache memory of the users to create multicasting opportunities. On the
other hand, the multiple antennas at the transmitter leverages the available
CSIT to transmit multiple streams simultaneously. In this paper, we introduce a
novel scheme which combines both the gain derived from coded-caching and
spatial multiplexing and outperforms existing schemes in terms of delivery time
and CSIT requirement.
Anna Guerra, Francesco Guidi, Davide Dardari
Subjects: Information Theory (cs.IT)
Next generation cellular networks will experience the combination of
femtocells, millimeter-wave (mm-wave) communications and massive antenna
arrays. Thanks to the beamforming capability as well as the high angular
resolution provided by massive arrays, only one single access point (AP) acting
as an anchor node could be used for localization estimation, thus avoiding
over-sized infrastructures dedicated to positioning. In this context, our paper
aims at investigating the localization and orientation performance limits
employing massive arrays both at the AP and mobile side. In particular, we
propose a comparison between MIMO and beamforming in terms of array structure,
synchronization error and multipath components. Among different array
configurations, we consider also random beamforming as a trade-off between the
high diversity gain of MIMO and the high directivity guaranteed by phased
arrays. By evaluating the Cramer-Rao bound (CRB) for the different array
configurations, results show the interplay between diversity and beamforming
gain as well as the benefits achievable by varying the number of array elements
in terms of localization accuracy.
Maciej Skorski
Subjects: Information Theory (cs.IT); Computational Complexity (cs.CC)
This paper studies the complexity of estimating Renyi divergences of a
distribution (p) observed from samples, with respect to a baseline distribution
(q) known emph{a priori}. Extending the results of Acharya et al. (SODA’15) on
estimating Renyi entropy, we present improved estimation techniques together
with upper and lower bounds on the sample complexity.
We show that, contrarily to estimating Renyi entropy where a sublinear (in
the alphabet size) number of samples suffices, the sample complexity is heavily
dependent on emph{events occurring unlikely} in (q), and is unbounded in
general (no matter what an estimation technique is used). For any divergence of
order bigger than (1), we provide upper and lower bounds on the number of
samples dependent on probabilities of (p) and (q). We conclude that the
worst-case sample complexity is polynomial in the alphabet size if and only if
the probabilities of (q) are non-negligible.
This gives theoretical insights into heuristics used in applied papers to
handle numerical instability, which occurs for small probabilities of (q). Our
result explains that small probabilities should be handled with care not only
because of numerical issues, but also because of a blow up in sample
complexity.
Sudarshan Guruacharya, Ekram Hossain
Subjects: Information Theory (cs.IT)
Ambient energy harvesting is touted as a low cost solution to prolong the
life of low-powered devices, reduce the carbon footprint, and make the system
self-sustainable. Most research to date have focused either on the physical
aspects of energy conversion process or on optimal consumption policy of the
harvested energy at the system level. However, although intuitively understood,
to the best of our knowledge, the idea of self-sustainability is yet to be made
precise and studied as a performance metric. In this paper, we provide a
mathematical definition of the concept of self-sustainability of an energy
harvesting system, based on the complementary idea of eventual outage. In
particular, we analyze the harvest-store-consume system with infinite battery
capacity, stochastic energy arrivals, and fixed energy consumption rate. Using
the random walk theory, we identify the necessary condition for the system to
be self-sustainable. General formulas are given for the self-sustainability
probability in the form of integral equations. Since these integral equations
are difficult to solve analytically, an exponential upper bound for eventual
outage probability is given using martingales. This bound guarantees that the
eventual outage probability can be made arbitrarily small simply by increasing
the initial battery energy. We also give an asymptotic formula for eventual
outage. For the special case when the energy arrival follows a Poisson process,
we are able to find the exact formulas for the eventual outage probability. We
also show that the harvest-store-consume system is mathematically equivalent to
a (GI/G/1) queueing system, which allows us to easily find the outage
probability, in case the necessary condition for self-sustainability is
violated. Monte-Carlo simulations verify our analysis.
Arash Shahmansoori, Gabriel E. Garcia, Giuseppe Destino, Gonzalo Seco-Granados, Henk Wymeersch
Comments: 27 pages, 8 figures, journal paper
Subjects: Information Theory (cs.IT)
Millimeter wave signals and large antenna arrays are considered enabling
technologies for future 5G networks. While their benefits for achieving
high-data rate communications are well-known, their potential advantages for
accurate positioning are largely undiscovered. We derive the Cram'{e}r-Rao
bound (CRB) on position and rotation angle estimation uncertainty from
millimeter wave signals from a single transmitter, in the presence of
scatterers. We also present a novel two-stage algorithm for position and
rotation angle estimation that attains the CRB for average to high
signal-to-noise ratio. The algorithm is based on multiple measurement vectors
matching pursuit for coarse estimation, followed by a refinement stage based on
the space-alternating generalized expectation maximization algorithm. We find
that accurate position and rotation angle estimation is possible using signals
from a single transmitter, in either line-of- sight, non-line-of-sight, or
obstructed-line-of-sight conditions.
Robin A. A. Ince
Subjects: Information Theory (cs.IT); Statistics Theory (math.ST); Neurons and Cognition (q-bio.NC); Quantitative Methods (q-bio.QM); Methodology (stat.ME)
Obtaining meaningful quantitative descriptions of the statistical dependence
within multivariate systems is a difficult open problem. Recently, the Partial
Information Decomposition (PID) framework was proposed to decompose mutual
information about a target variable within a set of predictor variables into
components which are redundant, unique and synergistic within different subsets
of predictors. However, the details of how to implement this framework in
practice are still debated. Here, we propose to apply the elegant formalism of
the PID to multivariate entropy, resulting in a Partial Entropy Decomposition
(PED). We implement the PED with an entropy redundancy measure based on
pointwise common surprisal; a natural definition which is closely related to
the definition of mutual information. We show how this approach can reveal the
dyadic vs triadic generative structure of multivariate systems that are
indistinguishable with classical Shannon measures. The entropy perspective also
shows that misinformation is synergistic entropy and hence that mutual
information itself includes both redundant and synergistic effects. We show the
relationships between the PED and mutual information in two predictors, and
derive two alternative information decompositions, which we illustrate on
several example systems. The new perspective provided by these developments
helps to clarify some of the difficulties encountered with the PID approach and
the resulting decompositions provide useful tools for practical data analysis
across a range of application areas.
Abubakr O. Al-Abbasi, Ridha Hamila, Waheed U. Bajwa, Naofal Al-Dhahir
Comments: 11 pages, 8 figures, IEEE Trans. On Wireless Communications. arXiv admin note: substantial text overlap with arXiv:1603.00160
Subjects: Information Theory (cs.IT)
In this paper, we propose a general framework that transforms the problems of
designing sparse finite-impulseresponse linear equalizers and non-linear
decision-feedback equalizers, for multiple antenna systems, into the problem of
sparsestapproximation of a vector in different dictionaries. In addition, we
investigate several choices of the sparsifying dictionaries under this
framework. Furthermore, the worst-case coherences of these dictionaries, which
determine their sparsifying effectiveness, are analytically and/or numerically
evaluated. Moreover, we show how to reduce the computational complexity of the
designed sparse equalizer filters by exploiting the asymptotic equivalence of
Toeplitz and circulant matrices. Finally, the superiority of our proposed
framework over conventional methods is demonstrated through numerical
experiments.
Shanxiang Lyu, Antonio Campello, Cong Ling, Jean-Claude Belfiore
Comments: submitted to ISIT2017
Subjects: Information Theory (cs.IT)
Previous approaches of compute and forward (C&F) are mainly based on static
channel model, where the channel coefficients stay fixed during the whole
transmission time span. In this work, we investigate the C&F strategy under
block fading channels. Our technique is to design codes using construction A
over rings, so as to allow better quantization for the channels. Advantages in
terms of decoding error probabilities and computation rates are presented, and
the construction is shown to strictly outperform, in this scenario, the
compute-and-forward strategy over the integers (mathbb{Z}).
Mohammad. Moltafet, Nader. Mokari, Mohammad R. Javan, Paiez. Azmi
Comments: 5 pages, 2 figures
Subjects: Information Theory (cs.IT)
In this paper, the performance and system complexity of the candidate
multiple access (MA) techniques for the next generation of cellular systems,
namely, non-orthogonal multiple access (NOMA) (in this paper, we consider power
domain MA as NOMA) and sparse code multiple access (SCMA), are investigated. To
this end, for each MA technique, a resource allocation problem considering
heterogeneous cellular networks (HetNet) is formulated. We apply successive
convex approximation (SCA) method to each problem and obtain their solutions.
The simulation results show that SCMA-based system achieves better performance
than NOMA-based one at the cost of more complexity.
Liang Liu, Ping Wei
Subjects: Information Theory (cs.IT)
In this letter, we apply previous array receiver architecture which employs
time-domain sub-Nyquist sampling techniques to jointly estimate frequency and
direction-of-arrival(DOA) of narrowband far-field signals. Herein, a more
general situation is taken into consideration, where there may be more than one
signal in a subband. We build time-space union model, analyze the
identification of the model, and give the maximum signal number which can be
classified. We also proof that the Cramer-Rao Bound (CRB) is lower than that of
which employs Nyquist sampling. Simulation results verify the capacity to
estimate the number of sources. Meanwhile, simulations show that our estimation
performance closely matches the CRB and is superior for more sources than
sensors, especially when the minimum redundancy array (MRA) is employed.
Farhad Shirani, S. Sandeep Pradhan
Subjects: Information Theory (cs.IT)
We investigate binary block-codes (BBC). A BBC is defined as a vector of
Boolean functions. We consider BBCs which are generated randomly, and using
single-letter distributions. We characterize the vector of dependency spectrums
of these BBCs. We use this vector to upper-bound the correlation between the
outputs of two distributed BBCs. Finally, the upper-bound is used to show that
the large blocklength single-letter coding schemes in the literature are
sub-optimal in some multiterminal communication settings.
Farhad Shirani, S. Sandeep Pradhan
Subjects: Information Theory (cs.IT)
In this paper, we establish a new inequality tying together the effective
length and the maximum correlation between the outputs of an arbitrary pair of
Boolean functions which operate on two sequences of correlated random
variables. We derive a new upper-bound on the correlation between the outputs
of these functions. The upper-bound is useful in various disciplines which deal
with common-information. We build upon Witsenhausen’s bound on
maximum-correlation. The previous upper-bound did not take the effective length
of the Boolean functions into account.
Morgane Austern, Arian Maleki
Subjects: Information Theory (cs.IT)
Let ( K(X_1, ldots, X_n)) and (H(X_n | X_{n-1}, ldots, X_1)) denote the
Kolmogorov complexity and Shannon’s entropy rate of a stationary and ergodic
process ({X_i}_{i=-infty}^infty). It has been proved that [ frac{K(X_1,
ldots, X_n)}{n} – H(X_n | X_{n-1}, ldots, X_1)
ightarrow 0, ] almost
surely. This paper studies the convergence rate of this asymptotic result. In
particular, we show that if the process satisfies certain mixing conditions,
then there exists (sigma<infty) such that
()sqrt{n}left(frac{K(X_{1:n})}{n}- H(X_0|X_1,dots,X_{-infty})
ight)
ightarrow_d N(0,sigma^2).() Furthermore, we show that under slightly
stronger mixing conditions one may obtain non-asymptotic concentration bounds
for the Kolmogorov complexity.
Stanislav Kruglik, Alexey Frolov
Comments: ISIT 2017 submission, 5 pages, 3 figures
Subjects: Information Theory (cs.IT)
We investigate the distance properties of linear locally recoverable codes
(LRC codes) with all-symbol locality and availability. New upper and lower
bounds on the minimum distance of such codes are derived. The upper bound is
based on the shortening method and improves existing shortening bounds. To
reduce the gap in between upper and lower bounds we do not restrict the
alphabet size and propose explicit constructions of codes with locality and
availability via rank-metric codes. The first construction relies on expander
graphs and is better in low rate region, the second construction utilizes LRC
codes developed by Wang et al. as inner codes and better in high rate region.
Shakeel Salamat Ullah, Gianluigi Liva, Soung Chang Liew
Comments: Submitted to IEEE International Symposium on Information Theory (ISIT), 2017
Subjects: Information Theory (cs.IT)
In this work, we derive the random coding error exponent for the uplink phase
of a two-way relay system where physical layer network coding (PNC) is
employed. The error exponent is derived for the practical (yet sub-optimum) XOR
channel decoding setting. We show that the random coding error exponent under
optimum (i.e., maximum likelihood) PNC channel decoding can be achieved even
under the sub-optimal XOR channel decoding. The derived achievability bounds
provide us with valuable insight and can be used as a benchmark for the
performance of practical channel-coded PNC systems employing low complexity
decoders when finite-length codewords are used.
Shuxing Li
Comments: 16 pages, Finite Fields and Their Applications, Accepted
Subjects: Information Theory (cs.IT)
The generalized Hamming weights (GHWs) are fundamental parameters of linear
codes. GHWs are of great interest in many applications since they convey
detailed information of linear codes. In this paper, we continue the work of
[10] to study the GHWs of a family of cyclic codes with arbitrary number of
nonzeroes. The weight hierarchy is determined by employing a number-theoretic
approach.
Nikolaos Pappas, Zheng Chen
Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)
In this letter, we consider a simple scenario of a wireless caching system
that consists of a caching helper in proximity to the user to be served and a
data center with large content storage. Assuming bursty packet arrivals at the
caching helper, its availability is affected by the packet arrival rate, which
also affects the system throughput. We characterize the optimal transmission
and helping probabilities of the nodes in the caching system, aiming at
achieving the maximum weighted throughput of the system.
Yasutada Oohama
Subjects: Information Theory (cs.IT)
Let (X), (Y) be two correlated discrete random variables. We consider an
estimation of (X) from encoded data (varphi(Y)) of (Y) by some encoder
function (varphi(Y)). We derive an inequality describing a relation of the
correct probability of estimation and the mutual information between (X) and
(varphi(Y)). This inequality may be useful for the secure analysis of crypto
system when we use the success probability of estimating secret data as a
security criterion. It also provides an intuitive meaning of the secrecy
exponent in the strong secrecy criterion.
Shuai Yuan, Qin Huang, Zulin Wang
Subjects: Information Theory (cs.IT)
This paper generalizes the piggybacking constructions for distributed storage
systems by considering various protected instances and piggybacked instances.
Analysis demonstrates that the proportion of protected instances determines the
average repair bandwidth for a systematic node. By optimizing the proportion of
protected instances, the repair ratio of generalized piggybacking codes
approaches zero instead of 50% as the number of parity check nodes tends to
infinity. Furthermore, the computational complexity for repairing a single
systematic node cost by generalized piggybacking codes is less than that of the
existing piggybacking designs.
Van-Dinh Nguyen, Hieu V. Nguyen, Chuyen T. Nguyen, Oh-Soon Shin
Comments: 12 pages, 9 figures. To appear in IEEE ACCESS, 2017. The MATLAB codes for this paper can be downloaded from: this https URL
Subjects: Information Theory (cs.IT)
Full-duplex (FD) systems have emerged as an es- sential enabling technology
to further increase the data rate of wireless communication systems. The key
idea of FD is to serve multiple users over the same bandwidth with a base
station (BS) that can simultaneously transmit and receive the signals. The most
challenging issue in designing an FD system is to address both the harmful
effects of residual self-interference caused by the transmit-to-receive
antennas at the BS as well as the co- channel interference from an uplink user
(ULU) to a downlink user (DLU). An efficient solution to these problems is to
assign the ULUs/DLUs in different groups/slots, with each user served in
multiple groups. Hence, this paper studies the joint design of transmit
beamformers, ULUs/DLUs group assignment, and time allocation for each group.
The specific aim is to maximize the sum rate under the ULU/DLU minimum
throughput constraints. The utility function of interest is a difficult
nonconcave problem, and the involved constraints are also nonconvex, and so
this is a computationally troublesome problem. To solve this optimization
problem, we propose a new path-following algorithm for compu- tational
solutions to arrive at least the local optima. Each iteration involves only a
simple convex quadratic program. We prove that the proposed algorithm
iteratively improves the objective while guaranteeing convergence. Simulation
results confirm the fast convergence of the proposed algorithm with substantial
performance improvements over existing approaches.
Sami Akin, Marwan Hammouda, Jürgen Peissig
Subjects: Information Theory (cs.IT)
Recently, the demand for faster and more reliable data transmission has
brought up complex communications systems. As a result, it has become more
difficult to carry out closed-form solutions that can provide insight about
performance levels. In this paper, different from the existing research, we
study a cognitive radio system that employs hybrid-automatic-repeat-request
(HARQ) protocols under quality-of-service (QoS) constraints. We assume that the
secondary users access the spectrum by utilizing a strategy that is a
combination of underlay and interweave access techniques. Considering that the
secondary users imperfectly perform channel sensing in order to detect the
active primary users and that there is a transmission deadline for each data
packet at the secondary transmitter buffer, we formulate the state-transition
model of the system. Then, we obtain the state-transition probabilities when
HARQ-chase combining is adopted. Subsequently, we provide the packet-loss rate
in the channel and achieve the effective capacity. Finally, we substantiate our
analytical derivations with numerical results.
Varun Jog, Venkat Anantharam
Comments: 33 pages. A shorter version of this paper appeared in the proceedings of ISIT 2015
Subjects: Information Theory (cs.IT)
The entropy of a random variable is well-known to equal the exponential
growth rate of the volumes of its typical sets. In this paper, we show that for
any log-concave random variable (X), the sequence of the (lfloor n heta
floor^{ ext{th}}) intrinsic volumes of the typical sets of (X) in dimensions
(n geq 1) grows exponentially with a well-defined rate. We denote this rate by
(h_X( heta)), and call it the ( heta^{ ext{th}}) intrinsic entropy of (X).
We show that (h_X( heta)) is a continuous function of ( heta) over the range
([0,1]), thereby providing a smooth interpolation between the values 0 and
(h(X)) at the endpoints 0 and 1, respectively.
Thuan Nguyen, Thinh Nguyen
Comments: journal
Subjects: Information Theory (cs.IT)
The recent increase in number of wireless devices has been driven by the
growing markets of smart homes and the Internet of Things (IoT). As a result,
expanding and/or efficient utilization of the radio frequency (RF) spectrum is
critical to accommodate such an increase in wireless bandwidth. Alternatively,
recent free-space optical (FSO) communication technologies have demonstrated
the feasibility of building WiFO, a high capacity indoor wireless network using
the femtocell architecture. Since FSO transmission does not interfere with the
RF signals, such a system can be integrated with the current WiFi systems to
provide orders of magnitude improvement in bandwidth. A novel component of WiFO
is its ability to jointly encode bits from different flows for optimal
transmissions. In this paper, we introduce the WiFO architecture and a novel
cooperative transmission framework using location assisted coding (LAC)
technique to increase the overall wireless capacity. Specifically, achievable
rate regions for WiFO using LAC will be characterized. Both numerical and
theoretical analyses are given to validate the proposed coding schemes.
Anastasios Papazafeiropoulos, Bruno Clerckx, Tharm Ratnarajah
Comments: accepted in IEEE ICC 2017 Wireless Communications Symposium
Subjects: Information Theory (cs.IT)
This work encompasses Rate-Splitting (RS), providing significant benefits in
multi-user settings in the context of huge degrees of freedom promised by
massive Multiple-Input Multiple-Output (MIMO). However, the requirement of
massive MIMO for cost-efficient implementation makes them more prone to
hardware imperfections such as phase noise (PN). As a result, we focus on a
realistic broadcast channel with a large number of antennas and hampered by the
unavoidable PN. Moreover, we employ the RS transmission strategy, and we show
its robustness against PN, since the sum-rate does not saturate at high
signal-to-noise ratio (SNR). Although, the analytical results are obtained by
means of the deterministic equivalent analysis, they coincide with simulation
results even for finite system dimensions.
Javad Heydari, Ali Tajer
Comments: 31 pages, 7 figures
Subjects: Applications (stat.AP); Information Theory (cs.IT)
Agile localization of anomalous events plays a pivotal role in enhancing the
overall reliability of the grid and avoiding cascading failures. This is
especially of paramount significance in the large-scale grids due to their
geographical expansions and the large volume of data generated. This paper
proposes a stochastic graphical framework, by leveraging which it aims to
localize the anomalies with the minimum amount of data. This framework
capitalizes on the strong correlation structures observed among the
measurements collected from different buses. The proposed approach, at its
core, collects the measurements sequentially and progressively updates its
decision about the location of the anomaly. The process resumes until the
location of the anomaly can be identified with desired reliability. We provide
a general theory for the quickest anomaly localization and also investigate its
application for quickest line outage localization. Simulations in the IEEE
118-bus model are provided to establish the gains of the proposed approach.
Alberto Enciso, Piergiulio Tempesta
Comments: 15 pages, no figures
Subjects: Mathematical Physics (math-ph); Information Theory (cs.IT); High Energy Physics – Theory (hep-th); Quantum Physics (quant-ph)
The requirement that an entropy function be composable is key: it means that
the entropy of a compound system can be calculated in terms of the entropy of
its independent components. We prove that, under mild regularity assumptions,
the only composable generalized entropy in trace form is the Tsallis
one-parameter family (which contains Boltzmann-Gibbs as a particular case).
This result leads to the use of generalized entropies that are not of trace
form, such as R’enyi’s entropy, in the study of complex systems. In this
direction, we also present a characterization theorem for a large class of
composable non-trace-form entropy functions with features akin to those of
R’enyi’s entropy.
Taposh Banerjee, Alfred O. Hero III
Comments: Asilomar 2016
Subjects: Statistics Theory (math.ST); Information Theory (cs.IT)
A sequential test is proposed for detection and isolation of hubs in a
correlation graph. Hubs in a correlation graph of a random vector are variables
(nodes) that have a strong correlation edge. It is assumed that the random
vectors are high-dimensional and are multivariate Gaussian distributed. The
test employs a family of novel local and global summary statistics generated
from small samples of the random vectors. Delay and false alarm analysis of the
test is obtained and numerical results are provided to show that the test is
consistent in identifying hubs, as the false alarm rate goes to zero.