A. Emin Orhan
Comments: 16 pages, 12 figures, 1 supplementary figure
Subjects: Neural and Evolutionary Computing (cs.NE)
Skip connections made the training of very deep neural networks possible and
have become an indispendable component in a variety of neural architectures. A
satisfactory explanation for their success remains elusive. Here, we present an
explanation for the benefits of skip connections in training very deep neural
networks. We argue that skip connections help break symmetries inherent in the
loss landscapes of deep networks, leading to drastically simplified landscapes.
In particular, skip connections between adjacent layers in a multilayer network
break the permutation symmetry of nodes in a given layer, and the recently
proposed DenseNet architecture, where each layer projects skip connections to
every layer above it, also breaks the rescaling symmetry of connectivity
matrices between different layers. This hypothesis is supported by evidence
from a toy model with binary weights and from experiments with fully-connected
networks suggesting (i) that skip connections do not necessarily improve
training unless they help break symmetries and (ii) that alternative ways of
breaking the symmetries also lead to significant performance improvements in
training deep networks, hence there is nothing special about skip connections
in this respect. We find, however, that skip connections confer additional
benefits over and above symmetry-breaking, such as the ability to deal
effectively with the vanishing gradients problem.
Sheila Mahapatra, Nitin Malik
Comments: Google Scholar Indexed Australian journal, ISSN: 2005-4238, 8 pages, 2 figures, 1 table
Journal-ref: International Journal of Advanced Science and Technology, vol.89,
pp.1-8, April 2016
Subjects: Neural and Evolutionary Computing (cs.NE); Systems and Control (cs.SY)
This paper proposes a hybrid technique for secured optimal power flow coupled
with enhancing voltage stability with FACTS device installation. The hybrid
approach of Improved Gravitational Search algorithm (IGSA) and Firefly
algorithm (FA) performance is analyzed by optimally placing TCSC controller.
The algorithm is implemented in MATLAB working platform and the power flow
security and voltage stability is evaluated with IEEE 30 bus transmission
systems. The optimal results generated are compared with those available in
literature and the superior performance of algorithm is depicted as minimum
generation cost, reduced real power losses along with sustaining voltage
stability.
André R. Goncalves, Celso Cavelucci, Christiano Lyra Filho, Fernando J. Von Zuben
Comments: Paper published in the 6th IEEE/PES Transmission and Distribution: Latin America, 2012, Montevideo, Uruguay
Subjects: Optimization and Control (math.OC); Neural and Evolutionary Computing (cs.NE)
Installation of capacitors in distribution networks is one of the most used
procedure to compensate reactive power generated by loads and, consequently, to
reduce technical losses. So, the problem consists in identifying the optimal
placement and sizing of capacitors. This problem is known in the literature as
optimal capacitor placement problem. Neverthless, depending on the location and
size of the capacitor, it may become a harmonic source, allowing capacitor to
enter into resonance with the distribution network, causing several undesired
side effects. In this work we propose a parsimonious method to deal with the
capacitor placement problem that incorporates resonance constraints, ensuring
that every allocated capacitor will not act as a harmonic source. This proposed
algorithm is based upon a physical inspired metaheuristic known as Extremal
Optimization. The results achieved showed that this proposal has reached
significant gains when compared with other proposals that attempt repair, in a
post-optimization stage, already obtained solutions which violate resonance
constraints.
Naveen Mellempudi, Abhisek Kundu, Dipankar Das, Dheevatsa Mudigere, Bharat Kaul
Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
We propose a cluster-based quantization method to convert pre-trained full
precision weights into ternary weights with minimal impact on the accuracy. In
addition, we also constrain the activations to 8-bits thus enabling sub 8-bit
full integer inference pipeline. Our method uses smaller clusters of N filters
with a common scaling factor to minimize the quantization loss, while also
maximizing the number of ternary operations. We show that with a cluster size
of N=4 on Resnet-101, can achieve 71.8% TOP-1 accuracy, within 6% of the best
full precision results while replacing approx 85% of all multiplications with
8-bit accumulations. Using the same method with 4-bit weights achieves 76.3%
TOP-1 accuracy which within 2% of the full precision result. We also study the
impact of the size of the cluster on both performance and accuracy, larger
cluster sizes N=64 can replace approx 98% of the multiplications with ternary
operations but introduces significant drop in accuracy which necessitates fine
tuning the parameters with retraining the network at lower precision. To
address this we have also trained low-precision Resnet-50 with 8-bit
activations and ternary weights by pre-initializing the network with full
precision weights and achieve 68.9% TOP-1 accuracy within 4 additional epochs.
Our final quantized model can run on a full 8-bit compute pipeline, with a
potential 16x improvement in performance compared to baseline full-precision
models.
Samarth Brahmbhatt, James Hays
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We present DeepNav, a Convolutional Neural Network (CNN) based algorithm for
navigating large cities using locally visible street-view images. The DeepNav
agent learns to reach its destination quickly by making the correct navigation
decisions at intersections. We collect a large-scale dataset of street-view
images organized in a graph where nodes are connected by roads. This dataset
contains 10 city graphs and more than 1 million street-view images. We propose
3 supervised learning approaches for the navigation task and show how A* search
in the city graph can be used to generate supervision for the learning. Our
annotation process is fully automated using publicly available mapping services
and requires no human input. We evaluate the proposed DeepNav models on 4
held-out cities for navigating to 5 different types of destinations. Our
algorithms outperform previous work that uses hand-crafted features and Support
Vector Regression (SVR)[19].
Seyede Mahya Hazavei, Hamid Reza Shahdoosti
Comments: 6 pages, 12 figures, conference
Subjects: Computer Vision and Pattern Recognition (cs.CV)
During the last decades, denoising methods have attracted much attention of
researchers. The conventional method for removing the Moire’ pattern from
images is using notch filters in the Frequency-domain. In this paper a new
method is proposed that can achieve a better performance in comparison with the
traditional method. Median filter is used in some part of spectrum of noisy
images to reduce the noise. At the second part of this paper, to demonstrate
the robustness of the proposed method, it is implemented for some noisy images
that have moire’ pattern. Experiments on noisy images with different
characteristics show that the proposed method increases the PSNR values
compared with previous methods.
Aleksei Tiulpin, Jérôme Thevenot, Esa Rahtu, Simo Saarakkala
Comments: Submitted to Scandinavian Conference on Image Analysis (SCIA) 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Osteoarthritis (OA) is a common musculoskeletal condition typically diagnosed
from radiographic assessment after clinical examination. However, a visual
evaluation made by a practitioner suffers from subjectivity and is highly
dependent on the experience. Computer-aided diagnostics (CAD) could improve the
objectivity of knee radiographic examination. The first essential step of knee
OA CAD is to automatically localize the joint area. However, according to the
literature this task itself remains challenging. The aim of this study was to
develop novel and computationally efficient method to tackle the issue. Here,
three different datasets of knee radiographs were used (n = 473/93/77) to
validate the overall performance of the method. Our pipeline consists of two
parts: anatomically-based joint area proposal and their evaluation using
Histogram of Oriented Gradients and the pre-trained Support Vector Machine
classifier scores. The obtained results for the used datasets show the mean
intersection over the union equal to: 0.84, 0.79 and 0.78. Using a high-end
computer, the method allows to automatically annotate conventional knee
radiographs within 14-16ms and high resolution ones within 170ms. Our results
demonstrate that the developed method is suitable for large-scale analyses.
Alin-Ionut Popa, Mihai Zanfir, Cristian Sminchisescu
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We propose a deep multitask architecture for emph{fully automatic 2d and 3d
human sensing} (DMHS), including emph{recognition and reconstruction}, in
emph{monocular images}. The system computes the figure-ground segmentation,
semantically identifies the human body parts at pixel level, and estimates the
2d and 3d pose of the person. The model supports the joint training of all
components by means of multi-task losses where early processing stages
recursively feed into advanced ones for increasingly complex calculations,
accuracy and robustness. The design allows us to tie a complete training
protocol, by taking advantage of multiple datasets that would otherwise
restrictively cover only some of the model components: complex 2d image data
with no body part labeling and without associated 3d ground truth, or complex
3d data with limited 2d background variability. In detailed experiments based
on several challenging 2d and 3d datasets (LSP, HumanEva, Human3.6M), we
evaluate the sub-structures of the model, the effect of various types of
training data in the multitask loss, and demonstrate that state-of-the-art
results can be achieved at all processing levels. We also show that in the wild
our monocular RGB architecture is perceptually competitive to a state-of-the
art (commercial) Kinect system based on RGB-D data.
Pedro Costa, Adrian Galdran, Maria Inês Meyer, Michael David Abràmoff, Meindert Niemeijer, Ana Maria Mendonça, Aurélio Campilho
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)
Synthesizing images of the eye fundus is a challenging task that has been
previously approached by formulating complex models of the anatomy of the eye.
New images can then be generated by sampling a suitable parameter space. In
this work, we propose a method that learns to synthesize eye fundus images
directly from data. For that, we pair true eye fundus images with their
respective vessel trees, by means of a vessel segmentation technique. These
pairs are then used to learn a mapping from a binary vessel tree to a new
retinal image. For this purpose, we use a recent image-to-image translation
technique, based on the idea of adversarial learning. Experimental results show
that the original and the generated images are visually different in terms of
their global appearance, in spite of sharing the same vessel tree.
Additionally, a quantitative quality analysis of the synthetic retinal images
confirms that the produced images retain a high proportion of the true image
set quality.
Nhan Truong, Levin Kuhlmann, Mohammad Reza Bonyadi, Jiawei Yang, Andrew Faulks, Omid Kavehei
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Detecting seizure using brain neuroactivations recorded by intracranial
electroencephalogram (iEEG) has been widely used for monitoring, diagnosing,
and closed-loop therapy of epileptic patients, however, computational
efficiency gains are needed if state-of-the-art methods are to be implemented
in implanted devices. We present a novel method for automatic seizure detection
based on iEEG data that outperforms current state-of-the-art seizure detection
methods in terms of computational efficiency while maintaining the accuracy.
The proposed algorithm incorporates an automatic channel selection (ACS) engine
as a pre-processing stage to the seizure detection procedure. The ACS engine
consists of supervised classifiers which aim to find iEEGchannelswhich
contribute the most to a seizure. Seizure detection stage involves feature
extraction and classification. Feature extraction is performed in both
frequency and time domains where spectral power and correlation between channel
pairs are calculated. Random Forest is used in classification of interictal,
ictal and early ictal periods of iEEG signals. Seizure detection in this paper
is retrospective and patient-specific. iEEG data is accessed via Kaggle,
provided by International Epilepsy Electro-physiology Portal. The dataset
includes a training set of 6.5 hours of interictal data and 41 minin ictal data
and a test set of 9.14 hours. Compared to the state-of-the-art on the same
dataset, we achieve 49.4% increase in computational efficiency and 400 mins
better in average for detection delay. The proposed model is able to detect a
seizure onset at 91.95% sensitivity and 94.05% specificity with a mean
detection delay of 2.77 s. The area under the curve (AUC) is 96.44%, that is
comparable to the current state-of-the-art with AUC of 96.29%.
Da Zhang, Hamid Maei, Xin Wang, Yuan-Fang Wang
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
Convolutional neural network (CNN) models have achieved tremendous success in
many visual detection and recognition tasks. Unfortunately, visual tracking, a
fundamental computer vision problem, is not handled well using the existing CNN
models, because most object trackers implemented with CNN do not effectively
leverage temporal and contextual information among consecutive frames.
Recurrent neural network (RNN) models, on the other hand, are often used to
process text and voice data due to their ability to learn intrinsic
representations of sequential and temporal data. Here, we propose a novel
neural network tracking model that is capable of integrating information over
time and tracking a selected target in video. It comprises three components: a
CNN extracting best tracking features in each video frame, an RNN constructing
video memory state, and a reinforcement learning (RL) agent making target
location decisions. The tracking problem is formulated as a decision-making
process, and our model can be trained with RL algorithms to learn good tracking
policies that pay attention to continuous, inter-frame correlation and maximize
tracking performance in the long run. We compare our model with an existing
neural-network based tracking method and show that the proposed tracking
approach works well in various scenarios by performing rigorous validation
experiments on artificial video sequences with ground truth. To the best of our
knowledge, our tracker is the first neural-network tracker that combines
convolutional and recurrent networks with RL algorithms.
Hadar Averbuch-Elor, Johannes Kopf, Tamir Hazan, Daniel Cohen-Or
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We present a co-segmentation technique for space-time co-located image
collections. These prevalent collections capture various dynamic events,
usually by multiple photographers, and may contain multiple co-occurring
objects which are not necessarily part of the intended foreground object,
resulting in ambiguities for traditional co-segmentation techniques. Thus, to
disambiguate what the common foreground object is, we introduce a
weakly-supervised technique, where we assume only a small seed, given in the
form of a single segmented image. We take a distributed approach, where local
belief models are propagated and reinforced with similar images. Our technique
progressively expands the foreground and background belief models across the
entire collection. The technique exploits the power of the entire set of image
without building a global model, and thus successfully overcomes large
variability in appearance of the common foreground object. We demonstrate that
our method outperforms previous co-segmentation techniques on challenging
space-time co-located collections, including dense benchmark datasets which
were adapted for our novel problem setting.
Padmavathi K, Mahima Bhat, Maya V Karki
Comments: 8 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Multimodal medical image fusion helps to increase efficiency in medical
diagnosis. This paper presents multimodal medical image fusion by selecting
relevant features using Principle Component Analysis (PCA) and Particle Swarm
Optimization techniques (PSO). DTCWT is used for decomposition of the images
into low and high frequency coefficients. Fusion rules such as combination of
minimum, maximum and simple averaging are applied to approximate and detailed
coefficients. The fused image is reconstructed by inverse DTCWT. Performance
metrics are evaluated and it shows that DTCWT-PCA performs better than
DTCWT-PSO in terms of Structural Similarity Index Measure (SSIM) and Cross
Correlation (CC). Computation time and feature vector size is reduced in
DTCWT-PCA compared to DTCWT-PSO for feature selection which proves robustness
and storage capacity.
Xiaqing Pan, Yueru Chen, C.-C. Jay Kuo
Comments: arXiv admin note: text overlap with arXiv:1603.01942
Subjects: Computer Vision and Pattern Recognition (cs.CV)
A novel solution for the content-based 3D shape retrieval problem using an
unsupervised clustering approach, which does not need any label information of
3D shapes, is presented in this work. The proposed shape retrieval system
consists of two modules in cascade: the irrelevance filtering (IF) module and
the similarity ranking (SR) module. The IF module attempts to cluster gallery
shapes that are similar to each other by examining global and local features
simultaneously. However, shapes that are close in the local feature space can
be distant in the global feature space, and vice versa. To resolve this issue,
we propose a joint cost function that strikes a balance between two distances.
Irrelevant samples that are close in the local feature space but distant in the
global feature space can be removed in this stage. The remaining gallery
samples are ranked in the SR module using the local feature. The superior
performance of the proposed IF/SR method is demonstrated by extensive
experiments conducted on the popular SHREC12 dataset.
Ram Krishna Pandey, A G Ramakrishnan
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Recognition of document images have important applications in restoring old
and classical texts. The problem involves quality improvement before passing it
to a properly trained OCR to get accurate recognition of the text. The image
enhancement and quality improvement constitute important steps as subsequent
recognition depends upon the quality of the input image. There are scenarios
when high resolution images are not available and our experiments show that the
OCR accuracy reduces significantly with decrease in the spatial resolution of
document images. Thus the only option is to improve the resolution of such
document images. The goal is to construct a high resolution image, given a
single low resolution binary image, which constitutes the problem of single
image super-resolution. Most of the previous work in super-resolution deal with
natural images which have more information-content than the document images.
Here, we use Convolution Neural Network to learn the mapping between low and
the corresponding high resolution images. We experiment with different number
of layers, parameter settings and non-linear functions to build a fast
end-to-end framework for document image super-resolution. Our proposed model
shows a very good PSNR improvement of about 4 dB on 75 dpi Tamil images,
resulting in a 3 % improvement of word level accuracy by the OCR. It takes less
time than the recent sparse based natural image super-resolution technique,
making it useful for real-time document recognition applications.
Alexey A. Novikov, David Major, Dimitrios Lenis, Jiri Hladůvka, Maria Wimmer, Katja Bühler
Subjects: Computer Vision and Pattern Recognition (cs.CV)
The recent success of Deep Convolutional Neural Networks on image
classification and recognition tasks has led to new applications in very
diversifying contexts. One of these is medical imaging where scarcity and
imbalance of training data has hindered rapid development of neural network
related applications.
This paper investigates and proposes neural network architectures within the
context of automated segmentation of anatomical organs in chest radiographs,
namely for lung, clavicles and heart. By relating prior class data
distributions to the objective function sparsely represented structures are
methodologically emphasized. Scarce training sets and data augmentation are
encountered with aggressive data regularization. The problem of highly
imbalanced target object appearance in the input data is solved by modifying
the objective function.
The models are trained and tested on the publicly available JSRT database
consisting of 247 X-Ray images the ground-truth masks for which available in
the SCR database. The networks have been trained in a multi-class setup with
three target classes.
Our best performing model trained with the negative Dice loss function was
able to reach mean Jaccard overlap scores of 94.1\% for lungs, 86.6\% for heart
and 88.7\% for clavicles in the multi-label setup, therefore, outperforming the
best state-of-the art methods for heart and clavicle and human observer on lung
and heart segmentation tasks.
Moustafa Alzantot, Supriyo Chakraborty, Mani B. Srivastava
Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Our ability to synthesize sensory data that preserves specific statistical
properties of the real data has had tremendous implications on data privacy and
big data analytics. The synthetic data can be used as a substitute for
selective real data segments,that are sensitive to the user, thus protecting
privacy and resulting in improved analytics.However, increasingly adversarial
roles taken by data recipients such as mobile apps, or other cloud-based
analytics services, mandate that the synthetic data, in addition to preserving
statistical properties, should also be difficult to distinguish from the real
data. Typically, visual inspection has been used as a test to distinguish
between datasets. But more recently, sophisticated classifier models
(discriminators), corresponding to a set of events, have also been employed to
distinguish between synthesized and real data. The model operates on both
datasets and the respective event outputs are compared for consistency. In this
paper, we take a step towards generating sensory data that can pass a deep
learning based discriminator model test, and make two specific contributions:
first, we present a deep learning based architecture for synthesizing sensory
data. This architecture comprises of a generator model, which is a stack of
multiple Long-Short-Term-Memory (LSTM) networks and a Mixture Density Network.
second, we use another LSTM network based discriminator model for
distinguishing between the true and the synthesized data. Using a dataset of
accelerometer traces, collected using smartphones of users doing their daily
activities, we show that the deep learning based discriminator model can only
distinguish between the real and synthesized traces with an accuracy in the
neighborhood of 50%.
Fabio Gagliardi Cozman, Denis Deratani Mauá
Subjects: Artificial Intelligence (cs.AI)
We examine the meaning and the complexity of probabilistic logic programs
that consist of a set of rules and a set of independent probabilistic facts
(that is, programs based on Sato’s distribution semantics). We focus on two
semantics, respectively based on stable and on well-founded models. We show
that the semantics based on stable models (referred to as the “credal
semantics”) produces sets of probability models that dominate infinitely
monotone Choquet capacities, we describe several useful consequences of this
result. We then examine the complexity of inference with probabilistic logic
programs. We distinguish between the complexity of inference when a
probabilistic program and a query are given (the inferential complexity), and
the complexity of inference when the probabilistic program is fixed and the
query is given (the query complexity, akin to data complexity as used in
database theory). We obtain results on the inferential and query complexity for
acyclic, stratified, and cyclic propositional and relational programs,
complexity reaches various levels of the counting hierarchy and even
exponential levels.
AmirEmad Ghassami, Negar Kiyavash
Subjects: Artificial Intelligence (cs.AI)
Interaction information is one of the multivariate generalizations of mutual
information, which expresses the amount information shared among a set of
variables, beyond the information, which is shared in any proper subset of
those variables. Unlike (conditional) mutual information, which is always
non-negative, interaction information can be negative. We utilize this property
to find the direction of causal influences among variables in a triangle
topology under some mild assumptions.
Francois Belletti, Daniel Haziza, Gabriel Gomes, Alexandre M. Bayen
Subjects: Artificial Intelligence (cs.AI)
This article shows how the recent breakthroughs in Reinforcement Learning
(RL) that have enabled robots to learn to play arcade video games, walk or
assemble colored bricks, can be used to perform other tasks that are currently
at the core of engineering cyberphysical systems. We present the first use of
RL for the control of systems modeled by discretized non-linear Partial
Differential Equations (PDEs) and devise a novel algorithm to use
non-parametric control techniques for large multi-agent systems. We show how
neural network based RL enables the control of discretized PDEs whose
parameters are unknown, random, and time-varying. We introduce an algorithm of
Mutual Weight Regularization (MWR) which alleviates the curse of dimensionality
of multi-agent control schemes by sharing experience between agents while
giving each agent the opportunity to specialize its action policy so as to
tailor it to the local parameters of the part of the system it is located in.
Rodrigo Agerri, German Rigau
Comments: 26 pages, 19 tables (submitted for publication on September 2015), Artificial Intelligence (2016)
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
We present a multilingual Named Entity Recognition approach based on a robust
and general set of features across languages and datasets. Our system combines
shallow local information with clustering semi-supervised features induced on
large amounts of unlabeled text. Understanding via empirical experimentation
how to effectively combine various types of clustering features allows us to
seamlessly export our system to other datasets and languages. The result is a
simple but highly competitive system which obtains state of the art results
across five languages and twelve datasets. The results are reported on standard
shared task evaluation data such as CoNLL for English, Spanish and Dutch.
Furthermore, and despite the lack of linguistically motivated features, we also
report best results for languages such as Basque and German. In addition, we
demonstrate that our method also obtains very competitive results even when the
amount of supervised data is cut by half, alleviating the dependency on
manually annotated data. Finally, the results show that our emphasis on
clustering features is crucial to develop robust out-of-domain models. The
system and models are freely available to facilitate its use and guarantee the
reproducibility of results.
Pan Li, Arya Mazumdar, Olgica Milenkovic
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)
We propose a novel rank aggregation method based on converting permutations
into their corresponding Lehmer codes or other subdiagonal images. Lehmer
codes, also known as inversion vectors, are vector representations of
permutations in which each coordinate can take values not restricted by the
values of other coordinates. This transformation allows for decoupling of the
coordinates and for performing aggregation via simple scalar median or mode
computations. We present simulation results illustrating the performance of
this completely parallelizable approach and analytically prove that both the
mode and median aggregation procedure recover the correct centroid aggregate
with small sample complexity when the permutations are drawn according to the
well-known Mallows models. The proposed Lehmer code approach may also be used
on partial rankings, with similar performance guarantees.
Jeff Heaton
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI)
Frequent itemset mining is a popular data mining technique. Apriori, Eclat,
and FP-Growth are among the most common algorithms for frequent itemset mining.
Considerable research has been performed to compare the relative performance
between these three algorithms, by evaluating the scalability of each algorithm
as the dataset size increases. While scalability as data size increases is
important, previous papers have not examined the performance impact of
similarly sized datasets that contain different itemset characteristics. This
paper explores the effects that two dataset characteristics can have on the
performance of these three frequent itemset algorithms. To perform this
empirical analysis, a dataset generator is created to measure the effects of
frequent item density and the maximum transaction size on performance. The
generated datasets contain the same number of rows. This provides some insight
into dataset characteristics that are conducive to each algorithm. The results
of this paper’s research demonstrate Eclat and FP-Growth both handle increases
in maximum transaction size and frequent itemset density considerably better
than the Apriori algorithm.
This paper explores the effects that two dataset characteristics can have on
the performance of these three frequent itemset algorithms. To perform this
empirical analysis, a dataset generator is created to measure the effects of
frequent item density and the maximum transaction size on performance. The
generated datasets contain the same number of rows. This provides some insight
into dataset characteristics that are conducive to each algorithm. The results
of this paper’s research demonstrate Eclat and FP-Growth both handle increases
in maximum transaction size and frequent itemset density considerably better
than the Apriori algorithm.
Marco Baroni, Armand Joulin, Allan Jabri, Germàn Kruszewski, Angeliki Lazaridou, Klemen Simonic, Tomas Mikolov
Comments: Submitted to ICLR 2017 Workshop Track
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
With machine learning successfully applied to new daunting problems almost
every day, general AI starts looking like an attainable goal. However, most
current research focuses instead on important but narrow applications, such as
image classification or machine translation. We believe this to be largely due
to the lack of objective ways to measure progress towards broad machine
intelligence. In order to fill this gap, we propose here a set of concrete
desiderata for general AI, together with a platform to test machines on how
well they satisfy such desiderata, while keeping all further complexities to a
minimum.
Smruti Amarjyoti
Comments: 18 pages
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI)
The focus of this work is to enumerate the various approaches and algorithms
that center around application of reinforcement learning in robotic ma-
]]nipulation tasks. Earlier methods utilized specialized policy representations
and human demonstrations to constrict the policy. Such methods worked well with
continuous state and policy space of robots but failed to come up with
generalized policies. Subsequently, high dimensional non-linear function
approximators like neural networks have been used to learn policies from
scratch. Several novel and recent approaches have also embedded control policy
with efficient perceptual representation using deep learning. This has led to
the emergence of a new branch of dynamic robot control system called deep r
inforcement learning(DRL). This work embodies a survey of the most recent
algorithms, architectures and their implementations in simulations and real
world robotic platforms. The gamut of DRL architectures are partitioned into
two different branches namely, discrete action space algorithms(DAS) and
continuous action space algorithms(CAS). Further, the CAS algorithms are
divided into stochastic continuous action space(SCAS) and deterministic
continuous action space(DCAS) algorithms. Along with elucidating an organ-
isation of the DRL algorithms this work also manifests some of the state of the
art applications of these approaches in robotic manipulation tasks.
Laroche Romain, Feraud Raphael
Comments: 27 pages
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG); Optimization and Control (math.OC)
Dialogue systems rely on a careful reinforcement learning design: the
learning algorithm and its state space representation. In lack of more rigorous
knowledge, the designer resorts to its practical experience to choose the best
option. In order to automate and to improve the performance of the
aforementioned process, this article formalises the problem of online
off-policy reinforcement learning algorithm selection. A meta-algorithm is
given for input a portfolio constituted of several off-policy reinforcement
learning algorithms. It then determines at the beginning of each new
trajectory, which algorithm in the portfolio is in control of the behaviour
during the full next trajectory, in order to maximise the return. The article
presents a novel meta-algorithm, called Epochal Stochastic Bandit Algorithm
Selection (ESBAS). Its principle is to freeze the policy updates at each epoch,
and to leave a rebooted stochastic bandit in charge of the algorithm selection.
Under some assumptions, a thorough theoretical analysis demonstrates its
near-optimality considering the structural sampling budget limitations. Then,
ESBAS is put to the test in a set of experiments with various portfolios, on a
negotiation dialogue game. The results show the practical benefits of the
algorithm selection for dialogue systems, in most cases even outperforming the
best algorithm in the portfolio, even when the aforementioned assumptions are
transgressed.
Rupam Bhattacharyya, Adity Saikia, Shyamanta M. Hazarika
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI)
Retention of residual skills for persons who partially lose their cognitive
or physical ability is of utmost importance. Research is focused on developing
systems that provide need-based assistance for retention of such residual
skills. This paper describes a novel cognitive collaborative control
architecture C3A, designed to address the challenges of developing need- based
assistance for wheelchair navigation. Organization of C3A is detailed and
results from simulation of the proposed architecture is presented. For
simulation of our proposed architecture, we have used ROS (Robot Operating
System) as a control framework and a 3D robotic simulator called USARSim
(Unified System for Automation and Robot Simulation).
Maged M. Eljazzar, Afnan Hassan, Amira A. AlSharkawy
Subjects: Information Retrieval (cs.IR); Computers and Society (cs.CY)
The number of Internet Muslim-users is remarkably increasing from all over
the world countries. There are a lot of structured, and well-documented text
resources for the Quran interpretation, Tafsir, over the Internet with several
languages. Nevertheless, when searching for the meaning of specific words, many
users prefer watching short videos rather than reading a script or a book. This
paper introduces the solution for the challenge of partitioning the common
Tafsir videos into short videos according to the search query and sharing these
result videos on the social networks. Furthermore, we provide the ability of
user commenting on a specific time-based frame on the video or a specific verse
in a particular subject. It would be very valuable to apply the current
technologies on Holy Quran and Tafsir to easy the query for verses,
understanding of its meaning, and sharing it on the different social media.
Guang-Neng Hu
Comments: 12 pages, PAKDD 2017
Subjects: Information Retrieval (cs.IR)
Item recommendation task predicts a personalized ranking over a set of items
for individual user. One paradigm is the rating-based methods that concentrate
on explicit feedbacks and hence face the difficulties in collecting them.
Meanwhile, the ranking-based methods are presented with rated items and then
rank the rated above the unrated. This paradigm uses widely available implicit
feedback but it usually ignores some important information: item reviews. Item
reviews not only justify the preferences of users, but also help alleviate the
cold-start problem that fails the collaborative filtering. In this paper, we
propose two novel and simple models to integrate item reviews into matrix
factorization based Bayesian personalized ranking (BPR-MF). In each model, we
make use of text features extracted from item reviews via word embeddings. On
top of text features we uncover the review dimensions that explain the
variation in users’ feedback and these review factors represent a prior
preference of a user. Experiments on real-world data sets show the benefits of
leveraging item reviews on ranking prediction. We also conduct analyses to
understand the proposed models.
Aria Rezaei, Bryan Perozzi, Leman Akoglu
Comments: WWW’17 Web Science, 9 pages
Subjects: Social and Information Networks (cs.SI); Information Retrieval (cs.IR); Physics and Society (physics.soc-ph)
Given a set of attributed subgraphs known to be from different classes, how
can we discover their differences? There are many cases where collections of
subgraphs may be contrasted against each other. For example, they may be
assigned ground truth labels (spam/not-spam), or it may be desired to directly
compare the biological networks of different species or compound networks of
different chemicals.
In this work we introduce the problem of characterizing the differences
between attributed subgraphs that belong to different classes. We define this
characterization problem as one of partitioning the attributes into as many
groups as the number of classes, while maximizing the total attributed quality
score of all the given subgraphs.
We show that our attribute-to-class assignment problem is NP-hard and an
optimal ((1 – 1/e))-approximation algorithm exists. We also propose two
different faster heuristics that are linear-time in the number of attributes
and subgraphs. Unlike previous work where only attributes were taken into
account for characterization, here we exploit both attributes and social ties
(i.e. graph structure).
Through extensive experiments, we compare our proposed algorithms, show
findings that agree with human intuition on datasets from Amazon co-purchases,
Congressional bill sponsorships, and DBLP co-authorships. We also show that our
approach of characterizing subgraphs is better suited for sense-making than
discriminating classification approaches.
Rodrigo Agerri, German Rigau
Comments: 26 pages, 19 tables (submitted for publication on September 2015), Artificial Intelligence (2016)
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
We present a multilingual Named Entity Recognition approach based on a robust
and general set of features across languages and datasets. Our system combines
shallow local information with clustering semi-supervised features induced on
large amounts of unlabeled text. Understanding via empirical experimentation
how to effectively combine various types of clustering features allows us to
seamlessly export our system to other datasets and languages. The result is a
simple but highly competitive system which obtains state of the art results
across five languages and twelve datasets. The results are reported on standard
shared task evaluation data such as CoNLL for English, Spanish and Dutch.
Furthermore, and despite the lack of linguistically motivated features, we also
report best results for languages such as Basque and German. In addition, we
demonstrate that our method also obtains very competitive results even when the
amount of supervised data is cut by half, alleviating the dependency on
manually annotated data. Finally, the results show that our emphasis on
clustering features is crucial to develop robust out-of-domain models. The
system and models are freely available to facilitate its use and guarantee the
reproducibility of results.
Marco Baroni, Armand Joulin, Allan Jabri, Germàn Kruszewski, Angeliki Lazaridou, Klemen Simonic, Tomas Mikolov
Comments: Submitted to ICLR 2017 Workshop Track
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
With machine learning successfully applied to new daunting problems almost
every day, general AI starts looking like an attainable goal. However, most
current research focuses instead on important but narrow applications, such as
image classification or machine translation. We believe this to be largely due
to the lack of objective ways to measure progress towards broad machine
intelligence. In order to fill this gap, we propose here a set of concrete
desiderata for general AI, together with a platform to test machines on how
well they satisfy such desiderata, while keeping all further complexities to a
minimum.
Aravind Vasudevan, David Gregg
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
The critical path of a group of tasks is an important measure that is
commonly used to guide task allocation and scheduling on parallel computers.
The critical path is the longest chain of dependencies in an acyclic task
dependence graph. A problem arises on heterogeneous parallel machines where
computation and communication costs can vary between different types of
processor. Existing solutions for heterogeneous machines attempt to estimate
the critical path using average values of computation and communication costs.
However, this ignores opportunities to match specific tasks to specific classes
of processor and communication links, and can result in quite misleading paths
being identified as critical. We argue that an accurate critical path must
consider the mapping of tasks to classes of processor and communication links.
We formulate a polynomial time algorithm to find such a critical path. Our
Critical Earliest Finish Time (CEFT) algorithm finds both the length of the
critical path and an allocation of tasks to processors on that path. We
compared CEFT experimentally to existing approaches such as averaging execution
times across processors. The latter approach fails to accurately model the
execution cost of tasks, and as a result fails to identify a correct critical
path in 83.99% of cases in our experiments. We also adapted a critical
path-oriented scheduling algorithm (CPOP) to use our critical path algorithm
and found that the resulting schedules are faster.
William Pettersson, Melih Ozlen
Comments: 7 pages
Subjects: Optimization and Control (math.OC); Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS)
To obtain a better understanding of the trade-offs between various
objectives, Bi-Objective Integer Programming (BOIP) algorithms calculate the
set of all non-dominated vectors and present these as the solution to a BOIP
problem. Historically, these algorithms have been compared in terms of the
number of single-objective IPs solved and total CPU time taken to produce the
solution to a problem. This is equitable, as researchers can often have access
to widely differing amounts of computing power. However, the real world has
recently seen a large uptake of multi-core processors in computers, laptops,
tablets and even mobile phones. With this in mind, we look at how to best
utilise parallel processing to improve the elapsed time of optimisation
algorithms. We present two methods of parallelising the recursive algorithm
presented by Ozlen, Burton and MacRae. Both new methods utilise two threads and
improve running times. One of the new methods, the Meeting algorithm, halves
running time to achieve near-perfect parallelisation. The results are compared
with the efficiency of parallelisation within the commercial IP solver IBM ILOG
CPLEX, and the new methods are both shown to perform better.
Hongteng Xu, Hongyuan Zha
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
We propose an effective method to solve the event sequence clustering
problems based on a novel Dirichlet mixture model of a special but significant
type of point processes — Hawkes process. In this model, each event sequence
belonging to a cluster is generated via the same Hawkes process with specific
parameters, and different clusters correspond to different Hawkes processes.
The prior distribution of the Hawkes processes is controlled via a Dirichlet
process. We learn the model via a maximum likelihood estimator (MLE) and
propose an effective variational Bayesian inference algorithm. We specifically
analyze the resulting EM-type algorithm in the context of inner-outer
iterations and discuss several inner iteration allocation strategies. The
identifiability of our model, the convergence of our learning method, and its
sample complexity are analyzed in both theoretical and empirical ways, which
demonstrate the superiority of our method to other competitors. The proposed
method learns the number of clusters automatically and is robust to model
misspecification. Experiments on both synthetic and real-world data show that
our method can learn diverse triggering patterns hidden in asynchronous event
sequences and achieve encouraging performance on clustering purity and
consistency.
Pan Li, Arya Mazumdar, Olgica Milenkovic
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)
We propose a novel rank aggregation method based on converting permutations
into their corresponding Lehmer codes or other subdiagonal images. Lehmer
codes, also known as inversion vectors, are vector representations of
permutations in which each coordinate can take values not restricted by the
values of other coordinates. This transformation allows for decoupling of the
coordinates and for performing aggregation via simple scalar median or mode
computations. We present simulation results illustrating the performance of
this completely parallelizable approach and analytically prove that both the
mode and median aggregation procedure recover the correct centroid aggregate
with small sample complexity when the permutations are drawn according to the
well-known Mallows models. The proposed Lehmer code approach may also be used
on partial rankings, with similar performance guarantees.
Naveen Mellempudi, Abhisek Kundu, Dipankar Das, Dheevatsa Mudigere, Bharat Kaul
Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
We propose a cluster-based quantization method to convert pre-trained full
precision weights into ternary weights with minimal impact on the accuracy. In
addition, we also constrain the activations to 8-bits thus enabling sub 8-bit
full integer inference pipeline. Our method uses smaller clusters of N filters
with a common scaling factor to minimize the quantization loss, while also
maximizing the number of ternary operations. We show that with a cluster size
of N=4 on Resnet-101, can achieve 71.8% TOP-1 accuracy, within 6% of the best
full precision results while replacing approx 85% of all multiplications with
8-bit accumulations. Using the same method with 4-bit weights achieves 76.3%
TOP-1 accuracy which within 2% of the full precision result. We also study the
impact of the size of the cluster on both performance and accuracy, larger
cluster sizes N=64 can replace approx 98% of the multiplications with ternary
operations but introduces significant drop in accuracy which necessitates fine
tuning the parameters with retraining the network at lower precision. To
address this we have also trained low-precision Resnet-50 with 8-bit
activations and ternary weights by pre-initializing the network with full
precision weights and achieve 68.9% TOP-1 accuracy within 4 additional epochs.
Our final quantized model can run on a full 8-bit compute pipeline, with a
potential 16x improvement in performance compared to baseline full-precision
models.
Marco Baroni, Armand Joulin, Allan Jabri, Germàn Kruszewski, Angeliki Lazaridou, Klemen Simonic, Tomas Mikolov
Comments: Submitted to ICLR 2017 Workshop Track
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
With machine learning successfully applied to new daunting problems almost
every day, general AI starts looking like an attainable goal. However, most
current research focuses instead on important but narrow applications, such as
image classification or machine translation. We believe this to be largely due
to the lack of objective ways to measure progress towards broad machine
intelligence. In order to fill this gap, we propose here a set of concrete
desiderata for general AI, together with a platform to test machines on how
well they satisfy such desiderata, while keeping all further complexities to a
minimum.
Jeffrey Bilmes, Wenruo Bai
Subjects: Learning (cs.LG)
We start with an overview of a class of submodular functions called SCMMs
(sums of concave composed with non-negative modular functions plus a final
arbitrary modular). We then define a new class of submodular functions we call
{em deep submodular functions} or DSFs. We show that DSFs are a flexible
parametric family of submodular functions that share many of the properties and
advantages of deep neural networks (DNNs). DSFs can be motivated by considering
a hierarchy of descriptive concepts over ground elements and where one wishes
to allow submodular interaction throughout this hierarchy. Results in this
paper show that DSFs constitute a strictly larger class of submodular functions
than SCMMs. We show that, for any integer (k>0), there are (k)-layer DSFs that
cannot be represented by a (k’)-layer DSF for any (k'<k). This implies that,
like DNNs, there is a utility to depth, but unlike DNNs, the family of DSFs
strictly increase with depth. Despite this, we show (using a “backpropagation”
like method) that DSFs, even with arbitrarily large (k), do not comprise all
submodular functions. In offering the above results, we also define the notion
of an antitone superdifferential of a concave function and show how this
relates to submodular functions (in general), DSFs (in particular), negative
second-order partial derivatives, continuous submodularity, and concave
extensions. To further motivate our analysis, we provide various special case
results from matroid theory, comparing DSFs with forms of matroid rank, in
particular the laminar matroid. Lastly, we discuss strategies to learn DSFs,
and define the classes of deep supermodular functions, deep difference of
submodular functions, and deep multivariate submodular functions, and discuss
where these can be useful in applications.
Moustafa Alzantot, Supriyo Chakraborty, Mani B. Srivastava
Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Our ability to synthesize sensory data that preserves specific statistical
properties of the real data has had tremendous implications on data privacy and
big data analytics. The synthetic data can be used as a substitute for
selective real data segments,that are sensitive to the user, thus protecting
privacy and resulting in improved analytics.However, increasingly adversarial
roles taken by data recipients such as mobile apps, or other cloud-based
analytics services, mandate that the synthetic data, in addition to preserving
statistical properties, should also be difficult to distinguish from the real
data. Typically, visual inspection has been used as a test to distinguish
between datasets. But more recently, sophisticated classifier models
(discriminators), corresponding to a set of events, have also been employed to
distinguish between synthesized and real data. The model operates on both
datasets and the respective event outputs are compared for consistency. In this
paper, we take a step towards generating sensory data that can pass a deep
learning based discriminator model test, and make two specific contributions:
first, we present a deep learning based architecture for synthesizing sensory
data. This architecture comprises of a generator model, which is a stack of
multiple Long-Short-Term-Memory (LSTM) networks and a Mixture Density Network.
second, we use another LSTM network based discriminator model for
distinguishing between the true and the synthesized data. Using a dataset of
accelerometer traces, collected using smartphones of users doing their daily
activities, we show that the deep learning based discriminator model can only
distinguish between the real and synthesized traces with an accuracy in the
neighborhood of 50%.
André R. Gonçalves, Arindam Banerjee, Fernando J. Von Zuben
Comments: Accepted for the 31st AAAI Conference on Artificial Intelligence (AAAI-17)
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
Future projection of climate is typically obtained by combining outputs from
multiple Earth System Models (ESMs) for several climate variables such as
temperature and precipitation. While IPCC has traditionally used a simple model
output average, recent work has illustrated potential advantages of using a
multitask learning (MTL) framework for projections of individual climate
variables. In this paper we introduce a framework for hierarchical multitask
learning (HMTL) with two levels of tasks such that each super-task, i.e., task
at the top level, is itself a multitask learning problem over sub-tasks. For
climate projections, each super-task focuses on projections of specific climate
variables spatially using an MTL formulation. For the proposed HMTL approach, a
group lasso regularization is added to couple parameters across the
super-tasks, which in the climate context helps exploit relationships among the
behavior of different climate variables at a given spatial location. We show
that some recent works on MTL based on learning task dependency structures can
be viewed as special cases of HMTL. Experiments on synthetic and real climate
data show that HMTL produces better results than decoupled MTL methods applied
separately on the super-tasks and HMTL significantly outperforms baselines for
climate projection.
Dipan K. Pal, Vishnu Boddeti, Marios Savvides
Subjects: Learning (cs.LG)
Many theories have emerged which investigate how in- variance is generated in
hierarchical networks through sim- ple schemes such as max and mean pooling.
The restriction to max/mean pooling in theoretical and empirical studies has
diverted attention away from a more general way of generating invariance to
nuisance transformations. We con- jecture that hierarchically building
selective invariance (i.e. carefully choosing the range of the transformation
to be in- variant to at each layer of a hierarchical network) is im- portant
for pattern recognition. We utilize a novel pooling layer called adaptive
pooling to find linear pooling weights within networks. These networks with the
learnt pooling weights have performances on object categorization tasks that
are comparable to max/mean pooling networks. In- terestingly, adaptive pooling
can converge to mean pooling (when initialized with random pooling weights),
find more general linear pooling schemes or even decide not to pool at all. We
illustrate the general notion of selective invari- ance through object
categorization experiments on large- scale datasets such as SVHN and ILSVRC
2012.
Tong Liu, Qijin Cheng, Christopher M. Homan, Vincent M.B. Silenzio
Comments: 8 pages, 4 figures, 7 tables
Subjects: Learning (cs.LG); Computers and Society (cs.CY); Social and Information Networks (cs.SI)
Suicide is an important but often misunderstood problem, one that researchers
are now seeking to better understand through social media. Due in large part to
the fuzzy nature of what constitutes suicidal risks, most supervised approaches
for learning to automatically detect suicide-related activity in social media
require a great deal of human labor to train. However, humans themselves have
diverse or conflicting views on what constitutes suicidal thoughts. So how to
obtain reliable gold standard labels is fundamentally challenging and, we
hypothesize, depends largely on what is asked of the annotators and what slice
of the data they label. We conducted multiple rounds of data labeling and
collected annotations from crowdsourcing workers and domain experts. We
aggregated the resulting labels in various ways to train a series of supervised
models. Our preliminary evaluations show that using unanimously agreed labels
from multiple annotators is helpful to achieve robust machine models.
Irineo Cabreros, Karan Singh, Angela Zhou
Comments: Presented at the 2016 Data Efficient Machine Learning Workshop at International Conference on Machine Learning
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
Aggregating crowdsourced labels to generate reliable estimates of the true
answer is an important challenge for crowdsourcing systems. We introduce the
mixture of experts model, an extension of the classical Dawid-Skene [2] model,
where each question is comprised of a mixture of k topics upon which each user
has varying expertise. Under this new model, we (1) provide a spectral
algorithm for estimating true labels with provable asymptotic guarantees and
(2) introduce a task assignment algorithm in a quasi-online setting. We test
these algorithms on simulated data, finding that both methods perform well for
large budget allocations and that the assignment algorithm boosts performance
in the low-sampling regime. As a consequence of this work, we also develop a
simple method for efficiently fitting the mixture of experts model that may be
more widely applicable.
Mikhail V. Goubko, Sergey O. Kuznetsov, Alexey A. Neznanov, Dmitry I. Ignatov
Journal-ref: IFAC-PapersOnLine, 49(32), 2016, p. 24-29, ISSN 2405-8963
Subjects: Learning (cs.LG); Systems and Control (cs.SY); Machine Learning (stat.ML)
In coming years residential consumers will face real-time electricity tariffs
with energy prices varying day to day, and effective energy saving will require
automation – a recommender system, which learns consumer’s preferences from her
actions. A consumer chooses a scenario of home appliance use to balance her
comfort level and the energy bill. We propose a Bayesian learning algorithm to
estimate the comfort level function from the history of appliance use. In
numeric experiments with datasets generated from a simulation model of a
consumer interacting with small home appliances the algorithm outperforms
popular regression analysis tools. Our approach can be extended to control an
air heating and conditioning system, which is responsible for up to half of a
household’s energy bill.
Pedro Costa, Adrian Galdran, Maria Inês Meyer, Michael David Abràmoff, Meindert Niemeijer, Ana Maria Mendonça, Aurélio Campilho
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)
Synthesizing images of the eye fundus is a challenging task that has been
previously approached by formulating complex models of the anatomy of the eye.
New images can then be generated by sampling a suitable parameter space. In
this work, we propose a method that learns to synthesize eye fundus images
directly from data. For that, we pair true eye fundus images with their
respective vessel trees, by means of a vessel segmentation technique. These
pairs are then used to learn a mapping from a binary vessel tree to a new
retinal image. For this purpose, we use a recent image-to-image translation
technique, based on the idea of adversarial learning. Experimental results show
that the original and the generated images are visually different in terms of
their global appearance, in spite of sharing the same vessel tree.
Additionally, a quantitative quality analysis of the synthetic retinal images
confirms that the produced images retain a high proportion of the true image
set quality.
Abdelghafour Talibi, Boujemâa Achchab, Rafik Lasri
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
The mixture models have become widely used in clustering, given its
probabilistic framework in which its based, however, for modern databases that
are characterized by their large size, these models behave disappointingly in
setting out the model, making essential the selection of relevant variables for
this type of clustering. After recalling the basics of clustering based on a
model, this article will examine the variable selection methods for model-based
clustering, as well as presenting opportunities for improvement of these
methods.
Da Zhang, Hamid Maei, Xin Wang, Yuan-Fang Wang
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
Convolutional neural network (CNN) models have achieved tremendous success in
many visual detection and recognition tasks. Unfortunately, visual tracking, a
fundamental computer vision problem, is not handled well using the existing CNN
models, because most object trackers implemented with CNN do not effectively
leverage temporal and contextual information among consecutive frames.
Recurrent neural network (RNN) models, on the other hand, are often used to
process text and voice data due to their ability to learn intrinsic
representations of sequential and temporal data. Here, we propose a novel
neural network tracking model that is capable of integrating information over
time and tracking a selected target in video. It comprises three components: a
CNN extracting best tracking features in each video frame, an RNN constructing
video memory state, and a reinforcement learning (RL) agent making target
location decisions. The tracking problem is formulated as a decision-making
process, and our model can be trained with RL algorithms to learn good tracking
policies that pay attention to continuous, inter-frame correlation and maximize
tracking performance in the long run. We compare our model with an existing
neural-network based tracking method and show that the proposed tracking
approach works well in various scenarios by performing rigorous validation
experiments on artificial video sequences with ground truth. To the best of our
knowledge, our tracker is the first neural-network tracker that combines
convolutional and recurrent networks with RL algorithms.
Laroche Romain, Feraud Raphael
Comments: 27 pages
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG); Optimization and Control (math.OC)
Dialogue systems rely on a careful reinforcement learning design: the
learning algorithm and its state space representation. In lack of more rigorous
knowledge, the designer resorts to its practical experience to choose the best
option. In order to automate and to improve the performance of the
aforementioned process, this article formalises the problem of online
off-policy reinforcement learning algorithm selection. A meta-algorithm is
given for input a portfolio constituted of several off-policy reinforcement
learning algorithms. It then determines at the beginning of each new
trajectory, which algorithm in the portfolio is in control of the behaviour
during the full next trajectory, in order to maximise the return. The article
presents a novel meta-algorithm, called Epochal Stochastic Bandit Algorithm
Selection (ESBAS). Its principle is to freeze the policy updates at each epoch,
and to leave a rebooted stochastic bandit in charge of the algorithm selection.
Under some assumptions, a thorough theoretical analysis demonstrates its
near-optimality considering the structural sampling budget limitations. Then,
ESBAS is put to the test in a set of experiments with various portfolios, on a
negotiation dialogue game. The results show the practical benefits of the
algorithm selection for dialogue systems, in most cases even outperforming the
best algorithm in the portfolio, even when the aforementioned assumptions are
transgressed.
Swayambhoo Jain, Urvashi Oswal, Kevin S. Xu, Brian Eriksson, Jarvis Haupt
Comments: To appear in IEEE Transactions on Biomedical Engineering
Subjects: Machine Learning (stat.ML); Learning (cs.LG); Applications (stat.AP)
The measurement and analysis of Electrodermal Activity (EDA) offers
applications in diverse areas ranging from market research, to seizure
detection, to human stress analysis. Unfortunately, the analysis of EDA signals
is made difficult by the superposition of numerous components which can obscure
the signal information related to a user’s response to a stimulus. We show how
simple pre-processing followed by a novel compressed sensing based
decomposition can mitigate the effects of the undesired noise components and
help reveal the underlying physiological signal. The proposed framework allows
for decomposition of EDA signals with provable bounds on the recovery of user
responses. We test our procedure on both synthetic and real-world EDA signals
from wearable sensors and demonstrate that our approach allows for more
accurate recovery of user responses as compared to the existing techniques.
Alexander Shen
Subjects: Information Theory (cs.IT); Formal Languages and Automata Theory (cs.FL)
It is well known that normality (all factors of given length appear in an
infinite sequence with the same frequency) can be described as
incompressibility via finite automata. Still the statement and proof of this
result as given by Becher and Heiber in terms of “lossless finite-state
compressors” do not follow the standard scheme of Kolmogorov complexity
definition (the automaton is used for compression, not decompression). We
modify this approach to make it more similar to the traditional Kolmogorov
complexity theory (and simpler) by explicitly defining the notion of automatic
Kolmogorov complexity and using its simple properties. Other known notions
(Shallit–Wang, Calude–Salomaa–Roblot) of description complexity related to
finite automata are discussed (see the last section).
As a byproduct, we obtain simple proofs of classical results about normality
(equivalence of definitions with aligned occurences and all occurencies, Wall’s
theorem saying that a normal number remains normal when multiplied by a
rational number, and Agafonov’s result saying that normality is preserved by
automatic selection rules).
Sepehr Rezvani, Nader Mokari, Mohammad R. Javan
Comments: 19 pages, 10 figures
Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)
The unprecedented growth of internet contents, specially social media,
invoking a challenge to the load of cellular networks. In addition, nowadays,
the demands of quality of experience (QoE) became a more practical norm in
contrast of quality of service (QoS), in which downloading delay is an
important factor of it. To satisfy this demand, we propose two resource
allocation (RA) algorithms to optimize the place of social media in the cache
of the Base stations (BSs) and radio resources (i.e., transmit powers and
subcarriers), jointly in a multi-cell Orthogonal Frequency Division Multiple
Access (OFDMA) based system. In the first scheme, the total downloading delay
of the network is minimized, while in the second scheme, a fairness-aware
scheme is proposed in which the maximum downloading delays is minimized. We
propose iterative algorithms to solve each problem, where the content placement
problem and the joint subcarrier and transmit power allocation problem will be
iteratively optimized. We also prove that the proposed approaches converges to
a near-optimal solution, as well as the number of iterations increases.
Gerardo Febres
Comments: 25 pages, 7 figues, 5 tables
Subjects: Information Theory (cs.IT)
When considering perceptions, the observation scale and resolution are
closely related properties. There is consensus in considering resolution as the
density of elementary pieces of information in a specified information space.
Differently, with the concept of scale, several conceptions compete for a
consistent meaning. Scale is typically regarded as way to indicate the degree
of detail in which an observation is performed. But surprisingly, there is not
a unified definition of scale as a description’s property. This paper offers a
precise definition of scale, and a method to quantify it as a property
associated to the interpretation of a description. To complete the parameters
needed to describe the perception of a description, the concepts of scope and
resolution are also exposed with an exact meaning. A model describing the
recursive process of interpretation, based on evolving steps of scale, scope
and resolution, is introduced. The model relies on the conception of
observation scale and its association to the selection of symbols. Five
experiments illustrate the application of these concepts, showing that
resolution, scale and scope integrate the set of properties to define any point
of view from which an observation is performed and interpreted.
Akira Yamawaki, Hiroshi Kamabe, Shan Lu
Comments: 8 pages, 3 figures, submitted to ISIT2017
Subjects: Information Theory (cs.IT)
Index-less Indexed Flash Code (ILIFC) is a coding scheme for flash memories,
in which one bit of a data sequence is stored in a slice consisting of several
cells but the index of the bit is stored implicitly. Although several modified
ILIFC schemes have been proposed, in this research we consider an ILIFC with
inversion cells(I-ILIFC). The I-ILIFC reduces the total number of cell level
changes at each writing request. Computer simulation is used to show that the
I-ILIFC improves the average performance of the ILIFC in many cases. This paper
presents our derivation of the lower bounds on the number of writing operations
by I-ILIFC and shows that the worst-case performance of the I-ILIFC is better
than that of the ILIFC if the code length is sufficiently large. Additionally,
we consider the tight lower bounds thereon. The results show that the threshold
of the code length that determines whether the I-ILIFC improves the worst-case
performance of the ILIFC is smaller than that in the first lower bounds.
Sichen Zhong, Yue Zhao
Subjects: Information Theory (cs.IT); Computational Complexity (cs.CC)
The cospark of a matrix is the cardinality of the sparsest vector in the
column space of the matrix. Computing the cospark of a matrix is well known to
be an NP hard problem. Given the sparsity pattern (i.e., the locations of the
non-zero entries) of a matrix, if the non-zero entries are drawn from
independently distributed continuous probability distributions, we prove that
the cospark of the matrix equals, with probability one, to a particular number
termed the generic cospark of the matrix. The generic cospark also equals to
the maximum cospark of matrices consistent with the given sparsity pattern. We
prove that the generic cospark of a matrix can be computed in polynomial time,
and offer an algorithm that achieves this.
Shengli Zhang, Chongtao Guo, Taotao Wang, Wei Zhang
Comments: 10 pages, 9 figures
Subjects: Information Theory (cs.IT)
This paper investigates a new analog beamforming architecture for massive
multiple-input multiple-output (MIMO) systems, where each of the multiple
transmit antennas is switched to be on or off to form a beam according to the
channel state information at transmitters. This on-off analog beamforming
(OABF) scheme has the advantages in terms of both hardware complexities and
algorithmic complexities. OABF can completely remove the high-cost,
power-consuming and bulky analog phase-shifters that are extensively employed
by traditional analog beamforming schemes; it only requires the deployment of
low-cost analog switches that are easy to implement. Moreover, we show that the
beams formed by such simple antenna on-off switch operations can achieve rather
good performances with low-complexity beamforming algorithms. Specifically, we
first propose two optimal signal-to-noise ratio maximization algorithms to
determine the on-off state of each switch under the per-antenna power
constraint and the total power constraint, respectively. After that, we
theoretically prove that OABF can achieve the full diversity gain and the full
array gain with complexities up to a polynomial order. Numerical results are
consistent with our theoretical analysis. We believe that the simple structure
of OABF makes massive MIMO much easier to implement in real systems.
Shihao Yan, Biao He, Yirui Cong, Xiangyun Zhou
Comments: Accepted by IEEE ICC 2017
Subjects: Information Theory (cs.IT); Cryptography and Security (cs.CR)
Covert communication is to achieve a reliable transmission from a transmitter
to a receiver while guaranteeing an arbitrarily small probability of this
transmission being detected by a warden. In this work, we study the covert
communication in AWGN channels with finite blocklength, in which the number of
channel uses is finite. Specifically, we analytically prove that the entire
block (all available channel uses) should be utilized to maximize the effective
throughput of the transmission subject to a predetermined covert requirement.
This is a nontrivial result because more channel uses results in more
observations at the warden for detecting the transmission. We also determine
the maximum allowable transmit power per channel use, which is shown to
decrease as the blocklength increases. Despite the decrease in the maximum
allowable transmit power per channel use, the maximum allowable total power
over the entire block is proved to increase with the blocklength, which leads
to the fact that the effective throughput increases with the blocklength.
AmirEmad Ghassami, Ali Yekkehkhany, Negar Kiyavash, Yi Lu
Subjects: Information Theory (cs.IT)
We study a covert queueing channel between two users sharing a round robin
scheduler. Such a covert channel can arise when users share a resource such as
a computer processor or a router arbitrated by a round robin policy. We present
an information-theoretic framework to model and derive the maximum reliable
data transmission rate, i.e., the capacity of this channel for both noiseless
and noisy scenarios. Our results show that seemingly isolated users can
communicate with high rate over the covert channel. Furthermore, we propose a
practical finite-length code construction, which achieves the capacity limit.
Rémi Bonnefoi, Amor Nafkha
Comments: 4 pages, 4 figures, accepted at IEEE ICC 2017 Optical Networks and Systems Symposium
Subjects: Information Theory (cs.IT)
In this paper, we present an analytical lower bound on the ergodic capacity
of optical multiple-input multiple-output (MIMO) channels. It turns out that
the optical MIMO channel matrix which couples the mt inputs (modes/cores) into
mr outputs (modes/cores) can be modeled as a sub-matrix of a m x m
Haar-distributed unitary matrix where m > mt,mr. Using the fact that the
probability density of the eigenvalues of a random matrix from unitary ensemble
can be expressed in terms of the Christoffel-Darboux kernel. We provide a new
analytical expression of the ergodic capacity as function of signal-to-noise
ratio (SNR). Moreover, we derive a closed-form lower-bound expression to the
ergodic capacity. In addition, we also derive an approximation to the ergodic
capacity in low-SNR regimes. Finally, we present numerical results supporting
the expressions derived.
Paul Harris, Steffen Malkowsky, Joao Vieira, Fredrik Tufvesson Wael Boukley Hassan, Liang Liu, Mark Beach, Simon Armour, Ove Edfors
Comments: Submitted to the 2017 IEEE JSAC Special Issue on Deployment Issues and Performance Challenges for 5G
Subjects: Information Theory (cs.IT)
The first measured results for massive MIMO performance in a line-of-sight
(LOS) scenario with moderate mobility are presented, with 8 users served in
real-time using a 100 antenna base Station (BS) at 3:7 GHz. When such a large
number of channels dynamically change, the inherent propagation and processing
delay has a critical relationship with the rate of change, as the use of
outdated channel information can result in severe detection and precoding
inaccuracies. For the downlink (DL) in particular, a time division duplex (TDD)
configuration synonymous with massive multiple-input, multiple-output (MIMO)
deployments could mean only the uplink (UL) is usable in extreme cases.
Therefore, it is of great interest to investigate the impact of mobility on
massive MIMO performance and consider ways to combat the potential limitations.
In a mobile scenario with moving cars and pedestrians, the massive MIMO channel
is sampled across many points in space to build a picture of the overall user
orthogonality, and the impact of both azimuth and elevation array
configurations are considered. Temporal analysis is also conducted for vehicles
moving up to 29km/h and real-time bit error rates (BERs) for both the UL and DL
without power control are presented.
Sara Shahi, Daniela Tuninetti, Natasha Devroye
Comments: Under submission
Subjects: Information Theory (cs.IT)
This paper investigates the capacity of a communications channel that, in
addition to additive white Gaussian noise, also suffers from interference
caused by a co-existing radar transmission. The radar interference (of short
duty-cycle and of much wider bandwidth than the intended communication signal)
is modeled as an additive term whose amplitude is known and constant, but whose
phase is independent and identically uniformly distributed at each channel use.
The capacity achieving input distribution, under the standard average power
constraint, is shown to have independent modulo and phase. The phase is
uniformly distributed in ([0,2pi]). The modulo is discrete with countably
infinitly many mass points, but only finitely many in any bounded interval.
From numerical evaluations, a proper-complex Gaussian input is seen to perform
quite well for weak radar interference. We also show that for very large radar
interference, capacity is equal to (1/2log (1 + S)) and a proper-complex
Gaussian input achieves it. It is concluded that the presence of the radar
interference results in a loss of half of the degrees of freedom compared to an
AWGN channel without radar interference.
Danny Hucke, Markus Lohrey
Subjects: Information Theory (cs.IT)
We apply so-called tree straight-line programs to the problem of lossless
compression of binary trees. We derive upper bound on the maximal pointwise
redundancy (or worst-case redundancy) that improve previous bounds obtained by
Zhang, Yang, and Kieffer using directed acyclic graphs. Using this, we obtain
universal codes for new classes of structered tree sources.
Sara Shahi, Daniela Tuninetti, Natasha Devroye
Comments: Under submission
Subjects: Information Theory (cs.IT)
The slotted strongly asynchronous channel with a bursty user consists of
(A_n=e^{nalpha}) blocks of length (n) channel uses. A user transmits one
randomly selected message among (M_n=e^{n R}) different ones in exactly
(K_n=e^{n
u}) randomly selected but distinct blocks of the available (A_n)
blocks. This paper studies the trade-off between ((R,alpha,
u)). For the pure
synchronization problem, i.e., (R=0), the receiver must locate, with vanishing
error probability in (n), each of the (K_n) blocks in which the user is
transmitting; the trade-off between (alpha) and (
u) is exactly
characterized. For the case with (R>0), the receiver is also required to
reliably decode all the (K_n) transmitted messages; outer and inner regions for
((R,alpha,
u)) are proposed.
Robert Beinert, Gerlind Plonka
Subjects: Numerical Analysis (math.NA); Information Theory (cs.IT)
In this paper, we show that sparse signals f representable as a linear
combination of a finite number N of spikes at arbitrary real locations or as a
finite linear combination of B-splines of order m with arbitrary real knots can
be almost surely recovered from O(N^2) Fourier intensity measurements up to
trivial ambiguities. The constructive proof consists of two steps, where in the
first step the Prony method is applied to recover all parameters of the
autocorrelation function and in the second step the parameters of f are
derived. Moreover, we present an algorithm to evaluate f from its Fourier
intensities and illustrate it at different numerical examples.
James G. Dowty
Comments: 14 pages
Subjects: Statistics Theory (math.ST); Information Theory (cs.IT); Differential Geometry (math.DG); Probability (math.PR)
Chentsov’s theorem characterizes the Fisher information metrics on
statistical models as essentially the only Riemannian metrics that are
invariant under sufficient statistics. This implies that statistical models are
naturally equipped with geometric structures, so Chentsov’s theorem explains
why many statistical properties can be described in geometric terms. However,
despite being one of the foundational theorems of statistics, Chentsov’s
theorem has only been proved previously in very restricted settings or under
relatively strong regularity and invariance assumptions. We therefore prove a
version of this theorem for the important case of exponential families. In
particular, we characterise the Fisher information metric as the only
Riemannian metric (up to rescaling) on an exponential family and its derived
families that is invariant under independent and identically distributed
extensions and canonical sufficient statistics. Our approach is based on the
central limit theorem, so it gives a unified proof for both discrete and
continuous exponential families, and it is less technical than previous
approaches.