Naresh Kumari, Nitin Malik, A. N. Jha, Gaddam Mallesham
Comments: SCOPUS indexed Singapore journal, ISSN (Print): 2319-8613, ISSN (Online): 0975-4024, 8 pages, 11 Figures, 5 tables
Journal-ref: Int. J. Engineering & Tech.,8(6),2779-2786,2016
Subjects: Systems and Control (cs.SY); Neural and Evolutionary Computing (cs.NE)
The system comprises of three interconnected power system networks based on
thermal, wind and hydro power generation. The load variation in any one of the
network results in frequency deviation in all the connected systems.The PI
controllers have been connected separately with each system for the frequency
control and the gains (Kp and Ki) of all the controllers have been optimized
along with frequency bias (Bi) and speed regulation parameter (Ri). The
computationally intelligent techniques like bacterial foraging optimization
(BFO) and particle swarm optimization (PSO) have been applied for the tuning of
controller gains along with variable parameters Bi and Ri. The gradient descent
(GD) based conventional method has also been applied for optimizing the
parameters Kp, Ki,Bi and Ri.The frequency responses are obtained with all the
methods. The performance index chosen is the integral square error (ISE). The
settling time, peak overshoot and peak undershoot of all the frequency
responses on applying three optimization techniques have been compared. It has
been observed that the peak overshoot and peak undershoot significantly reduce
with BFO technique followed by the PSO and GD techniques. While obtaining such
optimum response the settling time is increased marginally with bacterial
foraging technique due to large number of mathematical equations used for the
computation in BFO. The comparison of frequency response using three techniques
show the superiority of BFO over the PSO and GD techniques. The designing of
the system and tuning of the parameters with three techniques has been done in
MATLAB/SIMULINK environment.
Haiping Huang, Taro Toyoizumi
Comments: 7 pages and 5 figures
Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Standard error backpropagation is used in almost all modern deep network
training. However, it typically suffers from proliferation of saddle points in
high-dimensional parameter space. Therefore, it is highly desirable to design
an efficient algorithm to escape from these saddle points and reach a good
parameter region of better generalization capabilities, especially based on
rough insights about the landscape of the error surface. Here, we propose a
simple extension of the backpropagation, namely reinforced backpropagation,
which simply adds previous first-order gradients in a stochastic manner with a
probability that increases with learning time. Extensive numerical simulations
on a toy deep learning model verify its excellent performance. The reinforced
backpropagation can significantly improve test performance of the deep network
training, especially when the data are scarce. The performance is even better
than that of state-of-the-art stochastic optimization algorithm called Adam,
with an extra advantage of less computer memory required.
Jianfeng Chen, Vivek Nair, Tim Menzies
Comments: 17 pages, 10 figures
Subjects: Software Engineering (cs.SE); Neural and Evolutionary Computing (cs.NE)
Context:Evolutionary algorithms typically require large number of evaluations
(of solutions) to reach their conclusions – which can be very slow and
expensive to evaluate. Objective:To solve search-based SE problems, using fewer
evaluations than evolutionary methods. Method:Instead of mutating a small
population, we build a very large initial population which is then culled using
a recursive bi-clustering binary chop approach. We evaluate this approach on
multiple software engineering(SE) models, unconstrained as well as constrained,
and compare its performance with standard evolutionary algorithms.
Results:Using just a few evaluations (under 100), we can obtain the comparable
results to standard evolutionary algorithms. Conclusion:Just because something
works, and is widespread use, does not necessarily mean that there is no value
in seeking methods to improve that method. Before undertaking SBSE optimization
task using traditional EAs, it is recommended to try other techniques, like
those explored here, to obtain the same results with fewer evaluations.
Gerard Rinkus
Comments: 30 pages, 13 figures
Subjects: Neurons and Cognition (q-bio.NC); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
It is widely acknowledged that the brain i) implements probabilistic
reasoning, and ii) represents information via population/distributed coding.
Previous population-based probabilistic (PPC) theories share some basic
properties: 1) continuous neurons; 2) all neurons formally participate in every
code; 3) decoding requires either graded synapses or rate coding; 4) generally
assume rate-coded signaling; individual neurons are generally assumed to 5)
have unimodal, e.g., bell-shaped, tuning functions (TFs) and 6) be
fundamentally noisy; and 7) noise/correlation are generally viewed as degrading
computation. In contrast, our theory assumes: 1) binary neurons; 2) only a
small subset of neurons, i.e., a sparse distributed code (SDC), comprises any
individual code; 3) binary synapses; 4) signaling via waves of
contemporaneously arriving first-spikes; individual neurons 5) have completely
flat TFs (all weights initially zero) and 6) are not noisy; and 7) noise is a
resource used to preserve similarity from inputs to codes. Thus, noise controls
a tradeoff between storage capacity and embedding the input space statistics
(in the pattern of code intersections), indirectly yielding particular
correlation patterns. The theory, Sparsey, was introduced 20 years ago as a
canonical cortical circuit/algorithm model, but not elaborated as a
probabilistic model. Assuming input similarity correlates with likelihood, the
active SDC code simultaneously represents both the most probable hypothesis and
the full probability distribution. We show this for spatial and spatiotemporal
(sequential) cases. Finally, consistent with moving beyond the Neuron Doctrine
to the view that the SDC (cell assembly, ensemble) is the atomic unit of neural
representation, Sparsey explains how classical unimodal TFs emerge as an
artifact of a single/few-trial learning process in which SDC codes are laid
down in superposition.
Ahmed Karam Eldaly, Yoann Altmann, Antonios Perperidis, Nikola Krstajic, Tushar Choudhary, Kevin Dhaliwal, Stephen McLaughlin
Subjects: Computer Vision and Pattern Recognition (cs.CV); Applications (stat.AP)
Optical endomicroscopy (OEM) is an emerging technology platform with
preclinical and clinical imaging utility. Pulmonary OEM via multicore fibres
has the potential to provide in vivo in situ molecular signatures of disease
such as infection and inflammation. However, enhancing the quality of data
acquired by this technique for better visualization and subsequent analysis
remains a challenging problem. Cross coupling between fiber cores is one of the
main reasons of poor detection performance (i.e., inflammation, bacteria,
etc.). In this work, we address the problem of deconvolution and restoration of
OEM data. We propose and compare four methods, three are based on the
alternating direction method of multipliers (ADMM) and one is based on Markov
chain Monte Carlo (MCMC) methods. Results on both synthetic and real datasets
illustrate the effectiveness of the proposed methods.
Guillaume Noyel (IPRI), Michel Jourlin (LHC, IPRI)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Numerical Analysis (math.NA)
We establish the link between Mathematical Morphology and the map of
Aspl”und’s distances between a probe and a grey scale function, using the
Logarithmic Image Processing scalar multiplication. We demonstrate that the map
is the logarithm of the ratio between a dilation and an erosion of the function
by a structuring function: the probe. The dilations and erosions are mappings
from the lattice of the images into the lattice of the positive functions.
Using a flat structuring element, the expression of the map of Aspl”und’s
distances can be simplified with a dilation and an erosion of the image, these
mappings stays in the lattice of the images. We illustrate our approach by an
example of pattern matching with a non-flat structuring function.
Hanqing Zhang, Tim Stangner, Krister Wiklund, Alvaro Rodriguez, Magnus Andersson
Comments: Manuscript including supplementary materials
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We present a versatile and fast MATLAB program (UmUTracker) that
automatically detects and tracks particles by analyzing long video sequences
acquired by either light microscopy or digital holography microscopy. Our
program finds the 2D particle center position using an isosceles triangle
transform and the axial position by a fast implementation of
Rayleigh-Sommerfeld numerical reconstruction algorithm using a one dimensional
radial intensity profile. To validate the accuracy of our program we test each
module individually in controlled experiments. First, we determine the lateral
position of polystyrene particles recorded under bright field microscopy and
digital holography conditions. Second, we reconstruct the axial position of the
same particles by analyzing synthetic and experimentally acquired holograms.
Thereafter, as a proof of concept, we profile the fluid flow in a 100
micrometer high flow chamber. This result agrees with computational fluid
dynamic simulations. On a regular desktop computer UmUTracker can detect,
analyze, and track a single particle at 5 frames per second for a template size
of 201 x 201 in a 1024 x 1024 image. To enhance usability and to make it easy
to implement new functions we used object-oriented programming. UmUTracker is
suitable for studies related to: particle dynamics, cell localization, colloids
and microfluidic flow measurement.
Nan Li, Yifang Xu, Chao Wang
Comments: 9 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Naturalness of warping is gaining extensive attention in image stitching.
Recent warps such as SPHP, AANAP and GSP, use a global similarity to
effectively mitigate projective distortion (which enlarges regions), however,
they necessarily bring in perspective distortion (which generates
inconsistency). In this paper, we propose a quasi-homography warp, which
balances perspective distortion against projective distortion in the
non-overlapping region, to create natural-looking mosaics. Our approach
formulates the warp as a solution of a system of bivariate equations, where
perspective distortion and projective distortion are characterized as slope
preservation and scale linearization respectively. Our proposed warp only
relies on a global homography thus is totally parameter-free. A comprehensive
experiment shows that quasi-homography outperforms some state-of-the-art warps
in urban scenes, including homography, AutoStitch and SPHP. A user study
demonstrates that quasi-homography wins most users’ favor as well, comparing to
homography and SPHP.
Jingkuan Song, Tao He, Lianli Gao, Xing Xu, Heng Tao Shen
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Instance Search (INS) is a fundamental problem for many applications, while
it is more challenging comparing to traditional image search since the
relevancy is defined at the instance level.
Existing works have demonstrated the success of many complex ensemble systems
that are typically conducted by firstly generating object proposals, and then
extracting handcrafted and/or CNN features of each proposal for matching.
However, object bounding box proposals and feature extraction are often
conducted in two separated steps, thus the effectiveness of these methods
collapses. Also, due to the large amount of generated proposals, matching speed
becomes the bottleneck that limits its application to large-scale datasets. To
tackle these issues, in this paper we propose an effective and efficient Deep
Region Hashing (DRH) approach for large-scale INS using an image patch as the
query. Specifically, DRH is an end-to-end deep neural network which consists of
object proposal, feature extraction, and hash code generation. DRH shares
full-image convolutional feature map with the region proposal network, thus
enabling nearly cost-free region proposals. Also, each high-dimensional,
real-valued region features are mapped onto a low-dimensional, compact binary
codes for the efficient object region level matching on large-scale dataset.
Experimental results on four datasets show that our DRH can achieve even better
performance than the state-of-the-arts in terms of MAP, while the efficiency is
improved by nearly 100 times.
Gerard Rinkus
Comments: 30 pages, 13 figures
Subjects: Neurons and Cognition (q-bio.NC); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
It is widely acknowledged that the brain i) implements probabilistic
reasoning, and ii) represents information via population/distributed coding.
Previous population-based probabilistic (PPC) theories share some basic
properties: 1) continuous neurons; 2) all neurons formally participate in every
code; 3) decoding requires either graded synapses or rate coding; 4) generally
assume rate-coded signaling; individual neurons are generally assumed to 5)
have unimodal, e.g., bell-shaped, tuning functions (TFs) and 6) be
fundamentally noisy; and 7) noise/correlation are generally viewed as degrading
computation. In contrast, our theory assumes: 1) binary neurons; 2) only a
small subset of neurons, i.e., a sparse distributed code (SDC), comprises any
individual code; 3) binary synapses; 4) signaling via waves of
contemporaneously arriving first-spikes; individual neurons 5) have completely
flat TFs (all weights initially zero) and 6) are not noisy; and 7) noise is a
resource used to preserve similarity from inputs to codes. Thus, noise controls
a tradeoff between storage capacity and embedding the input space statistics
(in the pattern of code intersections), indirectly yielding particular
correlation patterns. The theory, Sparsey, was introduced 20 years ago as a
canonical cortical circuit/algorithm model, but not elaborated as a
probabilistic model. Assuming input similarity correlates with likelihood, the
active SDC code simultaneously represents both the most probable hypothesis and
the full probability distribution. We show this for spatial and spatiotemporal
(sequential) cases. Finally, consistent with moving beyond the Neuron Doctrine
to the view that the SDC (cell assembly, ensemble) is the atomic unit of neural
representation, Sparsey explains how classical unimodal TFs emerge as an
artifact of a single/few-trial learning process in which SDC codes are laid
down in superposition.
Dmitry Petrov, Boris Gutman, Alexander Ivanov, Joshua Faskowitz, Mikhail Belyaev, Paul Thompson
Comments: Accepted for IEEE International Symposium on Biomedical Imaging 2017
Subjects: Neurons and Cognition (q-bio.NC); Computer Vision and Pattern Recognition (cs.CV)
In this work, we study the extent to which structural connectomes and
topological derivative measures are unique to individual changes within human
brains. To do so, we classify structural connectome pairs from two large
longitudinal datasets as either belonging to the same individual or not. Our
data is comprised of 227 individuals from the Alzheimer’s Disease Neuroimaging
Initiative (ADNI) and 226 from the Parkinson’s Progression Markers Initiative
(PPMI). We achieve 0.99 area under the ROC curve score for features which
represent either weights or network structure of the connectomes (node degrees,
PageRank and local efficiency). Our approach may be useful for eliminating
noisy features as a preprocessing step in brain aging studies and early
diagnosis classification problems.
Ardavan Salehi Nobandegani, Ioannis N. Psaromiligkos
Subjects: Artificial Intelligence (cs.AI); Neurons and Cognition (q-bio.NC); Machine Learning (stat.ML)
The Frame Problem (FP) is a puzzle in philosophy of mind and epistemology,
articulated by the Stanford Encyclopedia of Philosophy as follows: “How do we
account for our apparent ability to make decisions on the basis only of what is
relevant to an ongoing situation without having explicitly to consider all that
is not relevant?” In this work, we focus on the causal variant of the FP, the
Causal Frame Problem (CFP). Assuming that a reasoner’s mental causal model can
be (implicitly) represented by a causal Bayes net, we first introduce a notion
called Potential Level (PL). PL, in essence, encodes the relative position of a
node with respect to its neighbors in a causal Bayes net. Drawing on the
psychological literature on causal judgment, we substantiate the claim that PL
may bear on how time is encoded in the mind. Using PL, we propose an inference
framework, called the PL-based Inference Framework (PLIF), which permits a
boundedly-rational approach to the CFP to be formally articulated at Marr’s
algorithmic level of analysis. We show that our proposed framework, PLIF, is
consistent with a wide range of findings in causal judgment literature, and
that PL and PLIF make a number of predictions, some of which are already
supported by existing findings.
Apratim Bhattacharyya, Jilles Vreeken
Subjects: Artificial Intelligence (cs.AI); Databases (cs.DB)
Discovering the key structure of a database is one of the main goals of data
mining. In pattern set mining we do so by discovering a small set of patterns
that together describe the data well. The richer the class of patterns we
consider, and the more powerful our description language, the better we will be
able to summarise the data. In this paper we propose ourmethod, a novel greedy
MDL-based method for summarising sequential data using rich patterns that are
allowed to interleave. Experiments show ourmethod is orders of magnitude
faster than the state of the art, results in better models, as well as
discovers meaningful semantics in the form patterns that identify multiple
choices of values.
Sven Tomforde, Bernhard Sick, Christian Müller-Schloer
Comments: 10 pages, one figure, article
Subjects: Multiagent Systems (cs.MA); Artificial Intelligence (cs.AI)
Organic Computing is an initiative in the field of systems engineering that
proposed to make use of concepts such as self-adaptation and self-organisation
to increase the robustness of technical systems. Based on the observation that
traditional design and operation concepts reach their limits, transferring more
autonomy to the systems themselves should result in a reduction of complexity
for users, administrators, and developers. However, there seems to be a need
for an updated definition of the term “Organic Computing”, of desired
properties of technical, organic systems, and the objectives of the Organic
Computing initiative. With this article, we will address these points.
Lucas Lamata
Subjects: Quantum Physics (quant-ph); Mesoscale and Nanoscale Physics (cond-mat.mes-hall); Superconductivity (cond-mat.supr-con); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Superconducting circuit technologies have recently achieved quantum protocols
involving closed feedback loops. Quantum artificial intelligence and quantum
machine learning are emerging fields inside quantum technologies which may
enable quantum devices to acquire information from the outer world and improve
themselves via a learning process. Here we propose the implementation of basic
protocols in quantum reinforcement learning, with superconducting circuits
employing feedback-loop control. We introduce diverse scenarios for
proof-of-principle experiments with state-of-the-art superconducting circuit
technologies and analyze their feasibility in presence of imperfections. The
field of quantum artificial intelligence implemented with superconducting
circuits paves the way for enhanced quantum control and quantum computation
protocols.
Syed Mehedi Hasan Nirob, Md. Kazi Nayeem, Md. Saiful Islam
Comments: 8 pages
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)
Trending topic of newspapers is an indicator to understand the situation of a
country and also a way to evaluate the particular newspaper. This paper
represents a model describing few techniques to select trending topics from
Bangla Newspaper. Topics that are discussed more frequently than other in
Bangla newspaper will be marked and how a very famous topic loses its
importance with the change of time and another topic takes its place will be
demonstrated. Data from two popular Bangla Newspaper with date and time were
collected. Statistical analysis was performed after on these data after
preprocessing. Popular and most used keywords were extracted from the stream of
Bangla keyword with this analysis. This model can also cluster category wise
news trend or a list of news trend in daily or weekly basis with enough data. A
pattern can be found on their news trend too. Comparison among past news trend
of Bangla newspapers will give a visualization of the situation of Bangladesh.
This visualization will be helpful to predict future trending topics of Bangla
Newspaper.
Björn Ross, Michael Rist, Guillermo Carbonell, Benjamin Cabrera, Nils Kurowsky, Michael Wojatzki
Journal-ref: Proceedings of NLP4CMC III: 3rd Workshop on Natural Language
Processing for Computer-Mediated Communication (Bochum), Bochumer
Linguistische Arbeitsberichte, vol. 17, sep 2016, pp. 6-9
Subjects: Computation and Language (cs.CL)
Some users of social media are spreading racist, sexist, and otherwise
hateful content. For the purpose of training a hate speech detection system,
the reliability of the annotations is crucial, but there is no universally
agreed-upon definition. We collected potentially hateful messages and asked two
groups of internet users to determine whether they were hate speech or not,
whether they should be banned or not and to rate their degree of offensiveness.
One of the groups was shown a definition prior to completing the survey. We
aimed to assess whether hate speech can be annotated reliably, and the extent
to which existing definitions are in accordance with subjective ratings. Our
results indicate that showing users a definition caused them to partially align
their own opinion with the definition but did not improve reliability, which
was very low overall. We conclude that the presence of hate speech should
perhaps not be considered a binary yes-or-no decision, and raters need more
detailed instructions for the annotation.
Vladimir Chernykh, Grigoriy Sterling, Pavel Prihodko
Subjects: Computation and Language (cs.CL)
In this paper the task of emotion recognition from speech is considered.
Proposed approach uses deep recurrent neural network trained on a sequence of
acoustic features calculated over small speech intervals. At the same time
special probabilistic-nature CTC loss function allows to consider long
utterances containing both emotional and unemotional parts. The effectiveness
of such an approach is shown in two ways. First one is the comparison with
recent advances in this field. While second way implies measuring human
performance on the same task, which also was done by authors.
Dávid Márk Nemeskey
Comments: Additional resources: – the emLam repository: this https URL – the emLam corpus: this http URL
Journal-ref: In Proceedings of the 13th Conference on Hungarian Computational
Linguistics (MSZNY), pp. 91-102. Szeged, 2017
Subjects: Computation and Language (cs.CL)
This paper aims to make up for the lack of documented baselines for Hungarian
language modeling. Various approaches are evaluated on three publicly available
Hungarian corpora. Perplexity values comparable to models of similar-sized
English corpora are reported. A new, freely downloadable Hungar- ian benchmark
corpus is introduced.
Syed Mehedi Hasan Nirob, Md. Kazi Nayeem, Md. Saiful Islam
Comments: 8 pages
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)
Trending topic of newspapers is an indicator to understand the situation of a
country and also a way to evaluate the particular newspaper. This paper
represents a model describing few techniques to select trending topics from
Bangla Newspaper. Topics that are discussed more frequently than other in
Bangla newspaper will be marked and how a very famous topic loses its
importance with the change of time and another topic takes its place will be
demonstrated. Data from two popular Bangla Newspaper with date and time were
collected. Statistical analysis was performed after on these data after
preprocessing. Popular and most used keywords were extracted from the stream of
Bangla keyword with this analysis. This model can also cluster category wise
news trend or a list of news trend in daily or weekly basis with enough data. A
pattern can be found on their news trend too. Comparison among past news trend
of Bangla newspapers will give a visualization of the situation of Bangladesh.
This visualization will be helpful to predict future trending topics of Bangla
Newspaper.
Matt M. T. Yiu, Helen H. W. Chan, Patrick P. C. Lee
Subjects: Databases (cs.DB); Distributed, Parallel, and Cluster Computing (cs.DC)
Erasure coding has been widely adopted in distributed storage systems for
fault-tolerant storage with low redundancy. We present MemEC, an
erasure-coding-based in-memory key-value (KV) store. MemEC is specifically
designed for workloads dominated by small objects. It encodes objects in
entirety, and incurs 60% less storage redundancy for small objects than
existing replication- and erasure-coding-based approaches. It also supports
graceful transitions between decentralized requests in normal mode (i.e., no
failures) and coordinated requests in degraded mode (i.e., with failures), so
as to maintain availability and consistency. We evaluate our MemEC prototype
via extensive testbed experiments under read-heavy and update-heavy YCSB
workloads. MemEC achieves high throughput and low latency in both normal and
degraded modes, and also achieves fast transitions between the two modes.
Marek Kokot, Maciej Długosz, Sebastian Deorowicz
Subjects: Genomics (q-bio.GN); Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS)
Summary: Counting all k-mers in a given dataset is a standard procedure in
many bioinformatics applications. We introduce KMC3, a significant improvement
of the former KMC2 algorithm together with KMC tools for manipulating k-mer
databases. Usefulness of the tools is shown on a few real problems.
Availability: Program is freely available at
this http URL Contact: sebastian.deorowicz@polsl.pl
Haiping Huang, Taro Toyoizumi
Comments: 7 pages and 5 figures
Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Standard error backpropagation is used in almost all modern deep network
training. However, it typically suffers from proliferation of saddle points in
high-dimensional parameter space. Therefore, it is highly desirable to design
an efficient algorithm to escape from these saddle points and reach a good
parameter region of better generalization capabilities, especially based on
rough insights about the landscape of the error surface. Here, we propose a
simple extension of the backpropagation, namely reinforced backpropagation,
which simply adds previous first-order gradients in a stochastic manner with a
probability that increases with learning time. Extensive numerical simulations
on a toy deep learning model verify its excellent performance. The reinforced
backpropagation can significantly improve test performance of the deep network
training, especially when the data are scarce. The performance is even better
than that of state-of-the-art stochastic optimization algorithm called Adam,
with an extra advantage of less computer memory required.
Naman Agarwal, Karan Singh
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
We design differentially private algorithms for the problem of online linear
optimization in the full information and bandit settings with optimal
( ilde{O}(sqrt{T})) regret bounds. In the full-information setting, our
results demonstrate that ((epsilon, delta))-differential privacy may be
ensured for free – in particular, the regret bounds scale as
(O(sqrt{T})+ ilde{O}ig(frac{1}{epsilon}log frac{1}{delta}ig)). For
bandit linear optimization, and as a special case, for non-stochastic
multi-armed bandits, the proposed algorithm achieves a regret of
(OBig(frac{sqrt{Tlog T}}{epsilon}log frac{1}{delta}Big)), while the
previously best known bound was
( ilde{O}Big(frac{T^{frac{3}{4}}}{epsilon}Big)).
Adarsh Barik, Jean Honorio, Mohit Tawarmalani
Subjects: Learning (cs.LG); Information Theory (cs.IT); Machine Learning (stat.ML)
We analyze the necessary number of samples for sparse vector recovery in a
noisy linear prediction setup. This model includes problems such as linear
regression and classification. We focus on structured graph models. In
particular, we prove that sufficient number of samples for the weighted graph
model proposed by Hegde and others is also necessary. We use the Fano’s
inequality on well constructed ensembles as our main tool in establishing
information theoretic lower bounds.
Jeff Heaton
Subjects: Learning (cs.LG)
Machine learning models, such as neural networks, decision trees, random
forests and gradient boosting machines accept a feature vector and provide a
prediction. These models learn in a supervised fashion where a set of feature
vectors with expected output is provided. It is very common practice to
engineer new features from the provided feature set. Such engineered features
will either augment, or replace portions of the existing feature vector. These
engineered features are essentially calculated fields, based on the values of
the other features.
Engineering such features is primarily a manual, time-consuming task.
Additionally, each type of model will respond differently to different types of
engineered features. This paper reports on empirical research to demonstrate
what types of engineered features are best suited to which machine learning
model type. This is accomplished by generating several datasets that are
designed to benefit from a particular type of engineered feature. The
experiment demonstrates to what degree the machine learning model is capable of
synthesizing the needed feature on its own. If a model is capable of
synthesizing an engineered feature, it is not necessary to provide that
feature. The research demonstrated that the studied models do indeed perform
differently with various types of engineered features.
Vivek Nair, Tim Menzies, Norbert Siegmund, Sven Apel
Comments: 26 pages, 6 figures
Subjects: Software Engineering (cs.SE); Learning (cs.LG)
Despite the huge spread and economical importance of configurable software
systems, there is unsatisfactory support in utilizing the full potential of
these systems with respect to finding performance-optimal configurations. Prior
work on predicting the performance of software configurations suffered from
either (a) requiring far too many sample configurations or (b) large variances
in their predictions. Both these problems can be avoided using the WHAT
spectral learner. WHAT’s innovation is the use of the spectrum (eigenvalues) of
the distance matrix between the configurations of a configurable software
system, to perform dimensionality reduction. Within that reduced configuration
space, many closely associated configurations can be studied by executing only
a few sample configurations. For the subject systems studied here, a few dozen
samples yield accurate and stable predictors – less than 10% prediction error,
with a standard deviation of less than 2%. When compared to the state of the
art, WHAT (a) requires 2 to 10 times fewer samples to achieve similar
prediction accuracies, and (b) its predictions are more stable (i.e., have
lower standard deviation). Furthermore, we demonstrate that predictive models
generated by WHAT can be used by optimizers to discover system configurations
that closely approach the optimal performance.
Bert J. Claessens, Dirk Vanhoudt, Johan Desmedt, Frederik Ruelens
Comments: Under review at Elsevier: Energy and buildings 2017
Subjects: Systems and Control (cs.SY); Learning (cs.LG)
Optimal control of thermostatically controlled loads connected to a district
heating network is considered a sequential decision- making problem under
uncertainty. The practicality of a direct model-based approach is compromised
by two challenges, namely scalability due to the large dimensionality of the
problem and the system identification required to identify an accurate model.
To help in mitigating these problems, this paper leverages on recent
developments in reinforcement learning in combination with a market-based
multi-agent system to obtain a scalable solution that obtains a significant
performance improvement in a practical learning time. The control approach is
applied on a scenario comprising 100 thermostatically controlled loads
connected to a radial district heating network supplied by a central combined
heat and power plant. Both for an energy arbitrage and a peak shaving
objective, the control approach requires 60 days to obtain a performance within
65% of a theoretical lower bound on the cost.
Franz J. Király, Zhaozhi Qian
Subjects: Machine Learning (stat.ML); Learning (cs.LG); Applications (stat.AP); Methodology (stat.ME)
Prediction and modelling of competitive sports outcomes has received much
recent attention, especially from the Bayesian statistics and machine learning
communities. In the real world setting of outcome prediction, the seminal
‘{E}lH{o} update still remains, after more than 50 years, a valuable baseline
which is difficult to improve upon, though in its original form it is a
heuristic and not a proper statistical “model”. Mathematically, the ‘{E}lH{o}
rating system is very closely related to the Bradley-Terry models, which are
usually used in an explanatory fashion rather than in a predictive supervised
or on-line learning setting.
Exploiting this close link between these two model classes and some newly
observed similarities, we propose a new supervised learning framework with
close similarities to logistic regression, low-rank matrix completion and
neural networks. Building on it, we formulate a class of structured log-odds
models, unifying the desirable properties found in the above: supervised
probabilistic prediction of scores and wins/draws/losses, batch/epoch and
on-line learning, as well as the possibility to incorporate features in the
prediction, without having to sacrifice simplicity, parsimony of the
Bradley-Terry models, or computational efficiency of ‘{E}lH{o}’s original
approach.
We validate the structured log-odds modelling approach in synthetic
experiments and English Premier League outcomes, where the added expressivity
yields the best predictions reported in the state-of-art, close to the quality
of contemporary betting odds.
Martin Arjovsky, Soumith Chintala, Léon Bottou
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
We introduce a new algorithm named WGAN, an alternative to traditional GAN
training. In this new model, we show that we can improve the stability of
learning, get rid of problems like mode collapse, and provide meaningful
learning curves useful for debugging and hyperparameter searches. Furthermore,
we show that the corresponding optimization problem is sound, and provide
extensive theoretical work highlighting the deep connections to other distances
between distributions.
Arjun Radhakrishna, Nicholas Lewchenko, Shawn Meier, Sergio Mover, Krishna Chaitanya Sripada, Damien Zufferey, Bor-Yuh Evan Chang, Pavol Černý
Comments: Submitted to CAV 2017
Subjects: Logic in Computer Science (cs.LO); Learning (cs.LG); Programming Languages (cs.PL)
In event-driven programming frameworks, such as Android, the client and the
framework interact using callins (framework methods that the client invokes)
and callbacks (client methods that the framework invokes). The protocols for
interacting with these frameworks can often be described by finite-state
machines we dub *asynchronous typestates*. Asynchronous typestates are akin to
classical typestates, with the key difference that their outputs (callbacks)
are produced asynchronously.
We present an algorithm to infer asynchronous typestates for Android
framework classes. It is based on the L* algorithm that uses membership and
equivalence queries. We show how to implement these queries for Android
classes. Membership queries are implemented using testing. Under realistic
assumptions, equivalence queries can be implemented using membership queries.
We provide an improved algorithm for equivalence queries that is better suited
for our application than the algorithms from literature. Instead of using a
bound on the size of the typestate to be learned, our algorithm uses a
*distinguisher bound*. The distinguisher bound quantifies how two states in the
typestate are locally different.
We implement our approach and evaluate it empirically. We use our tool,
Starling, to learn asynchronous typestates for Android classes both for cases
where one is already provided by the documentation, and for cases where the
documentation is unclear. The results show that Starling learns asynchronous
typestates accurately and efficiently. Additionally, in several cases, the
synthesized asynchronous typestates uncovered surprising and undocumented
behaviors.
Seyyed Ali Hashemi, Carlo Condo, Warren J. Gross
Comments: WCNC 2017 Polar Coding Workshop
Subjects: Information Theory (cs.IT)
Polar codes are capacity achieving error correcting codes that can be decoded
through the successive-cancellation algorithm. To improve its error-correction
performance, a list-based version called successive-cancellation list (SCL) has
been proposed in the past, that however substantially increases the number of
time-steps in the decoding process. The simplified SCL (SSCL) decoding
algorithm exploits constituent codes within the polar code structure to greatly
reduce the required number of time-steps without introducing any
error-correction performance loss. In this paper, we propose a faster decoding
approach to decode one of these constituent codes, the Rate-1 node. We use this
Rate-1 node decoder to develop Fast-SSCL. We demonstrate that only a
list-size-bound number of bits needs to be estimated in Rate-1 nodes and
Fast-SSCL exactly matches the error-correction performance of SCL and SSCL.
This technique can potentially greatly reduce the total number of time-steps
needed for polar codes decoding: analysis on a set of case studies show that
Fast-SSCL has a number of time-steps requirement that is up to 66.6% lower than
SSCL and 88.1% lower than SCL.
Ryan Gabrys, Olgica Milenkovic
Subjects: Information Theory (cs.IT)
We introduce a new variant of the (k)-deck problem, which in its traditional
formulation asks for determining the smallest (k) that allows one to
reconstruct any binary sequence of length (n) from the multiset of its
(k)-length subsequences. In our version of the problem, termed the hybrid
k-deck problem, one is given a certain number of special subsequences of the
sequence of length (n – t), (t > 0), and the question of interest is to
determine the smallest value of (k) such that the (k)-deck, along with the
subsequences, allows for reconstructing the original sequence in an error-free
manner. We first consider the case that one is given a single subsequence of
the sequence of length (n – t), obtained by deleting zeros only, and seek the
value of (k) that allows for hybrid reconstruction. We prove that in this case,
(k in [log t+2, min{ t+1, O(sqrt{n cdot (1+log t)}) } ]). We then
proceed to extend the single-subsequence setup to the case where one is given
(M) subsequences of length (n – t) obtained by deleting zeroes only. In this
case, we first aggregate the asymmetric traces and then invoke the single-trace
results. The analysis and problem at hand are motivated by nanopore sequencing
problems for DNA-based data storage.
Kevin R. Moon, Kumar Sricharan, Alfred O. Hero III
Comments: 21 pages, 1 figure
Subjects: Information Theory (cs.IT); Statistics Theory (math.ST)
We derive the mean squared error convergence rates of kernel density-based
plug-in estimators of mutual information measures between two multidimensional
random variables (mathbf{X}) and (mathbf{Y}) for two cases: 1) (mathbf{X})
and (mathbf{Y}) are both continuous; 2) (mathbf{X}) is continuous and
(mathbf{Y}) is discrete. Using the derived rates, we propose an ensemble
estimator of these information measures for the second case by taking a
weighted sum of the plug-in estimators with varied bandwidths. The resulting
ensemble estimator achieves the (1/N) parametric convergence rate when the
conditional densities of the continuous variables are sufficiently smooth. To
the best of our knowledge, this is the first nonparametric mutual information
estimator known to achieve the parametric convergence rate for this case, which
frequently arises in applications (e.g. variable selection in classification).
The estimator is simple to implement as it uses the solution to an offline
convex optimization problem and simple plug-in estimators. A central limit
theorem is also derived for the ensemble estimator. Ensemble estimators that
achieve the parametric rate are also derived for the first case ((mathbf{X})
and (mathbf{Y}) are both continuous) and another case 3) (mathbf{X}) and
(mathbf{Y}) may have any mixture of discrete and continuous components.
Mohamad Khas Mohamadi, Hamid Saeedi, Reza Asvadi
Comments: 9 pages, 6 figures
Subjects: Information Theory (cs.IT)
This paper is concerned with the design of capacity approaching ensembles of
Low-Densiy Parity-Check (LDPC) codes for correlated sources. We consider
correlated binary sources where the data is encoded independently at each
source through a systematic LDPC encoder and sent over two independent
channels. At the receiver, a iterative joint decoder consisting of two
component LDPC decoders is considered where the encoded bits at the output of
each component decoder are used at the other decoder as the a priori
information. We first provide asymptotic performance analysis using the concept
of extrinsic information transfer (EXIT) charts. Compared to the conventional
EXIT charts devised to analyze LDPC codes for point to point communication, the
proposed EXIT charts have been completely modified to able to accommodate the
systematic nature of the codes as well as the iterative behavior between the
two component decoders. Then the developed modified EXIT charts are deployed to
design ensembles for different levels of correlation. Our results show that as
the average degree of the designed ensembles grow, the thresholds corresponding
to the designed ensembles approach the capacity. In particular, for ensembles
with average degree of around 9, the gap to capacity is reduced to about 0.2dB.
Finite block length performance evaluation is also provided for the designed
ensembles to verify the asymptotic results.
Anas Chaaban, Aydin Sezgin, Mohamed-Slim Alouini
Subjects: Information Theory (cs.IT)
The degrees-of-freedom (DoF) of the multi-antenna three-way channel (3WC)
with an intermittent node is studied. Special attention is given to the impact
of adaptation. A nonadaptive transmission scheme based on interference
alignment, zero-forcing, and erasure-channel treatment is proposed, and its
corresponding DoF region is derived. Then, it is shown that this scheme
achieves the sum-DoF of the intermittent channel, in addition to the DoF region
of the nonintermittent one. Thus, adaptation is not necessary from those
perspectives. To the contrary, it is shown that adaptation is necessary for
achieving the DoF region of the intermittent case. This is shown by deriving an
outer bound for the intermittent channel with nonadaptive encoding, and giving
a counterexample of an adaptive scheme which achieves DoF tuples outside this
bound. This highlights the importance of cooperation in this intermittent
network.
Vahid Aref, Zhenhua Dong, Henning Buelow
Comments: The invited paper presented in IEEE Photonics Conference (IPC) 2016, Oct. 2016
Subjects: Information Theory (cs.IT)
We explain how to optimize the nonlinear spectrum of multi-soliton pulses by
considering the practical constraints of transmitter, receiver, and
lumped-amplified link. The optimization is applied for the experimental
transmission of 2ns soliton pulses with independent on-off keying of 10
eigenvalues over 2000 km of NZ-DSF fiber spans.
Joseph J. Boutros, Fanny Jardel, Cyril Méasson
Comments: Submitted to IEEE International Symposium on Information Theory (ISIT) 2017
Subjects: Information Theory (cs.IT)
We generalize probabilistic amplitude shaping (PAS) with binary codes to the
case of non-binary codes defined over prime finite fields. Firstly, we
introduce probabilistic shaping via time sharing where shaping applies to
information symbols only. Then, we design circular quadrature amplitude
modulations (CQAM) that allow to directly generalize PAS to prime finite fields
with full shaping.
Yahia Alghorani, Mehdi Sayfi
Comments: Accepted for publication in IEEE Wireless Communications Letters, Jan. 2017
Subjects: Information Theory (cs.IT)
In this letter, we investigate the performance of multiple-input
multiple-output techniques in a vehicle-to-vehicle communication system. We
consider both transmit antenna selection with maximal-ratio combining and
transmit antenna selection with selection combining. The channel propagation
model between two vehicles is represented as n*Rayleigh distribution, which has
been shown to be a realistic model for vehicle-to-vehicle communication
scenarios. We derive tight analytical expressions for the outage probability
and amount of fading of the post-processing signal-to-noise ratio.
Yeow Meng Chee, Tuvi Etzion, Han Mao Kiah, Alexander Vardy
Subjects: Information Theory (cs.IT)
High temperatures have dramatic negative effects~on interconnect performance
and, hence, numerous techniques have been proposed to reduce the power
consumption of on-chip buses. However, existing methods fall short of fully
addressing the thermal challenges posed by high-performance interconnects. In
this paper, we introduce new efficient coding schemes that make it possible to
directly control the peak temperature of a bus by effectively cooling its
hottest wires. This is achieved by avoiding state transitions on the hottest
wires for as long as necessary until their temperature drops off. At the same
time, we reduce the average power consumption by making sure that the total
number of state transitions on all the wires is below a prescribed threshold.
These two features are obtained separately or simultaneously. In addition,
error-correction for the transmitted information can also be provided with each
one of the two features and when they both obtained simultaneously.
Our solutions call for some redundancy: we use (n > k) wires to encode a
given (k)-bit bus. Therefore, it is important to determine theoretically the
minimum possible number of wires (n) needed to encode (k) bits while satisfying
the desired properties. We provide full theoretical analysis in each case, and
show that the number of additional wires required to cool the (t) hottest wires
becomes negligible when (k) is large. Moreover, although the proposed
thermal-management techniques make use of sophisticated tools from
combinatorics, discrete geometry, linear algebra, and coding theory, the
resulting encoders and decoders are fully practical. They do not require
significant computational overhead and can be implemented without sacrificing a
large circuit area.
Elena Boshkovska, Nikola Zlatanov, Linglong Dai, Derrick Wing Kwan Ng, Robert Schober
Comments: Accepted for publication, WCNC 2017
Subjects: Information Theory (cs.IT)
We optimize resource allocation to enable communication security in
simultaneous wireless information and power transfer (SWIPT) for
internet-of-things (IoT) networks. The resource allocation algorithm design is
formulated as a non-convex optimization problem. We aim at maximizing the total
harvested power at energy harvesting (EH) receivers via the joint optimization
of transmit beamforming vectors and the covariance matrix of the artificial
noise injected to facilitate secrecy provisioning. The proposed problem
formulation takes into account the non-linearity of energy harvesting circuits
and the quality of service requirements for secure communication.
To obtain a globally optimal solution of the resource allocation problem, we
first transform the resulting non-convex sum-of-ratios objective function into
an equivalent objective function in parametric subtractive form, which
facilitates the design of a novel iterative resource allocation algorithm. In
each iteration, the semidefinite programming (SDP) relaxation approach is
adopted to solve a rank-constrained optimization problem optimally. Numerical
results reveal that the proposed algorithm can guarantee communication security
and provide a significant performance gain in terms of the harvested energy
compared to existing designs which are based on the traditional linear EH
model.
Emrah Akyol, Kenneth Rose, Tamer Basar, Cedric Langbort
Comments: submitted to IEEE Transactions on Signal and Information Processing over Networks, Special Issue on Distributed Signal Processing for Security and Privacy in Networked Cyber-Physical Systems
Subjects: Computer Science and Game Theory (cs.GT); Cryptography and Security (cs.CR); Information Theory (cs.IT); Multiagent Systems (cs.MA)
This paper studies optimal communication and coordination strategies in
cyber-physical systems for both defender and attacker within a game-theoretic
framework. We model the communication network of a cyber-physical system as a
sensor network which involves one single Gaussian source observed by many
sensors, subject to additive independent Gaussian observation noises. The
sensors communicate with the estimator over a coherent Gaussian multiple access
channel. The aim of the receiver is to reconstruct the underlying source with
minimum mean squared error. The scenario of interest here is one where some of
the sensors are captured by the attacker and they act as the adversary
(jammer): they strive to maximize distortion. The receiver (estimator) knows
the captured sensors but still cannot simply ignore them due to the multiple
access channel, i.e., the outputs of all sensors are summed to generate the
estimator input. We show that the ability of transmitter sensors to secretly
agree on a random event, that is “coordination”, plays a key role in the
analysis…
Thibault Lesieur, Léo Miolane, Marc Lelarge, Florent Krzakala, Lenka Zdeborová
Comments: 8 pages, 3 figures, 1 table
Subjects: Statistics Theory (math.ST); Disordered Systems and Neural Networks (cond-mat.dis-nn); Information Theory (cs.IT)
We consider tensor factorizations using a generative model and a Bayesian
approach. We compute rigorously the mutual information, the Minimal Mean Square
Error (MMSE), and unveil information-theoretic phase transitions. In addition,
we study the performance of Approximate Message Passing (AMP) and show that it
achieves the MMSE for a large set of parameters, and that factorization is
algorithmically “easy” in a much wider region than previously believed. It
exists, however, a “hard” region where AMP fails to reach the MMSE and we
conjecture that no polynomial algorithm will improve on AMP.
Alexey Balitskiy, Roman Karasev, Alexander Tsigler
Subjects: Metric Geometry (math.MG); Information Theory (cs.IT)
We consider geometrical optimization problems related to optimizing the error
probability in the presence of a Gaussian noise. One famous questions in the
field is the “weak simplex conjecture”. We discuss possible approaches to it,
and state related conjectures about the Gaussian measure, in particular, the
conjecture about minimizing of the Gaussian measure of a simplex. We also
consider antipodal codes, apply the v{S}id’ak inequality and establish some
theoretical and some numerical results about their optimality.
Ming Ding, David Lopez-Perez
Comments: Invited paper for the Workshop on Spatial Stochastic Models for Wireless Networks (SpaSWiN), Paris, France, 15th – 19th May, 2017. arXiv admin note: text overlap with arXiv:1609.07710
Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT)
In this paper, we conduct performance evaluation for ultra-dense networks
(UDNs) and identify the major and minor factors in the performance analysis of
UDNs using stochastic geometry. From our study, we draw the following
conclusions. First, there are 3 major factors that define the fundamental
behaviors of UDNs: (i) a multi-piece path loss model with line-of-sight
(LoS)/non-line-of-sight (NLoS) transmissions, which leads to a performance
degradation caused by a faster growth of the interference power compared with
the signal power; (ii) a non-zero antenna height difference between BSs and
UEs, which gives rise to another performance degradation due to a cap on the
signal power; (iii) a finite BS/UE density, which promises a performance
improvement due to the BS idle mode operation that mitigates unnecessary
inter-cell interference. Second, there are 4 minor factors that do not change
the fundamental behaviors of UDNs: (i) a general multi-path fading model based
on Rician fading; (ii) a correlated shadow fading model; (iii) a base station
(BS) density dependent transmission power; (iv) a deterministic BS/user
density. Finally, there are 3 factors for future study: (i) a BS vertical
antenna pattern; (ii) multi-antenna and/or multi-BS joint transmissions; (iii)
a constrained uniform distribution of BSs. Our conclusions can guide
researchers to down-select the assumptions in their stochastic geometry
analyses, so as to avoid unnecessarily complicated results, while still
capturing the fundamentals of UDNs in a meaningful way.
Ryota Iwamoto, Takeshi Koshiba
Comments: 5 pages
Subjects: Cryptography and Security (cs.CR); Information Theory (cs.IT)
Non-malleable code is a relaxed version of error-correction codes and the
decoding of modified codewords results in the original message or a completely
unrelated value. Thus, if an adversary corrupts a codeword then he cannot get
any information from the codeword. This means that non-malleable codes are
useful to provide a security guarantee in such situations that the adversary
can overwrite the encoded message. In 2010, Dziembowski et al. showed a
construction for non-malleable codes against the adversary who can falsify
codewords bitwise independently. In this paper, we consider an extended
adversarial model (affine error model) where the adversary can falsify
codewords bitwise independently or replace some bit with the value obtained by
applying an affine map over a limited number of bits. We prove that the
non-malleable codes (for the bitwise error model) provided by Dziembowski et
al. are still non-malleable against the adversary in the affine error model.
Adarsh Barik, Jean Honorio, Mohit Tawarmalani
Subjects: Learning (cs.LG); Information Theory (cs.IT); Machine Learning (stat.ML)
We analyze the necessary number of samples for sparse vector recovery in a
noisy linear prediction setup. This model includes problems such as linear
regression and classification. We focus on structured graph models. In
particular, we prove that sufficient number of samples for the weighted graph
model proposed by Hegde and others is also necessary. We use the Fano’s
inequality on well constructed ensembles as our main tool in establishing
information theoretic lower bounds.