IT博客汇 | arXiv Paper Daily: Fri, 6 Jan 2017

arXiv Paper Daily: Fri, 6 Jan 2017

我爱机器学习(52ml.net)发表于 2017-01-06 00:00:00

Neural and Evolutionary Computing

Generating Focussed Molecule Libraries for Drug Discovery with Recurrent Neural Networks

Marwin H.S. Segler, Thierry Kogej, Christian Tyrchan, Mark P. Waller
Comments: 17 pages, 17 figures
Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Learning (cs.LG); Chemical Physics (physics.chem-ph); Machine Learning (stat.ML)

In de novo drug design, computational strategies are used to generate novel
molecules with good affinity to the desired biological target. In this work, we
show that recurrent neural networks can be trained as generative models for
molecular structures, similar to statistical language models in natural
language processing. We demonstrate that the properties of the generated
molecules correlate very well with the properties of the molecules used to
train the model. In order to enrich libraries with molecules active towards a
given biological target, we propose to fine-tune the model with small sets of
molecules, which are known to be active against that target.

Against Staphylococcus aureus, the model reproduced 14% of 6051 hold-out test
molecules that medicinal chemists designed, whereas against Plasmodium
falciparum (Malaria) it reproduced 28% of 1240 test molecules. When coupled
with a scoring function, our model can perform the complete de novo drug design
cycle to generate large sets of novel molecules for drug discovery.

Subpopulation Diversity Based Selecting Migration Moment in Distributed Evolutionary Algorithms

Chengjun Li, Jia Wu
Comments: 17 pages, 9 figures
Subjects: Neural and Evolutionary Computing (cs.NE)

In distributed evolutionary algorithms, migration interval is used to decide
migration moments. Nevertheless, migration moments predetermined by intervals
cannot match the dynamic situation of evolution. In this paper, a scheme of
setting the success rate of migration based on subpopulation diversity at each
interval is proposed. With the scheme, migration still occurs at intervals, but
the probability of immigrants entering the target subpopulation will be
determined by the diversity of this subpopulation according to a proposed
formula. An analysis shows that the time consumption of our scheme is
acceptable. In our experiments, the basement of parallelism is an evolutionary
algorithm for the traveling salesman problem. Under different value
combinations of parameters for the formula, outcomes for eight benchmark
instances of the distributed evolutionary algorithm with the proposed scheme
are compared with those of a traditional one, respectively. Results show that
the distributed evolutionary algorithm based on our scheme has a significant
advantage on solutions especially for high difficulty instances. Moreover, it
can be seen that the algorithm with the scheme has the most outstanding
performance under three value combinations of above-mentioned parameters for
the formula.

A Review of Neural Network Based Machine Learning Approaches for Rotor Angle Stability Control

Reza Yousefian, Sukumar Kamalasadan
Subjects: Systems and Control (cs.SY); Neural and Evolutionary Computing (cs.NE)

This paper reviews the current status and challenges of Neural Networks (NNs)
based machine learning approaches for modern power grid stability control
including their design and implementation methodologies. NNs are widely
accepted as Artificial Intelligence (AI) approaches offering an alternative way
to control complex and ill-defined problems. In this paper various application
of NNs for power system rotor angle stabilization and control problem is
discussed. The main focus of this paper is on the use of Reinforcement Learning
(RL) and Supervised Learning (SL) algorithms in power system wide-area control
(WAC). Generally, these algorithms due to their capability in modeling
nonlinearities and uncertainties are used for transient classification,
neuro-control, wide-area monitoring and control, renewable energy management
and control, and so on. The works of researchers in the field of conventional
and renewable energy systems are reported and categorized. Paper concludes by
presenting, comparing and evaluating various learning techniques and
infrastructure configurations based on efficiency.

Computer Vision and Pattern Recognition

Learning from Synthetic Humans

Gül Varol, Javier Romero, Xavier Martin, Naureen Mahmood, Michael Black, Ivan Laptev, Cordelia Schmid
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Estimating human pose, shape, and motion from images and videos are
fundamental challenges with many applications. Recent advances in 2D human pose
estimation use large amounts of manually-labeled training data for learning
convolutional neural networks (CNNs). Such data is time consuming to acquire
and difficult to extend. Moreover, manual labeling of 3D pose, depth and motion
is impractical. In this work we present SURREAL (Synthetic hUmans foR REAL
tasks): a new large-scale dataset with synthetically-generated but realistic
images of people rendered from 3D sequences of human motion capture data. We
generate more than 6 million frames together with ground truth pose, depth
maps, and segmentation masks. We show that CNNs trained on our synthetic
dataset allow for accurate human depth estimation and human part segmentation
in real RGB images. Our results and the new dataset open up new possibilities
for advancing person analysis using cheap and large-scale synthetic data.

Autoencoder Regularized Network For Driving Style Representation Learning

Weishan Dong, Ting Yuan, Kai Yang, Changsheng Li, Shilei Zhang
Subjects: Computer Vision and Pattern Recognition (cs.CV)

In this paper, we study learning generalized driving style representations
from automobile GPS trip data. We propose a novel Autoencoder Regularized deep
neural Network (ARNet) and a trip encoding framework trip2vec to learn drivers’
driving styles directly from GPS records, by combining supervised and
unsupervised feature learning in a unified architecture. Experiments on a
challenging driver number estimation problem and the driver identification
problem show that ARNet can learn a good generalized driving style
representation: It significantly outperforms existing methods and alternative
architectures by reaching the least estimation error on average (0.68, less
than one driver) and the highest identification accuracy (by at least 3%
improvement) compared with traditional supervised learning methods.

The Dem@Care Experiments and Datasets: a Technical Report

Anastasios Karakostas, Alexia Briassouli, Konstantinos Avgerinakis, Ioannis Kompatsiaris, Magda Tsolaki
Comments: 4pages 2figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)

The objective of Dem@Care is the development of a complete system providing
personal health services to people with dementia, as well as medical
professionals and caregivers, by using a multitude of sensors, for
context-aware, multi-parametric monitoring of lifestyle, ambient environment,
and health parameters. Multi-sensor data analysis, combined with intelligent
decision making mechanisms, will allow an accurate representation of the
person’s current status and will provide the appropriate feedback, both to the
person and the associated caregivers, enhancing the standard clinical workflow.
Within the project framework, several data collection activities have taken
place to assist technical development and evaluation tasks. In all these
activities, particular attention has been paid to adhere to ethical guidelines
and preserve the participants’ privacy. This technical report describes shorty
the (a) the main objectives of the project, (b) the main ethical principles and
(c) the datasets that have been already created.

Overlapping Cover Local Regression Machines

Mohamed Elhoseiny, Ahmed Elgammal
Comments: Long Article with more experiments and analysis of conference paper “Overlapping Domain Cover for Scalable and Accurate Regression Kernel Machines”, presented orally 2015 at the British Machine Vision Conference 2015 (BMVC)
Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)

We present the Overlapping Domain Cover (ODC) notion for kernel machines, as
a set of overlapping subsets of the data that covers the entire training set
and optimized to be spatially cohesive as possible. We show how this notion
benefit the speed of local kernel machines for regression in terms of both
speed while achieving while minimizing the prediction error. We propose an
efficient ODC framework, which is applicable to various regression models and
in particular reduces the complexity of Twin Gaussian Processes (TGP)
regression from cubic to quadratic. Our notion is also applicable to several
kernel methods (e.g., Gaussian Process Regression(GPR) and IWTGP regression, as
shown in our experiments). We also theoretically justified the idea behind our
method to improve local prediction by the overlapping cover. We validated and
analyzed our method on three benchmark human pose estimation datasets and
interesting findings are discussed.

Artificial Intelligence

NeuroRule: A Connectionist Approach to Data Mining

Hongjun Lu, Rudy Setiono, Huan Liu
Comments: VLDB1995
Subjects: Artificial Intelligence (cs.AI)

Classification, which involves finding rules that partition a given data set
into disjoint groups, is one class of data mining problems. Approaches proposed
so far for mining classification rules for large databases are mainly decision
tree based symbolic learning methods. The connectionist approach based on
neural networks has been thought not well suited for data mining. One of the
major reasons cited is that knowledge generated by neural networks is not
explicitly represented in the form of rules suitable for verification or
interpretation by humans. This paper examines this issue. With our newly
developed algorithms, rules which are similar to, or more concise than those
generated by the symbolic methods can be extracted from the neural networks.
The data mining process using neural networks with the emphasis on rule
extraction is described. Experimental results and comparison with previously
published works are presented.

Toward negotiable reinforcement learning: shifting priorities in Pareto optimal sequential decision-making

Andrew Critch
Subjects: Artificial Intelligence (cs.AI); Computer Science and Game Theory (cs.GT); Learning (cs.LG)

Existing multi-objective reinforcement learning (MORL) algorithms do not
account for objectives that arise from players with differing beliefs.
Concretely, consider two players with different beliefs and utility functions
who may cooperate to build a machine that takes actions on their behalf. A
representation is needed for how much the machine’s policy will prioritize each
player’s interests over time. Assuming the players have reached common
knowledge of their situation, this paper derives a recursion that any Pareto
optimal policy must satisfy. Two qualitative observations can be made from the
recursion: the machine must (1) use each player’s own beliefs in evaluating how
well an action will serve that player’s utility function, and (2) shift the
relative priority it assigns to each player’s expected utilities over time, by
a factor proportional to how well that player’s beliefs predict the machine’s
inputs. Observation (2) represents a substantial divergence from na”{i}ve
linear utility aggregation (as in Harsanyi’s utilitarian theorem, and existing
MORL algorithms), which is shown here to be inadequate for Pareto optimal
sequential decision-making on behalf of players with different beliefs.

Generating Focussed Molecule Libraries for Drug Discovery with Recurrent Neural Networks

In de novo drug design, computational strategies are used to generate novel
molecules with good affinity to the desired biological target. In this work, we
show that recurrent neural networks can be trained as generative models for
molecular structures, similar to statistical language models in natural
language processing. We demonstrate that the properties of the generated
molecules correlate very well with the properties of the molecules used to
train the model. In order to enrich libraries with molecules active towards a
given biological target, we propose to fine-tune the model with small sets of
molecules, which are known to be active against that target.

Against Staphylococcus aureus, the model reproduced 14% of 6051 hold-out test
molecules that medicinal chemists designed, whereas against Plasmodium
falciparum (Malaria) it reproduced 28% of 1240 test molecules. When coupled
with a scoring function, our model can perform the complete de novo drug design
cycle to generate large sets of novel molecules for drug discovery.

Information Retrieval

Exploration of Proximity Heuristics in Length Normalization

Pranav Agrawal
Comments: 7 pages
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

Ranking functions used in information retrieval are primarily used in the
search engines and they are often adopted for various language processing
applications. However, features used in the construction of ranking functions
should be analyzed before applying it on a data set. This paper gives
guidelines on construction of generalized ranking functions with
application-dependent features. The paper prescribes a specific case of a
generalized function for recommendation system using feature engineering
guidelines on the given data set. The behavior of both generalized and specific
functions are studied and implemented on the unstructured textual data. The
proximity feature based ranking function has outperformed by 52% from regular
BM25.

Outlier Detection for Text Data : An Extended Version

Ramakrishnan Kannan, Hyenkyun Woo, Charu C. Aggarwal, Haesun Park
Comments: Accepted at 2017 SIAM Data Mining Conference
Subjects: Information Retrieval (cs.IR); Learning (cs.LG); Machine Learning (stat.ML)

The problem of outlier detection is extremely challenging in many domains
such as text, in which the attribute values are typically non-negative, and
most values are zero. In such cases, it often becomes difficult to separate the
outliers from the natural variations in the patterns in the underlying data. In
this paper, we present a matrix factorization method, which is naturally able
to distinguish the anomalies with the use of low rank approximations of the
underlying data. Our iterative algorithm TONMF is based on block coordinate
descent (BCD) framework. We define blocks over the term-document matrix such
that the function becomes solvable. Given most recently updated values of other
matrix blocks, we always update one block at a time to its optimal. Our
approach has significant advantages over traditional methods for text outlier
detection. Finally, we present experimental results illustrating the
effectiveness of our method over competing methods.

Temporal Effects on Hashtag Reuse in Twitter: A Cognitive-Inspired Hashtag Recommendation Approach

Dominik Kowald, Subhash Pujari, Elisabeth Lex
Comments: Accepted at WWW 2017
Subjects: Information Retrieval (cs.IR)

Hashtags have become a powerful tool in social platforms such as Twitter to
categorize and search for content, and to spread short messages across members
of the social network. In this paper, we study temporal hashtag usage practices
in Twitter with the aim of designing a cognitive-inspired hashtag
recommendation algorithm we call BLLi,s. Our main idea is to incorporate the
effect of time on (i) individual hashtag reuse (i.e., reusing own hashtags),
and (ii) social hashtag reuse (i.e., reusing hashtags, which has been
previously used by a followee) into a predictive model. For this, we turn to
the Base-Level Learning (BLL) equation from the cognitive architecture ACT-R,
which accounts for the time-dependent decay of item exposure in human memory.
We validate BLLi,s using two crawled Twitter datasets in two evaluation
scenarios: firstly, only temporal usage patterns of past hashtag assignments
are utilized and secondly, these patterns are combined with a content-based
analysis of the current tweet. In both scenarios, we find not only that
temporal effects play an important role for both individual and social hashtag
reuse but also that BLLi,s provides significantly better prediction accuracy
and ranking results than current state-of-the-art hashtag recommendation
methods.

A Probabilistic View of Neighborhood-based Recommendation Methods

Jun Wang, Qiang Tang
Comments: accepted by: ICDM 2016 – IEEE International Conference on Data Mining series (ICDM) workshop CLOUDMINE, 7 pages
Subjects: Information Retrieval (cs.IR)

Probabilistic graphic model is an elegant framework to compactly present
complex real-world observations by modeling uncertainty and logical flow
(conditionally independent factors). In this paper, we present a probabilistic
framework of neighborhood-based recommendation methods (PNBM) in which
similarity is regarded as an unobserved factor. Thus, PNBM leads the estimation
of user preference to maximizing a posterior over similarity. We further
introduce a novel multi-layer similarity descriptor which models and learns the
joint influence of various features under PNBM, and name the new framework
MPNBM. Empirical results on real-world datasets show that MPNBM allows very
accurate estimation of user preferences.

Adaptive Questionnaires for Direct Identification of Optimal Product Design

Max Yi Ren, Clayton Scott
Comments: submitted to Journal of Mechanical Design
Subjects: Machine Learning (stat.ML); Information Retrieval (cs.IR)

We consider the problem of identifying the most profitable product design
from a finite set of candidates under unknown consumer preference. A standard
approach to this problem follows a two-step strategy: First, estimate the
preference of the consumer population, represented as a point in part-worth
space, using an adaptive discrete-choice questionnaire. Second, integrate the
estimated part-worth vector with engineering feasibility and cost models to
determine the optimal design. In this work, we (1) demonstrate that accurate
preference estimation is neither necessary nor sufficient for identifying the
optimal design, (2) introduce a novel adaptive questionnaire that leverages
knowledge about engineering feasibility and manufacturing costs to directly
determine the optimal design, and (3) interpret product design in terms of a
nonlinear segmentation of part-worth space, and use this interpretation to
illuminate the intrinsic difficulty of optimal design in the presence of noisy
questionnaire responses. We establish the superiority of the proposed approach
using a well-documented optimal product design task. This study demonstrates
how the identification of optimal product design can be accelerated by
integrating marketing and manufacturing knowledge into the adaptive
questionnaire.

Computation and Language

Textual Entailment with Structured Attentions and Composition

Kai Zhao, Liang Huang, Mingbo Ma
Subjects: Computation and Language (cs.CL)

Deep learning techniques are increasingly popular in the textual entailment
task, overcoming the fragility of traditional discrete models with hard
alignments and logics. In particular, the recently proposed attention models
(Rockt”aschel et al., 2015; Wang and Jiang, 2015) achieves state-of-the-art
accuracy by computing soft word alignments between the premise and hypothesis
sentences. However, there remains a major limitation: this line of work
completely ignores syntax and recursion, which is helpful in many traditional
efforts. We show that it is beneficial to extend the attention model to tree
nodes between premise and hypothesis. More importantly, this subtree-level
attention reveals information about entailment relation. We study the recursive
composition of this subtree-level entailment relation, which can be viewed as a
soft version of the Natural Logic framework (MacCartney and Manning, 2009).
Experiments show that our structured attention and entailment composition model
can correctly identify and infer entailment relations from the bottom up, and
bring significant improvements in accuracy.

Exploration of Proximity Heuristics in Length Normalization

Pranav Agrawal
Comments: 7 pages
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

Ranking functions used in information retrieval are primarily used in the
search engines and they are often adopted for various language processing
applications. However, features used in the construction of ranking functions
should be analyzed before applying it on a data set. This paper gives
guidelines on construction of generalized ranking functions with
application-dependent features. The paper prescribes a specific case of a
generalized function for recommendation system using feature engineering
guidelines on the given data set. The behavior of both generalized and specific
functions are studied and implemented on the unstructured textual data. The
proximity feature based ranking function has outperformed by 52% from regular
BM25.

Distributed, Parallel, and Cluster Computing

GPU Multisplit

Saman Ashkiani, Andrew Davidson, Ulrich Meyer, John D. Owens
Comments: 46 pages, invited paper to ACM Transactions on Parallel Computing (TOPC): “Special Issue: invited papers from PPoPP 2016”. This is an extended version of PPoPP’16 paper “GPU Multisplit”
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

Multisplit is a broadly useful parallel primitive that permutes its input
data into contiguous buckets or bins, where the function that categorizes an
element into a bucket is provided by the programmer. Due to the lack of an
efficient multisplit on GPUs, programmers often choose to implement multisplit
with a sort. One way is to first generate an auxiliary array of bucket IDs and
then sort input data based on it. In case smaller indexed buckets possess
smaller valued keys, another way for multisplit is to directly sort input data.
Both methods are inefficient and require more work than necessary: the former
requires more expensive data movements while the latter spends unnecessary
effort in sorting elements within each bucket. In this work, we provide a
parallel model and multiple implementations for the multisplit problem. Our
principal focus is multisplit for a small (up to 256) number of buckets. We use
warp-synchronous programming models and emphasize warp-wide communications to
avoid branch divergence and reduce memory usage. We also hierarchically reorder
input elements to achieve better coalescing of global memory accesses. On a
GeForce GTX 1080 GPU, we can reach a peak throughput of 18.93 Gkeys/s (or 11.68
Gpairs/s) for a key-only (or key-value) multisplit. Finally, we demonstrate how
multisplit can be used as a building block for radix sort. In our
multisplit-based sort implementation, we achieve comparable performance to the
fastest GPU sort routines, sorting 32-bit keys (and key-value pairs) with a
throughput of 3.0 G keys/s (and 2.1 Gpair/s).

Gunrock: GPU Graph Analytics

Yangzihao Wang, Yuechao Pan, Andrew Davidson, Yuduo Wu, Carl Yang, Leyuan Wang, Muhammad Osama, Chenshan Yuan, Weitang Liu, Andy T. Riffel, John D. Owens
Comments: 52 pages, invited paper to ACM Transactions on Parallel Computing (TOPC), an extended version of PPoPP’16 paper “Gunrock: A High-Performance Graph Processing Library on the GPU”
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

For large-scale graph analytics on the GPU, the irregularity of data access
and control flow, and the complexity of programming GPUs, have presented two
significant challenges to developing a programmable high-performance graph
library. “Gunrock”, our graph-processing system designed specifically for the
GPU, uses a high-level, bulk-synchronous, data-centric abstraction focused on
operations on a vertex or edge frontier. Gunrock achieves a balance between
performance and expressiveness by coupling high performance GPU computing
primitives and optimization strategies with a high-level programming model that
allows programmers to quickly develop new graph primitives with small code size
and minimal GPU programming knowledge. We characterize the performance of
various optimization strategies and evaluate Gunrock’s overall performance on
different GPU architectures on a wide range of graph primitives that span from
traversal-based algorithms and ranking algorithms, to triangle counting and
bipartite-graph-based algorithms. The results show that on a single GPU,
Gunrock has on average at least an order of magnitude speedup over Boost and
PowerGraph, comparable performance to the fastest GPU hardwired primitives and
CPU shared-memory graph libraries such as Ligra and Galois, and better
performance than any other GPU high-level graph library.

Learning

Overlapping Cover Local Regression Machines

We present the Overlapping Domain Cover (ODC) notion for kernel machines, as
a set of overlapping subsets of the data that covers the entire training set
and optimized to be spatially cohesive as possible. We show how this notion
benefit the speed of local kernel machines for regression in terms of both
speed while achieving while minimizing the prediction error. We propose an
efficient ODC framework, which is applicable to various regression models and
in particular reduces the complexity of Twin Gaussian Processes (TGP)
regression from cubic to quadratic. Our notion is also applicable to several
kernel methods (e.g., Gaussian Process Regression(GPR) and IWTGP regression, as
shown in our experiments). We also theoretically justified the idea behind our
method to improve local prediction by the overlapping cover. We validated and
analyzed our method on three benchmark human pose estimation datasets and
interesting findings are discussed.

Signed Laplacian for spectral clustering revisited

Andrew V. Knyazev
Subjects: Data Structures and Algorithms (cs.DS); Learning (cs.LG); Numerical Analysis (math.NA); Machine Learning (stat.ML)

Classical spectral clustering is based on a spectral decomposition of a graph
Laplacian, obtained from a graph adjacency matrix representing positive graph
edge weights describing similarities of graph vertices. In signed graphs, the
graph edge weights can be negative to describe disparities of graph vertices,
for example, negative correlations in the data. Negative weights lead to
possible negative spectrum of the standard graph Laplacian, which is cured by
defining a signed Laplacian. We revisit comparing the standard and signed
Laplacians and argue that the former is more natural than the latter, also
showing that the negative spectrum is actually beneficial, for spectral
clustering of signed graphs.

Generating Focussed Molecule Libraries for Drug Discovery with Recurrent Neural Networks

In de novo drug design, computational strategies are used to generate novel
molecules with good affinity to the desired biological target. In this work, we
show that recurrent neural networks can be trained as generative models for
molecular structures, similar to statistical language models in natural
language processing. We demonstrate that the properties of the generated
molecules correlate very well with the properties of the molecules used to
train the model. In order to enrich libraries with molecules active towards a
given biological target, we propose to fine-tune the model with small sets of
molecules, which are known to be active against that target.

Against Staphylococcus aureus, the model reproduced 14% of 6051 hold-out test
molecules that medicinal chemists designed, whereas against Plasmodium
falciparum (Malaria) it reproduced 28% of 1240 test molecules. When coupled
with a scoring function, our model can perform the complete de novo drug design
cycle to generate large sets of novel molecules for drug discovery.

Outlier Detection for Text Data : An Extended Version

The problem of outlier detection is extremely challenging in many domains
such as text, in which the attribute values are typically non-negative, and
most values are zero. In such cases, it often becomes difficult to separate the
outliers from the natural variations in the patterns in the underlying data. In
this paper, we present a matrix factorization method, which is naturally able
to distinguish the anomalies with the use of low rank approximations of the
underlying data. Our iterative algorithm TONMF is based on block coordinate
descent (BCD) framework. We define blocks over the term-document matrix such
that the function becomes solvable. Given most recently updated values of other
matrix blocks, we always update one block at a time to its optimal. Our
approach has significant advantages over traditional methods for text outlier
detection. Finally, we present experimental results illustrating the
effectiveness of our method over competing methods.

Toward negotiable reinforcement learning: shifting priorities in Pareto optimal sequential decision-making

Andrew Critch
Subjects: Artificial Intelligence (cs.AI); Computer Science and Game Theory (cs.GT); Learning (cs.LG)

Existing multi-objective reinforcement learning (MORL) algorithms do not
account for objectives that arise from players with differing beliefs.
Concretely, consider two players with different beliefs and utility functions
who may cooperate to build a machine that takes actions on their behalf. A
representation is needed for how much the machine’s policy will prioritize each
player’s interests over time. Assuming the players have reached common
knowledge of their situation, this paper derives a recursion that any Pareto
optimal policy must satisfy. Two qualitative observations can be made from the
recursion: the machine must (1) use each player’s own beliefs in evaluating how
well an action will serve that player’s utility function, and (2) shift the
relative priority it assigns to each player’s expected utilities over time, by
a factor proportional to how well that player’s beliefs predict the machine’s
inputs. Observation (2) represents a substantial divergence from na”{i}ve
linear utility aggregation (as in Harsanyi’s utilitarian theorem, and existing
MORL algorithms), which is shown here to be inadequate for Pareto optimal
sequential decision-making on behalf of players with different beliefs.

OpenML: An R Package to Connect to the Networked Machine Learning Platform OpenML

Giuseppe Casalicchio, Jakob Bossek, Michel Lang, Dominik Kirchhoff, Pascal Kerschke, Benjamin Hofner, Heidi Seibold, Joaquin Vanschoren, Bernd Bischl
Subjects: Machine Learning (stat.ML); Learning (cs.LG)

OpenML is an online machine learning platform where researchers can easily
share data, machine learning tasks and experiments as well as organize them
online to work and collaborate more efficiently. In this paper, we present an R
package to interface with the OpenML platform and illustrate its usage in
combination with the machine learning R package mlr. We show how the OpenML
package allows R users to easily search, download and upload data sets and
machine learning tasks. Furthermore, we also show how to upload results of
experiments, share them with others and download results from other users.
Beyond ensuring reproducibility of results, the OpenML platform automates much
of the drudge work, speeds up research, facilitates collaboration and increases
the users’ visibility online.

Information Theory

Higher Order Context Transformations

Michal Vašinek, Jan Platoš
Subjects: Information Theory (cs.IT)

The context transformation and generalized context transformation methods, we
introduced recently, were able to reduce zero order entropy by exchanging
digrams, and as a consequence, they were removing mutual information between
consecutive symbols of the input message. These transformations were intended
to be used as a preprocessor for zero-order entropy coding algorithms like
Arithmetic or Huffman coding, since we know, that especially Arithmetic coding
can achieve a compression rate almost of the size of Shannon’s entropy.

This paper introduces a novel algorithm based on the concept of generalized
context transformation, that allows transformation of words longer than simple
digrams. The higher order contexts are exploited using recursive form of a
generalized context transformation. It is shown that the zero order entropy of
transformed data drops significantly, but on the other hand, the overhead given
by a description of individual transformations increases and it has become a
limiting factor in a successful transformation of smaller files.

Twisted Reed–Solomon Codes

Peter Beelen, Sven Puchinger, Johan Rosenkilde né Nielsen
Comments: 5 pages, submitted to IEEE International Symposium on Information Theory 2017
Subjects: Information Theory (cs.IT)

We present a new general construction of MDS codes over a finite field
(mathbb{F}_q). We describe two explicit subclasses which contain new MDS codes
of length at least (q/2) for all values of (q ge 11). Moreover, we show that
most of the new codes are not equivalent to a Reed–Solomon code.

Energy-Efficient Transceiver Design for Hybrid Sub-Array Architecture MIMO Systems

Shiwen He, Chenhao Qi, Yongpen Wu, Yongming Huang
Comments: 8
Subjects: Information Theory (cs.IT)

Millimeter-wave (mmWave) communication operated in frequency bands between 30
and 300 GHz has attracted extensive attention due to the potential ability of
offering orders of magnitude greater bandwidths combined with further gains via
beamforming and spatial multiplexing from multi-element antenna arrays. MmWave
system may exploit the hybrid analog and digital precoding to achieve
simultaneously the diversity, array and multiplexing gain with a lower cost of
implementation. Motivated by this, in this paper, we investigate the design of
hybrid precoder and combiner with sub-connected architecture, where each radio
frequency chain is connected to only a subset of base station (BS) antennas
from the perspective of energy efficient transmission. The problem of interest
is a non-convex and NP-hard problem that is difficult to solve directly. In
order to address it, we resort to design a two-layer optimization method to
solve the problem of interest by exploiting jointly the interference alignment
and fractional programming. First, the analog precoder and combiner are
optimized via the alternating-direction optimization method (ADOM) where the
phase shifter can be easily adjusted with an analytical structure. Then, we
optimize the digital precoder and combiner based on an effective multiple-input
multiple-output (MIMO) channel coefficient. The convergence of the proposed
algorithms is proved using the monotonic boundary theorem and fractional
programming theory. Extensive simulation results are given to validate the
effectiveness of the presented method and to evaluate the energy efficiency
performance under various system configurations.

Several classes of optimal ternary cyclic codes

Lanqiang Li, Li Liu, Shixin Zhu
Subjects: Information Theory (cs.IT)

Cyclic codes have efficient encoding and decoding algorithms over finite
fields, so that they have practical applications in communication systems,
consumer electronics and data storage systems. The objective of this paper is
to give eight new classes of optimal ternary cyclic codes with parameters
([3^m-1,3^m-1-2m,4]), according to a result on the non-existence of solutions
to a certain equation over (F_{3^m}). It is worth noticing that some recent
conclusions on such optimal ternary cyclic codes are some special cases of our
work. More importantly, three of the nine open problems proposed by Ding and
Helleseth in [8] are solved completely. In addition, another one among the nine
open problems is also promoted.

Downlink Coverage Analysis for a Finite 3D Wireless Network of Unmanned Aerial Vehicles

Vishnu Vardhan Chetlur, Harpreet S. Dhillon
Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

In this paper, we consider a finite network of unmanned aerial vehicles
(UAVs) serving a given region. Modeling this network as a uniform binomial
point process (BPP), we derive the downlink coverage probability of a reference
receiver located at an arbitrary position on the ground assuming Nakagami-(m)
fading for all wireless links. The reference receiver is assumed to connect to
its closest transmitting node as is usually the case in cellular systems. After
deriving the distribution of distances from the reference receiver to the
serving and interfering nodes, we derive an exact expression for downlink
coverage probability in terms of the derivative of Laplace transform of
interference power distribution. In the downlink of this system, it is not
unusual to encounter scenarios in which the line-of-sight (LOS) component is
significantly stronger than the reflected multipath components. To emulate such
scenarios, we also derive the coverage probability in the absence of fading
from the results of Nakagami-(m) fading by taking the limit (m o infty).
Using asymptotic expansion of incomplete gamma function, we concretely show
that this limit reduces to a redundant condition. Consequently, we derive an
accurate coverage probability approximation for this case using dominant
interferer-based approach in which the effect of dominant interferer is exactly
captured and the residual interference from other interferers is carefully
approximated. We then derive the bounds of the approximate coverage probability
using Berry-Esseen theorem. Our analyses reveal several useful trends in
coverage probability as a function of height of the transmitting nodes and the
location of reference receiver on the ground.

Random Subsets of Structured Deterministic Frames have MANOVA Spectra

Marina Haikin, Ram Zamir, Matan Gavish
Subjects: Information Theory (cs.IT)

We draw a random subset of (k) rows from a frame with (n) rows (vectors) and
(m) columns (dimensions), where (k) and (m) are proportional to (n). For a
variety of important deterministic equiangular tight frames (ETFs) and tight
non-ETF frames, we consider the distribution of singular values of the
(k)-subset matrix. We observe that for large (n) they can be precisely
described by a known probability distribution — Wachter’s MANOVA spectral
distribution, a phenomenon that was previously known only for two types of
random frames. In terms of convergence to this limit, the (k)-subset matrix
from all these frames is shown to behave exactly like the classical MANOVA
(Jacobi) random matrix ensemble. Thus empirically the MANOVA ensemble offers a
universal description of spectra of randomly selected (k)-subframes, even those
taken from deterministic frames. The same universality phenomena is shown to
hold for notable random frames as well. This description enables exact
calculations of properties of solutions for systems of linear equations based
on a random choice of (k) frame vectors out of (n) possible vectors, and has a
variety of implications for erasure coding, compressed sensing, and sparse
recovery. Our results are empirical, but they are exhaustive, precise and fully
reproducible.

Low Complexity Receiver for Uplink SCMA System via Expectation Propagation

Xiangming Meng, Yiqun Wu, Yan Chen, Meng Cheng
Comments: 5 pages, 5 figures, accepted by IEEE WCNC 2017
Subjects: Information Theory (cs.IT)

Sparse code multiple access (SCMA) scheme is considered to be one promising
non-orthogonal multiple access technology for the future fifth generation (5G)
communications. Due to the sparse nature, message passing algorithm (MPA) has
been used as the receiver to achieve close to maximum likelihood (ML) detection
performance with much lower complexity. However, the complexity order of MPA is
still exponential with the size of codebook and the degree of signal
superposition on a given resource element. In this paper, we propose a novel
low complexity iterative receiver based on expectation propagation algorithm
(EPA), which reduces the complexity order from exponential to linear.
Simulation results demonstrate that the proposed EPA receiver achieves nearly
the same block error rate (BLER) performance as the conventional message
passing algorithm (MPA) receiver with orders less complexity.

Efficient Multi-Dimensional Mapping Using QAM Constellations for BICM-ID

Hassan M. Navazi, Md. Jahangir Hossain
Subjects: Information Theory (cs.IT)

Bit-interleaved coded modulation with iterative decoding (BICM-ID) offers
very good error performance over additive white Gaussian noise (AWGN) and
fading channels if it uses a wisely designed signal mapping. Further, error
performance of BICM-ID can significantly be improved by employing
multi-dimensional (MD) modulation. However, suitable MD mappings are obtained
by computer search techniques except for MD modulations that use smaller
constellation e.g., binary phase shift keying (BPSK), quadrature phase shift
keying (QPSK) and (8)-ary phase shift keying ((8)-PSK) as basic modulation. The
alphabet size of MD modulations increases exponentially as the order of the
basic modulation increases and computer search becomes intractable. In this
paper, we propose a systematic mapping method for MD modulations. The
innovativeness of our proposed method is that it generates MD mappings using
(16)- and (64)-quadrature amplitude modulation (QAM) very efficiently. The
presented numerical results show that the proposed method improves bit error
rate (BER) of BICM-ID.

Effect of Cell-Selection on the Effective Fading Distribution in a Downlink (K)-tier HetNet

Mustafa A. Kishk, Harpreet S. Dhillon
Subjects: Information Theory (cs.IT); Probability (math.PR)

This paper characterizes the statistics of effective fading gain in
multi-tier cellular networks with strongest base station (BS) cell association
policy. First, we derive the probability of association with the (n)-th nearest
BS in the (k)-th tier. Next, we use this result to derive the probability
density function (PDF) of the channel fading gain (effective fading)
experienced by the user when associating with the strongest BS. Interestingly,
our results show that the effective channel gain distribution solely depends
upon the original channel fading and the path-loss exponent. Moreover, we show
that in the case of Nakagami-(m) fading channels (Gamma distribution), the
distribution of the effective fading is also Gamma but with a gain of
(frac{alpha}{2}) in the shape parameter, where (alpha) is the path-loss
exponent.

The World's First Real-Time Testbed for Massive MIMO: Design, Implementation, and Validation

Steffen Malkowsky, Joao Vieira, Liang Liu, Paul Harris, Karl Nieman, Nikhil Kundargi, Ian Wong, Fredrik Tufvesson, Viktor Öwall, Ove Edfors
Comments: 14 pages, submitted to IEEE Access
Subjects: Information Theory (cs.IT)

This paper sets up a framework for designing a massive multiple-input
multiple-output (MIMO) testbed by investigating hardware and system-level
requirements such as processing complexity, duplexing mode and frame structure.
Taking these into account, a generic system and processing partitioning is
proposed which allows flexible scaling and processing distribution onto a
multitude of physically separated devices. Based on the given hardware
constraints such as maximum number of links and maximum throughput for
peer-to-peer interconnections combined with processing capabilities, the
framework allows to evaluate available off-the-shelf hardware components. To
verify our design approach, we present the Lund University Massive MIMO
(LuMaMi) testbed which constitutes the first reconfigurable real-time hardware
platform for prototyping massive MIMO. Utilizing up to 100 base station
antennas and more than 50 field-programmable gate arrays (FPGAs), up to 12 user
equipments are served on the same time/frequency resource using an LTE-like
OFDM TDD-based transmission scheme. Field trials with this system show that
massive MIMO can spatially separate a multitude of users in a static indoor and
static/dynamic outdoor environment.

Adaptive Real-Time Software Defined MIMO Visible Light Communications using Spatial Multiplexing and Spatial Diversity

Peng Deng, Mohsen Kavehrad
Comments: 6 pages, 7 figures, Accept by WiSEE_2016. arXiv admin note: substantial text overlap with arXiv:1612.05402
Subjects: Information Theory (cs.IT)

In this paper, we experimentally demonstrate a real-time software defined
multiple input multiple output (MIMO) visible light communication (VLC) system
employing link adaptation of spatial multiplexing and spatial diversity.
Real-time MIMO signal processing is implemented by using the Field Programmable
Gate Array (FPGA) based Universal Software Radio Peripheral (USRP) devices.
Software defined implantation of MIMO VLC can assist in enabling an adaptive
and reconfigurable communication system without hardware changes. We measured
the error vector magnitude (EVM), bit error rate (BER) and spectral efficiency
performance for single carrier M-QAM MIMO VLC using spatial diversity and
spatial multiplexing. Results show that spatial diversity MIMO VLC improves
error performance at the cost of spectral efficiency that spatial multiplexing
should enhance. We propose the adaptive MIMO solution that both modulation
schema and MIMO schema are dynamically adapted to the changing channel
conditions for enhancing the error performance and spectral efficiency. The
average error-free spectral efficiency of adaptive 2×2 MIMO VLC achieved 12
b/s/Hz over 2 meters indoor dynamic transmission.

Perfect Sequences and Arrays over the Unit Quaternions

Sam Blake
Subjects: Information Theory (cs.IT)

We introduce several new constructions for perfect periodic autocorrelation
sequences and arrays over the unit quaternions. This paper uses both
mathematical proofs and com- puter experiments to prove the (bounded) array
constructions have perfect periodic auto- correlation. Furthermore, the first
sequence construction generates odd-perfect sequences of unbounded lengths,
with good ZCZ.

Compressive Sensing Based Detection With Multimodal-Dependent Data

Thakshila Wimalajeewa, Pramod K. Varshney
Subjects: Applications (stat.AP); Information Theory (cs.IT)

Detection with high dimensional multimodal data is a challenging problem when
there are complex inter- and intra- modal dependencies. While several
approaches have been proposed for dependent data fusion (e.g., based on copula
theory), their advantages come at a high price in terms of computational
complexity. To overcome computational difficulties, in this paper, we treat the
detection problem in a compressed domain where compression is achieved via low
dimensional random projections as considered in compressive sensing (CS). While
CS has recently been exploited to solve detection problems under various
assumptions on the signals of interest, its potential for dependent data fusion
has not been explored adequately. We propose two approaches for detection when
the uncompressed data is dependent. First, Gaussian approximation is employed
to perform likelihood ratio (LR) based detection in the compressed domain. We
show that, under certain conditions, LR based detection with a small number of
compressed measurements leads to enhanced performance compared to detection
with (uncompressed) high dimensional data using widely considered suboptimal
approaches in the literature. Second, we explore the potential of CS to capture
second order statistics of the uncompressed dependent data to construct a
decision statistic in the compressed domain. While the second approach is quite
computationally complex compared to the first approach, its use is promising
over the first approach when multimodal data is dependent and highly
correlated. Further, in both approaches, complete signal reconstruction is not
necessary to make a reliable detection decision.

A Hybrid Energy Sharing Framework for Green Cellular Networks

Muhammad Junaid Farooq, Hakim Ghazzai, Abdullah Kadri, Hesham ElSawy, Mohamed-Slim Alouini
Comments: 16 pages 8 figures in IEEE Transactions on Communications 2017
Subjects: Networking and Internet Architecture (cs.NI); Computational Geometry (cs.CG); Information Theory (cs.IT)

Cellular operators are increasingly turning towards renewable energy (RE) as
an alternative to using traditional electricity in order to reduce operational
expenditure and carbon footprint. Due to the randomness in both RE generation
and mobile traffic at each base station (BS), a surplus or shortfall of energy
may occur at any given time. To increase energy self-reliance and minimize the
network’s energy cost, the operator needs to efficiently exploit the RE
generated across all BSs. In this paper, a hybrid energy sharing framework for
cellular network is proposed, where a combination of physical power lines and
energy trading with other BSs using smart grid is used. Algorithms for physical
power lines deployment between BSs, based on average and complete statistics of
the net RE available, are developed. Afterwards, an energy management framework
is formulated to optimally determine the quantities of electricity and RE to be
procured and exchanged among BSs, respectively, while considering battery
capacities and real-time energy pricing. Three cases are investigated where RE
generation is unknown, perfectly known, and partially known ahead of time.
Results investigate the time varying energy management of BSs and demonstrate
considerable reduction in average energy cost thanks to the hybrid energy
sharing scheme.

A Matrix Factorization Approach for Learning Semidefinite-Representable Regularizers

Yong Sheng Soh, Venkat Chandrasekaran
Comments: 48 pages, 9 figures
Subjects: Optimization and Control (math.OC); Information Theory (cs.IT); Machine Learning (stat.ML)

Regularization techniques are widely employed in optimization-based
approaches for solving ill-posed inverse problems in data analysis and
scientific computing. These methods are based on augmenting the objective with
a penalty function, which is specified based on prior domain-specific expertise
to induce a desired structure in the solution. We consider the problem of
learning suitable regularization functions from data in settings in which
precise domain knowledge is not directly available. Previous work under the
title of `dictionary learning’ or `sparse coding’ may be viewed as learning a
regularization function that can be computed via linear programming. We
describe generalizations of these methods to learn regularizers that can be
computed and optimized via semidefinite programming. Our framework for learning
such semidefinite regularizers is based on obtaining structured factorizations
of data matrices, and our algorithmic approach for computing these
factorizations combines recent techniques for rank minimization problems along
with an operator analog of Sinkhorn scaling. Under suitable conditions on the
input data, our algorithm provides a locally linearly convergent method for
identifying the correct regularizer that promotes the type of structure
contained in the data. Our analysis is based on the stability properties of
Operator Sinkhorn scaling and their relation to geometric aspects of
determinantal varieties (in particular tangent spaces with respect to these
varieties). The regularizers obtained using our framework can be employed
effectively in semidefinite programming relaxations for solving inverse
problems.