IT博客汇 | arXiv Paper Daily: Mon, 13 Feb 2017

arXiv Paper Daily: Mon, 13 Feb 2017

我爱机器学习(52ml.net)发表于 2017-02-13 00:00:00

Neural and Evolutionary Computing

Stochastic Configuration Networks: Fundamentals and Algorithms

Dianhui Wang, Ming Li
Comments: 19 pages, 4 figures and 28 references
Subjects: Neural and Evolutionary Computing (cs.NE)

This paper contributes to a development of randomized methods for neural
networks. The proposed learner model is generated incrementally by stochastic
configuration (SC) algorithms, termed as Stochastic Configuration Networks
(SCNs). In contrast to the existing randomised learning algorithms for single
layer feed-forward neural networks (SLFNNs), we randomly assign the input
weights and biases of the hidden nodes in the light of a supervisory mechanism,
and the output weights are analytically evaluated in either constructive or
selective manner. As fundamentals of SCN-based data modelling techniques, we
establish some theoretical results on the universal approximation property.
Three versions of SC algorithms are presented for regression problems
(applicable for classification problems as well) in this work. Simulation
results concerning both function approximation and real world data regression
indicate some remarkable merits of our proposed SCNs in terms of less human
intervention on the network size setting, the scope adaptation of random
parameters, fast learning and sound generalization.

A Deterministic and Generalized Framework for Unsupervised Learning with Restricted Boltzmann Machines

Eric W. Tramel, Marylou Gabrié, Andre Manoel, Francesco Caltagirone, Florent Krzakala
Subjects: Learning (cs.LG); Disordered Systems and Neural Networks (cond-mat.dis-nn); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

Restricted Boltzmann machines (RBMs) are energy-based neural-networks which
are commonly used as the building blocks for deep architectures neural
architectures. In this work, we derive a deterministic framework for the
training, evaluation, and use of RBMs based upon the Thouless-Anderson-Palmer
(TAP) mean-field approximation of widely-connected systems with weak
interactions coming from spin-glass theory. While the TAP approach has been
extensively studied for fully-visible binary spin systems, our construction is
generalized to latent-variable models, as well as to arbitrarily distributed
real-valued spin systems with bounded support. In our numerical experiments, we
demonstrate the effective deterministic training of our proposed models and are
able to show interesting features of unsupervised learning which could not be
directly observed with sampling. Additionally, we demonstrate how to utilize
our TAP-based framework for leveraging trained RBMs as joint priors in
denoising problems.

Arabic Language Sentiment Analysis on Health Services

Abdulaziz M. Alayba, Vasile Palade, Matthew England, Rahat Iqbal
Comments: Authors accepted version of submission for ASAR 2017
Subjects: Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE); Social and Information Networks (cs.SI)

The social media network phenomenon leads to a massive amount of valuable
data that is available online and easy to access. Many users share images,
videos, comments, reviews, news and opinions on different social networks
sites, with Twitter being one of the most popular ones. Data collected from
Twitter is highly unstructured, and extracting useful information from tweets
is a challenging task. Twitter has a huge number of Arabic users who mostly
post and write their tweets using the Arabic language. While there has been a
lot of research on sentiment analysis in English, the amount of researches and
datasets in Arabic language is limited. This paper introduces an Arabic
language dataset which is about opinions on health services and has been
collected from Twitter. The paper will first detail the process of collecting
the data from Twitter and also the process of filtering, pre-processing and
annotating the Arabic text in order to build a big sentiment analysis dataset
in Arabic. Several Machine Learning algorithms (Naive Bayes, Support Vector
Machine and Logistic Regression) alongside Deep and Convolutional Neural
Networks were utilized in our experiments of sentiment analysis on our health
dataset.

Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights

Aojun Zhou, Anbang Yao, Yiwen Guo, Lin Xu, Yurong Chen
Comments: Accepted as a conference track paper by ICLR 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

This paper presents incremental network quantization (INQ), a novel method,
targeting to efficiently convert any pre-trained full-precision convolutional
neural network (CNN) model into a low-precision version whose weights are
constrained to be either powers of two or zero. Unlike existing methods which
are struggled in noticeable accuracy loss, our INQ has the potential to resolve
this issue, as benefiting from two innovations. On one hand, we introduce three
interdependent operations, namely weight partition, group-wise quantization and
re-training. A well-proven measure is employed to divide the weights in each
layer of a pre-trained CNN model into two disjoint groups. The weights in the
first group are responsible to form a low-precision base, thus they are
quantized by a variable-length encoding method. The weights in the other group
are responsible to compensate for the accuracy loss from the quantization, thus
they are the ones to be re-trained. On the other hand, these three operations
are repeated on the latest re-trained group in an iterative manner until all
the weights are converted into low-precision ones, acting as an incremental
network quantization and accuracy enhancement procedure. Extensive experiments
on the ImageNet classification task using almost all known deep CNN
architectures including AlexNet, VGG-16, GoogleNet and ResNets well testify the
efficacy of the proposed method. Specifically, at 5-bit quantization, our
models have improved accuracy than the 32-bit floating-point references. Taking
ResNet-18 as an example, we further show that our quantized models with 4-bit,
3-bit and 2-bit ternary weights have improved or very similar accuracy against
its 32-bit floating-point baseline. Besides, impressive results with the
combination of network pruning and INQ are also reported. The code will be made
publicly available.

Computer Vision and Pattern Recognition

Dual-Tree Wavelet Scattering Network with Parametric Log Transformation for Object Classification

Amarjot Singh, Nick Kingsbury
Subjects: Computer Vision and Pattern Recognition (cs.CV)

We introduce a ScatterNet that uses a parametric log transformation with
Dual-Tree complex wavelets to extract translation invariant representations
from a multi-resolution image. The parametric transformation aids the OLS
pruning algorithm by converting the skewed distributions into relatively
mean-symmetric distributions while the Dual-Tree wavelets improve the
computational efficiency of the network. The proposed network is shown to
outperform Mallat’s ScatterNet on two image datasets, both for classification
accuracy and computational efficiency. The advantages of the proposed network
over other supervised and some unsupervised methods are also presented using
experiments performed on different training dataset sizes.

A clustering approach to heterogeneous change detection

Luigi Tommaso Luppino, Stian Normann Anfinsen, Gabriele Moser, Robert Jenssen, Filippo Maria Bianchi, Sebastiano Serpico, Gregoire Mercier
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Change detection in heterogeneous multitemporal satellite images is a
challenging and still not much studied topic in remote sensing and earth
observation. This paper focuses on comparison of image pairs covering the same
geographical area and acquired by two different sensors, one optical radiometer
and one synthetic aperture radar, at two different times. We propose a
clustering-based technique to detect changes, identified as clusters that split
or merge in the different images. To evaluate potentials and limitations of our
method, we perform experiments on real data. Preliminary results confirm the
relationship between splits and merges of clusters and the occurrence of
changes. However, it becomes evident that it is necessary to incorporate prior,
ancillary, or application-specific information to improve the interpretation of
clustering results and to identify unambiguously the areas of change.

Texture Characterization by Using Shape Co-occurrence Patterns

Gui-Song Xia, Gang Liu, Xiang Bai, Liangpei Zhang
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Texture characterization is a key problem in image understanding and pattern
recognition. In this paper, we present a flexible shape-based texture
representation using shape co-occurrence patterns. More precisely, texture
images are first represented by tree of shapes, each of which is associated
with several geometrical and radiometric attributes. Then four typical kinds of
shape co-occurrence patterns based on the hierarchical relationship of the
shapes in the tree are learned as codewords. Three different coding methods are
investigated to learn the codewords, with which, any given texture image can be
encoded into a descriptive vector. In contrast with existing works, the
proposed method not only inherits the strong ability to depict geometrical
aspects of textures and the high robustness to variations of imaging conditions
from the shape-based method, but also provides a flexible way to consider shape
relationships and to compute high-order statistics on the tree. To our
knowledge, this is the first time to use co-occurrence patterns of explicit
shapes as a tool for texture analysis. Experiments on various texture datasets
and scene datasets demonstrate the efficiency of the proposed method.

Graph Fourier Transform with Negative Edges for Depth Image Coding

Weng-Tai Su, Gene Cheung, Chia-Wen Lin
Comments: 5 pages, submitted to submitted to IEEE International Conference on Image Processing, Beijing, China, September, 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Recent advent in graph signal processing (GSP) has led to the development of
new graph-based transforms and wavelets for image / video coding, where the
underlying graph describes inter-pixel correlations. In this paper, we develop
a new transform called signed graph Fourier transform (SGFT), where the
underlying graph G contains negative edges that describe anti-correlations
between pixel pairs. Specifically, we first construct a one-state Markov
process that models both inter-pixel correlations and anti-correlations. We
then derive the corresponding precision matrix, and show that the loopy graph
Laplacian matrix Q of a graph G with a negative edge and two self-loops at its
end nodes is approximately equivalent. This proves that the eigenvectors of Q –
called SGFT – approximates the optimal Karhunen-Lo`eve Transform (KLT). We show
the importance of the self-loops in G to ensure Q is positive semi-definite. We
prove that the first eigenvector of Q is piecewise constant (PWC), and thus can
well approximate a piecewise smooth (PWS) signal like a depth image.
Experimental results show that a block-based coding scheme based on SGFT
outperforms a previous scheme using graph transforms with only positive edges
for several depth images.

Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights

This paper presents incremental network quantization (INQ), a novel method,
targeting to efficiently convert any pre-trained full-precision convolutional
neural network (CNN) model into a low-precision version whose weights are
constrained to be either powers of two or zero. Unlike existing methods which
are struggled in noticeable accuracy loss, our INQ has the potential to resolve
this issue, as benefiting from two innovations. On one hand, we introduce three
interdependent operations, namely weight partition, group-wise quantization and
re-training. A well-proven measure is employed to divide the weights in each
layer of a pre-trained CNN model into two disjoint groups. The weights in the
first group are responsible to form a low-precision base, thus they are
quantized by a variable-length encoding method. The weights in the other group
are responsible to compensate for the accuracy loss from the quantization, thus
they are the ones to be re-trained. On the other hand, these three operations
are repeated on the latest re-trained group in an iterative manner until all
the weights are converted into low-precision ones, acting as an incremental
network quantization and accuracy enhancement procedure. Extensive experiments
on the ImageNet classification task using almost all known deep CNN
architectures including AlexNet, VGG-16, GoogleNet and ResNets well testify the
efficacy of the proposed method. Specifically, at 5-bit quantization, our
models have improved accuracy than the 32-bit floating-point references. Taking
ResNet-18 as an example, we further show that our quantized models with 4-bit,
3-bit and 2-bit ternary weights have improved or very similar accuracy against
its 32-bit floating-point baseline. Besides, impressive results with the
combination of network pruning and INQ are also reported. The code will be made
publicly available.

Reconstruction for Feature Disentanglement in Pose-invariant Face Recognition

Xi Peng, Xiang Yu, Kihyuk Sohn, Dimitris Metaxas, Manmohan Chandraker
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Deep neural networks (DNNs) trained on large-scale datasets have recently
achieved impressive improvements in face recognition. But a persistent
challenge remains to develop methods capable of handling large pose variations
that are relatively under-represented in training data. This paper presents a
method for learning a feature representation that is invariant to pose, without
requiring extensive pose coverage in training data. We first propose to use a
synthesis network for generating non-frontal views from a single frontal image,
in order to increase the diversity of training data while preserving accurate
facial details that are critical for identity discrimination. Our next
contribution is a multi-source multi-task DNN that seeks a rich embedding
representing identity information, as well as information such as pose and
landmark locations. Finally, we propose a Siamese network to explicitly
disentangle identity and pose, by demanding alignment between the feature
reconstructions through various combinations of identity and pose features
obtained from two images of the same subject. Experiments on face datasets in
both controlled and wild scenarios, such as MultiPIE, LFW and 300WLP, show that
our method consistently outperforms the state-of-the-art, especially on images
with large head pose variations.

A New Rank Constraint on Multi-view Fundamental Matrices, and its Application to Camera Location Recovery

Soumyadip Sengupta, Tal Amir, Meirav Galun, Tom Goldstein, David W. Jacobs, Amit Singer, Ronen Basri
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Accurate estimation of camera matrices is an important step in structure from
motion algorithms. In this paper we introduce a novel rank constraint on
collections of fundamental matrices in multi-view settings. We show that in
general, with the selection of proper scale factors, a matrix formed by
stacking fundamental matrices between pairs of images has rank 6. Moreover,
this matrix forms the symmetric part of a rank 3 matrix whose factors relate
directly to the corresponding camera matrices. We use this new characterization
to produce better estimations of fundamental matrices by optimizing an L1-cost
function using Iterative Re-weighted Least Squares and Alternate Direction
Method of Multiplier. We further show that this procedure can improve the
recovery of camera locations, particularly in multi-view settings in which
fewer images are available.

A large comparison of feature-based approaches for buried target classification in forward-looking ground-penetrating radar

Joseph A. Camilo, Leslie M. Collins, Jordan M. Malof
Comments: 11 pages, 14 figures, for submission to IEEE TGARS
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Forward-looking ground-penetrating radar (FLGPR) has recently been
investigated as a remote sensing modality for buried target detection (e.g.,
landmines). In this context, raw FLGPR data is beamformed into images and then
computerized algorithms are applied to automatically detect subsurface buried
targets. Most existing algorithms are supervised, meaning they are trained to
discriminate between labeled target and non-target imagery, usually based on
features extracted from the imagery. A large number of features have been
proposed for this purpose, however thus far it is unclear which are the most
effective. The first goal of this work is to provide a comprehensive comparison
of detection performance using existing features on a large collection of FLGPR
data. Fusion of the decisions resulting from processing each feature is also
considered. The second goal of this work is to investigate two modern feature
learning approaches from the object recognition literature: the
bag-of-visual-words and the Fisher vector for FLGPR processing. The results
indicate that the new feature learning approaches outperform existing methods.
Results also show that fusion between existing features and new features yields
little additional performance improvements.

Artificial Intelligence

Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning

Jason D. Williams, Kavosh Asadi, Geoffrey Zweig
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

End-to-end learning of recurrent neural networks (RNNs) is an attractive
solution for dialog systems; however, current techniques are data-intensive and
require thousands of dialogs to learn simple behaviors. We introduce Hybrid
Code Networks (HCNs), which combine an RNN with domain-specific knowledge
encoded as software and system action templates. Compared to existing
end-to-end approaches, HCNs considerably reduce the amount of training data
required, while retaining the key benefit of inferring a latent representation
of dialog state. In addition, HCNs can be optimized with supervised learning,
reinforcement learning, or a mixture of both. HCNs attain state-of-the-art
performance on the bAbI dialog dataset, and outperform two commercially
deployed customer-facing dialog systems.

Modeling Semantic Expectation: Using Script Knowledge for Referent Prediction

Ashutosh Modi, Ivan Titov, Vera Demberg, Asad Sayeed, Manfred Pinkal
Comments: 14 pages, published at TACL, 2017, Volume-5, Pg 31-44, 2017
Journal-ref: Transactions of ACL, Volume-5, Pg 31-44 (2017)
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

Recent research in psycholinguistics has provided increasing evidence that
humans predict upcoming content. Prediction also affects perception and might
be a key to robustness in human language processing. In this paper, we
investigate the factors that affect human prediction by building a
computational model that can predict upcoming discourse referents based on
linguistic knowledge alone vs. linguistic knowledge jointly with common-sense
knowledge in the form of scripts. We find that script knowledge significantly
improves model estimates of human predictions. In a second study, we test the
highly controversial hypothesis that predictability influences referring
expression type but do not find evidence for such an effect.

Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights

This paper presents incremental network quantization (INQ), a novel method,
targeting to efficiently convert any pre-trained full-precision convolutional
neural network (CNN) model into a low-precision version whose weights are
constrained to be either powers of two or zero. Unlike existing methods which
are struggled in noticeable accuracy loss, our INQ has the potential to resolve
this issue, as benefiting from two innovations. On one hand, we introduce three
interdependent operations, namely weight partition, group-wise quantization and
re-training. A well-proven measure is employed to divide the weights in each
layer of a pre-trained CNN model into two disjoint groups. The weights in the
first group are responsible to form a low-precision base, thus they are
quantized by a variable-length encoding method. The weights in the other group
are responsible to compensate for the accuracy loss from the quantization, thus
they are the ones to be re-trained. On the other hand, these three operations
are repeated on the latest re-trained group in an iterative manner until all
the weights are converted into low-precision ones, acting as an incremental
network quantization and accuracy enhancement procedure. Extensive experiments
on the ImageNet classification task using almost all known deep CNN
architectures including AlexNet, VGG-16, GoogleNet and ResNets well testify the
efficacy of the proposed method. Specifically, at 5-bit quantization, our
models have improved accuracy than the 32-bit floating-point references. Taking
ResNet-18 as an example, we further show that our quantized models with 4-bit,
3-bit and 2-bit ternary weights have improved or very similar accuracy against
its 32-bit floating-point baseline. Besides, impressive results with the
combination of network pruning and INQ are also reported. The code will be made
publicly available.

Multi-agent Reinforcement Learning in Sequential Social Dilemmas

Joel Z. Leibo, Vinicius Zambaldi, Marc Lanctot, Janusz Marecki, Thore Graepel
Comments: 10 pages, 7 figures
Subjects: Multiagent Systems (cs.MA); Artificial Intelligence (cs.AI); Computer Science and Game Theory (cs.GT); Learning (cs.LG)

Matrix games like Prisoner’s Dilemma have guided research on social dilemmas
for decades. However, they necessarily treat the choice to cooperate or defect
as an atomic action. In real-world social dilemmas these choices are temporally
extended. Cooperativeness is a property that applies to policies, not
elementary actions. We introduce sequential social dilemmas that share the
mixed incentive structure of matrix game social dilemmas but also require
agents to learn policies that implement their strategic intentions. We analyze
the dynamics of policies learned by multiple self-interested independent
learning agents, each using its own deep Q-network, on two Markov games we
introduce here: 1. a fruit Gathering game and 2. a Wolfpack hunting game. We
characterize how learned behavior in each domain changes as a function of
environmental factors including resource abundance. Our experiments show how
conflict can emerge from competition over shared resources and shed light on
how the sequential nature of real world social dilemmas affects cooperation.

Learning Semantic Script Knowledge with Event Embeddings

Ashutosh Modi, Ivan Titov
Comments: 4 Pages, 1 figure, ICLR Workshop
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (stat.ML)

Induction of common sense knowledge about prototypical sequences of events
has recently received much attention. Instead of inducing this knowledge in the
form of graphs, as in much of the previous work, in our method, distributed
representations of event realizations are computed based on distributed
representations of predicates and their arguments, and then these
representations are used to predict prototypical event orderings. The
parameters of the compositional process for computing the event representations
and the ranking component of the model are jointly estimated from texts. We
show that this approach results in a substantial boost in ordering performance
with respect to previous methods.

Information Retrieval

Mining Electronic Health Records: A Survey

Pranjul Yadav, Michael Steinbach, Vipin Kumar, Gyorgy Simon
Subjects: Information Retrieval (cs.IR)

The continuously increasing cost of the US healthcare system has received
significant attention. Central to the ideas aimed at curbing this trend is the
use of technology, in the form of the mandate to implement electronic health
records (EHRs). EHRs consist of patient information such as demographics,
medications, laboratory test results, diagnosis codes and procedures. Mining
EHRs could lead to improvement in patient health management as EHRs contain
detailed information related to disease prognosis for large patient
populations. In this manuscript, we provide a structured and comprehensive
overview of data mining techniques for modeling EHR data. We first provide a
detailed understanding of the major application areas to which EHR mining has
been applied and then discuss the nature of EHR data and its accompanying
challenges. Next, we describe major approaches used for EHR mining, the metrics
associated with EHRs, and the various study designs. With this foundation, we
then provide a systematic and methodological organization of existing data
mining techniques used to model EHRs and discuss ideas for future research. We
conclude this survey with a comprehensive summary of clinical data mining
applications of EHR data, as illustrated in the online supplement.

Computation and Language

Arabic Language Sentiment Analysis on Health Services

The social media network phenomenon leads to a massive amount of valuable
data that is available online and easy to access. Many users share images,
videos, comments, reviews, news and opinions on different social networks
sites, with Twitter being one of the most popular ones. Data collected from
Twitter is highly unstructured, and extracting useful information from tweets
is a challenging task. Twitter has a huge number of Arabic users who mostly
post and write their tweets using the Arabic language. While there has been a
lot of research on sentiment analysis in English, the amount of researches and
datasets in Arabic language is limited. This paper introduces an Arabic
language dataset which is about opinions on health services and has been
collected from Twitter. The paper will first detail the process of collecting
the data from Twitter and also the process of filtering, pre-processing and
annotating the Arabic text in order to build a big sentiment analysis dataset
in Arabic. Several Machine Learning algorithms (Naive Bayes, Support Vector
Machine and Logistic Regression) alongside Deep and Convolutional Neural
Networks were utilized in our experiments of sentiment analysis on our health
dataset.

Universal Semantic Parsing

Siva Reddy, Oscar Täckström, Slav Petrov, Mark Steedman, Mirella Lapata
Comments: 15 pages with supplementary
Subjects: Computation and Language (cs.CL)

Universal Dependencies (UD) provides a cross-linguistically uniform syntactic
representation, with the aim of advancing multilingual applications of parsing
and natural language understanding. Reddy et al. (2016) recently developed a
semantic interface for (English) Stanford Dependencies, based on the lambda
calculus. In this work, we introduce UDepLambda, a similar semantic interface
for UD, which allows mapping natural language to logical forms in an almost
language-independent framework. We evaluate our approach on semantic parsing
for the task of question answering against Freebase. To facilitate multilingual
evaluation, we provide German and Spanish translations of the WebQuestions and
GraphQuestions datasets. Results show that UDepLambda outperforms strong
baselines across languages and datasets. For English, it achieves the strongest
result to date on GraphQuestions, with competitive results on WebQuestions.

Modeling Semantic Expectation: Using Script Knowledge for Referent Prediction

Recent research in psycholinguistics has provided increasing evidence that
humans predict upcoming content. Prediction also affects perception and might
be a key to robustness in human language processing. In this paper, we
investigate the factors that affect human prediction by building a
computational model that can predict upcoming discourse referents based on
linguistic knowledge alone vs. linguistic knowledge jointly with common-sense
knowledge in the form of scripts. We find that script knowledge significantly
improves model estimates of human predictions. In a second study, we test the
highly controversial hypothesis that predictability influences referring
expression type but do not find evidence for such an effect.

UsingWord Embedding for Cross-Language Plagiarism Detection

J. Ferrero, F. Agnes, L. Besacier, D. Schwab
Comments: Accepted to EACL 2017 (short)
Subjects: Computation and Language (cs.CL)

This paper proposes to use distributed representation of words (word
embeddings) in cross-language textual similarity detection. The main
contributions of this paper are the following: (a) we introduce new
cross-language similarity detection methods based on distributed representation
of words; (b) we combine the different methods proposed to verify their
complementarity and finally obtain an overall F1 score of 89.15% for
English-French similarity detection at chunk level (88.5% at sentence level) on
a very challenging corpus.

Local System Voting Feature for Machine Translation System Combination

Markus Freitag, Jan-Thorsten Peter, Stephan Peitz, Minwei Feng, Hermann Ney
Comments: published WMT 2015
Subjects: Computation and Language (cs.CL)

In this paper, we enhance the traditional confusion network system
combination approach with an additional model trained by a neural network. This
work is motivated by the fact that the commonly used binary system voting
models only assign each input system a global weight which is responsible for
the global impact of each input system on all translations. This prevents
individual systems with low system weights from having influence on the system
combination output, although in some situations this could be helpful. Further,
words which have only been seen by one or few systems rarely have a chance of
being present in the combined output. We train a local system voting model by a
neural network which is based on the words themselves and the combinatorial
occurrences of the different system outputs. This gives system combination the
option to prefer other systems at different word positions even for the same
sentence.

Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning

Jason D. Williams, Kavosh Asadi, Geoffrey Zweig
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

End-to-end learning of recurrent neural networks (RNNs) is an attractive
solution for dialog systems; however, current techniques are data-intensive and
require thousands of dialogs to learn simple behaviors. We introduce Hybrid
Code Networks (HCNs), which combine an RNN with domain-specific knowledge
encoded as software and system action templates. Compared to existing
end-to-end approaches, HCNs considerably reduce the amount of training data
required, while retaining the key benefit of inferring a latent representation
of dialog state. In addition, HCNs can be optimized with supervised learning,
reinforcement learning, or a mixture of both. HCNs attain state-of-the-art
performance on the bAbI dialog dataset, and outperform two commercially
deployed customer-facing dialog systems.

Distributed, Parallel, and Cluster Computing

D4M 3.0

Lauren Milechin, Alexander Chen, Vijay Gadepally, Dylan Hutchison, Siddharth Samsi, Jeremy Kepner
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Databases (cs.DB)

The D4M tool is used by hundreds of researchers to perform complex analytics
on unstructured data. Over the past few years, the D4M toolbox has evolved to
support connectivity with a variety of database engines, graph analytics in the
Apache Accumulo database, and an implementation using the Julia programming
language. In this article, we describe some of our latest additions to the D4M
toolbox and our upcoming D4M 3.0 release.

Improving the Performance of Fully Connected Neural Networks by Out-of-Place Matrix Transpose

Shaohuai Shi, Pengfei Xu, Xiaowen Chu
Comments: 5 pages
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

Fully connected network has been widely used in deep learning, and its
computation efficiency is highly benefited from the matrix multiplication
algorithm with cuBLAS on GPU. However, We found that, there exist some
drawbacks of cuBLAS in calculating matrix ( extbf{A}) multiplies the transpose
of matrix ( extbf{B}) (i.e., NT operation). To reduce the impact of NT
operation by cuBLAS, we exploit the out-of-place transpose of matrix
( extbf{B}) to avoid using NT operation, and then we apply our method to
Caffe, which is a popular deep learning tool. Our contribution is two-fold.
First, we propose a naive method (TNN) and model-based method (MTNN) to
increase the performance in calculating ( extbf{A} imes extbf{B}^T), and it
achieves about 4.7 times performance enhancement in our tested cases on GTX1080
card. Second, we integrate MTNN method into Caffe to enhance the efficiency in
training fully connected networks, which achieves about 70% speedup compared to
the original Caffe in our configured fully connected networks on GTX1080 card.

(Leader/Randomization/Signature)-free Byzantine Consensus for Consortium Blockchains

Tyler Crain, Vincent Gramoli, Mikel Larrea, Michel Raynal
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Cryptography and Security (cs.CR)

This paper presents a new Byzantine consensus algorithm targeting consortium
blockchains. To this end, it first revisits the consensus validity property by
requiring that the decided value satisfies a predefined predicate, which does
not systematically exclude a value proposed only by Byzantine processes,
thereby generalizing the validity properties found in the literature. Then, the
paper presents a simple and modular Byzantine consensus algorithm that relies
neither on a leader, nor on signatures, nor on randomization. It features the
fastest multivalued reduction to binary consensus we know of and a time optimal
binary Byzantine consensus algorithm. The multivalued reduction runs multiple
instances of binary consensus concurrently, which result in a bitmask that is
then applied to a vector of multivalued proposals to filter out a valid
proposed value that is decided. To ensure eventual decision deterministically,
the underlying binary consensus algorithm assumes eventual synchrony.

Finer-grained Locking in Concurrent Dynamic Planar Convex Hulls

K. Alex Mills, James Smith
Comments: 4 pages; 2 figures; brief announcement submitted to SPAA 2017
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

The convex hull of a planar point set is the smallest convex polygon
containing each point in the set. The dynamic convex hull problem concerns
efficiently maintaining the convex hull of a set of points subject to additions
and removals. One algorithm for this problem uses two external balanced binary
search trees (BSTs) (M. H. Overmars, J. van Leeuwen 1981). We present the first
concurrent solution for this problem, which uses a single BST that stores
references to intermediate convex hull solutions at each node. We implement and
evaluate two lock-based approaches: a) fine-grained locking, where each node of
the tree is protected by a lock, and b) “finer-grained locking”, where each
node contains a separate lock for each of the left and right chains. In our
throughput experiments, we observe that finer-grained locking yields an 8-60%
improvement over fine-grained locking, and a 38-61x improvement over
coarsegrained locking and software transactional memory (STM). When applied to
find the convex hull of static point sets, our approach outperforms a parallel
divide-and-conquer implementation by 2-4x using an equivalent number of
threads.

Elastic Resource Management with Adaptive State Space Partitioning of Markov Decision Processes

Konstantinos Lolos, Ioannis Konstantinou, Verena Kantere, Nectarios Koziris
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

Modern large-scale computing deployments consist of complex applications
running over machine clusters. An important issue in these is the offering of
elasticity, i.e., the dynamic allocation of resources to applications to meet
fluctuating workload demands. Threshold based approaches are typically
employed, yet they are difficult to configure and optimize. Approaches based on
reinforcement learning have been proposed, but they require a large number of
states in order to model complex application behavior. Methods that adaptively
partition the state space have been proposed, but their partitioning criteria
and strategies are sub-optimal. In this work we present MDP_DT, a novel
full-model based reinforcement learning algorithm for elastic resource
management that employs adaptive state space partitioning. We propose two novel
statistical criteria and three strategies and we experimentally prove that they
correctly decide both where and when to partition, outperforming existing
approaches. We experimentally evaluate MDP_DT in a real large scale cluster
over variable not-encountered workloads and we show that it takes more informed
decisions compared to static and model-free approaches, while requiring a
minimal amount of training data.

Comparative benchmarking of cloud computing vendors with High Performance Linpack

Mohammad Mohammadi, Timur Bazhirov
Subjects: Performance (cs.PF); Distributed, Parallel, and Cluster Computing (cs.DC)

We present a comparative analysis of the maximum performance achieved by the
Linpack benchmark on compute intensive hardware publicly available from
multiple cloud providers. We study both performance within a single compute
node, and speedup for distributed memory calculations with up to 32 nodes or at
least 512 computing cores. We distinguish between hyper-threaded and
non-hyper-threaded scenarios and estimate the performance per single computing
core. We also compare results with a traditional supercomputing system for
reference. Our findings provide a way to rank the cloud providers and
demonstrate the viability of the cloud for high performance computing
applications.

Learning

Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models

Sergey Ioffe
Subjects: Learning (cs.LG)

Batch Normalization is quite effective at accelerating and improving the
training of deep models. However, its effectiveness diminishes when the
training minibatches are small, or do not consist of independent samples. We
hypothesize that this is due to the dependence of model layer inputs on all the
examples in the minibatch, and different activations being produced between
training and inference. We propose Batch Renormalization, a simple and
effective extension to ensure that the training and inference models generate
the same outputs that depend on individual examples rather than the entire
minibatch. Models trained with Batch Renormalization perform substantially
better than batchnorm when training with small or non-i.i.d. minibatches. At
the same time, Batch Renormalization retains the benefits of batchnorm such as
insensitivity to initialization and training efficiency.

A Deterministic and Generalized Framework for Unsupervised Learning with Restricted Boltzmann Machines

Restricted Boltzmann machines (RBMs) are energy-based neural-networks which
are commonly used as the building blocks for deep architectures neural
architectures. In this work, we derive a deterministic framework for the
training, evaluation, and use of RBMs based upon the Thouless-Anderson-Palmer
(TAP) mean-field approximation of widely-connected systems with weak
interactions coming from spin-glass theory. While the TAP approach has been
extensively studied for fully-visible binary spin systems, our construction is
generalized to latent-variable models, as well as to arbitrarily distributed
real-valued spin systems with bounded support. In our numerical experiments, we
demonstrate the effective deterministic training of our proposed models and are
able to show interesting features of unsupervised learning which could not be
directly observed with sampling. Additionally, we demonstrate how to utilize
our TAP-based framework for leveraging trained RBMs as joint priors in
denoising problems.

Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning

Stefan Elfwing, Eiji Uchibe, Kenji Doya
Comments: 18 pages, 21 figures
Subjects: Learning (cs.LG)

In recent years, neural networks have enjoyed a renaissance as function
approximators in reinforcement learning. Two decades after Teasauro’s TD-Gammon
achieved near top-level human performance in backgammon, the deep reinforcement
learning algorithm DQN (combining Q-learning with a deep neural network,
experience replay, and a separate target network) achieved human-level
performance in many Atari 2600 games. The purpose of this study is twofold.
First, based on the expected energy restricted Boltzmann machine (EE-RBM), we
propose two activation functions for neural network function approximation in
reinforcement learning: the sigmoid-weighted linear (SiL) unit and its
derivative function (SiLd1). The activation of the SiL unit is computed by the
sigmoid function multiplied by its input, which is equal to the contribution to
the output from one hidden unit in an EE-RBM. Second, we suggest that the more
traditional approach of using on-policy learning with eligibility traces,
instead of experience replay, and softmax action selection can be competitive
with DQN, without the need for a separate target network. We validate our
proposed approach by, first, achieving new state-of-the-art results in both
stochastic SZ-Tetris and Tetris with a small 10×10 board, using TD((lambda))
learning and shallow SiLd1 network agents, and, then, outperforming DQN in the
Atari 2600 domain by using a deep Sarsa((lambda)) agent with SiL and SiLd1
hidden units.

Following the Leader and Fast Rates in Linear Prediction: Curved Constraint Sets and Other Regularities

Ruitong Huang, Tor Lattimore, András György, Csaba Szepesvári
Subjects: Learning (cs.LG)

The follow the leader (FTL) algorithm, perhaps the simplest of all online
learning algorithms, is known to perform well when the loss functions it is
used on are convex and positively curved. In this paper we ask whether there
are other “lucky” settings when FTL achieves sublinear, “small” regret. In
particular, we study the fundamental problem of linear prediction over a
non-empty convex, compact domain. Amongst other results, we prove that the
curvature of the boundary of the domain can act as if the losses were curved:
In this case, we prove that as long as the mean of the loss vectors have
positive lengths bounded away from zero, FTL enjoys a logarithmic growth rate
of regret, while, e.g., for polytope domains and stochastic data it enjoys
finite expected regret. Building on a previously known meta-algorithm, we also
get an algorithm that simultaneously enjoys the worst-case guarantees and the
bound available for FTL.

Multi-step Off-policy Learning Without Importance Sampling Ratios

Ashique Rupam Mahmood, Huizhen Yu, Richard S. Sutton
Comments: 24 pages, 4 figures
Subjects: Learning (cs.LG)

To estimate the value functions of policies from exploratory data, most
model-free off-policy algorithms rely on importance sampling, where the use of
importance sampling ratios often leads to estimates with severe variance. It is
thus desirable to learn off-policy without using the ratios. However, such an
algorithm does not exist for multi-step learning with function approximation.
In this paper, we introduce the first such algorithm based on
temporal-difference (TD) learning updates. We show that an explicit use of
importance sampling ratios can be eliminated by varying the amount of
bootstrapping in TD updates in an action-dependent manner. Our new algorithm
achieves stability using a two-timescale gradient-based TD update. A prior
algorithm based on lookup table representation called Tree Backup can also be
retrieved using action-dependent bootstrapping, becoming a special case of our
algorithm. In two challenging off-policy tasks, we demonstrate that our
algorithm is stable, effectively avoids the large variance issue, and can
perform substantially better than its state-of-the-art counterpart.

Soft tensegrity robots

John Rieffel, Jean-Baptiste Mouret
Subjects: Robotics (cs.RO); Learning (cs.LG); Systems and Control (cs.SY)

Living organisms intertwine soft (e.g., muscle) and hard (e.g., bones)
materials, giving them an intrinsic flexibility and resiliency often lacking in
conventional rigid robots. The emerging field of soft robotics seeks to harness
these same properties in order to create resilient machines. The nature of soft
materials, however, presents considerable challenges to aspects of design,
construction, and control — and up until now, the vast majority of gaits for
soft robots have been hand-designed through empirical trial-and-error. This
manuscript describes an easy-to-assemble tensegrity-based soft robot capable of
highly dynamic locomotive gaits and demonstrating structural and behavioral
resilience in the face of physical damage. Enabling this is the use of a
machine learning algorithm able to discover novel gaits with a minimal number
of physical trials. These results lend further credence to soft-robotic
approaches that seek to harness the interaction of complex material dynamics in
order to generate a wealth of dynamical behaviors.

Multi-agent Reinforcement Learning in Sequential Social Dilemmas

Matrix games like Prisoner’s Dilemma have guided research on social dilemmas
for decades. However, they necessarily treat the choice to cooperate or defect
as an atomic action. In real-world social dilemmas these choices are temporally
extended. Cooperativeness is a property that applies to policies, not
elementary actions. We introduce sequential social dilemmas that share the
mixed incentive structure of matrix game social dilemmas but also require
agents to learn policies that implement their strategic intentions. We analyze
the dynamics of policies learned by multiple self-interested independent
learning agents, each using its own deep Q-network, on two Markov games we
introduce here: 1. a fruit Gathering game and 2. a Wolfpack hunting game. We
characterize how learned behavior in each domain changes as a function of
environmental factors including resource abundance. Our experiments show how
conflict can emerge from competition over shared resources and shed light on
how the sequential nature of real world social dilemmas affects cooperation.

Fixing an error in Caponnetto and de Vito (2007)

Dougal J. Sutherland
Subjects: Machine Learning (stat.ML); Learning (cs.LG); Statistics Theory (math.ST)

The seminal paper of Caponnetto and de Vito (2007) provides minimax-optimal
rates for kernel ridge regression in a very general setting. Its proof,
however, contains an error in its bound on the effective dimensionality. In
this note, we explain the mistake, provide a correct bound, and show that the
main theorem remains true.

Information Theory

Multidimensional Index Modulation in Wireless Communications

Bharath Shamasundar, Swaroop Jacob, Sandeep Bhat, A. Chockalingam
Subjects: Information Theory (cs.IT)

In index modulation schemes, information bits are conveyed through indexing
of transmission entities such as antennas, subcarriers, times slots, precoders,
subarrays, and radio frequency (RF) mirrors. Index modulation schemes are
attractive for their advantages such as good performance, high rates, and
hardware simplicity. This paper focuses on index modulation schemes in which
multiple transmission entities, namely, {em antennas}, {em time slots}, and
{em RF mirrors}, are indexed {em simultaneously}. Recognizing that such
multidimensional index modulation schemes encourage sparsity in their transmit
signal vectors, we propose efficient signal detection schemes that use
compressive sensing based reconstruction algorithms. Results show that, for a
given rate, improved performance is achieved when the number of indexed
transmission entities is increased. We also explore indexing opportunities in
{em load modulation}, which is a modulation scheme that offers power
efficiency and reduced RF hardware complexity advantages in multiantenna
systems. Results show that indexing space and time in load modulated
multiantenna systems can achieve improved performance.

Spatial Oversampling in LOS MIMO Systems with 1-Bit Quantization at the Receiver

Tim Hälsig, Berthold Lankl
Comments: Accepted at International ITG SCC 2017
Subjects: Information Theory (cs.IT)

In this paper we investigate the achievable rate of LOS MIMO systems that use
1-bit quantization and spatial oversampling at the receiver. We propose that by
using additional antennas at the receiver, the loss incurred due to the strong
limitation of the quantization can be decreased. Mutual information results
show that considerable rate gains can be achieved depending on the number and
arrangement of the antennas. In some of the cases, even the full available rate
from the transmitter can be attained. Furthermore, the results also reveal that
two-dimensional antenna arrays can benefit more from spatial oversampling than
one-dimensional arrays, when using 1-bit quantization in the LOS MIMO scenario.

Performance of Cell-Free Massive MIMO Systems with MMSE and LSFD Receivers

Elina Nayebi, Alexei Ashikhmin, Thomas L. Marzetta, Bhaskar D. Rao
Subjects: Information Theory (cs.IT)

Cell-Free Massive MIMO comprises a large number of distributed single-antenna
access points (APs) serving a much smaller number of users. There is no
partitioning into cells and each user is served by all APs.

In this paper, the uplink performance of cell-free systems with minimum mean
squared error (MMSE) and large scale fading decoding (LSFD) receivers is
investigated. The main idea of LSFD receiver is to maximize achievable
throughput using only large scale fading coefficients between APs and users.
Capacity lower bounds for MMSE and LSFD receivers are derived. An asymptotic
approximation for signal-to-interference-plus-noise ratio (SINR) of MMSE
receiver is derived as a function of large scale fading coefficients only. The
obtained approximation is accurate even for a small number of antennas. MMSE
and LSFD receivers demonstrate five-fold and two-fold gains respectively over
matched filter (MF) receiver in terms of 5%-outage rate.

Cramér-Rao Lower Bounds for Positioning with Large Intelligent Surfaces

Sha Hu, Fredrik Rusek, Ove Edfors
Comments: 5 pages, 7 figures, conference
Subjects: Information Theory (cs.IT)

We consider the potential for positioning with a system where antenna arrays
are deployed as a large intelligent surface (LIS). We derive
Fisher-informations and Cram'{e}r-Rao lower bounds (CRLB) in closed-form for
terminals along the central perpendicular line (CPL) of the LIS for all three
Cartesian dimensions. For terminals at positions other than the CPL,
closed-form expressions for the Fisher-informations and CRLBs seem out of
reach, and we alternatively provide approximations (in closed-form) which are
shown to be very accurate. We also show that under mild conditions, the CRLBs
in general decrease quadratically in the surface-area for both the (x) and (y)
dimensions. For the (z)-dimension (distance from the LIS), the CRLB decreases
linearly in the surface-area when terminals are along the CPL. However, when
terminals move away from the CPL, the CRLB is dramatically increased and then
also decreases quadratically in the surface-area. We also extensively discuss
the impact of different deployments (centralized and distributed) of the LIS.

The Potential of Using Large Antenna Arrays on Intelligent Surfaces

Sha Hu, Fredrik Rusek, Ove Edfors
Comments: 6 pages, 10 figures,conference
Subjects: Information Theory (cs.IT)

In this paper, we consider capacities of single-antenna terminals
communicating to large antenna arrays that are deployed on surfaces. That is,
the entire surface is used as an intelligent receiving antenna array. Under the
condition that the surface area is sufficiently large, the received signal
after matched-filtering (MF) can be well approximated by an intersymbol
interference (ISI) channel where channel taps are closely related to a sinc
function. Based on such an approximation, we have derived the capacities for
both one-dimensional (terminals on a line) and high dimensional (terminals on a
plane or in a cube) terminal-deployments. In particular, we analyze the
normalized capacity (ar{mathcal{C}}), measured in nats/s/Hz/m(^2), under the
constraint that the transmit power per m(^2), (ar{P}), is fixed. We show that
when the user-density increases, the limit of (ar{mathcal{C}}), achieved as
the wavelength (lambda) approaches 0, is (ar{P}/(2N_0)) nats/s/Hz/m(^2),
where (N_0) is the spatial power spectral density (PSD) of noise. In addition,
we also show that the number of signal dimensions is (2/lambda) per meter
deployed surface for the one-dimensional case, and (pi/lambda^2) per m(^2)
deployed surface for two and three dimensional terminal-deployments.

PCA in Data-Dependent Noise (Correlated-PCA): Nearly Optimal Finite Sample Guarantees

Namrata Vaswani, Praneeth Narayanamurthy
Comments: 7 pages, 2 figures
Subjects: Information Theory (cs.IT); Machine Learning (stat.ML)

We study Principal Component Analysis (PCA) in a setting where a part of the
corrupting noise is data-dependent and, hence, the noise and the true data are
correlated. Under a bounded-ness assumption on both the true data and noise,
and a few assumptions on the data-noise correlation, we obtain a sample
complexity bound for the most common PCA solution, singular value decomposition
(SVD). This bound, which is within a logarithmic factor of the best achievable,
significantly improves upon our bound from recent work (NIPS 2016) where we
first studied this “correlated-PCA” problem.

Sparsity/Undersampling Tradeoffs in Anisotropic Undersampling, with Applications in MR Imaging/Spectroscopy

Hatef Monajemi, David L. Donoho
Subjects: Information Theory (cs.IT)

We study anisotropic undersampling schemes like those used in
multi-dimensional NMR spectroscopy and MR imaging, which sample exhaustively in
certain time dimensions and randomly in others.

Our analysis shows that anisotropic undersampling schemes are equivalent to
certain block-diagonal measurement systems. We develop novel exact formulas for
the sparsity/undersampling tradeoffs in such measurement systems. Our formulas
predict finite-(N) phase transition behavior differing substantially from the
well known asymptotic phase transitions for classical Gaussian undersampling.
Extensive empirical work shows that our formulas accurately describe observed
finite-(N) behavior, while the usual formulas based on universality are
substantially inaccurate.

We also vary the anisotropy, keeping the total number of samples fixed, and
for each variation we determine the precise sparsity/undersampling tradeoff
(phase transition). We show that, other things being equal, the ability to
recover a sparse object decreases with an increasing number of
exhaustively-sampled dimensions.

The ARMA(k) Gaussian Feedback Capacity

Tao Liu, Guangyue Han
Subjects: Information Theory (cs.IT)

Using Kim’s variational formulation (with a slight yet important
modification), we derive the ARMA(k) Gaussian feedback capacity, i.e., the
feedback capacity of an additive channel where the noise is a k-th order
autoregressive moving average Gaussian process. More specifically, the ARMA(k)
Gaussian feedback capacity is expressed as a simple function evaluated at a
solution to a system of polynomial equations, which has only finitely many
solutions for the cases k=1, 2 and possibly beyond.

Density Functional Estimators with k-Nearest Neighbor Bandwidths

Weihao Gao, Sewoong Oh, Pramod Viswanath
Comments: 17 pages, 6 figures
Subjects: Information Theory (cs.IT)

Estimating expected polynomials of density functions from samples is a basic
problem with numerous applications in statistics and information theory.
Although kernel density estimators are widely used in practice for such
functional estimation problems, practitioners are left on their own to choose
an appropriate bandwidth for each application in hand. Further, kernel density
estimators suffer from boundary biases, which are prevalent in real world data
with lower dimensional structures. We propose using the fixed-k nearest
neighbor distances for the bandwidth, which adaptively adjusts to local
geometry. Further, we propose a novel estimator based on local likelihood
density estimators, that mitigates the boundary biases. Although such a choice
of fixed-k nearest neighbor distances to bandwidths results in inconsistent
estimators, we provide a simple debiasing scheme that precomputes the
asymptotic bias and divides off this term. With this novel correction, we show
consistency of this debiased estimator. We provide numerical experiments
suggesting that it improves upon competing state-of-the-art methods.

An Overview of Multi-Processor Approximate Message Passing

Junan Zhu, Ryan Pilgrim, Dror Baron
Subjects: Information Theory (cs.IT)

Approximate message passing (AMP) is an algorithmic framework for solving
linear inverse problems from noisy measurements, with exciting applications
such as reconstructing images, audio, hyper spectral images, and various other
signals, including those acquired in compressive signal acquisiton systems. The
growing prevalence of big data systems has increased interest in large-scale
problems, which may involve huge measurement matrices that are unsuitable for
conventional computing systems. To address the challenge of large-scale
processing, multiprocessor (MP) versions of AMP have been developed. We provide
an overview of two such MP-AMP variants. In row-MP-AMP, each computing node
stores a subset of the rows of the matrix and processes corresponding
measurements. In column- MP-AMP, each node stores a subset of columns, and is
solely responsible for reconstructing a portion of the signal. We will discuss
pros and cons of both approaches, summarize recent research results for each,
and explain when each one may be a viable approach. Aspects that are
highlighted include some recent results on state evolution for both MP-AMP
algorithms, and the use of data compression to reduce communication in the MP
network.

Elementary (L^infty) error estimates for super-resolution de-noising

Weilin Li
Subjects: Information Theory (cs.IT)

This paper studies the problem of recovering a discrete complex measure on
the torus from a finite number of corrupted Fourier samples. We assume the
support of the unknown discrete measure satisfies a minimum separation
condition and we use convex regularization methods to recover approximations of
the original measure. We focus on two well-known convex regularization methods,
and for both, we establish an error estimate that bounds the smoothed-out error
in terms of the target resolution and noise level. Our (L^infty) approximation
rate is entirely new for one of the methods, and improves upon a previously
established (L^1) estimate for the other. We provide a unified analysis and an
elementary proof of the theorem.

Secure Multi-Source Multicast

Alejandro Cohen, Asaf Cohen, Muriel Medard, Omer Gurewitz
Subjects: Information Theory (cs.IT)

The principal mission of Multi-Source Multicast (MSM) is to disseminate all
messages from all sources in a network to all destinations. MSM is utilized in
numerous applications. In many of them, securing the messages disseminated is
critical. A common secure model is to consider a network where there is an
eavesdropper which is able to observe a subset of the network links, and seek a
code which keeps the eavesdropper ignorant regarding all the messages. While
this is solved when all messages are located at a single source, Secure MSM
(SMSM) is an open problem, and the rates required are hard to characterize in
general. In this paper, we consider Individual Security, which promises that
the eavesdropper has zero mutual information with each message individually. We
completely characterize the rate region for SMSM under individual security, and
show that such a security level is achievable at the full capacity of the
network, that is, the cut-set bound is the matching converse, similar to
non-secure MSM. Moreover, we show that the field size is similar to non-secure
MSM and does not have to be larger due to the security constraint.

Conversion of Coherence Number and Concurrence into Entanglement (k)-concurrence

Seungbeom Chin
Comments: 17 pages
Subjects: Quantum Physics (quant-ph); Information Theory (cs.IT)

We investigate the (k)-concurrence entanglement convertability theorem of
coherence. By introducing the (coherence) (number), which is the generalization
of coherence rank to mixed states, we present a necessary and sufficient
condition for a coherent mixed state to be converted to an entangled state of
nonzero (k)-concurrence. We also quantitatively compare the amount of
(k)-concurrence entanglement with coherence concurrence (C_c), a recently
introduced convex roof monotone of coherence.