IT博客汇 | arXiv Paper Daily: Wed, 15 Feb 2017

arXiv Paper Daily: Wed, 15 Feb 2017

我爱机器学习(52ml.net)发表于 2017-02-15 00:00:00

Neural and Evolutionary Computing

Exploring loss function topology with cyclical learning rates

Leslie N. Smith, Nicholay Topin
Comments: Submitted as an ICLR 2017 Workshop paper
Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

We present observations and discussion of previously unreported phenomena
discovered while training residual networks. The goal of this work is to better
understand the nature of neural networks through the examination of these new
empirical results. These behaviors were identified through the application of
Cyclical Learning Rates (CLR) and linear network interpolation. Among these
behaviors are counterintuitive increases and decreases in training loss and
instances of rapid training. For example, we demonstrate how CLR can produce
greater testing accuracy than traditional training despite using large learning
rates. Files to replicate these results are available at
this https URL

The Causal Role of Astrocytes in Slow-Wave Rhythmogenesis: A Computational Modelling Study

Leo Kozachkov, Konstantinos P. Michmizos
Comments: 20 pages, 6 figures
Subjects: Neurons and Cognition (q-bio.NC); Neural and Evolutionary Computing (cs.NE); Cell Behavior (q-bio.CB)

Finding the origin of slow and infra-slow oscillations could reveal or
explain brain mechanisms in health and disease. Here, we present a
biophysically constrained computational model of a neural network where the
inclusion of astrocytes introduced slow and infra-slow-oscillations, through
two distinct mechanisms. Specifically, we show how astrocytes can modulate the
fast network activity through their slow inter-cellular calcium wave speed and
amplitude and possibly cause the oscillatory imbalances observed in diseases
commonly known for such abnormalities, namely Alzheimer’s disease, Parkinson’s
disease, epilepsy, depression and ischemic stroke. This work aims to increase
our knowledge on how astrocytes and neurons synergize to affect brain function
and dysfunction.

Differential Evolution for Quantum Robust Control: Algorithm, Applications and Experiments

Daoyi Dong, Xi Xing, Hailan Ma, Chunlin Chen, Zhixin Liu, Herschel Rabitz
Comments: 13 pages, 10 figures and 1 table
Subjects: Quantum Physics (quant-ph); Neural and Evolutionary Computing (cs.NE); Systems and Control (cs.SY)

Robust control design for quantum systems has been recognized as a key task
in quantum information technology, molecular chemistry and atomic physics. In
this paper, an improved differential evolution algorithm of msMS_DE is proposed
to search robust fields for various quantum control problems. In msMS_DE,
multiple samples are used for fitness evaluation and a mixed strategy is
employed for mutation operation. In particular, the msMS_DE algorithm is
applied to the control problem of open inhomogeneous quantum ensembles and the
consensus problem of a quantum network with uncertainties. Numerical results
are presented to demonstrate the excellent performance of the improved DE
algorithm for these two classes of quantum robust control problems.
Furthermore, msMS_DE is experimentally implemented on femtosecond laser control
systems to generate good signals of two photon absorption and control
fragmentation of halomethane molecules CH2BrI. Experimental results demonstrate
excellent performance of msMS_DE in searching effective femtosecond laser
pulses for various tasks.

Computer Vision and Pattern Recognition

Integrating Three Mechanisms of Visual Attention for Active Visual Search

Amir Rasouli, John K. Tsotsos
Comments: Presented at International Symposium On Attention in Cognitive Systems (ISACS) in Association with IROS, 2015
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Algorithms for robotic visual search can benefit from the use of visual
attention methods in order to reduce computational costs. Here, we describe how
three distinct mechanisms of visual attention can be integrated and
productively used to improve search performance. The first is viewpoint
selection as has been proposed earlier using a greedy search over a
probabilistic occupancy grid representation. The second is top-down
object-based attention using a histogram backprojection method, also previously
described. The third is visual saliency. This is novel in the sense that it is
not used as a region-of-interest method for the current image but rather as a
noncombinatorial form of look-ahead in search for future viewpoint selection.
Additionally, the integration of these three attentional schemes within a
single framework is unique and not previously studied. We examine our proposed
method in scenarios where little or no information regarding the environment is
available. Through extensive experiments on a mobile robot, we show that our
method improves visual search performance by reducing the time and number of
actions required.

DAGER: Deep Age, Gender and Emotion Recognition Using Convolutional Neural Network

Afshin Dehghan, Enrique G. Ortiz, Guang Shu, Syed Zain Masood
Comments: 10 Pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

This paper describes the details of Sighthound’s fully automated age, gender
and emotion recognition system. The backbone of our system consists of several
deep convolutional neural networks that are not only computationally
inexpensive, but also provide state-of-the-art results on several competitive
benchmarks. To power our novel deep networks, we collected large labeled
datasets through a semi-supervised pipeline to reduce the annotation
effort/time. We tested our system on several public benchmarks and report
outstanding results. Our age, gender and emotion recognition models are
available to developers through the Sighthound Cloud API at
this https URL

Structured Deep Hashing with Convolutional Neural Networks for Fast Person Re-identification

Lin Wu, Yang Wang
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Given a pedestrian image as a query, the purpose of person re-identification
is to identify the correct match from a large collection of gallery images
depicting the same person captured by disjoint camera views. The critical
challenge is how to construct a robust yet discriminative feature
representation to capture the compounded variations in pedestrian appearance.
To this end, deep learning methods have been proposed to extract hierarchical
features against extreme variability of appearance. However, existing methods
in this category generally neglect the efficiency in the matching stage whereas
the searching speed of a re-identification system is crucial in real-world
applications. In this paper, we present a novel deep hashing framework with
Convolutional Neural Networks (CNNs) for fast person re-identification.
Technically, we simultaneously learn both CNN features and hash functions/codes
to get robust yet discriminative features and similarity-preserving hash codes.
Thereby, person re-identification can be resolved by efficiently computing and
ranking the Hamming distances between images. A structured loss function
defined over positive pairs and hard negatives is proposed to formulate a novel
optimization problem so that fast convergence and more stable optimized
solution can be obtained. Extensive experiments on two benchmarks CUHK03
cite{FPNN} and Market-1501 cite{Market1501} show that the proposed deep
architecture is efficacy over state-of-the-arts.

FERA 2017 – Addressing Head Pose in the Third Facial Expression Recognition and Analysis Challenge

Michel F. Valstar, Enrique Sánchez-Lozano, Jeffrey F. Cohn, László A. Jeni, Jeffrey M. Girard, Zheng Zhang, Lijun Yin, Maja Pantic
Comments: FERA 2017 Baseline Paper
Subjects: Computer Vision and Pattern Recognition (cs.CV)

The field of Automatic Facial Expression Analysis has grown rapidly in recent
years. However, despite progress in new approaches as well as benchmarking
efforts, most evaluations still focus on either posed expressions, near-frontal
recordings, or both. This makes it hard to tell how existing expression
recognition approaches perform under conditions where faces appear in a wide
range of poses (or camera views), displaying ecologically valid expressions.
The main obstacle for assessing this is the availability of suitable data, and
the challenge proposed here addresses this limitation. The FG 2017 Facial
Expression Recognition and Analysis challenge (FERA 2017) extends FERA 2015 to
the estimation of Action Units occurrence and intensity under different camera
views. In this paper we present the third challenge in automatic recognition of
facial expressions, to be held in conjunction with the 12th IEEE conference on
Face and Gesture Recognition, May 2017, in Washington, United States. Two
sub-challenges are defined: the detection of AU occurrence, and the estimation
of AU intensity. In this work we outline the evaluation protocol, the data
used, and the results of a baseline method for both sub-challenges.

One-Step Time-Dependent Future Video Frame Prediction with a Convolutional Encoder-Decoder Neural Network

Vedran Vukotić, Silvia-Laura Pintea, Christian Raymond, Guillaume Gravier, Jan Van Gemert
Comments: 8 pages, 6 figures, published in the Netherlands Conference on Computer Vision (NCCV) 2016
Journal-ref: Proceedings of the 3rd Netherlands Conference on Computer Vision
(2016)
Subjects: Computer Vision and Pattern Recognition (cs.CV)

There is an inherent need for machines to have a notion of how entities
within their environment behave and to anticipate changes in the near future.
In this work, we focus on anticipating future appearance given the current
frame of a video. Typical methods are used either to predict the next frame of
a video or to predict future optical flow or trajectories based on a single
video frame. This work presents an experiment on stretching the ability of CNNs
to predict %not the next frame, but an anticipation of appearance at an
arbitrarily given future time. We condition our predicted video frames on a
continuous time variable that allows us to anticipate future frames at a given
temporal distance, directly from the current input video frame. We show that
CNNs can learn an intrinsic representation of typical appearance changes over
time and successfully generate realistic predictions in one step – at a
deliberate time difference in the near future. The method is evaluated on the
KTH human actions dataset and compared to a baseline consisting of an analogous
CNN architecture that is not time-aware.

Graph Based Over-Segmentation Methods for 3D Point Clouds

Yizhak Ben-Shabat, Tamar Avraham, Michael Lindenbaum, Anath Fischer
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Over-segmentation, or super-pixel generation, is a common preliminary stage
for many computer vision applications. New acquisition technologies enable the
capturing of 3D point clouds that contain color and geometrical information.
This 3D information introduces a new conceptual change that can be utilized to
improve the results of over-segmentation, which uses mainly color information,
and to generate clusters of points we call super-points. We consider a variety
of possible 3D extensions of the Local Variation (LV) graph based
over-segmentation algorithms, and compare them thoroughly. We consider
different alternatives for constructing the connectivity graph, for assigning
the edge weights, and for defining the merge criterion, which must now account
for the geometric information and not only color. Following this evaluation, we
derive a new generic algorithm for over-segmentation of 3D point clouds. We
call this new algorithm Point Cloud Local Variation (PCLV). The advantages of
the new over-segmentation algorithm are demonstrated on both outdoor and
cluttered indoor scenes. Performance analysis of the proposed approach compared
to state-of-the-art 2D and 3D over-segmentation algorithms shows significant
improvement according to the common performance measures.

Efficient Algorithms for Moral Lineage Tracing

Markus Rempfler, Jan-Hendrik Lange, Florian Jug, Corinna Blasse, Eugene W. Myers, Bjoern H. Menze, Bjoern Andres
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Lineage tracing, the joint segmentation and tracking of living cells as they
move and divide in a sequence of light microscopy images, is a challenging
task. Jug et al. have proposed a mathematical abstraction of this task, the
moral lineage tracing problem (MLTP) whose feasible solutions define a
segmentation of every image and a lineage forest of cells. Their branch-and-cut
algorithm, however, is prone to many cuts and slow convergences for large
instances. To address this problem, we make three contributions: Firstly, we
improve the branch-and-cut algorithm by separating tighter cutting planes.
Secondly, we define two primal feasible local search algorithms for the MLTP.
Thirdly, we show in experiments that our algorithms decrease the runtime on the
problem instances of Jug et al. considerably and find solutions on larger
instances in reasonable time.

SSPP-DAN: Deep Domain Adaptation Network for Face Recognition with Single Sample Per Person

Sungeun Hong, Woobin Im, Jongbin Ryu, Hyun S. Yang
Comments: 5 pages, 4 figures, 3 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Real-world face recognition using single sample per person (SSPP) is a
challenging task. The problem is exacerbated if the conditions under which the
gallery image and the probe set are captured are completely different. To
address these issues from the perspective of domain adaptation, we introduce a
SSPP domain adaptation network (SSPP-DAN). In the proposed approach, domain
adaptation, feature extraction, and classification are performed jointly using
a deep architecture with domain-adversarial training. However, the SSPP
characteristic of one training sample per class is insufficient to train the
deep architecture. To overcome this shortage, we generate synthetic images with
varying poses using a 3D face model. Experimental evaluations using a realistic
SSPP dataset show that deep domain adaptation and image synthesis complement
each other and dramatically improve accuracy. Experiments on a benchmark
dataset using the proposed approach show state-of-the-art performance.

A Graphical Social Topology Model for Multi-Object Tracking

Shan Gao, Xiaogang Chen, Qixiang Ye, Junliang Xing, Arjan Kuijper, Xiangyang Ji
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Tracking multiple objects is a challenging task when objects move in groups
and occlude each other. Existing methods have investigated the problems of
group division and group energy-minimization; however, lacking overall
object-group topology modeling limits their ability in handling complex object
and group dynamics. Inspired with the social affinity property of moving
objects, we propose a Graphical Social Topology (GST) model, which estimates
the group dynamics by jointly modeling the group structure and the states of
objects using a topological representation. With such topology representation,
moving objects are not only assigned to groups, but also dynamically connected
with each other, which enables in-group individuals to be correctly associated
and the cohesion of each group to be precisely modeled. Using well-designed
topology learning modules and topology training, we infer the birth/death and
merging/splitting of dynamic groups. With the GST model, the proposed
multi-object tracker can naturally facilitate the occlusion problem by treating
the occluded object and other in-group members as a whole while leveraging
overall state transition. Experiments on both RGB and RGB-D datasets confirm
that the proposed multi-object tracker improves the state-of-the-arts
especially in crowded scenes.

Evolution-Preserving Dense Trajectory Descriptors

Yang Wang, Vinh Tran, Minh Hoai
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Recently Trajectory-pooled Deep-learning Descriptors were shown to achieve
state-of-the-art human action recognition results on a number of datasets. This
paper improves their performance by applying rank pooling to each trajectory,
encoding the temporal evolution of deep learning features computed along the
trajectory. This leads to Evolution-Preserving Trajectory (EPT) descriptors, a
novel type of video descriptor that significantly outperforms Trajectory-pooled
Deep-learning Descriptors. EPT descriptors are defined based on dense
trajectories, and they provide complimentary benefits to video descriptors that
are not based on trajectories. In particular, we show that the combination of
EPT descriptors and VideoDarwin leads to state-of-the-art performance on
Hollywood2 and UCF101 datasets.

End-to-End Interpretation of the French Street Name Signs Dataset

Raymond Smith, Chunhui Gu, Dar-Shyang Lee, Huiyi Hu, Ranjith Unnikrishnan, Julian Ibarz, Sacha Arnoud, Sophia Lin
Comments: Presented at the IWRR workshop at ECCV 2016
Journal-ref: Computer Vision – ECCV 2016 Workshops Volume 9913 of the series
Lecture Notes in Computer Science pp 411-426
Subjects: Computer Vision and Pattern Recognition (cs.CV)

We introduce the French Street Name Signs (FSNS) Dataset consisting of more
than a million images of street name signs cropped from Google Street View
images of France. Each image contains several views of the same street name
sign. Every image has normalized, title case folded ground-truth text as it
would appear on a map. We believe that the FSNS dataset is large and complex
enough to train a deep network of significant complexity to solve the street
name extraction problem “end-to-end” or to explore the design trade-offs
between a single complex engineered network and multiple sub-networks designed
and trained to solve sub-problems. We present such an “end-to-end”
network/graph for Tensor Flow and its results on the FSNS dataset.

On Detecting Adversarial Perturbations

Jan Hendrik Metzen, Tim Genewein, Volker Fischer, Bastian Bischoff
Comments: Final version for ICLR2017 (see this https URL&noteId=SJzCSf9xg)
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

Machine learning and deep learning in particular has advanced tremendously on
perceptual tasks in recent years. However, it remains vulnerable against
adversarial perturbations of the input that have been crafted specifically to
fool the system while being quasi-imperceptible to a human. In this work, we
propose to augment deep neural networks with a small “detector” subnetwork
which is trained on the binary classification task of distinguishing genuine
data from data containing adversarial perturbations. Our method is orthogonal
to prior work on addressing adversarial perturbations, which has mostly focused
on making the classification network itself more robust. We show empirically
that adversarial perturbations can be detected surprisingly well even though
they are quasi-imperceptible to humans. Moreover, while the detectors have been
trained to detect only a specific adversary, they generalize to similar and
weaker adversaries. In addition, we propose an adversarial attack that fools
both the classifier and the detector and a novel training procedure for the
detector that counteracts this attack.

Artificial Intelligence

T-SKIRT: Online Estimation of Student Proficiency in an Adaptive Learning System

Chaitanya Ekanadham, Yan Karklin
Subjects: Artificial Intelligence (cs.AI)

We develop T-SKIRT: a temporal, structured-knowledge, IRT-based method for
predicting student responses online. By explicitly accounting for student
learning and employing a structured, multidimensional representation of student
proficiencies, the model outperforms standard IRT-based methods on an online
response prediction task when applied to real responses collected from students
interacting with diverse pools of educational content.

Constraint Answer Set Solver EZCSP and Why Integration Schemas Matter

Marcello Balduccini, Yuliya Lierler
Comments: Under consideration in Theory and Practice of Logic Programming (TPLP)
Subjects: Artificial Intelligence (cs.AI)

Researchers in answer set programming and constraint programming have spent
significant efforts in the development of hybrid languages and solving
algorithms combining the strengths of these traditionally separate fields.
These efforts resulted in a new research area: constraint answer set
programming. Constraint answer set programming languages and systems proved to
be successful at providing declarative, yet efficient solutions to problems
involving hybrid reasoning tasks. One of the main contributions of this paper
is the first comprehensive account of the constraint answer set language and
solver EZCSP, a mainstream representative of this research area that has been
used in various successful applications. We also develop an extension of the
transition systems proposed by Nieuwenhuis et al. in 2006 to capture Boolean
satisfiability solvers. We use this extension to describe the EZCSP algorithm
and prove formal claims about it. The design and algorithmic details behind
EZCSP clearly demonstrate that the development of the hybrid systems of this
kind is challenging. Many questions arise when one faces various design choices
in an attempt to maximize system’s benefits. One of the key decisions that a
developer of a hybrid solver makes is settling on a particular integration
schema within its implementation. Thus, another important contribution of this
paper is a thorough case study based on EZCSP, focused on the various
integration schemas that it provides.

Under consideration in Theory and Practice of Logic Programming (TPLP).

DAGER: Deep Age, Gender and Emotion Recognition Using Convolutional Neural Network

Afshin Dehghan, Enrique G. Ortiz, Guang Shu, Syed Zain Masood
Comments: 10 Pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

This paper describes the details of Sighthound’s fully automated age, gender
and emotion recognition system. The backbone of our system consists of several
deep convolutional neural networks that are not only computationally
inexpensive, but also provide state-of-the-art results on several competitive
benchmarks. To power our novel deep networks, we collected large labeled
datasets through a semi-supervised pipeline to reduce the annotation
effort/time. We tested our system on several public benchmarks and report
outstanding results. Our age, gender and emotion recognition models are
available to developers through the Sighthound Cloud API at
this https URL

On Detecting Adversarial Perturbations

Machine learning and deep learning in particular has advanced tremendously on
perceptual tasks in recent years. However, it remains vulnerable against
adversarial perturbations of the input that have been crafted specifically to
fool the system while being quasi-imperceptible to a human. In this work, we
propose to augment deep neural networks with a small “detector” subnetwork
which is trained on the binary classification task of distinguishing genuine
data from data containing adversarial perturbations. Our method is orthogonal
to prior work on addressing adversarial perturbations, which has mostly focused
on making the classification network itself more robust. We show empirically
that adversarial perturbations can be detected surprisingly well even though
they are quasi-imperceptible to humans. Moreover, while the detectors have been
trained to detect only a specific adversary, they generalize to similar and
weaker adversaries. In addition, we propose an adversarial attack that fools
both the classifier and the detector and a novel training procedure for the
detector that counteracts this attack.

Crossmatching variable objects with the Gaia data

Lorenzo Rimoldini, Krzysztof Nienartowicz, Maria Süveges, Jonathan Charnas, Leanne P. Guy, Grégory Jevardat de Fombelle, Berry Holl, Isabelle Lecoeur-Taïbi, Nami Mowlavi, Diego Ordóñez-Blanco, Laurent Eyer
Comments: 4 pages, 1 figure, in Astronomical Data Analysis Software and Systems XXVI, Astronomical Society of the Pacific Conference Series
Subjects: Instrumentation and Methods for Astrophysics (astro-ph.IM); Artificial Intelligence (cs.AI)

Tens of millions of new variable objects are expected to be identified in
over a billion time series from the Gaia mission. Crossmatching known variable
sources with those from Gaia is crucial to incorporate current knowledge,
understand how these objects appear in the Gaia data, train supervised
classifiers to recognise known classes, and validate the results of the
Variability Processing and Analysis Coordination Unit (CU7) within the Gaia
Data Analysis and Processing Consortium (DPAC). The method employed by CU7 to
crossmatch variables for the first Gaia data release includes a binary
classifier to take into account positional uncertainties, proper motion,
targeted variability signals, and artefacts present in the early calibration of
the Gaia data. Crossmatching with a classifier makes it possible to automate
all those decisions which are typically made during visual inspection. The
classifier can be trained with objects characterized by a variety of attributes
to ensure similarity in multiple dimensions (astrometry, photometry,
time-series features), with no need for a-priori transformations to compare
different photometric bands, or of predictive models of the motion of objects
to compare positions. Other advantages as well as some disadvantages of the
method are discussed. Implementation steps from the training to the assessment
of the crossmatch classifier and selection of results are described.

Data-Intensive Supercomputing in the Cloud: Global Analytics for Satellite Imagery

Michael S. Warren, Samuel W. Skillman, Rick Chartrand, Tim Kelton, Ryan Keisler, David Raleigh, Matthew Turk
Comments: 8 pages, 9 figures. Copyright 2016 IEEE. DataCloud 2016: The Seventh International Workshop on Data-Intensive Computing in the Clouds. In conjunction with SC16. Salt Lake City, Utah
Journal-ref: Proceedings of the 7th International Workshop on Data-Intensive
Computing in the Cloud (DataCloud ’16). IEEE Press, Piscataway, NJ, USA,
24-31, 2016
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI)

We present our experiences using cloud computing to support data-intensive
analytics on satellite imagery for commercial applications. Drawing from our
background in high-performance computing, we draw parallels between the early
days of clustered computing systems and the current state of cloud computing
and its potential to disrupt the HPC market. Using our own virtual file system
layer on top of cloud remote object storage, we demonstrate aggregate read
bandwidth of 230 gigabytes per second using 512 Google Compute Engine (GCE)
nodes accessing a USA multi-region standard storage bucket. This figure is
comparable to the best HPC storage systems in existence. We also present
several of our application results, including the identification of field
boundaries in Ukraine, and the generation of a global cloud-free base layer
from Landsat imagery.

Click Through Rate Prediction for Contextual Advertisment Using Linear Regression

Muhammad Junaid Effendi, Syed Abbas Ali
Comments: 8 pages, 13 Figures, 11 Tables
Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Learning (cs.LG)

This research presents an innovative and unique way of solving the
advertisement prediction problem which is considered as a learning problem over
the past several years. Online advertising is a multi-billion-dollar industry
and is growing every year with a rapid pace. The goal of this research is to
enhance click through rate of the contextual advertisements using Linear
Regression. In order to address this problem, a new technique propose in this
paper to predict the CTR which will increase the overall revenue of the system
by serving the advertisements more suitable to the viewers with the help of
feature extraction and displaying the advertisements based on context of the
publishers. The important steps include the data collection, feature
extraction, CTR prediction and advertisement serving. The statistical results
obtained from the dynamically used technique show an efficient outcome by
fitting the data close to perfection for the LR technique using optimized
feature selection.

Computation and Language

On the Relevance of Auditory-Based Gabor Features for Deep Learning in Automatic Speech Recognition

Angel Mario Castro Martinez, Sri Harish Mallidi, Bernd T. Meyer
Comments: accepted to Computer Speech & Language
Subjects: Computation and Language (cs.CL)

Previous studies support the idea of merging auditory-based Gabor features
with deep learning architectures to achieve robust automatic speech
recognition, however, the cause behind the gain of such combination is still
unknown. We believe these representations provide the deep learning decoder
with more discriminable cues. Our aim with this paper is to validate this
hypothesis by performing experiments with three different recognition tasks
(Aurora 4, CHiME 2 and CHiME 3) and assess the discriminability of the
information encoded by Gabor filterbank features. Additionally, to identify the
contribution of low, medium and high temporal modulation frequencies subsets of
the Gabor filterbank were used as features (dubbed LTM, MTM and HTM
respectively). With temporal modulation frequencies between 16 and 25 Hz, HTM
consistently outperformed the remaining ones in every condition, highlighting
the robustness of these representations against channel distortions, low
signal-to-noise ratios and acoustically challenging real-life scenarios with
relative improvements from 11 to 56% against a Mel-filterbank-DNN baseline. To
explain the results, a measure of similarity between phoneme classes from DNN
activations is proposed and linked to their acoustic properties. We find this
measure to be consistent with the observed error rates and highlight specific
differences on phoneme level to pinpoint the benefit of the proposed features.

Detection of Slang Words in e-Data using semi-Supervised Learning

Alok Ranjan Pal, Diganta Saha
Comments: 13 pages in International Journal of Artificial Intelligence & Applications (IJAIA), Vol. 4, No. 5, September 2013
Subjects: Computation and Language (cs.CL)

The proposed algorithmic approach deals with finding the sense of a word in
an electronic data. Now a day,in different communication mediums like internet,
mobile services etc. people use few words, which are slang in nature. This
approach detects those abusive words using supervised learning procedure. But
in the real life scenario, the slang words are not used in complete word forms
always. Most of the times, those words are used in different abbreviated forms
like sounds alike forms, taboo morphemes etc. This proposed approach can detect
those abbreviated forms also using semi supervised learning procedure. Using
the synset and concept analysis of the text, the probability of a suspicious
word to be a slang word is also evaluated.

JFLEG: A Fluency Corpus and Benchmark for Grammatical Error Correction

Courtney Napoles, Keisuke Sakaguchi, Joel Tetreault
Comments: To appear in EACL 2017 (short papers)
Subjects: Computation and Language (cs.CL)

We present a new parallel corpus, JHU FLuency-Extended GUG corpus (JFLEG) for
developing and evaluating grammatical error correction (GEC). Unlike other
corpora, it represents a broad range of language proficiency levels and uses
holistic fluency edits to not only correct grammatical errors but also make the
original text more native sounding. We describe the types of corrections made
and benchmark four leading GEC systems on this corpus, identifying specific
areas in which they do well and how they can improve. JFLEG fulfills the need
for a new gold standard to properly assess the current state of GEC.

The Parallel Meaning Bank: Towards a Multilingual Corpus of Translations Annotated with Compositional Meaning Representations

Lasha Abzianidze, Johannes Bjerva, Kilian Evang, Hessel Haagsma, Rik van Noord, Pierre Ludmann, Duc-Duy Nguyen, Johan Bos
Comments: To appear at EACL 2017
Subjects: Computation and Language (cs.CL)

The Parallel Meaning Bank is a corpus of translations annotated with shared,
formal meaning representations comprising over 11 million words divided over
four languages (English, German, Italian, and Dutch). Our approach is based on
cross-lingual projection: automatically produced (and manually corrected)
semantic annotations for English sentences are mapped onto their word-aligned
translations, assuming that the translations are meaning-preserving. The
semantic annotation consists of five main steps: (i) segmentation of the text
in sentences and lexical items; (ii) syntactic parsing with Combinatory
Categorial Grammar; (iii) universal semantic tagging; (iv) symbolization; and
(v) compositional semantic analysis based on Discourse Representation Theory.
These steps are performed using statistical models trained in a semi-supervised
manner. The employed annotation models are all language-neutral. Our first
results are promising.

Distributed, Parallel, and Cluster Computing

Don't cry over spilled records: Memory elasticity of data-parallel applications and its application to cluster scheduling

Calin Iorgulescu, Florin Dinu, Aunn Raza, Wajih Ul Hassan, Willy Zwaenepoel
Comments: 13 pages (11 without references)
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

Understanding the performance of data-parallel workloads when
resource-constrained has significant practical importance but unfortunately has
received only limited attention. This paper identifies, quantifies and
demonstrates memory elasticity, an intrinsic property of data-parallel tasks.
Memory elasticity allows tasks to run with significantly less memory that they
would ideally want while only paying a moderate performance penalty. For
example, we find that given as little as 10% of ideal memory, PageRank and
NutchIndexing Hadoop reducers become only 1.2x/1.75x and 1.08x slower. We show
that memory elasticity is prevalent in the Hadoop, Spark, Tez and Flink
frameworks. We also show that memory elasticity is predictable in nature by
building simple models for Hadoop and extending them to Tez and Spark.

To demonstrate the potential benefits of leveraging memory elasticity, this
paper further explores its application to cluster scheduling. In this setting,
we observe that the resource vs. time trade-off enabled by memory elasticity
becomes a task queuing time vs task runtime trade-off. Tasks may complete
faster when scheduled with less memory because their waiting time is reduced.
We show that a scheduler can turn this task-level trade-off into improved job
completion time and cluster-wide memory utilization. We have integrated memory
elasticity into Apache YARN. We show gains of up to 60% in average job
completion time on a 50-node Hadoop cluster. Extensive simulations show similar
improvements over a large number of scenarios.

Okapi: Causally Consistent Geo-Replication Made Faster, Cheaper and More Available

Diego Didona, Kristina Spirovska, Willy Zwaenepoel
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

Okapi is a new causally consistent geo-replicated key- value store. Okapi
leverages two key design choices to achieve high performance. First, it relies
on hybrid logical/physical clocks to achieve low latency even in the presence
of clock skew. Second, Okapi achieves higher resource efficiency and better
availability, at the expense of a slight increase in update visibility latency.
To this end, Okapi implements a new stabilization protocol that uses a
combination of vector and scalar clocks and makes a remote update visible when
its delivery has been acknowledged by every data center. We evaluate Okapi with
different workloads on Amazon AWS, using three geographically distributed
regions and 96 nodes. We compare Okapi with two recent approaches to causal
consistency, Cure and GentleRain. We show that Okapi delivers up to two orders
of magnitude better performance than GentleRain and that Okapi achieves up to
3.5x lower latency and a 60% reduction of the meta-data overhead with respect
to Cure.

Bizur: A Key-value Consensus Algorithm for Scalable File-systems

Ezra N. Hoch, Yaniv Ben-Yehuda, Noam Lewis, Avi Vigder
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Databases (cs.DB)

Bizur is a consensus algorithm exposing a key-value interface. It is used by
a distributed file-system that scales to 100s of servers, delivering millions
of IOPS, both data and metadata, with consistent low-latency.

Bizur is aimed for services that require strongly consistent state, but do
not require a distributed log; for example, a distributed lock manager or a
distributed service locator. By avoiding a distributed log scheme, Bizur
outperforms distributed log based consensus algorithms, producing more IOPS and
guaranteeing lower latencies during normal operation and especially during
failures.

Paxos-like algorithms (e.g., Zab and Raft) which are used by existing
distributed file-systems, can have artificial contention points due to their
dependence on a distributed log. The distributed log is needed when replicating
a general service, but when the desired service is key-value based, the
contention points created by the distributed log can be avoided.

Bizur does exactly that, by reaching consensus independently on independent
keys. This independence allows Bizur to handle failures more efficiently and to
scale much better than other consensus algorithms, allowing the file-system
that utilizes Bizur to scale with it.

Better Process Mapping and Sparse Quadratic Assignment

Christian Schulz, Jesper Larsson Träff
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS); Mathematical Software (cs.MS)

Communication and topology aware process mapping is a powerful approach to
reduce communication time in parallel applications with known communication
patterns on large, distributed memory systems. We address the problem as a
quadratic assignment problem (QAP), and present algorithms to construct initial
mappings of processes to processors as well as fast local search algorithms to
further improve the mappings. By exploiting assumptions that typically hold for
applications and modern supercomputer systems such as sparse communication
patterns and hierarchically organized communication systems, we arrive at
significantly more powerful algorithms for these special QAPs. Our multilevel
construction algorithms employ recently developed, perfectly balanced graph
partitioning techniques and excessively exploit the given communication system
hierarchy. We present improvements to a local search algorithm of Brandfass et
al., and decrease the running time by reducing the time needed to perform swaps
in the assignment as well as by carefully constraining local search
neighborhoods. Experiments indicate that our algorithms not only dramatically
speed up local search, but due to the multilevel approach also find much better
solutions~in~practice.

Scheduling Algorithms for Asymmetric Multi-core Processors

Alan David
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

Growing power dissipation due to high performance requirement of processor
suggests multicore processor technology, which has become the technology for
present and next decade. Research advocates asymmetric multi-core processor
system for better utilization of chip real state. However, asymmetric multi
core architecture poses a new challenge to operating system scheduler, which
traditionally assumes homogeneous hardware. So, scheduling threads to core has
become a major issue to operating system kernel. In this paper, proposed
scheduling algorithms for asymmetric multicore processors have been
categorized. This paper explores some representative algorithms of these
classes to get an overview of scheduling algorithms for asymmetric multicore
system.

Occupy the Cloud: Distributed Computing for the 99%

Eric Jonas, Shivaram Venkataraman, Ion Stoica, Benjamin Recht
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

Distributed computing remains inaccessible to a large number of users, in
spite of many open source platforms and extensive commercial offerings. While
distributed computation frameworks have moved beyond a simple map-reduce model,
many users are still left to struggle with complex cluster management and
configuration tools, even for running simple embarrassingly parallel jobs. We
argue that stateless functions represent a viable platform for these users,
eliminating cluster management overhead, fulfilling the promise of elasticity.
Furthermore, using our prototype implementation, PyWren, we show that this
model is general enough to implement a number of distributed computing models,
such as BSP, efficiently. Extrapolating from recent trends in network bandwidth
and the advent of disaggregated storage, we suggest that stateless functions
are a natural fit for data processing in future computing environments.

Data-Intensive Supercomputing in the Cloud: Global Analytics for Satellite Imagery

We present our experiences using cloud computing to support data-intensive
analytics on satellite imagery for commercial applications. Drawing from our
background in high-performance computing, we draw parallels between the early
days of clustered computing systems and the current state of cloud computing
and its potential to disrupt the HPC market. Using our own virtual file system
layer on top of cloud remote object storage, we demonstrate aggregate read
bandwidth of 230 gigabytes per second using 512 Google Compute Engine (GCE)
nodes accessing a USA multi-region standard storage bucket. This figure is
comparable to the best HPC storage systems in existence. We also present
several of our application results, including the identification of field
boundaries in Ukraine, and the generation of a global cloud-free base layer
from Landsat imagery.

LAMMPS' PPPM Long-Range Solver for the Second Generation Xeon Phi

William McDoniel (1), Markus Höhnerbach (1), Rodrigo Canales (1), Ahmed E. Ismail (2), Paolo Bientinesi (2) ((1) RWTH Aachen University, (2) West Virginia University)
Comments: 18 pages, 8 figures, submitted to ISC High Performance 2017
Subjects: Computational Engineering, Finance, and Science (cs.CE); Distributed, Parallel, and Cluster Computing (cs.DC); Performance (cs.PF)

Molecular Dynamics is an important tool for computational biologists,
chemists, and materials scientists, consuming a sizable amount of
supercomputing resources. Many of the investigated systems contain charged
particles, which can only be simulated accurately using a long-range solver,
such as PPPM. We extend the popular LAMMPS molecular dynamics code with an
implementation of PPPM particularly suitable for the second generation Intel
Xeon Phi. Our main target is the optimization of computational kernels by means
of vectorization, and we observe speedups in these kernels of up to 12x. These
improvements carry over to LAMMPS users, with overall speedups ranging between
2-3x, without requiring users to retune input parameters. Furthermore, our
optimizations make it easier for users to determine optimal input parameters
for attaining top performance.

Learning

Exploring loss function topology with cyclical learning rates

Leslie N. Smith, Nicholay Topin
Comments: Submitted as an ICLR 2017 Workshop paper
Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

We present observations and discussion of previously unreported phenomena
discovered while training residual networks. The goal of this work is to better
understand the nature of neural networks through the examination of these new
empirical results. These behaviors were identified through the application of
Cyclical Learning Rates (CLR) and linear network interpolation. Among these
behaviors are counterintuitive increases and decreases in training loss and
instances of rapid training. For example, we demonstrate how CLR can produce
greater testing accuracy than traditional training despite using large learning
rates. Files to replicate these results are available at
this https URL

Mutual Kernel Matrix Completion

Tsuyoshi Kato, Rachelle Rivero
Comments: 10 pages, 4 figures
Subjects: Learning (cs.LG); Numerical Analysis (cs.NA); Machine Learning (stat.ML)

With the huge influx of various data nowadays, extracting knowledge from them
has become an interesting but tedious task among data scientists, particularly
when the data come in heterogeneous form and have missing information. Many
data completion techniques had been introduced, especially in the advent of
kernel methods. However, among the many data completion techniques available in
the literature, studies about mutually completing several incomplete kernel
matrices have not been given much attention yet. In this paper, we present a
new method, called Mutual Kernel Matrix Completion (MKMC) algorithm, that
tackles this problem of mutually inferring the missing entries of multiple
kernel matrices by combining the notions of data fusion and kernel matrix
completion, applied on biological data sets to be used for classification task.
We first introduced an objective function that will be minimized by exploiting
the EM algorithm, which in turn results to an estimate of the missing entries
of the kernel matrices involved. The completed kernel matrices are then
combined to produce a model matrix that can be used to further improve the
obtained estimates. An interesting result of our study is that the E-step and
the M-step are given in closed form, which makes our algorithm efficient in
terms of time and memory. After completion, the (completed) kernel matrices are
then used to train an SVM classifier to test how well the relationships among
the entries are preserved. Our empirical results show that the proposed
algorithm bested the traditional completion techniques in preserving the
relationships among the data points, and in accurately recovering the missing
kernel matrix entries. By far, MKMC offers a promising solution to the problem
of mutual estimation of a number of relevant incomplete kernel matrices.

Is a Data-Driven Approach still Better than Random Choice with Naive Bayes classifiers?

Piotr Szymański, Tomasz Kajdanowicz
Subjects: Learning (cs.LG); Machine Learning (stat.ML)

We study the performance of data-driven, a priori and random approaches to
label space partitioning for multi-label classification with a Gaussian Naive
Bayes classifier. Experiments were performed on 12 benchmark data sets and
evaluated on 5 established measures of classification quality: micro and macro
averaged F1 score, Subset Accuracy and Hamming loss. Data-driven methods are
significantly better than an average run of the random baseline. In case of F1
scores and Subset Accuracy – data driven approaches were more likely to perform
better than random approaches than otherwise in the worst case. There always
exists a method that performs better than a priori methods in the worst case.
The advantage of data-driven methods against a priori methods with a weak
classifier is lesser than when tree classifiers are used.

On Detecting Adversarial Perturbations

Machine learning and deep learning in particular has advanced tremendously on
perceptual tasks in recent years. However, it remains vulnerable against
adversarial perturbations of the input that have been crafted specifically to
fool the system while being quasi-imperceptible to a human. In this work, we
propose to augment deep neural networks with a small “detector” subnetwork
which is trained on the binary classification task of distinguishing genuine
data from data containing adversarial perturbations. Our method is orthogonal
to prior work on addressing adversarial perturbations, which has mostly focused
on making the classification network itself more robust. We show empirically
that adversarial perturbations can be detected surprisingly well even though
they are quasi-imperceptible to humans. Moreover, while the detectors have been
trained to detect only a specific adversary, they generalize to similar and
weaker adversaries. In addition, we propose an adversarial attack that fools
both the classifier and the detector and a novel training procedure for the
detector that counteracts this attack.

Gaussian-Dirichlet Posterior Dominance in Sequential Learning

Ian Osband, Benjamin Van Roy
Subjects: Machine Learning (stat.ML); Learning (cs.LG); Probability (math.PR)

We consider the problem of sequential learning from categorical observations
bounded in [0,1]. We establish an ordering between the Dirichlet posterior over
categorical outcomes and a Gaussian posterior under observations with N(0,1)
noise. We establish that, conditioned upon identical data with at least two
observations, the posterior mean of the categorical distribution will always
second-order stochastically dominate the posterior mean of the Gaussian
distribution. These results provide a useful tool for the analysis of
sequential learning under categorical outcomes.

Practical Learning of Predictive State Representations

Carlton Downey, Ahmed Hefny, Geoffrey Gordon
Subjects: Machine Learning (stat.ML); Learning (cs.LG)

Over the past decade there has been considerable interest in spectral
algorithms for learning Predictive State Representations (PSRs). Spectral
algorithms have appealing theoretical guarantees; however, the resulting models
do not always perform well on inference tasks in practice. One reason for this
behavior is the mismatch between the intended task (accurate filtering or
prediction) and the loss function being optimized by the algorithm (estimation
error in model parameters).

A natural idea is to improve performance by refining PSRs using an algorithm
such as EM. Unfortunately it is not obvious how to apply apply an EM style
algorithm in the context of PSRs as the Log Likelihood is not well defined for
all PSRs. We show that it is possible to overcome this problem using ideas from
Predictive State Inference Machines.

We combine spectral algorithms for PSRs as a consistent and efficient
initialization with PSIM-style updates to refine the resulting model
parameters. By combining these two ideas we develop Inference Gradients, a
simple, fast, and robust method for practical learning of PSRs. Inference
Gradients performs gradient descent in the PSR parameter space to optimize an
inference-based loss function like PSIM. Because Inference Gradients uses a
spectral initialization we get the same consistency benefits as PSRs. We show
that Inference Gradients outperforms both PSRs and PSIMs on real and synthetic
data sets.

Soft Weight-Sharing for Neural Network Compression

Karen Ullrich, Edward Meeds, Max Welling
Comments: ICLR2017
Subjects: Machine Learning (stat.ML); Learning (cs.LG)

The success of deep learning in numerous application domains created the de-
sire to run and train them on mobile devices. This however, conflicts with
their computationally, memory and energy intense nature, leading to a growing
interest in compression. Recent work by Han et al. (2015a) propose a pipeline
that involves retraining, pruning and quantization of neural network weights,
obtaining state-of-the-art compression rates. In this paper, we show that
competitive compression rates can be achieved by using a version of soft
weight-sharing (Nowlan & Hinton, 1992). Our method achieves both quantization
and pruning in one simple (re-)training procedure. This point of view also
exposes the relation between compression and the minimum description length
(MDL) principle.

Information Theory

BranchHull: Convex bilinear inversion from the entrywise product of signals with known signs

Alireza Aghasi, Ali Ahmed, Paul Hand
Subjects: Information Theory (cs.IT); Optimization and Control (math.OC)

We consider the bilinear inverse problem of recovering two vectors, (x) and
(w), in (mathbb{R}^L) from their entrywise product. For the case where the
vectors have known signs and belong to known subspaces, we introduce the convex
program BranchHull, which is posed in the natural parameter space and does not
require an approximate solution or initialization in order to be stated or
solved. Under the structural assumptions that (x) and (w) are the members of
known (K) and (N) dimensional random subspaces, we prove that BranchHull
recovers (x) and (w) up to the inherent scaling ambiguity with high probability
whenever (L gtrsim K+N). This program is motivated by applications in blind
deconvolution and self-calibration.

Generalized Degrees-of-Freedom of the 2-User Case MISO Broadcast Channel with Distributed CSIT

Antonio Bazco, Paul de Kerret, David Gesbert, Nicolas Gresset
Subjects: Information Theory (cs.IT)

This work analyses the Generalized Degrees-of-Freedom (GDoF) of the 2-User
Multiple-Input Single-Output (MISO) Broadcast Channel (BC) in the so-called
Distributed CSIT regime, with application to decentralized wireless networks.
This regime differs from the classical limited CSIT one in that the CSIT is not
just noisy but also imperfectly shared across the transmitters (TXs). Hence,
each TX precodes data on the basis of local CSIT and statistical quality
information at other TXs. We derive the GDoF result and obtain the surprising
outcome that by specific accounting of the pathloss information, it becomes
possible for the decentralized precoded network to reach the same performance
as a genie-aided centralized network where the central node has obtained the
estimates of both TXs. The key ingredient in the scheme is the so-called
Active-Passive Zero-Forcing (AP-ZF) precoding, which lets the precoder design
adapt optimally with respect to different local CSIT qualities available at
different TXs.

Layered Coding for Energy Harvesting Communication Without CSIT

Rajshekhar Vishweshwar Bhat, Mehul Motani, Teng Joon Lim
Subjects: Information Theory (cs.IT)

Due to stringent constraints on resources, it may be infeasible to acquire
the current channel state information at the transmitter in energy harvesting
communication systems. In this paper, we optimize an energy harvesting
transmitter, communicating over a slow fading channel, using layered coding.
The transmitter has access to the channel statistics, but does not know the
exact channel state. In layered coding, the codewords are first designed for
each of the channel states at different rates, and then the codewords are
either time-multiplexed or superimposed before the transmission, leading to two
transmission strategies. The receiver then decodes the information adaptively
based on the realized channel state. The transmitter is equipped with a
finite-capacity battery having non-zero internal resistance. In each of the
transmission strategies, we first formulate and study an average rate
maximization problem with non-causal knowledge of the harvested power
variations. Further, assuming statistical knowledge and causal information of
the harvested power variations, we propose a sub-optimal algorithm, and compare
with the stochastic dynamic programming based solution and a greedy policy.

Improper Signaling for Virtual Full-Duplex Relay Systems

Mohamed Gaafar, Osama Amin, Rafael F. Schaefer, Mohamed-Slim Alouini
Comments: accepted in 21st International ITG Workshop on Smart Antennas, Berlin 03/2017
Subjects: Information Theory (cs.IT)

Virtual full-duplex (VFD) is a powerful solution to compensate the rate loss
of half-duplex relaying without the need to full-duplex capable nodes.
Inter-relay interference (IRI) challenges the operation of VFD relaying
systems. Recently, improper signaling is employed at both relays of the VFD to
mitigate the IRI by imposing the same signal characteristics for both relays.
To further boost the achievable rate performance, asymmetric time sharing VFD
relaying system is adopted with different improper signals at the half-duplex
relays. The joint tuning of the three design parameters improves the achievable
rate performance at different ranges of IRI and different relays locations.
Extensive simulation results are presented and analyzed to show the achievable
rate gain of the proposed system and understand the system behavior.

EGC Reception for FSO Systems Under Mixture-Gamma Fading Channels and Pointing Errors

Nikolaos I. Miridakis, Theodoros A. Tsiftsis
Subjects: Information Theory (cs.IT)

The performance of equal gain combining reception at free-space optical
communication systems is analytically studied and evaluated. We consider the
case when the total received signal undergoes independent and not necessarily
identically distributed channel fading, modeled by the versatile mixture-Gamma
distribution. Also, the misalignment-induced fading due to the presence of
pointing errors is jointly considered in the enclosed analysis. New closed-form
expressions in terms of finite sum series of the Meijer’s (G)-function are
derived regarding some key performance metrics of the considered system;
namely, the scintillation index, outage probability, and average bit-error
rate. Based on these results, some useful outcomes are manifested, such as the
system diversity order, the crucial role of diversity branches, and the impact
of composite channel fading/pointing error effect onto the system performance.

Circular Buffer Rate-Matched Polar Codes

Mostafa El-Khamy, Hsien-Ping Lin, Jungwon Lee, Inyup Kang
Subjects: Information Theory (cs.IT)

A practical rate-matching system for constructing rate-compatible polar codes
is proposed. The proposed polar code circular buffer rate-matching is suitable
for transmissions on communication channels that support hybrid automatic
repeat request (HARQ) communications, as well as for flexible resource-element
rate-matching on single transmission channels. Our proposed circular buffer
rate matching scheme also incorporates a bit-mapping scheme for transmission on
bit-interleaved coded modulation (BICM) channels using higher order
modulations. An interleaver is derived from a puncturing order obtained with a
low complexity progressive puncturing search algorithm on a base code of short
length, and has the flexibility to achieve any desired rate at the desired code
length, through puncturing or repetition. The rate-matching scheme is implied
by a two-stage polarization, for transmission at any desired code length, code
rate, and modulation order, and is shown to achieve the symmetric capacity of
BICM channels. Numerical results on AWGN and fast fading channels show that the
rate-matched polar codes have a competitive performance when compared to the
spatially-coupled quasi-cyclic LDPC codes or LTE turbo codes, while having
similar rate-dematching storage and computational complexities.

Spectral Efficiency Analysis for Spatial Modulation in Massive MIMO Uplink over Dispersive Channels

Yue Sun, Jintao Wang, Longzhuang He, Jian Song
Comments: 6 pages, 8 figures, has been accepted by IEEE ICC 2017
Subjects: Information Theory (cs.IT)

Flat-fading channel models are usually invoked for analyzing the performance
of massive spatial modulation multiple-input multiple-output (SM-MIMO) systems.
However, in the context of broadband SM transmission, the severe
inter-symbol-interference (ISI) caused by the frequency-selective fading
channels can not be ignored, which leads to very detrimental effects on the
achievable system performance, especially for single-carrier SM (SC-SM)
transmission schemes. To the best of the author’s knowledge, none of the
previous researchers have been able to provide a thorough analysis on the
achievable spectral efficiency (SE) of the massive SC-SM MIMO uplink
transmission. In this context, the uplink SE of single-cell massive SC-SM MIMO
system is analyzed, and a tight closed-form lower bound is proposed to quantify
the SE when the base station (BS) uses maximum ratio (MR) combining for
multi-user detection. The impacts of imperfect channel estimation and transmit
correlation are all considered. Monte Carlo simulations are performed to verify
the tightness of our proposed SE lower bound. Both the theoretical analysis and
simulation results show that the SE of uplink single-cell massive SC-SM MIMO
system has the potential to outperform the uplink SE achieved by single-antenna
UEs.

Localization in the Internet of Things Network: A Low-Rank Matrix Completion Approach

Luong Trung Nguyen, Sangtae Kim, Byonghyo Shim
Subjects: Information Theory (cs.IT)

Location awareness, providing ability to identify the location of sensor,
machine, vehicle, and wearable device, is a rapidly growing trend of
hyper-connected society and one of key ingredients for internet of things
(IoT). In order to make a proper reaction to the collected information from
devices, location information of things should be available at the data center.
One challenge for the massive IoT networks is to identify the location map of
whole sensor nodes from partially observed distance information. This is
especially important for massive sensor networks, relay-based and hierarchical
networks, and vehicular to everything (V2X) networks. The primary goal of this
paper is to propose an algorithm to reconstruct the Euclidean distance matrix
(and eventually the location map) from partially observed distance information.
By casting the low-rank matrix completion problem into the unconstrained
minimization problem in Riemannian manifold in which a notion of
differentiability can be defined, we are able to solve the low-rank matrix
completion problem efficiently using a modified conjugate gradient algorithm.
From the analysis and numerical experiments, we show that the proposed method,
termed localization in Riemannian manifold using conjugate gradient (LRM-CG),
is effective in recovering the Euclidean distance matrix for both noiseless and
noisy environments.

On the Courtade-Kumar conjecture for certain classes of Boolean functions

Septimia Sarbu
Comments: submitted to 2017 IEEE International Symposium on Information Theory (ISIT 2017); this article is a summarized version of my previous arXiv manuscript arXiv:1701.05014 , which was rejected by the IEEE Transactions on Information Theory
Subjects: Information Theory (cs.IT)

We prove the Courtade-Kumar conjecture, for certain classes of
(n)-dimensional Boolean functions, (forall ngeq 2) and for all values of the
error probability of the binary symmetric channel, (forall 0 leq p leq
frac{1}{2}). Let (mathbf{X}=[X_1…X_n]) be a vector of independent and
identically distributed Bernoulli((frac{1}{2})) random variables, which are
the input to a memoryless binary symmetric channel, with the error probability
in the interval (0 leq p leq frac{1}{2}), and (mathbf{Y}=[Y_1…Y_n]) the
corresponding output. Let (f:{0,1}^n
ightarrow {0,1}) be an
(n)-dimensional Boolean function. Then, the Courtade-Kumar conjecture states
that the mutual information (operatorname{MI}(f(mathbf{X}),mathbf{Y}) leq
1-operatorname{H}(p)), where (operatorname{H}(p)) is the binary entropy
function.

Bayesian Compressive Sensing Approaches for Direction of Arrival Estimation with Mutual Coupling Effects

Matthew Hawes, Lyudmila Mihaylova, François Septier, Simon Godsill
Comments: This paper is published in IEEE Transaction on Antenna and Propagation. If citing this work please use the information for the published version
Subjects: Information Theory (cs.IT)

The problem of estimating the dynamic direction of arrival of far field
signals impinging on a uniform linear array, with mutual coupling effects, is
addressed. This work proposes two novel approaches able to provide accurate
solutions, including at the endfire regions of the array. Firstly, a Bayesian
compressive sensing Kalman filter is developed, which accounts for the
predicted estimated signals rather than using the traditional sparse prior. The
posterior probability density function of the received source signals and the
expression for the related marginal likelihood function are derived
theoretically. Next, a Gibbs sampling based approach with indicator variables
in the sparsity prior is developed. This allows sparsity to be explicitly
enforced in different ways, including when an angle is too far from the
previous estimate. The proposed approaches are validated and evaluated over
different test scenarios and compared to the traditional relevance vector
machine based method. An improved accuracy in terms of average root mean square
error values is achieved (up to 73.39% for the modified relevance vector
machine based approach and 86.36% for the Gibbs sampling based approach). The
proposed approaches prove to be particularly useful for direction of arrival
estimation when the angle of arrival moves into the endfire region of the
array.

Prospect Theory for Enhanced Cyber-Physical Security of Drone Delivery Systems: A Network Interdiction Game

Anibal Sanjab, Walid Saad, Tamer Başar
Comments: 2017 IEEE International Conference on Communications – Communication and Information Systems Security Symposium
Subjects: Computer Science and Game Theory (cs.GT); Information Theory (cs.IT)

The use of unmanned aerial vehicles (UAVs) as delivery systems of online
goods is rapidly becoming a global norm, as corroborated by Amazon’s “Prime
Air” and Google’s “Project Wing” projects. However, the real-world deployment
of such drone delivery systems faces many cyber-physical security challenges.
In this paper, a novel mathematical framework for analyzing and enhancing the
security of drone delivery systems is introduced. In this regard, a zero-sum
network interdiction game is formulated between a vendor, operating a drone
delivery system, and a malicious attacker. In this game, the vendor seeks to
find the optimal path that its UAV should follow, to deliver a purchase from
the vendor’s warehouse to a customer location, to minimize the delivery time.
Meanwhile, an attacker seeks to choose an optimal location to interdict the
potential paths of the UAVs, so as to inflict cyber or physical damage to it,
thus, maximizing its delivery time. First, the Nash equilibrium point of this
game is characterized. Then, to capture the subjective behavior of both the
vendor and attacker, new notions from prospect theory are incorporated into the
game. These notions allow capturing the vendor’s and attacker’s i) subjective
perception of attack success probabilities, and ii) their disparate subjective
valuations of the achieved delivery times relative to a certain target delivery
time. Simulation results have shown that the subjective decision making of the
vendor and attacker leads to adopting risky path selection strategies which
inflict delays to the delivery, thus, yielding unexpected delivery times which
surpass the target delivery time set by the vendor.

Structure-Based Subspace Method for Multi-Channel Blind System Identification

Qadri Mayyala, Karim Abed-Meraim, Azzedine Zerguine
Comments: 5 pages, Submitted to IEEE Signal Processing Letters, January 2017
Subjects: Applications (stat.AP); Information Theory (cs.IT)

In this work, a novel subspace-based method for blind identification of
multichannel finite impulse response (FIR) systems is presented. Here, we
exploit directly the impeded Toeplitz channel structure in the signal linear
model to build a quadratic form whose minimization leads to the desired channel
estimation up to a scalar factor. This method can be extended to estimate any
predefined linear structure, e.g. Hankel, that is usually encountered in linear
systems. Simulation findings are provided to highlight the appealing advantages
of the new structure-based subspace (SSS) method over the standard subspace
(SS) method in certain adverse identification scenarii.