Xiangang Li, Xihong Wu
Comments: Published in INTERSPEECH 2015, September 6-10, 2015, Dresden, Germany
Subjects: Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE)
Long short-term memory (LSTM) recurrent neural networks (RNNs) have been
shown to give state-of-the-art performance on many speech recognition tasks, as
they are able to provide the learned dynamically changing contextual window of
all sequence history. On the other hand, the convolutional neural networks
(CNNs) have brought significant improvements to deep feed-forward neural
networks (FFNNs), as they are able to better reduce spectral variation in the
input signal. In this paper, a network architecture called as convolutional
recurrent neural network (CRNN) is proposed by combining the CNN and LSTM RNN.
In the proposed CRNNs, each speech frame, without adjacent context frames, is
organized as a number of local feature patches along the frequency axis, and
then a LSTM network is performed on each feature patch along the time axis. We
train and compare FFNNs, LSTM RNNs and the proposed LSTM CRNNs at various
number of configurations. Experimental results show that the LSTM CRNNs can
exceed state-of-the-art speech recognition performance.
Manan Shah, Christopher Rubadue, David Suster, Dayong Wang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Current analysis of tumor proliferation, the most salient prognostic
biomarker for invasive breast cancer, is limited to subjective mitosis counting
by pathologists in localized regions of tissue images. This study presents the
first data-driven integrative approach to characterize the severity of tumor
growth and spread on a categorical and molecular level, utilizing multiple
biologically salient deep learning classifiers to develop a comprehensive
prognostic model. Our approach achieves pathologist-level performance on
three-class categorical tumor severity prediction. It additionally pioneers
prediction of molecular expression data from a tissue image, obtaining a
Spearman’s rank correlation coefficient of 0.60 with ex vivo mean calculated
RNA expression. Furthermore, our framework is applied to identify over two
hundred unprecedented biomarkers critical to the accurate assessment of tumor
proliferation, validating our proposed integrative pipeline as the first to
holistically and objectively analyze histopathological images.
Xianzhi Du, Mostafa El-Khamy, Jungwon Lee, Larry S. Davis
Comments: 11 pages and 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We propose a deep neural network fusion architecture for fast and robust
pedestrian detection. The proposed network fusion architecture allows for
parallel processing of multiple networks for speed. A single shot deep
convolutional network is trained as a object detector to generate all possible
pedestrian candidates of different sizes and occlusions. This network outputs a
large variety of pedestrian candidates to cover the majority of ground-truth
pedestrians while also introducing a large number of false positives. Next,
multiple deep neural networks are used in parallel for further refinement of
these pedestrian candidates. We introduce a soft-rejection based network fusion
method to fuse the soft metrics from all networks together to generate the
final confidence scores. Our method performs better than existing
state-of-the-arts, especially when detecting small-size and occluded
pedestrians. Furthermore, we propose a method for integrating pixel-wise
semantic segmentation network into the network fusion architecture as a
reinforcement to the pedestrian detector. The approach outperforms
state-of-the-art methods on most protocols on Caltech Pedestrian dataset, with
significant boosts on several protocols. It is also faster than all other
methods.
João P. Oliveira, Ana Bragança, José Bioucas-Dias, Mário Figueiredo, Luís Alcácer, Jorge Morgado, Quirina Ferreira
Comments: 14 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In this article, we present a denoising algorithm to improve the
interpretation and quality of scanning tunneling microscopy (STM) images. Given
the high level of self-similarity of STM images, we propose a denoising
algorithm by reformulating the true estimation problem as a sparse regression,
often termed sparse coding. We introduce modifications to the algorithm to cope
with the existence of artifacts, mainly dropouts, which appear in a structured
way as consecutive line segments on the scanning direction. The resulting
algorithm treats the artifacts as missing data, and the estimated values
outperform those algorithms that substitute the outliers by a local filtering.
We provide code implementations for both Matlab and Gwyddion.
Adi Perry, Dor Verbin, Nahum Kiryati
Comments: Planned submission to “Pattern Recognition Letters”
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In the absence of pedestrian crossing lights, finding a safe moment to cross
the road is often hazardous and challenging, especially for people with visual
impairments. We present a reliable low-cost solution, an Android device
attached to a traffic sign or lighting pole near the crossing, indicating
whether it is safe to cross the road. The indication can be by sound, display,
vibration, and various communication modalities provided by the Android device.
The integral system camera is aimed at approaching traffic. Optical flow is
computed from the incoming video stream, and projected onto an influx map,
automatically acquired during a brief training period. The crossing safety is
determined based on a 1-dimensional temporal signal derived from the
projection. We implemented the complete system on a Samsung Galaxy K-Zoom
Android smartphone, and obtained real-time operation. The system achieves
promising experimental results, providing pedestrians with sufficiently early
warning of approaching vehicles. The system can serve as a stand-alone safety
device, that can be installed where pedestrian crossing lights are ruled out.
Requiring no dedicated infrastructure, it can be powered by a solar panel and
remotely maintained via the cellular network.
Hamed Saghaei
Comments: 5 pages, 3 figures, 2016 1st International Conference on New Research Achievements in Electrical and Computer Engineering
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In this paper, we propose an automatic and mechanized license and number
plate recognition (LNPR) system which can extract the license plate number of
the vehicles passing through a given location using image processing
algorithms. No additional devices such as GPS or radio frequency identification
(RFID) need to be installed for implementing the proposed system. Using special
cameras, the system takes pictures from each passing vehicle and forwards the
image to the computer for being processed by the LPR software. Plate
recognition software uses different algorithms such as localization,
orientation, normalization, segmentation and finally optical character
recognition (OCR). The resulting data is applied to compare with the records on
a database. Experimental results reveal that the presented system successfully
detects and recognizes the vehicle number plate on real images. This system can
also be used for security and traffic control.
Miao Sun, Tony X. Han, Ming-Chang Liu, Ahmad Khodayari-Rostamabad
Comments: International Conference on Pattern Recognition(ICPR) 2016, Oral paper
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Convolutional Neural Networks (CNN) have demon- strated its successful
applications in computer vision, speech recognition, and natural language
processing. For object recog- nition, CNNs might be limited by its strict label
requirement and an implicit assumption that images are supposed to be target-
object-dominated for optimal solutions. However, the labeling procedure,
necessitating laying out the locations of target ob- jects, is very tedious,
making high-quality large-scale dataset prohibitively expensive. Data
augmentation schemes are widely used when deep networks suffer the insufficient
training data problem. All the images produced through data augmentation share
the same label, which may be problematic since not all data augmentation
methods are label-preserving. In this paper, we propose a weakly supervised CNN
framework named Multiple Instance Learning Convolutional Neural Networks
(MILCNN) to solve this problem. We apply MILCNN framework to object recognition
and report state-of-the-art performance on three benchmark datasets: CIFAR10,
CIFAR100 and ILSVRC2015 classification dataset.
Justus Thies, Michael Zollhöfer, Marc Stamminger, Christian Theobalt, Matthias Nießner
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We introduce FaceVR, a novel method for gaze-aware facial reenactment in the
Virtual Reality (VR) context. The key component of FaceVR is a robust algorithm
to perform real-time facial motion capture of an actor who is wearing a
head-mounted display (HMD), as well as a new data-driven approach for eye
tracking from monocular videos. In addition to these face reconstruction
components, FaceVR incorporates photo-realistic re-rendering in real time, thus
allowing artificial modifications of face and eye appearances. For instance, we
can alter facial expressions, change gaze directions, or remove the VR goggles
in realistic re-renderings. In a live setup with a source and a target actor,
we apply these newly-introduced algorithmic components. We assume that the
source actor is wearing a VR device, and we capture his facial expressions and
eye movement in real-time. For the target video, we mimic a similar tracking
process; however, we use the source input to drive the animations of the target
video, thus enabling gaze-aware facial reenactment. To render the modified
target video on a stereo display, we augment our capture and reconstruction
process with stereo data. In the end, FaceVR produces compelling results for a
variety of applications, such as gaze-aware facial reenactment, reenactment in
virtual reality, removal of VR goggles, and re-targeting of somebody’s gaze
direction in a video conferencing call.
Aditya Tatu
Comments: 12 pages, To be sent to a Journal/Conference
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computational Geometry (cs.CG)
Extracting shape information from object bound- aries is a well studied
problem in vision, and has found tremen- dous use in applications like object
recognition. Conversely, studying the space of shapes represented by curves
satisfying certain constraints is also intriguing. In this paper, we model and
analyze the space of shapes represented by a 3D curve (space curve) formed by
connecting n pieces of quarter of a unit circle. Such a space curve is what we
call a Tangle, the name coming from a toy built on the same principle. We
provide two models for the shape space of n-link open and closed tangles, and
we show that tangles are a subset of trigonometric splines of a certain order.
We give algorithms for curve approximation using open/closed tangles, computing
geodesics on these shape spaces, and to find the deformation that takes one
given tangle to another given tangle, i.e., the Log map. The algorithms
provided yield tangles upto a small and acceptable tolerance, as shown by the
results given in the paper.
Shai Shalev-Shwartz, Shaked Shammah, Amnon Shashua
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)
Autonomous driving is a multi-agent setting where the host vehicle must apply
sophisticated negotiation skills with other road users when overtaking, giving
way, merging, taking left and right turns and while pushing ahead in
unstructured urban roadways. Since there are many possible scenarios, manually
tackling all possible cases will likely yield a too simplistic policy.
Moreover, one must balance between unexpected behavior of other
drivers/pedestrians and at the same time not to be too defensive so that normal
traffic flow is maintained.
In this paper we apply deep reinforcement learning to the problem of forming
long term driving strategies. We note that there are two major challenges that
make autonomous driving different from other robotic tasks. First, is the
necessity for ensuring functional safety – something that machine learning has
difficulty with given that performance is optimized at the level of an
expectation over many instances. Second, the Markov Decision Process model
often used in robotics is problematic in our case because of unpredictable
behavior of other agents in this multi-agent scenario. We make three
contributions in our work. First, we show how policy gradient iterations can be
used without Markovian assumptions. Second, we decompose the problem into a
composition of a Policy for Desires (which is to be learned) and trajectory
planning with hard constraints (which is not learned). The goal of Desires is
to enable comfort of driving, while hard constraints guarantees the safety of
driving. Third, we introduce a hierarchical temporal abstraction we call an
“Option Graph” with a gating mechanism that significantly reduces the effective
horizon and thereby reducing the variance of the gradient estimation even
further.
Patrick Blöbaum, Takashi Washio, Shohei Shimizu
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)
It is generally difficult to make any statements about the expected
prediction error in an univariate setting without further knowledge about how
the data were generated. Recent work showed that knowledge about the real
underlying causal structure of a data generation process has implications for
various machine learning settings. Assuming an additive noise and an
independence between data generating mechanism and its input, we draw a novel
connection between the intrinsic causal relationship of two variables and the
expected prediction error. We formulate the theorem that the expected error of
the true data generating function as prediction model is generally smaller when
the effect is predicted from its cause and, on the contrary, greater when the
cause is predicted from its effect. The theorem implies an asymmetry in the
error depending on the prediction direction. This is further corroborated with
empirical evaluations in artificial and real-world data sets.
Michael Cook, Mirjam Eladhari, Andy Nealen, Mike Treanor, Eddy Boxerman, Alex Jaffe, Paul Sottosanti, Steve Swink
Subjects: Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)
People enjoy encounters with generative software, but rarely are they
encouraged to interact with, understand or engage with it. In this paper we
define the term ‘PCG-based game’, and explain how this concept follows on from
the idea of an AI-based game. We look at existing examples of games which
foreground their AI, put forward a methodology for designing PCG-based games,
describe some example case study designs for PCG-based games, and describe
lessons learned during this process of sketching and developing ideas.
Alessandro Fontana
Comments: 11 pages, 4 figures
Subjects: Neurons and Cognition (q-bio.NC); Artificial Intelligence (cs.AI)
Both neurobiological and environmental factors are known to play a role in
the origin of schizophrenia, but no model has been proposed that accounts for
both. This work presents a functional model of schizophrenia that merges
psychodynamic elements with ingredients borrowed from the theory of
psychological traumas, and evidences the interplay of traumatic experiences and
defective mental functions in the pathogenesis of the disorder. Our model
foresees that dissociation is a standard tool used by the mind to protect
itself from emotional pain. In case of repeated traumas, the mind learns to
adopt selective forms of dissociation to avoid pain without losing touch with
external reality. We conjecture that this process is defective in
schizophrenia, where dissociation is either too weak, giving rise to positive
symptoms, or too strong, causing negative symptoms.
Andrea F. Daniele, Mohit Bansal, Matthew R. Walter
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Learning (cs.LG)
Modern robotics applications that involve human-robot interaction require
robots to be able to communicate with humans seamlessly and effectively.
Natural language provides a flexible and efficient medium through which robots
can exchange information with their human partners. Significant advancements
have been made in developing robots capable of interpreting free-form
instructions, but less attention has been devoted to endowing robots with the
ability to generate natural language. We propose a navigational guide model
that enables robots to generate natural language instructions that allow humans
to navigate a priori unknown environments. We first decide which information to
share with the user according to their preferences, using a policy trained from
human demonstrations via inverse reinforcement learning. We then “translate”
this information into a natural language instruction using a neural
sequence-to-sequence model that learns to generate free-form instructions from
natural language corpora. We evaluate our method on a benchmark route
instruction dataset and achieve a BLEU score of 72.18% when compared to
human-generated reference instructions. We additionally conduct navigation
experiments with human participants that demonstrate that our method generates
instructions that people follow as accurately and easily as those produced by
humans.
Yifan Hou, Pan Zhou, Ting Wang, Yuchong Hu, Dapeng Wu
Subjects: Learning (cs.LG); Computers and Society (cs.CY); Information Retrieval (cs.IR)
The Massive Open Online Course (MOOC) has expanded significantly in recent
years. With the widespread of MOOC, the opportunity to study the fascinating
courses for free has attracted numerous people of diverse educational
backgrounds all over the world. In the big data era, a key research topic for
MOOC is how to mine the needed courses in the massive course databases in cloud
for each individual (course) learner accurately and rapidly as the number of
courses is increasing fleetly. In this respect, the key challenge is how to
realize personalized course recommendation as well as to reduce the computing
and storage costs for the tremendous course data. In this paper, we propose a
big data-supported, context-aware online learning-based course recommender
system that could handle the dynamic and infinitely massive datasets, which
recommends courses by using personalized context information and historical
statistics. The context-awareness takes the personal preferences into
consideration, making the recommendation suitable for people with different
backgrounds. Besides, the algorithm achieves the sublinear regret performance,
which means it can gradually recommend the mostly preferred and matched courses
to learners. Unlike other existing algorithms, ours bounds the time complexity
and space complexity linearly. In addition, our devised storage module is
expanded to the distributed-connected clouds, which can handle massive course
storage problems from heterogenous sources. Our experiment results verify the
superiority of our algorithms when comparing with existing works in the big
data setting.
Hussam Hamdan
Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)
The classic supervised classification algorithms are efficient, but
time-consuming, complicated and not interpretable, which makes it difficult to
analyze their results that limits the possibility to improve them based on real
observations. In this paper, we propose a new and a simple classifier to
predict a sentiment label of a short text. This model keeps the capacity of
human interpret-ability and can be extended to integrate NLP techniques in a
more interpretable way. Our model is based on a correlation metric which
measures the degree of association between a sentiment label and a word. Ten
correlation metrics are proposed and evaluated intrinsically. And then a
classifier based on each metric is proposed, evaluated and compared to the
classic classification algorithms which have proved their performance in many
studies. Our model outperforms these algorithms with several correlation
metrics.
Hussam Hamdan, Patrice Bellot, Frederic Bechet
Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR); Learning (cs.LG)
Term weighting metrics assign weights to terms in order to discriminate the
important terms from the less crucial ones. Due to this characteristic, these
metrics have attracted growing attention in text classification and recently in
sentiment analysis. Using the weights given by such metrics could lead to more
accurate document representation which may improve the performance of the
classification. While previous studies have focused on proposing or comparing
different weighting metrics at two-classes document level sentiment analysis,
this study propose to analyse the results given by each metric in order to find
out the characteristics of good and bad weighting metrics. Therefore we present
an empirical study of fifteen global supervised weighting metrics with four
local weighting metrics adopted from information retrieval, we also give an
analysis to understand the behavior of each metric by observing and analysing
how each metric distributes the terms and deduce some characteristics which may
distinguish the good and bad metrics. The evaluation has been done using
Support Vector Machine on three different datasets: Twitter, restaurant and
laptop reviews.
Helen O'Horan, Yevgeni Berzak, Ivan Vulić, Roi Reichart, Anna Korhonen
Journal-ref: COLING 2016
Subjects: Computation and Language (cs.CL)
In recent years linguistic typology, which classifies the world’s languages
according to their functional and structural properties, has been widely used
to support multilingual NLP. While the growing importance of typological
information in supporting multilingual tasks has been recognised, no systematic
survey of existing typological resources and their use in NLP has been
published. This paper provides such a survey as well as discussion which we
hope will both inform and inspire future work in the area.
Lieke Gelderloos, Grzegorz Chrupała
Comments: Accepted at COLING 2016
Subjects: Computation and Language (cs.CL); Learning (cs.LG)
We present a model of visually-grounded language learning based on stacked
gated recurrent neural networks which learns to predict visual features given
an image description in the form of a sequence of phonemes. The learning task
resembles that faced by human language learners who need to discover both
structure and meaning from noisy and ambiguous data across modalities. We show
that our model indeed learns to predict features of the visual context given
phonetically transcribed image descriptions, and show that it represents
linguistic information in a hierarchy of levels: lower layers in the stack are
comparatively more sensitive to form, whereas higher layers are more sensitive
to meaning.
Barbara Plank
Comments: In COLING 2016
Subjects: Computation and Language (cs.CL)
Keystroke dynamics have been extensively used in psycholinguistic and writing
research to gain insights into cognitive processing. But do keystroke logs
contain actual signal that can be used to learn better natural language
processing models?
We postulate that keystroke dynamics contain information about syntactic
structure that can inform shallow syntactic parsing. To test this hypothesis,
we explore labels derived from keystroke logs as auxiliary task in a multi-task
bidirectional Long Short-Term Memory (bi-LSTM). Our results show promising
results on two shallow syntactic parsing tasks, chunking and CCG supertagging.
Our model is simple, has the advantage that data can come from distinct
sources, and produces models that are significantly better than models trained
on the text annotations alone.
Gábor Gosztolya, Tamás Grósz, László Tóth
Subjects: Computation and Language (cs.CL)
Recently, attempts have been made to remove Gaussian mixture models (GMM)
from the training process of deep neural network-based hidden Markov models
(HMM/DNN). For the GMM-free training of a HMM/DNN hybrid we have to solve two
problems, namely the initial alignment of the frame-level state labels and the
creation of context-dependent states. Although flat-start training via
iteratively realigning and retraining the DNN using a frame-level error
function is viable, it is quite cumbersome. Here, we propose to use a
sequence-discriminative training criterion for flat start. While
sequence-discriminative training is routinely applied only in the final phase
of model training, we show that with proper caution it is also suitable for
getting an alignment of context-independent DNN models. For the construction of
tied states we apply a recently proposed KL-divergence-based state clustering
method, hence our whole training process is GMM-free. In the experimental
evaluation we found that the sequence-discriminative flat start training method
is not only significantly faster than the straightforward approach of iterative
retraining and realignment, but the word error rates attained are slightly
better as well.
Maisa C. Duarte, Pierre Maret
Comments: 6 pages, 1 figure and 2 tables
Subjects: Computation and Language (cs.CL)
We are developing the method to start new instances of NELL in various
languages and develop then NELL multilingualism. We base our method on our
experience on NELL Portuguese and NELL French. This reports explain our method
and develops some research perspectives.
Huijia Wu, Jiajun Zhang, Chengqing Zong
Comments: Accepted at COLING 2016
Subjects: Computation and Language (cs.CL)
In this paper, we empirically explore the effects of various kinds of skip
connections in stacked bidirectional LSTMs for sequential tagging. We
investigate three kinds of skip connections connecting to LSTM cells: (a) skip
connections to the gates, (b) skip connections to the internal states and (c)
skip connections to the cell outputs. We present comprehensive experiments
showing that skip connections to cell outputs outperform the remaining two.
Furthermore, we observe that using gated identity functions as skip mappings
works pretty well. Based on this novel skip connections, we successfully train
deep stacked bidirectional LSTM models and obtain state-of-the-art results on
CCG supertagging and comparable results on POS tagging.
Xiangang Li, Xihong Wu
Comments: Published in INTERSPEECH 2015, September 6-10, 2015, Dresden, Germany
Subjects: Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE)
Long short-term memory (LSTM) recurrent neural networks (RNNs) have been
shown to give state-of-the-art performance on many speech recognition tasks, as
they are able to provide the learned dynamically changing contextual window of
all sequence history. On the other hand, the convolutional neural networks
(CNNs) have brought significant improvements to deep feed-forward neural
networks (FFNNs), as they are able to better reduce spectral variation in the
input signal. In this paper, a network architecture called as convolutional
recurrent neural network (CRNN) is proposed by combining the CNN and LSTM RNN.
In the proposed CRNNs, each speech frame, without adjacent context frames, is
organized as a number of local feature patches along the frequency axis, and
then a LSTM network is performed on each feature patch along the time axis. We
train and compare FFNNs, LSTM RNNs and the proposed LSTM CRNNs at various
number of configurations. Experimental results show that the LSTM CRNNs can
exceed state-of-the-art speech recognition performance.
Hussam Hamdan
Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)
The classic supervised classification algorithms are efficient, but
time-consuming, complicated and not interpretable, which makes it difficult to
analyze their results that limits the possibility to improve them based on real
observations. In this paper, we propose a new and a simple classifier to
predict a sentiment label of a short text. This model keeps the capacity of
human interpret-ability and can be extended to integrate NLP techniques in a
more interpretable way. Our model is based on a correlation metric which
measures the degree of association between a sentiment label and a word. Ten
correlation metrics are proposed and evaluated intrinsically. And then a
classifier based on each metric is proposed, evaluated and compared to the
classic classification algorithms which have proved their performance in many
studies. Our model outperforms these algorithms with several correlation
metrics.
Tiancheng Zhao, Ran Zhao, Zhao Meng, Justine Cassell
Comments: Submitted to NIPS Workshop. arXiv admin note: text overlap with arXiv:1608.02977 by other authors
Subjects: Computation and Language (cs.CL)
Social norms are shared rules that govern and facilitate social interaction.
Violating such social norms via teasing and insults may serve to upend power
imbalances or, on the contrary reinforce solidarity and rapport in
conversation, rapport which is highly situated and context-dependent. In this
work, we investigate the task of automatically identifying the phenomena of
social norm violation in discourse. Towards this goal, we leverage the power of
recurrent neural networks and multimodal information present in the
interaction, and propose a predictive model to recognize social norm violation.
Using long-term temporal and contextual information, our model achieves an F1
score of 0.705. Implications of our work regarding developing a social-aware
agent are discussed.
Hussam Hamdan, Patrice Bellot, Frederic Bechet
Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR); Learning (cs.LG)
Term weighting metrics assign weights to terms in order to discriminate the
important terms from the less crucial ones. Due to this characteristic, these
metrics have attracted growing attention in text classification and recently in
sentiment analysis. Using the weights given by such metrics could lead to more
accurate document representation which may improve the performance of the
classification. While previous studies have focused on proposing or comparing
different weighting metrics at two-classes document level sentiment analysis,
this study propose to analyse the results given by each metric in order to find
out the characteristics of good and bad weighting metrics. Therefore we present
an empirical study of fifteen global supervised weighting metrics with four
local weighting metrics adopted from information retrieval, we also give an
analysis to understand the behavior of each metric by observing and analysing
how each metric distributes the terms and deduce some characteristics which may
distinguish the good and bad metrics. The evaluation has been done using
Support Vector Machine on three different datasets: Twitter, restaurant and
laptop reviews.
Aaditya Prakash, Sadid A. Hasan, Kathy Lee, Vivek Datla, Ashequl Qadir, Joey Liu, Oladimeji Farri
Comments: COLING 2016
Subjects: Computation and Language (cs.CL)
In this paper, we propose a novel neural approach for paraphrase generation.
Conventional para- phrase generation methods either leverage hand-written rules
and thesauri-based alignments, or use statistical machine learning principles.
To the best of our knowledge, this work is the first to explore deep learning
models for paraphrase generation. Our primary contribution is a stacked
residual LSTM network, where we add residual connections between LSTM layers.
This allows for efficient training of deep LSTMs. We evaluate our model and
other state-of-the-art deep learning models on three different datasets: PPDB,
WikiAnswers and MSCOCO. Evaluation results demonstrate that our model
outperforms sequence to sequence, attention-based and bi- directional LSTM
models on BLEU, METEOR, TER and an embedding-based sentence similarity metric.
Andrea F. Daniele, Mohit Bansal, Matthew R. Walter
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Learning (cs.LG)
Modern robotics applications that involve human-robot interaction require
robots to be able to communicate with humans seamlessly and effectively.
Natural language provides a flexible and efficient medium through which robots
can exchange information with their human partners. Significant advancements
have been made in developing robots capable of interpreting free-form
instructions, but less attention has been devoted to endowing robots with the
ability to generate natural language. We propose a navigational guide model
that enables robots to generate natural language instructions that allow humans
to navigate a priori unknown environments. We first decide which information to
share with the user according to their preferences, using a policy trained from
human demonstrations via inverse reinforcement learning. We then “translate”
this information into a natural language instruction using a neural
sequence-to-sequence model that learns to generate free-form instructions from
natural language corpora. We evaluate our method on a benchmark route
instruction dataset and achieve a BLEU score of 72.18% when compared to
human-generated reference instructions. We additionally conduct navigation
experiments with human participants that demonstrate that our method generates
instructions that people follow as accurately and easily as those produced by
humans.
Chairi Kiourt, Dimitris Kalles
Comments: 12 pages,4 figures, Conference: Workshop Parallel and Distributed Computing for Knowledge Discovery in Data Bases, a workshop of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery, At Porto, Portugal, 2015
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Multiagent Systems (cs.MA)
This work introduces a novel, modular, layered web based platform for
managing machine learning experiments on grid-based High Performance Computing
infrastructures. The coupling of the communication services offered by the
grid, with an administration layer and conventional web server programming, via
a data synchronization utility, leads to the straightforward development of a
web-based user interface that allows the monitoring and managing of diverse
online distributed computing applications. It also introduces an experiment
generation and monitoring tool particularly suitable for investigating machine
learning in game playing. The platform is demonstrated with experiments for two
different games.
Philipp Födisch, Artsiom Bryksa, Bert Lange, Wolfgang Enghardt, Peter Kaever
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Hardware Architecture (cs.AR)
Contemporary field-programmable gate arrays (FPGAs) are predestined for the
application of finite impulse response (FIR) filters. Their embedded digital
signal processing~(DSP) blocks for multiply-accumulate operations enable
efficient fixed-point computations, in cases where the filter structure is
accurately mapped to the dedicated hardware architecture. This brief presents a
generic systolic structure for high-order FIR filters, efficiently exploiting
the hardware resources of an FPGA in terms of routability and timing. Although
this seems to be an easily implementable task, the synthesizing tools require
an adaptation of the straightforward digital filter implementation for an
optimal mapping. Using the example of a symmetric FIR filter with 90 taps, we
demonstrate the performance of the proposed structure with FPGAs from Xilinx
and Altera. The implementation utilizes less than 1% of slice logic and runs at
clock frequencies up to 526 MHz. Moreover, an enhancement of the structure
ultimately provides an extended dynamic range for the quantized coefficients
without the costs of additional slice logic.
Yadu N. Babuji, Kyle Chard, Aaron Gerow, Eamon Duede
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Distributed communities of researchers rely increasingly on valuable,
proprietary, or sensitive datasets. Given the growth of such data, especially
in fields new to data-driven, computationally intensive research like the
social sciences and humanities, coupled with what are often strict and complex
data-use agreements, many research communities now require methods that allow
secure, scalable and cost-effective storage and analysis. Here we present CLOUD
KOTTA: a cloud-based data management and analytics framework. CLOUD KOTTA
delivers an end-to-end solution for coordinating secure access to large
datasets, and an execution model that provides both automated infrastructure
scaling and support for executing analytics near to the data. CLOUD KOTTA
implements a fine-grained security model ensuring that only authorized users
may access, analyze, and download protected data. It also implements automated
methods for acquiring and configuring low-cost storage and compute resources as
they are needed. We present the architecture and implementation of CLOUD KOTTA
and demonstrate the advantages it provides in terms of increased performance
and flexibility. We show that CLOUD KOTTA’s elastic provisioning model can
reduce costs by up to 16x when compared with statically provisioned models.
Yadu N. Babuji, Kyle Chard, Aaron Gerow, Eamon Duede
Comments: Forthcoming eScience 2016
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Data-driven research is increasingly ubiquitous and data itself is a defining
asset for researchers, particularly in the computational social sciences and
humanities. Entire careers and research communities are built around valuable,
proprietary or sensitive datasets. However, many existing computation resources
fail to support secure and cost-effective storage of data while also enabling
secure and flexible analysis of the data. To address these needs we present
CLOUD KOTTA , a cloud-based architecture for the secure management and analysis
of social science data. CLOUD KOTTA leverages reliable, secure, and scalable
cloud resources to deliver capabilities to users, and removes the need for
users to manage complicated infrastructure.CLOUD KOTTA implements automated,
cost-aware models for efficiently provisioning tiered storage and automatically
scaled compute resources.CLOUD KOTTA has been used in production for several
months and currently manages approximately 10TB of data and has been used to
process more than 5TB of data with over 75,000 CPU hours. It has been used for
a broad variety of text analysis workflows, matrix factorization, and various
machine learning algorithms, and more broadly, it supports fast, secure and
cost-effective research.
David G. Harris
Subjects: Data Structures and Algorithms (cs.DS); Distributed, Parallel, and Cluster Computing (cs.DC); Probability (math.PR)
Many randomized algorithms can be derandomized efficiently using either the
method of conditional expectations or probability spaces with low (almost-)
independence. A series of papers, beginning with work by Luby (1988) and
continuing with Berger & Rompel (1991) and Chari et al. (1994), showed that
these techniques can be combined to give deterministic parallel algorithms for
combinatorial optimization problems involving sums of $w$-juntas. We improve
these algorithms through derandomized variable partitioning. This reduces the
processor complexity to essentially independent of $w$ while the running time
is reduced from exponential in $w$ to approximately $O(w)$. For example, we
improve the time complexity of an algorithm of Berger & Rompel (1991) for
rainbow hypergraph coloring by a factor of approximately $log^2 n$ and the
processor complexity by a factor of approximately $m^{ln 2}$.
As a major application of this, we give an NC algorithm for the Lov'{a}sz
Local Lemma. Previous NC algorithms, including Moser & Tardos (2010) and
Chandrasekaran et. al (2013), required that (essentially) the bad-events could
span only $O(log n)$ variables; we relax this to allowing $ ext{polylog}(n)$
variables. As two applications of our new algorithm, we give algorithms for
defective vertex coloring and domatic graph partition.
One main sub-problem encountered in these algorithms is to generate a
probability space which can “fool” a given list of $GF(2)$ Fourier characters.
Schulman (1992) gave an NC algorithm for this; we dramatically improve its
efficiency to near-optimal time and processor complexity and code dimension.
This leads to a new algorithm to solve the heavy-codeword problem, introduced
by Naor & Naor (1993), with a near-linear processor compliexty $(mn)^{1+o(1)}$;
this improves on the algorithm of Chari et. al. (1994) requiring $O(m n^2)$
processors.
Weiran Wang, Honglak Lee, Karen Livescu
Subjects: Learning (cs.LG)
We present deep variational canonical correlation analysis (VCCA), a deep
multi-view learning model that extends the latent variable model interpretation
of linear CCA~citep{BachJordan05a} to nonlinear observation models
parameterized by deep neural networks (DNNs). Marginal data likelihood as well
as inference are intractable under this model. We derive a variational lower
bound of the data likelihood by parameterizing the posterior density of the
latent variables with another DNN, and approximate the lower bound via Monte
Carlo sampling. Interestingly, the resulting model resembles that of multi-view
autoencoders~citep{Ngiam_11b}, with the key distinction of an additional
sampling procedure at the bottleneck layer. We also propose a variant of VCCA
called VCCA-private which can, in addition to the “common variables” underlying
both views, extract the “private variables” within each view. We demonstrate
that VCCA-private is able to disentangle the shared and private information for
multi-view data without hard supervision.
Yifan Hou, Pan Zhou, Ting Wang, Yuchong Hu, Dapeng Wu
Subjects: Learning (cs.LG); Computers and Society (cs.CY); Information Retrieval (cs.IR)
The Massive Open Online Course (MOOC) has expanded significantly in recent
years. With the widespread of MOOC, the opportunity to study the fascinating
courses for free has attracted numerous people of diverse educational
backgrounds all over the world. In the big data era, a key research topic for
MOOC is how to mine the needed courses in the massive course databases in cloud
for each individual (course) learner accurately and rapidly as the number of
courses is increasing fleetly. In this respect, the key challenge is how to
realize personalized course recommendation as well as to reduce the computing
and storage costs for the tremendous course data. In this paper, we propose a
big data-supported, context-aware online learning-based course recommender
system that could handle the dynamic and infinitely massive datasets, which
recommends courses by using personalized context information and historical
statistics. The context-awareness takes the personal preferences into
consideration, making the recommendation suitable for people with different
backgrounds. Besides, the algorithm achieves the sublinear regret performance,
which means it can gradually recommend the mostly preferred and matched courses
to learners. Unlike other existing algorithms, ours bounds the time complexity
and space complexity linearly. In addition, our devised storage module is
expanded to the distributed-connected clouds, which can handle massive course
storage problems from heterogenous sources. Our experiment results verify the
superiority of our algorithms when comparing with existing works in the big
data setting.
Kristjan Greenewald, Stephen Kelley, Alfred Hero III
Comments: to appear Allerton 2016. arXiv admin note: substantial text overlap with arXiv:1603.03678
Subjects: Learning (cs.LG)
Recent work in distance metric learning has focused on learning
transformations of data that best align with specified pairwise similarity and
dissimilarity constraints, often supplied by a human observer. The learned
transformations lead to improved retrieval, classification, and clustering
algorithms due to the better adapted distance or similarity measures. Here, we
address the problem of learning these transformations when the underlying
constraint generation process is nonstationary. This nonstationarity can be due
to changes in either the ground-truth clustering used to generate constraints
or changes in the feature subspaces in which the class structure is apparent.
We propose Online Convex Ensemble StrongLy Adaptive Dynamic Learning (OCELAD),
a general adaptive, online approach for learning and tracking optimal metrics
as they change over time that is highly robust to a variety of nonstationary
behaviors in the changing metric. We apply the OCELAD framework to an ensemble
of online learners. Specifically, we create a retro-initialized composite
objective mirror descent (COMID) ensemble (RICE) consisting of a set of
parallel COMID learners with different learning rates, demonstrate RICE-OCELAD
on both real and synthetic data sets and show significant performance
improvements relative to previously proposed batch and online distance metric
learning algorithms.
Shakir Mohamed, Balaji Lakshminarayanan
Subjects: Machine Learning (stat.ML); Learning (cs.LG); Computation (stat.CO)
Generative adversarial networks (GANs) provide an algorithmic framework for
constructing generative models with several appealing properties: they do not
require a likelihood function to be specified, only a generating procedure;
they provide samples that are sharp and compelling; and they allow us to
harness our knowledge of building highly accurate neural network classifiers.
Here, we develop our understanding of GANs with the aim of forming a rich view
of this growing area of machine learning—to build connections to the diverse
set of statistical thinking on this topic, of which much can be gained by a
mutual exchange of ideas. We frame GANs within the wider landscape of
algorithms for learning in implicit generative models–models that only specify
a stochastic procedure with which to generate data–and relate these ideas to
modelling problems in related fields, such as econometrics and approximate
Bayesian computation. We develop likelihood-free inference methods and
highlight hypothesis testing as a principle for learning in implicit generative
models, using which we are able to derive the objective function used by GANs,
and many other related objectives. The testing viewpoint directs our focus to
the general problem of density ratio estimation. There are four approaches for
density ratio estimation, one of which is a solution using classifiers to
distinguish real from generated data. Other approaches such as divergence
minimisation and moment matching have also been explored in the GAN literature,
and we synthesise these views to form an understanding in terms of the
relationships between them and the wider literature, highlighting avenues for
future exploration and cross-pollination.
Jason Sakellariou, Francesca Tria, Vittorio Loreto, François Pachet
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
We introduce a Maximum Entropy model able to capture the statistics of
melodies in music. The model can be used to generate new melodies that emulate
the style of the musical corpus which was used to train it. Instead of using
the $n-$body interactions of $(n-1)-$order Markov models, traditionally used in
automatic music generation, we use a $k-$nearest neighbour model with pairwise
interactions only. In that way, we keep the number of parameters low and avoid
over-fitting problems typical of Markov models. We show that long-range musical
phrases don’t need to be explicitly enforced using high-order Markov
interactions, but can instead emerge from multiple, competing, pairwise
interactions. We validate our Maximum Entropy model by contrasting how much the
generated sequences capture the style of the original corpus without
plagiarizing it. To this end we use a data-compression approach to discriminate
the levels of borrowing and innovation featured by the artificial sequences.
The results show that our modelling scheme outperforms both fixed-order and
variable-order Markov models. This shows that, despite being based only on
pairwise interactions, this Maximum Entropy scheme opens the possibility to
generate musically sensible alterations of the original phrases, providing a
way to generate innovation.
Lieke Gelderloos, Grzegorz Chrupała
Comments: Accepted at COLING 2016
Subjects: Computation and Language (cs.CL); Learning (cs.LG)
We present a model of visually-grounded language learning based on stacked
gated recurrent neural networks which learns to predict visual features given
an image description in the form of a sequence of phonemes. The learning task
resembles that faced by human language learners who need to discover both
structure and meaning from noisy and ambiguous data across modalities. We show
that our model indeed learns to predict features of the visual context given
phonetically transcribed image descriptions, and show that it represents
linguistic information in a hierarchy of levels: lower layers in the stack are
comparatively more sensitive to form, whereas higher layers are more sensitive
to meaning.
Hsiang-Fu Yu, Cho-Jui Hsieh, Qi Lei, Inderjit S. Dhillon
Subjects: Data Structures and Algorithms (cs.DS); Learning (cs.LG)
Maximum Inner Product Search (MIPS) is an important task in many machine
learning applications such as the prediction phase of a low-rank matrix
factorization model for a recommender system. There have been some works on how
to perform MIPS in sub-linear time recently. However, most of them do not have
the flexibility to control the trade-off between search efficient and search
quality. In this paper, we study the MIPS problem with a computational budget.
By carefully studying the problem structure of MIPS, we develop a novel
Greedy-MIPS algorithm, which can handle budgeted MIPS by design. While simple
and intuitive, Greedy-MIPS yields surprisingly superior performance compared to
state-of-the-art approaches. As a specific example, on a candidate set
containing half a million vectors of dimension 200, Greedy-MIPS runs 200x
faster than the naive approach while yielding search results with the top-5
precision greater than 75\%.
Shai Shalev-Shwartz, Shaked Shammah, Amnon Shashua
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)
Autonomous driving is a multi-agent setting where the host vehicle must apply
sophisticated negotiation skills with other road users when overtaking, giving
way, merging, taking left and right turns and while pushing ahead in
unstructured urban roadways. Since there are many possible scenarios, manually
tackling all possible cases will likely yield a too simplistic policy.
Moreover, one must balance between unexpected behavior of other
drivers/pedestrians and at the same time not to be too defensive so that normal
traffic flow is maintained.
In this paper we apply deep reinforcement learning to the problem of forming
long term driving strategies. We note that there are two major challenges that
make autonomous driving different from other robotic tasks. First, is the
necessity for ensuring functional safety – something that machine learning has
difficulty with given that performance is optimized at the level of an
expectation over many instances. Second, the Markov Decision Process model
often used in robotics is problematic in our case because of unpredictable
behavior of other agents in this multi-agent scenario. We make three
contributions in our work. First, we show how policy gradient iterations can be
used without Markovian assumptions. Second, we decompose the problem into a
composition of a Policy for Desires (which is to be learned) and trajectory
planning with hard constraints (which is not learned). The goal of Desires is
to enable comfort of driving, while hard constraints guarantees the safety of
driving. Third, we introduce a hierarchical temporal abstraction we call an
“Option Graph” with a gating mechanism that significantly reduces the effective
horizon and thereby reducing the variance of the gradient estimation even
further.
Patrick Blöbaum, Takashi Washio, Shohei Shimizu
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)
It is generally difficult to make any statements about the expected
prediction error in an univariate setting without further knowledge about how
the data were generated. Recent work showed that knowledge about the real
underlying causal structure of a data generation process has implications for
various machine learning settings. Assuming an additive noise and an
independence between data generating mechanism and its input, we draw a novel
connection between the intrinsic causal relationship of two variables and the
expected prediction error. We formulate the theorem that the expected error of
the true data generating function as prediction model is generally smaller when
the effect is predicted from its cause and, on the contrary, greater when the
cause is predicted from its effect. The theorem implies an asymmetry in the
error depending on the prediction direction. This is further corroborated with
empirical evaluations in artificial and real-world data sets.
Andrea F. Daniele, Mohit Bansal, Matthew R. Walter
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Learning (cs.LG)
Modern robotics applications that involve human-robot interaction require
robots to be able to communicate with humans seamlessly and effectively.
Natural language provides a flexible and efficient medium through which robots
can exchange information with their human partners. Significant advancements
have been made in developing robots capable of interpreting free-form
instructions, but less attention has been devoted to endowing robots with the
ability to generate natural language. We propose a navigational guide model
that enables robots to generate natural language instructions that allow humans
to navigate a priori unknown environments. We first decide which information to
share with the user according to their preferences, using a policy trained from
human demonstrations via inverse reinforcement learning. We then “translate”
this information into a natural language instruction using a neural
sequence-to-sequence model that learns to generate free-form instructions from
natural language corpora. We evaluate our method on a benchmark route
instruction dataset and achieve a BLEU score of 72.18% when compared to
human-generated reference instructions. We additionally conduct navigation
experiments with human participants that demonstrate that our method generates
instructions that people follow as accurately and easily as those produced by
humans.
Hussam Hamdan, Patrice Bellot, Frederic Bechet
Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR); Learning (cs.LG)
Term weighting metrics assign weights to terms in order to discriminate the
important terms from the less crucial ones. Due to this characteristic, these
metrics have attracted growing attention in text classification and recently in
sentiment analysis. Using the weights given by such metrics could lead to more
accurate document representation which may improve the performance of the
classification. While previous studies have focused on proposing or comparing
different weighting metrics at two-classes document level sentiment analysis,
this study propose to analyse the results given by each metric in order to find
out the characteristics of good and bad weighting metrics. Therefore we present
an empirical study of fifteen global supervised weighting metrics with four
local weighting metrics adopted from information retrieval, we also give an
analysis to understand the behavior of each metric by observing and analysing
how each metric distributes the terms and deduce some characteristics which may
distinguish the good and bad metrics. The evaluation has been done using
Support Vector Machine on three different datasets: Twitter, restaurant and
laptop reviews.
Gábor Braun, Sebastian Pokutta
Comments: 17 pages
Subjects: Data Structures and Algorithms (cs.DS); Learning (cs.LG)
For the linear bandit problem, we extend the analysis of algorithm CombEXP
from [R. Combes, M. S. Talebi Mazraeh Shahi, A. Proutiere, and M. Lelarge.
Combinatorial bandits revisited. In C. Cortes, N. D. Lawrence, D. D. Lee, M.
Sugiyama, and R. Garnett, editors, Advances in Neural Information Processing
Systems 28, pages 2116–2124. Curran Associates, Inc., 2015. URL
this http URL] to the
high-probability case against adaptive adversaries, allowing actions to come
from an arbitrary polytope. We prove a high-probability regret of
(O(T^{2/3})) for time horizon (T). While this bound is weaker than the
optimal (O(sqrt{T})) bound achieved by GeometricHedge in [P. L. Bartlett, V.
Dani, T. Hayes, S. Kakade, A. Rakhlin, and A. Tewari. High-probability regret
bounds for bandit online linear optimization. In 21th Annual Conference on
Learning Theory (COLT 2008), July 2008.
this http URL], CombEXP is computationally
efficient, requiring only an efficient linear optimization oracle over the
convex hull of the actions.
Pritam Mukherjee, Sennur Ulukus
Comments: Submitted to IEEE Transactions on Communications, October 2016
Subjects: Information Theory (cs.IT); Cryptography and Security (cs.CR)
We consider two fundamental multi-user channel models: the multiple-input
multiple-output (MIMO) wiretap channel with one helper (WTH) and the MIMO
multiple access wiretap channel (MAC-WT). In each case, the eavesdropper has
$K$ antennas while the remaining terminals have $N$ antennas each. We consider
a fast fading channel where the channel state information (CSI) of the
legitimate receiver is available at the transmitters but no channel state
information at the transmitters (CSIT) is available for the eavesdropper’s
channel. We determine the optimal sum secure degrees of freedom (s.d.o.f.) for
each channel model for the regime $Kleq N$, and show that in this regime, the
MAC-WT channel reduces to the WTH in the absence of eavesdropper CSIT. For the
regime $Nleq Kleq 2N$, we obtain the optimal linear s.d.o.f., and show that
the MAC-WT channel and the WTH have the same optimal s.d.o.f. when restricted
to linear encoding strategies. In the absence of any such restrictions, we
provide an upper bound for the sum s.d.o.f. of the MAC-WT chanel in the regime
$Nleq Kleq 2N$. Our results show that unlike in the single-input
single-output (SISO) case, there is loss of s.d.o.f. for even the WTH due to
lack of eavesdropper CSIT when $Kgeq N$.
Hadi Sarieddeen, Mohammad M. Mansour, Ali Chehab
Subjects: Information Theory (cs.IT)
The problem of efficient modulation classification (MC) in multiple-input
multiple-output (MIMO) systems is considered. Per-layer likelihood-based MC is
proposed by employing subspace decomposition to partially decouple the
transmitted streams. When detecting the modulation type of the stream of
interest, a dense constellation is assumed on all remaining streams. The
proposed classifier outperforms existing MC schemes at a lower complexity cost,
and can be efficiently implemented in the context of joint MC and subspace data
detection.
Stefan Wesemann, Thomas L. Marzetta
Comments: Submitted to IEEE Transactions on Signal Processing, 9 pages, 6 figures
Subjects: Information Theory (cs.IT)
A network of analog repeaters, each fed by a wireless fronthaul link and
powered by e.g., solar energy, is a promising candidate for a flexible small
cell deployment. A key challenge is the acquisition of accurate channel state
information by the fronthaul hub (FH), which is needed for the spatial
multiplexing of multiple fronthaul links over the same time/frequency resource.
For frequency division duplex channels, a simple pilot loop-back procedure has
been proposed that allows the estimation of the UL & DL channels at the FH
without relying on any digital signal processing at the repeater side. For this
scheme, we derive the maximum likelihood (ML) estimators for the UL & DL
channel subspaces, formulate the corresponding Cram’er-Rao bounds and show the
asymptotic efficiency of both (SVD-based) estimators by means of Monte Carlo
simulations. In addition, we illustrate how to compute the underlying (rank-1)
SVD with quadratic time complexity by employing the power iteration method. To
enable power control for the fronthaul links, knowledge of the channel gains is
needed. Assuming that the UL & DL channels have on average the same gain, we
formulate the ML estimator for the UL channel gain, and illustrate its
robustness against strong noise by means of simulations.
Ning Wei, Xingqin Lin, Wanwan Li, Youzhi Xiong, Zhongpei Zhang
Comments: 5 pages, 3 figures, submitted to IEEE ICC 2017
Subjects: Information Theory (cs.IT)
The effort to extend cellular technologies to unlicensed spectrum has been
gaining high momentum. Listen-before-talk (LBT) is enforced in the regions such
as European Union and Japan to harmonize coexistence of cellular and incumbent
systems in unlicensed spectrum. In this paper, we study throughput optimal LBT
transmission strategy for load based equipment (LBE). We find that the optimal
rule is a pure threshold policy: The LBE should stop listening and transmit
once the channel quality exceeds an optimized threshold. We also reveal the
optimal set of LBT parameters that are compliant with regulatory requirements.
Our results shed light on how the regulatory LBT requirements can affect the
transmission strategies of radio equipment in unlicensed spectrum.
Ahmed Raafat Hosny, Ramy Abdallah Tannious, Amr El-Keyi
Subjects: Information Theory (cs.IT)
In this paper, we study the performance of the downlink of a cellular network
with automatic repeat-request (ARQ) and a half duplex decode-and-forward shared
relay. In this system, two multiple-input-multiple-output (MIMO) base stations
serve two single antenna users. A MIMO shared relay retransmits the lost
packets to the target users. First, we study the system with direct
retransmission from the base station and derive a closed form expression for
the outage probability of the system.We show that the direct retransmission can
overcome the fading, however, it cannot overcome the interference. After that,
we invoke the shared relay and design the relay beamforming matrices such that
the signal-to-interference-and-noise ratio (SINR) is improved at the users
subject to power constraints on the relay. In the case when the transmission of
only one user fails, we derive a closed form solution for the relay
beamformers. On the other hand when both transmissions fail, we pose the
beamforming problem as a sequence of non-convex feasibility problems. We use
semidefinite relaxation (SDR) to convert each feasibility problem into a convex
optimization problem. We ensure a rank one solution, and hence, there is no
loss of optimality in SDR. Simulation results are presented showing the
superior performance of the proposed relay beamforming strategy compared to
direct ARQ system in terms of the outage probability.
Sundeep Rangan, Philip Schniter, Alyson Fletcher
Subjects: Information Theory (cs.IT)
The standard linear regression (SLR) problem is to recover a vector
$mathbf{x}^0$ from noisy linear observations
$mathbf{y}=mathbf{Ax}^0+mathbf{w}$. The approximate message passing (AMP)
algorithm recently proposed by Donoho, Maleki, and Montanari is a
computationally efficient iterative approach to SLR that has a remarkable
property: for large i.i.d. sub-Gaussian matrices $mathbf{A}$, its
per-iteration behavior is rigorously characterized by a scalar state-evolution
whose fixed points, when unique, are Bayes optimal. AMP, however, is fragile in
that even small deviations from the i.i.d. sub-Gaussian model can cause the
algorithm to diverge. This paper considers a “vector AMP” (VAMP) algorithm and
shows that VAMP has a rigorous scalar state-evolution that holds under a much
broader class of large random matrices $mathbf{A}$: those that are
right-rotationally invariant. After performing an initial singular value
decomposition (SVD) of $mathbf{A}$, the per-iteration complexity of VAMP can
be made similar to that of AMP. In addition, the fixed points of VAMP’s state
evolution are consistent with the replica prediction of the minimum
mean-squared error recently derived by Tulino, Caire, Verd’u, and Shamai. The
effectiveness and state evolution predictions of VAMP are confirmed in
numerical experiments.
Sridhar Rajagopal, Md. Saifur Rahman
Comments: 6 pages, 10 figures
Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)
Flexible numerologies are being considered as part of designs for 5G systems
to support vertical services with diverse requirements such as enhanced mobile
broadband, ultra-reliable low-latency communications, and massive machine type
communication. Different vertical services can be multiplexed in either
frequency domain, time domain, or both. In this paper, we investigate the use
of spatial multiplexing of services using MU-MIMO where the numerologies for
different users may be different. The users are grouped according to the chosen
numerology and a separate pre-coder and FFT size is used per numerology at the
transmitter. The pre-coded signals for the multiple numerologies are added in
the time domain before transmission. We analyze the performance gains of this
approach using capacity analysis and link level simulations using conjugate
beamforming and signal-to-leakage noise ratio maximization techniques. We show
that the MU interference between users with different numerologies can be
suppressed efficiently with reasonable number of antennas at the base-station.
This feature enables MU-MIMO techniques to be applied for 5G across different
numerologies.
Christopher Portmann
Comments: 34+13 pages, 12 figures, comments welcome
Subjects: Quantum Physics (quant-ph); Cryptography and Security (cs.CR); Information Theory (cs.IT)
We show that a family of quantum authentication protocols introduced in
[Barnum et al., FOCS 2002] can be used to construct a secure quantum channel
and additionally recycle all of the secret key if the message is successfully
authenticated, and recycle part of the key if tampering is detected. We give a
full security proof that constructs the secure channel given only insecure
noisy channels and a shared secret key. We also prove that the number of
recycled key bits is optimal for this family of protocols, i.e., there exists
an adversarial strategy to obtain all non-recycled bits. Previous works
recycled less key and only gave partial security proofs, since they did not
consider all possible distinguishers (environments) that may be used to
distinguish the real setting from the ideal secure quantum channel.