Guillaume Klein, Yoon Kim, Yuntian Deng, Jean Senellart, Alexander M. Rush
Comments: Report for this http URL
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
We describe an open-source toolkit for neural machine translation (NMT). The
toolkit prioritizes efficiency, modularity, and extensibility with the goal of
supporting NMT research into model architectures, feature representations, and
source modalities, while maintaining competitive performance and reasonable
training requirements. The toolkit consists of modeling and translation
support, as well as detailed pedagogical documentation about the underlying
techniques.
Jonathan T. Barron
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)
We present a loss function which can be viewed as a generalization of many
popular loss functions used in robust statistics: the Cauchy/Lorentzian,
Welsch, and generalized Charbonnier loss functions (and by transitivity the L2,
L1, L1-L2, and pseudo-Huber/Charbonnier loss functions). We describe and
visualize this loss, and document several of its useful properties.
Baris Kayalibay, Grady Jensen, Patrick van der Smagt
Comments: 24 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Convolutional neural networks have been applied to a wide variety of computer
vision tasks. Recent advances in semantic segmentation have enabled their
application to medical image segmentation. While most CNNs use two-dimensional
kernels, recent CNN-based publications on medical image segmentation featured
three-dimensional kernels, allowing full access to the three-dimensional
structure of medical images. Though closely related to semantic segmentation,
medical image segmentation includes specific challenges that need to be
addressed, such as the scarcity of labelled data, the high class imbalance
found in the ground truth and the high memory demand of three-dimensional
images. In this work, a CNN-based method with three-dimensional filters is
demonstrated and applied to hand and brain MRI. Two modifications to an
existing CNN architecture are discussed, along with methods on addressing the
aforementioned challenges. While most of the existing literature on medical
image segmentation focuses on soft tissue and the major organs, this work is
validated on data both from the central nervous system as well as the bones of
the hand.
Qingnan Fan, David Wipf, Gang Hua, Baoquan Chen
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We propose an image smoothing approximation and intrinsic image decomposition
method based on a modified convolutional neural network architecture applied
directly to the original color image. Our network has a very large receptive
field equipped with at least 20 convolutional layers and 8 residual units. When
training such a deep model however, it is quite difficult to generate
edge-preserving images without undesirable color differences. To overcome this
obstacle, we apply both image gradient supervision and a channel-wise rescaling
layer that computes a minimum mean-squared error color correction.
Additionally, to enhance piece-wise constant effects for image smoothing, we
append a domain transform filter with a predicted refined edge map. The
resulting deep model, which can be trained end-to-end, directly learns
edge-preserving smooth images and intrinsic decompositions without any special
design or input scaling/size requirements. Moreover, our method shows much
better numerical and visual results on both tasks and runs in comparable test
time to existing deep methods.
Matteo Zanotto, Riccardo Volpi, Alessandro Maccione, Luca Berdondini, Diego Sona, Vittorio Murino
Subjects: Computer Vision and Pattern Recognition (cs.CV); Neurons and Cognition (q-bio.NC)
The retina is a complex nervous system which encodes visual stimuli before
higher order processing occurs in the visual cortex. In this study we evaluated
whether information about the stimuli received by the retina can be retrieved
from the firing rate distribution of Retinal Ganglion Cells (RGCs), exploiting
High-Density 64×64 MEA technology. To this end, we modeled the RGC population
activity using mean-covariance Restricted Boltzmann Machines, latent variable
models capable of learning the joint distribution of a set of continuous
observed random variables and a set of binary unobserved random units. The idea
was to figure out if binary latent states encode the regularities associated to
different visual stimuli, as modes in the joint distribution. We measured the
goodness of mcRBM encoding by calculating the Mutual Information between the
latent states and the stimuli shown to the retina. Results show that binary
states can encode the regularities associated to different stimuli, using both
gratings and natural scenes as stimuli. We also discovered that hidden
variables encode interesting properties of retinal activity, interpreted as
population receptive fields. We further investigated the ability of the model
to learn different modes in population activity by comparing results associated
to a retina in normal conditions and after pharmacologically blocking GABA
receptors (GABAC at first, and then also GABAA and GABAB). As expected, Mutual
Information tends to decrease if we pharmacologically block receptors. We
finally stress that the computational method described in this work could
potentially be applied to any kind of neural data obtained through MEA
technology, though different techniques should be applied to interpret the
results.
Ramakrishna Vedantam, Samy Bengio, Kevin Murphy, Devi Parikh, Gal Chechik
Comments: 16 pages, 10 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
We introduce a technique to produce discriminative context-aware image
captions (captions that describe differences between images or visual concepts)
using only generic context-agnostic training data (captions that describe a
concept or an image in isolation). For example, given images and captions of
“siamese cat” and “tiger cat”, our system generates language that describes the
“siamese cat” in a way that distinguishes it from “tiger cat”. We start with a
generic language model that is context-agnostic and add a listener to
discriminate between closely-related concepts. Our approach offers two key
advantages over previous work: 1) our listener does not need separate training,
and 2) allows joint inference to decode sentences that satisfy both the speaker
and listener — yielding an introspective speaker. We first apply our
introspective speaker to a justification task, i.e. to describe why an image
contains a particular fine-grained category as opposed to another closely
related category in the CUB-200-2011 dataset. We then study discriminative
image captioning to generate language that uniquely refers to one out of two
semantically similar images in the COCO dataset. Evaluations with
discriminative ground truth for justification and human studies for
discriminative image captioning reveal that our approach outperforms baseline
generative and speaker-listener approaches for discrimination.
Chenglong Li, Guizhao Wang, Yunpeng Ma, Aihua Zheng, Bin Luo, Jin Tang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Despite significant progress, image saliency detection still remains a
challenging task in complex scenes and environments. Integrating multiple
different but complementary cues, like RGB and Thermal (RGB-T), may be an
effective way for boosting saliency detection performance. The current research
in this direction, however, is limited by the lack of a comprehensive
benchmark. This work contributes such a RGB-T image dataset, which includes 821
spatially aligned RGB-T image pairs and their ground truth annotations for
saliency detection purpose. The image pairs are with high diversity recorded
under different scenes and environmental conditions, and we annotate 11
challenges on these image pairs for performing the challenge-sensitive analysis
for different saliency detection algorithms. We also implement 3 kinds of
baseline methods with different modality inputs to provide a comprehensive
comparison platform.
With this benchmark, we propose a novel approach, multi-task manifold ranking
with cross-modality consistency, for RGB-T saliency detection. In particular,
we introduce a weight for each modality to describe the reliability, and
integrate them into the graph-based manifold ranking algorithm to achieve
adaptive fusion of different source data. Moreover, we incorporate the
cross-modality consistent constraints to integrate different modalities
collaboratively. For the optimization, we design an efficient algorithm to
iteratively solve several subproblems with closed-form solutions. Extensive
experiments against other baseline methods on the newly created benchmark
demonstrate the effectiveness of the proposed approach, and we also provide
basic insights and potential future research directions for RGB-T saliency
detection.
Kele Xu, Xi Liu, Chang Liu, Hengxing Cai, Zhifeng Gao
Comments: 7 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
During the last decades, the number of new full-reference image quality
assessment algorithms has been increasing drastically. Yet, despite of the
remarkable progress that has been made, the medical ultrasound image similarity
measurement remains largely unsolved due to a high level of speckle noise
contamination. Potential applications of the ultrasound image similarity
measurement seem evident in several aspects. To name a few, ultrasound imaging
quality assessment, abnormal function region detection, etc. In this paper, a
comparative study was made on full-reference image quality assessment methods
for ultrasound image visual structural similarity measure. Moreover, based on
the image similarity index, a generic ultrasound motion tracking
re-initialization framework is given in this work. The experiments are
conducted on synthetic data and real-ultrasound liver data and the results
demonstrate that, with proposed similarity-based tracking re-initialization,
the mean error of landmarks tracking can be decreased from 2 mm to about 1.5 mm
in the ultrasound liver sequence.
Xiaowei Zhang, Chi Xu, Yu Zhang, Tingshao Zhu, Li Cheng
Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
This paper studies the problem of multivariate linear regression where a
portion of the observations is grossly corrupted or is missing, and the
magnitudes and locations of such occurrences are unknown in priori. To deal
with this problem, we propose a new approach by explicitly consider the error
source as well as its sparseness nature. An interesting property of our
approach lies in its ability of allowing individual regression output elements
or tasks to possess their unique noise levels. Moreover, despite working with a
non-smooth optimization problem, our approach still guarantees to converge to
its optimal solution. Experiments on synthetic data demonstrate the
competitiveness of our approach compared with existing multivariate regression
models. In addition, empirically our approach has been validated with very
promising results on two exemplar real-world applications: The first concerns
the prediction of extit{Big-Five} personality based on user behaviors at
social network sites (SNSs), while the second is 3D human hand pose estimation
from depth images. The implementation of our approach and comparison methods as
well as the involved datasets are made publicly available in support of the
open-source and reproducible research initiatives.
Bo Dai, Ruiqi Guo, Sanjiv Kumar, Niao He, Le Song
Comments: 19 pages, 22 figures
Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
Learning to hash plays a fundamentally important role in the efficient image
and video retrieval and many other computer vision problems. However, due to
the binary outputs of the hash functions, the learning of hash functions is
very challenging. In this paper, we propose a novel approach to learn
stochastic hash functions such that the learned hashing codes can be used to
regenerate the inputs. We develop an efficient stochastic gradient learning
algorithm which can avoid the notorious difficulty caused by binary output
constraint, and directly optimize the parameters of the hash functions and the
associated generative model jointly. The proposed method can be applied to both
(L2) approximate nearest neighbor search (L2NNS) and maximum inner product
search (MIPS). Extensive experiments on a variety of large-scale datasets show
that the proposed method achieves significantly better retrieval results than
previous state-of-the-arts.
Yutaka Nagashima
Comments: Accepted at AITP2017
Subjects: Artificial Intelligence (cs.AI)
Despite the recent progress in automatic theorem provers, proof engineers are
still suffering from the lack of powerful proof automation. In this position
paper we first report our proof strategy language based on a meta-tool
approach. Then, we propose an AI-based approach to drastically improve proof
automation for Isabelle, while identifying three major challenges we plan to
address for this objective.
Aneta Vulgarakis Feljan, Athanasios Karapantelakis, Leonid Mokrushin, Hongxin Liang, Rafia Inam, Elena Fersman, Carlos R.B. Azevedo, Klaus Raizer, Ricardo S. Souza
Subjects: Artificial Intelligence (cs.AI)
Cyber-Physical Systems in general, and Intelligent Transport Systems (ITS) in
particular use heterogeneous data sources combined with problem solving
expertise in order to make critical decisions that may lead to some form of
actions e.g., driver notifications, change of traffic light signals and braking
to prevent an accident. Currently, a major part of the decision process is done
by human domain experts, which is time-consuming, tedious and error-prone.
Additionally, due to the intrinsic nature of knowledge possession this decision
process cannot be easily replicated or reused. Therefore, there is a need for
automating the reasoning processes by providing computational systems a formal
representation of the domain knowledge and a set of methods to process that
knowledge. In this paper, we propose a knowledge model that can be used to
express both declarative knowledge about the systems’ components, their
relations and their current state, as well as procedural knowledge representing
possible system behavior. In addition, we introduce a framework for knowledge
management and automated reasoning (KMARF). The idea behind KMARF is to
automatically select an appropriate problem solver based on formalized
reasoning expertise in the knowledge base, and convert a problem definition to
the corresponding format. This approach automates reasoning, thus reducing
operational costs, and enables reusability of knowledge and methods across
different domains. We illustrate the approach on a transportation planning use
case.
Ramakrishna Vedantam, Samy Bengio, Kevin Murphy, Devi Parikh, Gal Chechik
Comments: 16 pages, 10 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
We introduce a technique to produce discriminative context-aware image
captions (captions that describe differences between images or visual concepts)
using only generic context-agnostic training data (captions that describe a
concept or an image in isolation). For example, given images and captions of
“siamese cat” and “tiger cat”, our system generates language that describes the
“siamese cat” in a way that distinguishes it from “tiger cat”. We start with a
generic language model that is context-agnostic and add a listener to
discriminate between closely-related concepts. Our approach offers two key
advantages over previous work: 1) our listener does not need separate training,
and 2) allows joint inference to decode sentences that satisfy both the speaker
and listener — yielding an introspective speaker. We first apply our
introspective speaker to a justification task, i.e. to describe why an image
contains a particular fine-grained category as opposed to another closely
related category in the CUB-200-2011 dataset. We then study discriminative
image captioning to generate language that uniquely refers to one out of two
semantically similar images in the COCO dataset. Evaluations with
discriminative ground truth for justification and human studies for
discriminative image captioning reveal that our approach outperforms baseline
generative and speaker-listener approaches for discrimination.
Cong Duy Vu Hoang (University of Melbourne), Gholamreza Haffari (Monash University), Trevor Cohn (University of Melbourne)
Comments: v1 with preliminary results
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
In this work, we propose a novel decoding approach for neural machine
translation (NMT) based on continuous optimisation. The resulting optimisation
problem can then be tackled using a whole range of continuous optimisation
algorithms which have been developed and used in the literature mainly for
training. Our approach is general and can be applied to other
sequence-to-sequence neural models as well. We make use of this powerful
decoding approach to intersect an underlying NMT with a language model, to
intersect left-to-right and right-to-left NMT models, and to decode with soft
constraints involving coverage and fertility of the source sentence words. The
experimental results show the promise of the proposed framework.
Guillaume Klein, Yoon Kim, Yuntian Deng, Jean Senellart, Alexander M. Rush
Comments: Report for this http URL
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
We describe an open-source toolkit for neural machine translation (NMT). The
toolkit prioritizes efficiency, modularity, and extensibility with the goal of
supporting NMT research into model architectures, feature representations, and
source modalities, while maintaining competitive performance and reasonable
training requirements. The toolkit consists of modeling and translation
support, as well as detailed pedagogical documentation about the underlying
techniques.
Xiang Xiang, Trac D. Tran
Comments: The tutorial and program associated with this paper are available at this https URL yet for non-commercial use
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
In this paper, we deal with two challenges for measuring the similarity of
the subject identities in practical video-based face recognition – the
variation of the head pose in uncontrolled environments and the computational
expense of processing videos. Since the frame-wise feature mean is unable to
characterize the pose diversity among frames, we define and preserve the
overall pose diversity and closeness in a video. Then, identity will be the
only source of variation across videos since the pose varies even within a
single video. Instead of simply using all the frames, we select those faces
whose pose point is closest to the centroid of the K-means cluster containing
that pose point. Then, we represent a video as a bag of frame-wise deep face
features while the number of features has been reduced from hundreds to K.
Since the video representation can well represent the identity, now we measure
the subject similarity between two videos as the max correlation among all
possible pairs in the two bags of features. On the official 5,000 video-pairs
of the YouTube Face dataset for face verification, our algorithm achieves a
comparable performance with VGG-face that averages over deep features of all
frames. Other vision tasks can also benefit from the generic idea of employing
geometric cues to improve the descriptiveness of deep features.
Chongyang Tao, Lili Mou, Dongyan Zhao, Rui Yan
Subjects: Computation and Language (cs.CL); Human-Computer Interaction (cs.HC); Information Retrieval (cs.IR)
Open-domain human-computer conversation has been attracting increasing
attention over the past few years. However, there does not exist a standard
automatic evaluation metric for open-domain dialog systems; researchers usually
resort to human annotation for model evaluation, which is time- and
labor-intensive. In this paper, we propose RUBER, a Referenced metric and
Unreferenced metric Blended Evaluation Routine, which evaluates a reply by
taking into consideration both a groundtruth reply and a query (previous user
utterance). Our metric is learnable, but its training does not require labels
of human satisfaction. Hence, RUBER is flexible and extensible to different
datasets and languages. Experiments on both retrieval and generative dialog
systems show that RUBER has high correlation with human annotation.
Tapan Sahni, Chinmay Chandak, Naveen Reddy Chedeti, Manish Singh
Subjects: Social and Information Networks (cs.SI); Computation and Language (cs.CL); Information Retrieval (cs.IR)
As microblogging services like Twitter are becoming more and more influential
in today’s globalised world, its facets like sentiment analysis are being
extensively studied. We are no longer constrained by our own opinion. Others
opinions and sentiments play a huge role in shaping our perspective. In this
paper, we build on previous works on Twitter sentiment analysis using Distant
Supervision. The existing approach requires huge computation resource for
analysing large number of tweets. In this paper, we propose techniques to speed
up the computation process for sentiment analysis. We use tweet subjectivity to
select the right training samples. We also introduce the concept of EFWS
(Effective Word Score) of a tweet that is derived from polarity scores of
frequently used words, which is an additional heuristic that can be used to
speed up the sentiment classification with standard machine learning
algorithms. We performed our experiments using 1.6 million tweets. Experimental
evaluations show that our proposed technique is more efficient and has higher
accuracy compared to previously proposed methods. We achieve overall accuracies
of around 80% (EFWS heuristic gives an accuracy around 85%) on a training
dataset of 100K tweets, which is half the size of the dataset used for the
baseline model. The accuracy of our proposed model is 2-3% higher than the
baseline model, and the model effectively trains at twice the speed of the
baseline model.
Besat Kassaie
Subjects: Computation and Language (cs.CL)
In this report, we propose a new application for twitter data called
extit{job detection}. We identify people’s job category based on their
tweets. As a preliminary work, we limited our task to identify only IT workers
from other job holders. We have used and compared both simple bag of words
model and a document representation based on Skip-gram model. Our results show
that the model based on Skip-gram, achieves a 76\% precision and 82\% recall.
Chongyang Tao, Lili Mou, Dongyan Zhao, Rui Yan
Subjects: Computation and Language (cs.CL); Human-Computer Interaction (cs.HC); Information Retrieval (cs.IR)
Open-domain human-computer conversation has been attracting increasing
attention over the past few years. However, there does not exist a standard
automatic evaluation metric for open-domain dialog systems; researchers usually
resort to human annotation for model evaluation, which is time- and
labor-intensive. In this paper, we propose RUBER, a Referenced metric and
Unreferenced metric Blended Evaluation Routine, which evaluates a reply by
taking into consideration both a groundtruth reply and a query (previous user
utterance). Our metric is learnable, but its training does not require labels
of human satisfaction. Hence, RUBER is flexible and extensible to different
datasets and languages. Experiments on both retrieval and generative dialog
systems show that RUBER has high correlation with human annotation.
Arturo Argueta, David Chiang
Comments: accepted at EACL 2017
Subjects: Computation and Language (cs.CL); Distributed, Parallel, and Cluster Computing (cs.DC)
Weighted finite automata and transducers (including hidden Markov models and
conditional random fields) are widely used in natural language processing (NLP)
to perform tasks such as morphological analysis, part-of-speech tagging,
chunking, named entity recognition, speech recognition, and others.
Parallelizing finite state algorithms on graphics processing units (GPUs) would
benefit many areas of NLP. Although researchers have implemented GPU versions
of basic graph algorithms, limited previous work, to our knowledge, has been
done on GPU algorithms for weighted finite automata. We introduce a GPU
implementation of the Viterbi and forward-backward algorithm, achieving
decoding speedups of up to 5.2x over our serial implementation running on
different computer architectures and 6093x over OpenFST.
Kim Anh Nguyen, Sabine Schulte im Walde, Ngoc Thang Vu
Comments: EACL 2017, 10 pages
Journal-ref: EACL2017
Subjects: Computation and Language (cs.CL)
Distinguishing between antonyms and synonyms is a key task to achieve high
performance in NLP systems. While they are notoriously difficult to distinguish
by distributional co-occurrence models, pattern-based methods have proven
effective to differentiate between the relations. In this paper, we present a
novel neural network model AntSynNET that exploits lexico-syntactic patterns
from syntactic parse trees. In addition to the lexical and syntactic
information, we successfully integrate the distance between the related words
along the syntactic path as a new pattern feature. The results from
classification experiments show that AntSynNET improves the performance over
prior pattern-based methods.
Chloé Braud, Maximin Coavoux, Anders Søgaard
Comments: To be published in EACL 2017, 13 pages
Subjects: Computation and Language (cs.CL)
Discourse parsing is an integral part of understanding information flow and
argumentative structure in documents. Most previous research has focused on
inducing and evaluating models from the English RST Discourse Treebank.
However, discourse treebanks for other languages exist, including Spanish,
German, Basque, Dutch and Brazilian Portuguese. The treebanks share the same
underlying linguistic theory, but differ slightly in the way documents are
annotated. In this paper, we present (a) a new discourse parser which is
simpler, yet competitive (significantly better on 2/3 metrics) to state of the
art for English, (b) a harmonization of discourse treebanks across languages,
enabling us to present (c) what to the best of our knowledge are the first
experiments on cross-lingual discourse parsing.
Waheeb Ahmed, Dr. Anto P Babu
Comments: 10 pages, 3 figures, published article in IJNLC
Subjects: Computation and Language (cs.CL)
The first step of processing a question in Question Answering(QA) Systems is
to carry out a detailed analysis of the question for the purpose of determining
what it is asking for and how to perfectly approach answering it. Our Question
analysis uses several techniques to analyze any question given in natural
language: a Stanford POS Tagger & parser for Arabic language, a named entity
recognizer, tokenizer,Stop-word removal, Question expansion, Question
classification and Question focus extraction components. We employ numerous
detection rules and trained classifier using features from this analysis to
detect important elements of the question, including: 1) the portion of the
question that is a referring to the answer (the focus); 2) different terms in
the question that identify what type of entity is being asked for (the lexical
answer types); 3) Question expansion ; 4) a process of classifying the question
into one or more of several and different types; and We describe how these
elements are identified and evaluate the effect of accurate detection on our
question-answering system using the Mean Reciprocal Rank(MRR) accuracy measure.
Antonio Toral, Víctor M. Sánchez-Cartagena
Comments: Conference of the European Chapter of the Association for Computational Linguistics (EACL). April 2017, Val`encia, Spain
Subjects: Computation and Language (cs.CL)
We aim to shed light on the strengths and weaknesses of the newly introduced
neural machine translation paradigm. To that end, we conduct a multifaceted
evaluation in which we compare outputs produced by state-of-the-art neural
machine translation and phrase-based machine translation systems for 9 language
directions across a number of dimensions. Specifically, we measure the
similarity of the outputs, their fluency and amount of reordering, the effect
of sentence length and performance across different error categories. We find
out that translations produced by neural machine translation systems are
considerably different, more fluent and more accurate in terms of word order
compared to those produced by phrase-based systems. Neural machine translation
systems are also more accurate at producing inflected forms, but they perform
poorly when translating very long sentences.
Isabelle Augenstein, Leon Derczynski, Kalina Bontcheva
Comments: 51 pages; preprint accepted to Computer Speech and Language
Subjects: Computation and Language (cs.CL)
Named Entity Recognition (NER) is a key NLP task, which is all the more
challenging on Web and user-generated content with their diverse and
continuously changing language. This paper aims to quantify how this diversity
impacts state-of-the-art NER methods, by measuring named entity (NE) and
context variability, feature sparsity, and their effects on precision and
recall. In particular, our findings indicate that NER approaches struggle to
generalise in diverse genres with limited training data. Unseen NEs, in
particular, play an important role, which have a higher incidence in diverse
genres such as social media than in more regular genres such as newswire.
Coupled with a higher incidence of unseen features more generally and the lack
of large training corpora, this leads to significantly lower F1 scores for
diverse genres as compared to more regular ones. We also find that leading
systems rely heavily on surface forms found in training data, having problems
generalising beyond these, and offer explanations for this observation.
Cong Duy Vu Hoang (University of Melbourne), Gholamreza Haffari (Monash University), Trevor Cohn (University of Melbourne)
Comments: v1 with preliminary results
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
In this work, we propose a novel decoding approach for neural machine
translation (NMT) based on continuous optimisation. The resulting optimisation
problem can then be tackled using a whole range of continuous optimisation
algorithms which have been developed and used in the literature mainly for
training. Our approach is general and can be applied to other
sequence-to-sequence neural models as well. We make use of this powerful
decoding approach to intersect an underlying NMT with a language model, to
intersect left-to-right and right-to-left NMT models, and to decode with soft
constraints involving coverage and fertility of the source sentence words. The
experimental results show the promise of the proposed framework.
Guillaume Klein, Yoon Kim, Yuntian Deng, Jean Senellart, Alexander M. Rush
Comments: Report for this http URL
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
We describe an open-source toolkit for neural machine translation (NMT). The
toolkit prioritizes efficiency, modularity, and extensibility with the goal of
supporting NMT research into model architectures, feature representations, and
source modalities, while maintaining competitive performance and reasonable
training requirements. The toolkit consists of modeling and translation
support, as well as detailed pedagogical documentation about the underlying
techniques.
Hardie Cate, Zeshan Hussain
Comments: 7 pages
Subjects: Computation and Language (cs.CL)
We outline a bidirectional translation system that converts sentences from
American Sign Language (ASL) to English, and vice versa. To perform machine
translation between ASL and English, we utilize a generative approach.
Specifically, we employ an adjustment to the IBM word-alignment model 1 (IBM
WAM1), where we define language models for English and ASL, as well as a
translation model, and attempt to generate a translation that maximizes the
posterior distribution defined by these models. Then, using these models, we
are able to quantify the concepts of fluency and faithfulness of a translation
between languages.
Tapan Sahni, Chinmay Chandak, Naveen Reddy Chedeti, Manish Singh
Subjects: Social and Information Networks (cs.SI); Computation and Language (cs.CL); Information Retrieval (cs.IR)
As microblogging services like Twitter are becoming more and more influential
in today’s globalised world, its facets like sentiment analysis are being
extensively studied. We are no longer constrained by our own opinion. Others
opinions and sentiments play a huge role in shaping our perspective. In this
paper, we build on previous works on Twitter sentiment analysis using Distant
Supervision. The existing approach requires huge computation resource for
analysing large number of tweets. In this paper, we propose techniques to speed
up the computation process for sentiment analysis. We use tweet subjectivity to
select the right training samples. We also introduce the concept of EFWS
(Effective Word Score) of a tweet that is derived from polarity scores of
frequently used words, which is an additional heuristic that can be used to
speed up the sentiment classification with standard machine learning
algorithms. We performed our experiments using 1.6 million tweets. Experimental
evaluations show that our proposed technique is more efficient and has higher
accuracy compared to previously proposed methods. We achieve overall accuracies
of around 80% (EFWS heuristic gives an accuracy around 85%) on a training
dataset of 100K tweets, which is half the size of the dataset used for the
baseline model. The accuracy of our proposed model is 2-3% higher than the
baseline model, and the model effectively trains at twice the speed of the
baseline model.
David Castells-Rufas, Cédric Bastoul
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Proceedings of the Workshop on High Performance Energy Efficient Embedded
Systems (HIP3ES) 2017. Stockholm, Sweden, January 25th. Collocated with HIPEAC
2017 Conference.
Manxi Wang, Yongcheng Li, Xiaohan Wei, Qing Ling
Comments: IEEE GlobalSIP 2016
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
This paper considers the recovery of group sparse signals over a multi-agent
network, where the measurements are subject to sparse errors. We first
investigate the robust group LASSO model and its centralized algorithm based on
the alternating direction method of multipliers (ADMM), which requires a
central fusion center to compute a global row-support detector. To implement it
in a decentralized network environment, we then adopt dynamic average consensus
strategies that enable dynamic tracking of the global row-support detector.
Numerical experiments demonstrate the effectiveness of the proposed algorithms.
Zaid Hussain, Bader AlBdaiwi, Anton Cerny
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Message broadcasting in networks could be carried over spanning trees. A set
of spanning trees in the same network is node independent if two conditions are
satisfied. First, all trees are rooted at node (r). Second, for every node (u)
in the network, all trees’ paths from (r) to (u) are node-disjoint, excluding
the end nodes (r) and (u). Independent spanning trees have applications in
fault-tolerant communications and secure message distributions. Gaussian
networks and two-dimensional toroidal networks share similar topological
characteristics. They are regular of degree four, symmetric, and
node-transitive. Gaussian networks, however, have relatively lesser network
diameter that could result in a better performance. This promotes Gaussian
networks to be a potential alternative for two-dimensional toroidal networks.
In this paper, we present constructions for node independent spanning trees in
dense Gaussian networks. Based on these constructions, we design routing
algorithms that can be used in fault-tolerant routing and secure message
distribution. We also design fault-tolerant algorithms to construct these trees
in parallel.
Arturo Argueta, David Chiang
Comments: accepted at EACL 2017
Subjects: Computation and Language (cs.CL); Distributed, Parallel, and Cluster Computing (cs.DC)
Weighted finite automata and transducers (including hidden Markov models and
conditional random fields) are widely used in natural language processing (NLP)
to perform tasks such as morphological analysis, part-of-speech tagging,
chunking, named entity recognition, speech recognition, and others.
Parallelizing finite state algorithms on graphics processing units (GPUs) would
benefit many areas of NLP. Although researchers have implemented GPU versions
of basic graph algorithms, limited previous work, to our knowledge, has been
done on GPU algorithms for weighted finite automata. We introduce a GPU
implementation of the Viterbi and forward-backward algorithm, achieving
decoding speedups of up to 5.2x over our serial implementation running on
different computer architectures and 6093x over OpenFST.
Massimo Cafaro, Marco Pulimeno, Italo Epicoco
Comments: arXiv admin note: text overlap with arXiv:1601.03892
Subjects: Data Structures and Algorithms (cs.DS); Distributed, Parallel, and Cluster Computing (cs.DC)
We present PFDCMSS, a novel message-passing based parallel algorithm for
mining time-faded heavy hitters. The algorithm is a parallel version of the
recently published FDCMSS sequential algorithm. We formally prove its
correctness by showing that the underlying data structure, a sketch augmented
with a Space Saving stream summary holding exactly two counters, is mergeable.
Whilst mergeability of traditional sketches derives immediately from theory, we
show that merging our augmented sketch is non trivial. Nonetheless, the
resulting parallel algorithm is fast and simple to implement. To the best of
our knowledge, PFDCMSS is the first parallel algorithm solving the problem of
mining time-faded heavy hitters on message-passing parallel architectures.
Extensive experimental results confirm that PFDCMSS retains the extreme
accuracy and error bound provided by FDCMSS whilst providing excellent parallel
scalability.
Johan Jonasson
Comments: 9 pages
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
Markov chain Monte Carlo (MCMC) algorithms are ubiquitous in probability
theory in general and in machine learning in particular. A Markov chain is
devised so that its stationary distribution is some probability distribution of
interest. Then one samples from the given distribution by running the Markov
chain for a “long time” until it appears to be stationary and then collects the
sample. However these chains are often very complex and there are no
theoretical guarantees that stationarity is actually reached. In this paper we
study the Gibbs sampler of the posterior distribution of a very simple case of
Latent Dirichlet Allocation, an attractive Bayesian unsupervised learning model
for text generation and text classification. It turns out that in some
situations, the mixing time of the Gibbs sampler is exponential in the length
of documents and so it is practically impossible to properly sample from the
posterior when documents are sufficiently long.
Jean-Bernard Lasserre, Edouard Pauwels
Subjects: Learning (cs.LG)
We illustrate the potential in statistics and machine learning of the
Christoffel function, or more precisely, its empirical counterpart associated
with a counting measure uniformly supported on a finite set of points. Firstly,
we provide a thresholding scheme which allows to approximate the support of a
measure from a finite subset of its moments with strong asymptotic guaranties.
Secondly, we provide a consistency result which relates the empirical
Christoffel function and its population counterpart in the limit of large
samples. Finally, we illustrate the relevance of our results on simulated and
real world datasets for several applications in statistics and machine
learning: (a) density and support estimation from finite samples, (b) outlier
and novelty detection and (c) affine matching.
Bo Dai, Ruiqi Guo, Sanjiv Kumar, Niao He, Le Song
Comments: 19 pages, 22 figures
Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
Learning to hash plays a fundamentally important role in the efficient image
and video retrieval and many other computer vision problems. However, due to
the binary outputs of the hash functions, the learning of hash functions is
very challenging. In this paper, we propose a novel approach to learn
stochastic hash functions such that the learned hashing codes can be used to
regenerate the inputs. We develop an efficient stochastic gradient learning
algorithm which can avoid the notorious difficulty caused by binary output
constraint, and directly optimize the parameters of the hash functions and the
associated generative model jointly. The proposed method can be applied to both
(L2) approximate nearest neighbor search (L2NNS) and maximum inner product
search (MIPS). Extensive experiments on a variety of large-scale datasets show
that the proposed method achieves significantly better retrieval results than
previous state-of-the-arts.
Jonathan T. Barron
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)
We present a loss function which can be viewed as a generalization of many
popular loss functions used in robust statistics: the Cauchy/Lorentzian,
Welsch, and generalized Charbonnier loss functions (and by transitivity the L2,
L1, L1-L2, and pseudo-Huber/Charbonnier loss functions). We describe and
visualize this loss, and document several of its useful properties.
Xin Yuan, Yunchen Pu, Lawrence Carin
Comments: 17 pages, 6 figures
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
We solve the compressive sensing problem via convolutional factor analysis,
where the convolutional dictionaries are learned {em in situ} from the
compressed measurements. An alternating direction method of multipliers (ADMM)
paradigm for compressive sensing inversion based on convolutional factor
analysis is developed. The proposed algorithm provides reconstructed images as
well as features, which can be directly used for recognition ((e.g.),
classification) tasks. When a deep (multilayer) model is constructed, a
stochastic unpooling process is employed to build a generative model. During
reconstruction and testing, we project the upper layer dictionary to the data
level and only a single layer deconvolution is required. We demonstrate that
using (sim30\%) (relative to pixel numbers) compressed measurements, the
proposed model achieves the classification accuracy comparable to the original
data on MNIST. We also observe that when the compressed measurements are very
limited ((e.g.), (<10\%)), the upper layer dictionary can provide better
reconstruction results than the bottom layer.
Xiaowei Zhang, Chi Xu, Yu Zhang, Tingshao Zhu, Li Cheng
Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
This paper studies the problem of multivariate linear regression where a
portion of the observations is grossly corrupted or is missing, and the
magnitudes and locations of such occurrences are unknown in priori. To deal
with this problem, we propose a new approach by explicitly consider the error
source as well as its sparseness nature. An interesting property of our
approach lies in its ability of allowing individual regression output elements
or tasks to possess their unique noise levels. Moreover, despite working with a
non-smooth optimization problem, our approach still guarantees to converge to
its optimal solution. Experiments on synthetic data demonstrate the
competitiveness of our approach compared with existing multivariate regression
models. In addition, empirically our approach has been validated with very
promising results on two exemplar real-world applications: The first concerns
the prediction of extit{Big-Five} personality based on user behaviors at
social network sites (SNSs), while the second is 3D human hand pose estimation
from depth images. The implementation of our approach and comparison methods as
well as the involved datasets are made publicly available in support of the
open-source and reproducible research initiatives.
Kristjan Greenewald, Stephen Kelley, Brandon Oselio, Alfred O. Hero III
Comments: submitted to IEEE transactions on signal processing. arXiv admin note: substantial text overlap with arXiv:1610.03090, arXiv:1603.03678
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
Recent work in distance metric learning has focused on learning
transformations of data that best align with specified pairwise similarity and
dissimilarity constraints, often supplied by a human observer. The learned
transformations lead to improved retrieval, classification, and clustering
algorithms due to the better adapted distance or similarity measures. Here, we
address the problem of learning these transformations when the underlying
constraint generation process is nonstationary. This nonstationarity can be due
to changes in either the ground-truth clustering used to generate constraints
or changes in the feature subspaces in which the class structure is apparent.
We propose Online Convex Ensemble StrongLy Adaptive Dynamic Learning (OCELAD),
a general adaptive, online approach for learning and tracking optimal metrics
as they change over time that is highly robust to a variety of nonstationary
behaviors in the changing metric. We apply the OCELAD framework to an ensemble
of online learners. Specifically, we create a retro-initialized composite
objective mirror descent (COMID) ensemble (RICE) consisting of a set of
parallel COMID learners with different learning rates, and demonstrate
parameter-free RICE-OCELAD metric learning on both synthetic data and a highly
nonstationary Twitter dataset. We show significant performance improvements and
increased robustness to nonstationary effects relative to previously proposed
batch and online distance metric learning algorithms.
Rajat Sen, Karthikeyan Shanmugam, Alexandros G. Dimakis, Sanjay Shakkottai
Comments: 33 pages, 7 figures
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
Motivated by applications in computational advertising and systems biology,
we consider the problem of identifying the best out of several possible soft
interventions at a source node (V) in a causal DAG, to maximize the expected
value of a target node (Y) (downstream of (V)). There is a fixed total budget
for sampling under various interventions. Also, there are cost constraints on
different types of interventions. We pose this as a best arm identification
problem with (K) arms, where each arm is a soft intervention at (V). The key
difference from the classical setting is that there is information leakage
among the arms. Each soft intervention is a distinct known conditional
probability distribution between (V) and its parents (pa(V)).
We propose an efficient algorithm that uses importance sampling to adaptively
sample using different interventions and leverage information leakage to choose
the best. We provide the first gap dependent simple regret and best arm
mis-identification error bounds for this problem. This generalizes the prior
bounds available for the simpler case of no information leakage. In the case of
no leakage, the number of samples required for identification is (upto polylog
factors) ( ilde{O} (max_i frac{i}{Delta_i^2})) where (Delta_i) is the gap
between the optimal and the (i)-th best arm. We generalize the previous result
for the causal setting and show that ( ilde{O}(max_i
frac{sigma_i^2}{Delta_i^2})) is sufficient where (sigma_i^2) is the
effective variance of an importance sampling estimator that eliminates the
(i)-th best arm out of a set of arms with gaps roughly at most twice
(Delta_i). We also show that (sigma_i^2 << i) in many cases. Our result also
recovers (up to constants) prior gap independent bounds for this setting. We
demonstrate that our algorithm empirically outperforms the state of the art,
through synthetic experiments.
Shuo Shao, Tie Liu, Chao Tian, Cong Shen
Comments: 12 pages, 3 figures
Subjects: Information Theory (cs.IT)
We consider the ((n,k,d,ell)) secure exact-repair regenerating code problem,
which generalizes the ((n,k,d)) exact-repair regenerating code problem with the
additional constraint that the stored file needs to be kept
information-theoretically secure against an eavesdropper, who can access the
data transmitted to regenerate a total of (ell) different failed nodes. For
all known results on this problem, the achievable tradeoff regions between the
normalized storage capacity and repair bandwidth have a single corner point,
achieved by a scheme proposed by Shah, Rashmi and Kumar (the SRK point). Since
the achievable tradeoff regions of the exact-repair regenerating code problem
without any secrecy constraints are known to have multiple corner points in
general, these existing results suggest a phase-change-like behavior, i.e.,
enforcing a secrecy constraint ((ellgeq 1)) immediately reduces the tradeoff
region to one with a single corner point. In this work, we first show that when
the secrecy parameter (ell) is sufficiently large, the SRK point is indeed the
only corner point of the tradeoff region. However, when (ell) is small, we
show that the tradeoff region can in fact have multiple corner points. In
particular, we establish a precise characterization of the tradeoff region for
the ((7,6,6,1)) problem, which has exactly two corner points. Thus, a smooth
transition, instead of a phase-change-type of transition, should be expected as
the secrecy constraint is gradually strengthened.
J.Lopez-Fernandez, J.F.Paris, E. Martos-Naya
Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after wich this version may no longer be accessible
Subjects: Information Theory (cs.IT)
In this paper we present a bivariate Rician shadowed fading model where the
shadowing is assumed to follow a Nakagami-(m) distribution. We derive exact
expressions involving a single integral for both the joint probability density
function (PDF) and the joint cumulative distribution function (CDF) and we also
derive an exact closed-form expression for the moment generating function
(MGF). As a direct consequence we obtain a closed-form expression for the power
correlation coefficient between Rician shadowed variables as a function of the
correlation coefficient between the underlying variables of the model.
Additionally, in the particular case of integer (m) we show that the PDF can be
expressed in closed-form in terms of a sum of m Meijer G-functions of two
variables. Results are applied to analyze the outage probability (OT) of a
dual-branch selection combiner (SC) in correlated Rician shadowed fading and
the evaluation of the level crossing rate (LCR) and average fade duration (AFD)
of a sampled Rician shadowed fading envelope.
Seyed Pooya Shariatpanahi, Giuseppe Caire, Babak Hossein Khalaj
Comments: 7 pages, 2 figures
Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)
In this paper we consider a single-cell downlink scenario where a
multiple-antenna base station delivers contents to multiple cache-enabled user
terminals. Based on the multicasting opportunities provided by the so-called
Coded Caching technique, we investigate three delivery approaches. Our baseline
scheme employs the coded caching technique on top of max-min fair multicasting.
The second one consists of a joint design of Zero-Forcing (ZF) and coded
caching, where the coded chunks are formed in the signal domain (complex
field). The third scheme is similar to the second one with the difference that
the coded chunks are formed in the data domain (finite field). We derive
closed-form rate expressions where our results suggest that the latter two
schemes surpass the first one in terms of Degrees of Freedom (DoF). However, at
the intermediate SNR regime forming coded chunks in the signal domain results
in power loss, and will deteriorate throughput of the second scheme. The main
message of our paper is that the schemes performing well in terms of DoF may
not be directly appropriate for intermediate SNR regimes, and modified schemes
should be employed.
Werner Haselmayr, Syed Muhammad Haider Aejaz, A. Taufiq Asyhari, Andreas Springer, Weisi Guo
Subjects: Information Theory (cs.IT); Emerging Technologies (cs.ET)
Transposition errors fundamentally undermine reliability and capacity in
molecular communication, when individual particles are used for information
encoding. Recently, several channel coding techniques have been proposed for
mitigating the transposition effect. The presented techniques show promising
results for diffusion-based channels with drift. However, so far no performance
evaluation has been carried out for diffusion-based channels without drift,
although in this case, transpositions are more prominent. In this work, we
first derive an analytical expression for the uncoded bit error probability due
to transpositions. Then, we compare the uncoded and coded error performance
over diffusion-based channels with and without drift by means of computer
simulations. The results reveal that for diffusion-based channels without drift
a transmission with reasonable reliability can only be achieved by introducing
a guard time between the codewords. This research lays the foundation for
future development of strategies to mitigate transpositions in pure
diffusion-based channels.
Nhan Thanh Nguyen, Kyungchun Lee
Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)
In this paper, we investigate a coverage extension scheme based on orthogonal
random precoding (ORP) for the downlink of massive multiple-input
multiple-output (MIMO) systems. In this scheme, a precoding matrix consisting
of orthogonal vectors is employed at the transmitter to enhance the maximum
signal-to-interference-plus-noise ratio (SINR) of the user. To analyze and
optimize the ORP scheme in terms of cell coverage, we derive the analytical
expressions of the downlink coverage probability for two receiver structures,
namely, the single-antenna (SA) receiver and multiple-antenna receiver with
antenna selection (AS). The simulation results show that the analytical
expressions accurately capture the coverage behaviors of the systems employing
the ORP scheme. It is also shown that the optimal coverage performance is
achieved when a single precoding vector is used under the condition that the
threshold of the signal-to-noise ratio of the coverage is greater than one. The
performance of the ORP scheme is further analyzed when different random
precoder groups are utilized over multiple time slots to exploit precoding
diversity. The numerical results show that the proposed ORP scheme over
multiple time slots provides a substantial coverage gain over the space-time
coding scheme despite its low feedback overhead.
Zhetao Li, Hongqing Zeng, Chengqing Li, Jun Fang
Comments: 5 pages, 3 figures
Subjects: Information Theory (cs.IT)
The reconstruction of sparse signals requires the solution of an
(ell_0)-norm minimization problem in Compressed Sensing. Previous research has
focused on the investigation of a single candidate to identify the support
(index of nonzero elements) of a sparse signal. To ensure that the optimal
candidate can be obtained in each iteration, we propose here an iterative
greedy reconstruction algorithm (GSRA). First, the intersection of the support
sets estimated by the Orthogonal Matching Pursuit (OMP) and Subspace Pursuit
(SP) is set as the initial support set. Then, a hope-tree is built to expand
the set. Finally, a developed decreasing subspace pursuit method is used to
rectify the candidate set. Detailed simulation results demonstrate that GSRA is
more accurate than other typical methods in recovering Gaussian signals, 0–1
sparse signals, and synthetic signals.
Ori Rottenstreich, Yuval Cassuto
Subjects: Information Theory (cs.IT)
Data compression is a well-studied (and well-solved) problem in the setup of
long coding blocks. But important emerging applications need to compress data
to memory words of small fixed widths. This new setup is the subject of this
paper. In the problem we consider we have two sources with known discrete
distributions, and we wish to find codes that maximize the success probability
that the two source outputs are represented in (L) bits or less. A good
practical use for this problem is a table with two-field entries that is stored
in a memory of a fixed width (L). Such tables of very large sizes drive the
core functionalities of network switches and routers. After defining the
problem formally, we solve it optimally with an efficient code-design
algorithm. We also solve the problem in the more constrained case where a
single code is used in both fields (to save space for storing code
dictionaries). For both code-design problems we find decompositions that yield
efficient dynamic-programming algorithms. With an empirical study we show the
success probabilities of the optimal codes for different distributions and
memory widths. In particular, the study demonstrates the superiority of the new
codes over existing compression algorithms.
Kapil Dev Tyagi, Arun Kumar, R. Bahl
Comments: 6 pages, 3 figures
Subjects: Information Theory (cs.IT)
The range resolution in conventional continuous time frequency modulation
(CTFM) is inversely proportional to the signal bandwidth. The dual-demodulator
continuous time frequency modulation (DD-CTFM) processing technique was
proposed by Gough et al [1] as a method to increase the range resolution by
making the output of DD-CTFM truly continuous. However, it has been found that
in practice the range resolution is still limited by the signal bandwidth. The
limitation of DD-CTFM has been explained using simulations and mathematically
in this paper.
Nir Shlezinger, Daniel Zahavi, Yonathan Murin, Ron Dabora
Comments: 1 figures. This paper was presented in part at the 2015 IEEE International Symposium on Information Theory
Subjects: Information Theory (cs.IT)
In this work we study the secrecy capacity of Gaussian multiple-input
multiple-output (MIMO) wiretap channels (WTCs) with a finite memory, subject to
a per-symbol average power constraint on the MIMO channel input. MIMO channels
with finite memory are very common in wireless communications as well as in
wireline communications (e.g., in communications over power lines). To derive
the secrecy capacity of the Gaussian MIMO WTC with finite memory we first
construct an asymptotically-equivalent block-memoryless MIMO WTC, which is then
transformed into a set of parallel, independent, memoryless MIMO WTCs in the
frequency domain. The secrecy capacity of the Gaussian MIMO WTC with finite
memory is obtained as the secrecy capacity of the set of parallel independent
memoryless MIMO WTCs, and is expressed as a maximization over the input
covariance matrices in the frequency domain. Lastly, we detail two applications
of our result: First, we show that the secrecy capacity of the Gaussian scalar
WTC with finite memory can be achieved by waterfilling, and obtain a
closed-form expression for this secrecy capacity. Then, we use our result to
characterize the secrecy capacity of narrowband powerline channels, thereby
resolving one of the major open issues for this channel model.
Tamir Bendory, Pavel Sidorenko, Yonina C. Eldar
Subjects: Information Theory (cs.IT)
The problem of recovering a signal from its power spectrum, called phase
retrieval, arises in many scientific fields. One of many examples is
ultra-short laser pulse characterization in which the electromagnetic field is
oscillating with ~10^15 Hz and phase information cannot be measured directly
due to limitations of the electronic sensors. Phase retrieval is ill-posed in
most cases as there are many different signals with the same Fourier transform
magnitude. To overcome this fundamental ill-posedness, several measurement
techniques are used in practice. One of the most popular methods for complete
characterization of ultra-short laser pulses is the Frequency-Resolved Optical
Gating (FROG). In FROG, the acquired data is the power spectrum of the product
of the unknown pulse with its delayed replica. Therefore the measured signal is
a quartic function of the unknown pulse. A generalized version of FROG, where
the delayed replica is replaced by a second unknown pulse, is called blind
FROG. In this case, the measured signal is quadratic with respect to both
pulses. In this letter we introduce and formulate FROG-type techniques. We then
show that almost all signals are determined uniquely, up to trivial
ambiguities, by blind FROG measurements (and thus also by FROG), if in addition
we have access to the signals power spectrum.
Cheuk Ting Li, Abbas El Gamal
Comments: 13 pages
Subjects: Information Theory (cs.IT)
This paper shows that for any random variables (X) and (Y), it is possible to
represent (Y) as a function of ((X,Z)) such that (Z) is independent of (X) and
(I(X;Z|Y)lelog(I(X;Y)+1)+4). We use this strong functional representation
lemma (SFRL) to establish a tighter bound on the rate needed for one-shot exact
channel simulation than was previously established by Harsha et. al., and to
establish achievability results for one-shot variable-length lossy source
coding and multiple description coding. We also show that the SFRL can be used
to reduce the channel with state noncausally known at the encoder to a
point-to-point channel, which provides a simple achievability proof of the
Gelfand-Pinsker theorem. Finally we present an example in which the SFRL
inequality is tight to within 5 bits.
Ravi Kiran Raman, Lav R. Varshney
Comments: 20 pages, 4 figures
Subjects: Information Theory (cs.IT); Machine Learning (stat.ML)
The problem of joint clustering and registration of images is studied in a
universal setting. We define universal joint clustering and registration
algorithms using multivariate information functionals. We first study the
problem of registering two images using maximum mutual information and prove
its asymptotic optimality. We then show the shortcomings of pairwise
registration in multi-image registration, and design an asymptotically optimal
algorithm based on multiinformation. Finally, we define a novel multivariate
information functional to perform joint clustering and registration of images,
and prove consistency of the algorithm.
Hao-Chung Cheng, Min-Hsiu Hsieh, Marco Tomamichel
Comments: submitted to ISIT 2017
Subjects: Quantum Physics (quant-ph); Information Theory (cs.IT)
We provide a sphere-packing lower bound for the optimal error probability in
finite blocklengths when coding over a symmetric classical-quantum channel. Our
result shows that the pre-factor can be significantly improved from the order
of the subexponential to the polynomial. The established pre-factor is
essentially optimal because it matches the best known random coding upper bound
in the classical case. Our approaches rely on a sharp concentration inequality
in strong large deviation theory and crucial properties of the error-exponent
function.
Masahito Hayashi, Ning Cai
Subjects: Quantum Physics (quant-ph); Information Theory (cs.IT)
In this paper we obtain a lower bound of exponent of average probability of
error for classical quantum multiple access channel, which implies that for all
rate pairs in the capacity region is achievable by a code with exponential
probability of error. Thus we re-obtain the direct coding theorem.
Ryutaroh Matsumoto
Comments: LaTeX2e, 5 pages, no figure. Comments from readers are welcome
Subjects: Quantum Physics (quant-ph); Cryptography and Security (cs.CR); Information Theory (cs.IT)
We show a simple example of a secret sharing scheme encoding classical secret
to quantum shares that can realize an access structure impossible by classical
information processing with limitation on the size of each share. The example
is based on quantum stabilizer codes.