Nikolay Bazenkov, Dmitry Vorontsov, Varvara Dyakonova, Liudmila Zhilyakova, Oleg Kuznetsov, Dmitri Sakharov
Comments: 13 pages, 4 figures
Subjects: Neural and Evolutionary Computing (cs.NE); Neurons and Cognition (q-bio.NC)
We propose a novel discrete model of central pattern generators (CPG),
neuronal ensembles generating rhythmic activity. The model emphasizes the role
of nonsynaptic interactions and the diversity of electrical properties in
nervous systems. Neurons in the model release different neurotransmitters into
the shared extracellular space (ECS) so each neuron with the appropriate set of
receptors can receive signals from other neurons. We consider neurons,
differing in their electrical activity, represented as finite-state machines
functioning in discrete time steps. Discrete modeling is aimed to provide a
computationally tractable and compact explanation of rhythmic pattern
generation in nervous systems. The important feature of the model is the
introduced mechanism of neuronal competition which is shown to be responsible
for the generation of proper rhythms. The model is illustrated with two
examples: a half-center oscillator considered to be a basic mechanism of
emerging rhythmic activity and the well-studied feeding network of a pond
snail. Future research will focus on the neuromodulatory effects ubiquitous in
CPG networks and the whole nervous systems.
James B. Aimone
Comments: Draft version of a perspective article to be submitted
Subjects: Neural and Evolutionary Computing (cs.NE); Neurons and Cognition (q-bio.NC)
Although the brain has long been considered a potential inspiration for
future computing, Moore’s Law – the scaling property that has seen revolutions
in technologies ranging from supercomputers to smart phones – has largely been
driven by advances in materials science. As the ability to miniaturize
transistors is coming to an end, there is increasing attention on new
approaches to computation, including renewed enthusiasm around the potential of
neural computation. Recent advances in neurotechnologies, many of which have
been aided by computing’s rapid progression over recent decades, are now
reigniting this opportunity to bring neural computation insights into broader
computing applications. As we understand more about the brain, our ability to
motivate new computing paradigms with continue to progress. These new
approaches to computing, which we are already seeing in techniques such as deep
learning, will themselves improve our ability to learn about the brain and
accordingly can be projected to give rise to even further insights. Such a
positive feedback has the potential to change the complexion of how computing
sciences and neurosciences interact, and suggests that the next form of
exponential scaling in computing may emerge from our progressive understanding
of the brain.
Nadav Cohen, Or Sharir, Ronen Tamari, David Yakira, Yoav Levine, Amnon Shashua
Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
The driving force behind convolutional networks – the most successful deep
learning architecture to date, is their expressive power. Despite its wide
acceptance and vast empirical evidence, formal analyses supporting this belief
are scarce. The primary notions for formally reasoning about expressiveness are
efficiency and inductive bias. Efficiency refers to the ability of a network
architecture to realize functions that require an alternative architecture to
be much larger. Inductive bias refers to the prioritization of some functions
over others given prior knowledge regarding a task at hand. In this paper we
provide a high-level overview of a series of works written by the authors, that
through an equivalence to hierarchical tensor decompositions, analyze the
expressive efficiency and inductive bias of various architectural features in
convolutional networks (depth, width, convolution strides and more). The
results presented shed light on the demonstrated effectiveness of convolutional
networks, and in addition, provide new tools for network design.
Xiaosong Wang, Yifan Peng, Le Lu, Zhiyong Lu, Mohammadhadi Bagheri, Ronald M. Summers
Comments: CVPR 2017 spotlight
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
The chest X-ray is one of the most commonly accessible radiological
examinations for screening and diagnosis of many lung diseases. A tremendous
number of X-ray imaging studies accompanied by radiological reports are
accumulated and stored in many modern hospitals’ Picture Archiving and
Communication Systems (PACS). On the other side, it is still an open question
how this type of hospital-size knowledge database containing invaluable imaging
informatics (i.e., loosely labeled) can be used to facilitate the data-hungry
deep learning paradigms in building truly large-scale high precision
computer-aided diagnosis (CAD) systems.
In this paper, we present a new chest X-ray database, namely “ChestX-ray8”,
which comprises 108,948 frontal-view X-ray images of 32,717 unique patients
with the text-mined eight disease image labels (where each image can have
multi-labels), from the associated radiological reports using natural language
processing. Importantly, we demonstrate that these commonly occurring thoracic
diseases can be detected and even spatially-located via a unified
weakly-supervised multi-label image classification and disease localization
framework, which is validated using our proposed dataset. Although the initial
quantitative results are promising as reported, deep convolutional neural
network based “reading chest X-rays” (i.e., recognizing and locating the common
disease patterns trained with only image-level labels) remains a strenuous task
for fully-automated high precision CAD systems.
Minne Li, Zhaoning Zhang, Hao Yu, Xinyuan Chen, Dongsheng Li
Comments: 9 pages, 4 figures, submitted to BMVC 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
One of the major challenges in object detection is to propose detectors with
highly accurate localization of objects. The online sampling of high-loss
region proposals (hard examples) has made training of region-based deep
convolutional network detectors effective and efficient. In this paper, we
present the Stratified Online Hard Example Mining (S-OHEM) algorithm for
training higher efficiency and accuracy detectors. S-OHEM exploits OHEM with
stratified sampling, a widely adopted sampling technique. OHEM uses the
multitask loss with equal weight settings across all loss types (e.g.,
classification and localization, rigid and non-rigid categories) and ignores
the influence of different loss distribution throughout the training process,
which we found essential to the training efficacy. By maintaining a sampling
distribution according to this influence during hard example mining, we can
enhance the performance of object detectors. Based on this, S-OHEM samples the
training examples according to this distribution before feeding them to the
backpropagation process. We show through systematic experiments that S-OHEM
yields an average precision (AP) improvement of 0.5% on rigid categories of
PASCAL VOC 2007 for both the IoU threshold of 0.6 and 0.7. For KITTI 2012, both
results of the same metric are 1.6%. Regarding the mean average precision
(mAP), a relative increase of 0.3% and 0.5% (1% and 0.5%) is observed for VOC07
(KITTI12) using the same set of IoU threshold. Also, S-OHEM is easy to
integrate with existing region-based detectors and is capable of acting with
post-recognition level regressors.
James Thewlis, Hakan Bilen, Andrea Vedaldi
Comments: 17 pages, 14 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
Automatically learning the structure of object categories remains an
important open problem in computer vision. We propose a novel unsupervised
approach that can discover and learn to detect landmarks in object categories,
thus characterizing their structure. Our approach is based on factorizing image
deformations, as induced by a viewpoint change or an object articulation, by
learning a deep neural network that detects landmarks compatible with such
visual effects. We show that, by requiring the same neural network to be
applicable to different object instances, our method naturally induces
meaningful correspondences between different object instances in a category. We
assess the method qualitatively on a variety of object types, natural an
man-made. We also show that our unsupervised landmarks are highly predictive of
manually-annotated landmarks in faces benchmark datasets, and can be used to
regress those with a high degree of accuracy.
Noureldien Hussein, Efstratios Gavves, Arnold W.M. Smeulders
Comments: IEEE CVPR 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Event detection in unconstrained videos is conceived as a content-based video
retrieval with two modalities: textual and visual. Given a text describing a
novel event, the goal is to rank related videos accordingly. This task is
zero-exemplar, no video examples are given to the novel event.
Related works train a bank of concept detectors on external data sources.
These detectors predict confidence scores for test videos, which are ranked and
retrieved accordingly. In contrast, we learn a joint space in which the visual
and textual representations are embedded. The space casts a novel event as a
probability of pre-defined events. Also, it learns to measure the distance
between an event and its related videos.
Our model is trained end-to-end on publicly available EventNet. When applied
to TRECVID Multimedia Event Detection dataset, it outperforms the
state-of-the-art by a considerable margin.
Fuqing Zhu, Xiangwei Kong, Liang Zheng, Haiyan Fu, Qi Tian
Comments: 12 pages, 4 figures. IEEE Transactions on Image Processing, 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Large-scale is a trend in person re-identification (re-id). It is important
that real-time search be performed in a large gallery. While previous methods
mostly focus on discriminative learning, this paper makes the attempt in
integrating deep learning and hashing into one framework to evaluate the
efficiency and accuracy for large-scale person re-id. We integrate spatial
information for discriminative visual representation by partitioning the
pedestrian image into horizontal parts. Specifically, Part-based Deep Hashing
(PDH) is proposed, in which batches of triplet samples are employed as the
input of the deep hashing architecture. Each triplet sample contains two
pedestrian images (or parts) with the same identity and one pedestrian image
(or part) of the different identity. A triplet loss function is employed with a
constraint that the Hamming distance of pedestrian images (or parts) with the
same identity is smaller than ones with the different identity. In the
experiment, we show that the proposed Part-based Deep Hashing method yields
very competitive re-id accuracy on the large-scale Market-1501 and
Market-1501+500K datasets.
Antonio D'Innocente, Fabio Maria Carlucci, Mirco Colosi, Barbara Caputo
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Despite the impressive progress brought by deep network in visual object
recognition, robot vision is still far from being a solved problem. The most
successful convolutional architectures are developed starting from ImageNet, a
large scale collection of images of object categories downloaded from the Web.
This kind of images is very different from the situated and embodied visual
experience of robots deployed in unconstrained settings. To reduce the gap
between these two visual experiences, this paper proposes a simple yet
effective data augmentation layer that zooms on the object of interest and
simulates the object detection outcome of a robot vision system. The layer,
that can be used with any convolutional deep architecture, brings to an
increase in object recognition performance of up to 7\%, in experiments
performed over three different benchmark databases. Upon acceptance of the
paper, our robot data augmentation layer will be made publicly available.
Jiyang Gao, Chen Sun, Zhenheng Yang, Ram Nevatia
Subjects: Computer Vision and Pattern Recognition (cs.CV)
This paper focuses on temporal localization of actions in untrimmed videos.
Existing methods typically train classifiers for a pre-defined list of actions
and apply them in a sliding window fashion. However, activities in the wild
consist of a wide combination of actors, actions and objects; it is difficult
to design a proper activity list that meets users’ needs. We propose to
localize activities by natural language queries. Temporal Activity Localization
via Language (TALL) is challenging as it requires: (1) suitable design of text
and video representations to allow cross-modal matching of actions and language
queries; (2) ability to locate actions accurately given features from sliding
windows of limited granularity. We propose a novel Cross-modal Temporal
Regression Localizer (CTRL) to jointly model text query and video clips, output
alignment scores and action boundary regression results for candidate clips.
For evaluation, we adopt TaCoS dataset, and build a new dataset for this task
on top of Charades by adding sentence temporal annotations, called
Charades-STA. We also build complex sentence queries in Charades-STA for test.
Experimental results show that CTRL outperforms previous methods significantly
on both datasets.
Agrim Gupta, Justin Johnson, Alexandre Alahi, Li Fei-Fei
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Recent progress in style transfer on images has focused on improving the
quality of stylized images and speed of methods. However, real-time methods are
highly unstable resulting in visible flickering when applied to videos. In this
work we characterize the instability of these methods by examining the solution
set of the style transfer objective. We show that the trace of the Gram matrix
representing style is inversely related to the stability of the method. Then,
we present a recurrent convolutional network for real-time video style transfer
which incorporates a temporal consistency loss and overcomes the instability of
prior methods. Our networks can be applied at any resolution, do not re- quire
optical flow at test time, and produce high quality, temporally consistent
stylized videos in real-time.
Katerina Fragkiadaki, Jonathan Huang, Alex Alemi, Sudheendra Vijayanarasimhan, Susanna Ricco, Rahul Sukthankar
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Given a visual history, multiple future outcomes for a video scene are
equally probable, in other words, the distribution of future outcomes has
multiple modes. Multimodality is notoriously hard to handle by standard
regressors or classifiers: the former regress to the mean and the latter
discretize a continuous high dimensional output space. In this work, we present
stochastic neural network architectures that handle such multimodality through
stochasticity: future trajectories of objects, body joints or frames are
represented as deep, non-linear transformations of random (as opposed to
deterministic) variables. Such random variables are sampled from simple
Gaussian distributions whose means and variances are parametrized by the output
of convolutional encoders over the visual history. We introduce novel
convolutional architectures for predicting future body joint trajectories that
outperform fully connected alternatives cite{DBLP:journals/corr/WalkerDGH16}.
We introduce stochastic spatial transformers through optical flow warping for
predicting future frames, which outperform their deterministic equivalents
cite{DBLP:journals/corr/PatrauceanHC15}. Training stochastic networks involves
an intractable marginalization over stochastic variables. We compare various
training schemes that handle such marginalization through a) straightforward
sampling from the prior, b) conditional variational autoencoders
cite{NIPS2015_5775,DBLP:journals/corr/WalkerDGH16}, and, c) a proposed
K-best-sample loss that penalizes the best prediction under a fixed “prediction
budget”. We show experimental results on object trajectory prediction, human
body joint trajectory prediction and video prediction under varying future
uncertainty, validating quantitatively and qualitatively our architectural
choices and training schemes.
Teresa Heiss, Hubert Wagner
Subjects: Computer Vision and Pattern Recognition (cs.CV); Algebraic Topology (math.AT)
We present an efficient algorithm to compute Euler characteristic curves of
gray scale images of arbitrary dimension. In various applications the Euler
characteristic curve is used as a descriptor of an image.
Our algorithm is the first streaming algorithm for Euler characteristic
curves. The usage of streaming removes the necessity to store the entire image
in RAM. Experiments show that our implementation handles terabyte scale images
on commodity hardware. Due to lock-free parallelism, it scales well with the
number of processor cores.
Additionally, we put the concept of the Euler characteristic curve in the
wider context of computational topology. In particular, we explain the
connection with persistence diagrams.
Lovedeep Gondara
Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
Machine learning models, especially based on deep learning are used in
everyday applications ranging from self driving cars to medical diagnostics.
However, it is easy to trick such models using adversarial samples,
indistinguishable from real samples to human eye, such samples can bias models
towards incorrect classification. Impact of adversarial samples is far-reaching
and efficient detection of adversarial samples remains an open problem. In this
paper we propose to use direct density ratio estimation to detect adversarial
samples, we empirically show that adversarial samples have different underlying
probability densities compared to real samples. Our proposed method works well
with colored and grayscale images, and with different adversarial sample
generation methods.
Seyed Mohammad Mahdi Alavi, Yunyan Zhang
Subjects: Optimization and Control (math.OC); Computer Vision and Pattern Recognition (cs.CV)
Following the presentation and proof of the hypothesis that image features
are particularly perceived at points where the Fourier components are maximally
in phase, the concept of phase congruency (PC) is introduced. Subsequently, a
two-dimensional multi-scale phase congruency (2D-MSPC) is developed, which has
been an important tool for detecting and evaluation of image features. However,
the 2D-MSPC requires many parameters to be appropriately tuned for optimal
image features detection. In this paper, we defined a criterion for parameter
optimization of the 2D-MSPC, which is a function of its maximum and minimum
moments. We formulated the problem in various optimal and suboptimal
frameworks, and discussed the conditions and features of the suboptimal
solutions. The effectiveness of the proposed method was verified through
several examples, ranging from natural objects to medical images from patients
with a neurological disease, multiple sclerosis.
Jun Li, Kai Xu, Siddhartha Chaudhuri, Ersin Yumer, Hao Zhang, Leonidas Guibas
Comments: Corresponding author: Kai Xu (kevin.kai.xu@gmail.com)
Journal-ref: ACM Transactions on Graphics (SIGGRAPH 2017) 36, 4, Article 52
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
We introduce a novel neural network architecture for encoding and synthesis
of 3D shapes, particularly their structures. Our key insight is that 3D shapes
are effectively characterized by their hierarchical organization of parts,
which reflects fundamental intra-shape relationships such as adjacency and
symmetry. We develop a recursive neural net (RvNN) based autoencoder to map a
flat, unlabeled, arbitrary part layout to a compact code. The code effectively
captures hierarchical structures of man-made 3D objects of varying structural
complexities despite being fixed-dimensional: an associated decoder maps a code
back to a full hierarchy. The learned bidirectional mapping is further tuned
using an adversarial setup to yield a generative model of plausible structures,
from which novel structures can be sampled. Finally, our structure synthesis
framework is augmented by a second trained module that produces fine-grained
part geometry, conditioned on global and local structural context, leading to a
full generative pipeline for 3D shapes. We demonstrate that without
supervision, our network learns meaningful structural hierarchies adhering to
perceptual grouping principles, produces compact codes which enable
applications such as shape classification and partial matching, and supports
shape synthesis and interpolation with significant variations in topology and
geometry.
Cheng-Hao Cai
Comments: 12 pages, 5 figures
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Logic in Computer Science (cs.LO)
This paper introduces an SLD-resolution technique based on deep learning.
This technique enables neural networks to learn from old and successful
resolution processes and to use learnt experiences to guide new resolution
processes. An implementation of this technique is named SLDR-DL. It includes a
Prolog library of deep feedforward neural networks and some essential functions
of resolution. In the SLDR-DL framework, users can define logical rules in the
form of definite clauses and teach neural networks to use the rules in
reasoning processes.
Nikos Katzouris, Alexander Artikis, Georgios Paliouras
Subjects: Artificial Intelligence (cs.AI)
Logic-based event recognition systems infer occurrences of events in time
using a set of event definitions in the form of first-order rules. The Event
Calculus is a temporal logic that has been used as a basis in event recognition
applications, providing among others, direct connections to machine learning,
via Inductive Logic Programming (ILP). OLED is a recently proposed ILP system
that learns event definitions in the form of Event Calculus theories, in a
single pass over a data stream. In this work we present a version of OLED that
allows for distributed, online learning. We evaluate our approach on a
benchmark activity recognition dataset and show that we can significantly
reduce training times, exchanging minimal information between processing nodes.
Neil D. Lawrence
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Learning (cs.LG)
Application of models to data is fraught. Data-generating collaborators often
only have a very basic understanding of the complications of collating,
processing and curating data. Challenges include: poor data collection
practices, missing values, inconvenient storage mechanisms, intellectual
property, security and privacy. All these aspects obstruct the sharing and
interconnection of data, and the eventual interpretation of data through
machine learning or other approaches. In project reporting, a major challenge
is in encapsulating these problems and enabling goals to be built around the
processing of data. Project overruns can occur due to failure to account for
the amount of time required to curate and collate. But to understand these
failures we need to have a common language for assessing the readiness of a
particular data set. This position paper proposes the use of data readiness
levels: it gives a rough outline of three stages of data preparedness and
speculates on how formalisation of these levels into a common language for data
readiness could facilitate project management.
Michel Besserve, Naji Shajarisales, Bernhard Schölkopf, Dominik Janzing
Comments: 16 pages, 6 figures
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG); Statistics Theory (math.ST)
The postulate of independence of cause and mechanism (ICM) has recently led
to several new causal discovery algorithms. The interpretation of independence
and the way it is utilized, however, varies across these methods. Our aim in
this paper is to propose a group theoretic framework for ICM to unify and
generalize these approaches. In our setting, the cause-mechanism relationship
is assessed by comparing it against a null hypothesis through the application
of random generic group transformations. We show that the group theoretic view
provides a very general tool to study the structure of data generating
mechanisms with direct applications to machine learning.
Josua Krause, Aritra Dasgupta, Jordan Swartz, Yindalon Aphinyanaphongs, Enrico Bertini
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI)
Human-in-the-loop data analysis applications necessitate greater transparency
in machine learning models for experts to understand and trust their decisions.
To this end, we propose a visual analytics workflow to help data scientists and
domain experts explore, diagnose, and understand the decisions made by a binary
classifier. The approach leverages “instance-level explanations”, measures of
local feature relevance that explain single instances, and uses them to build a
set of visual representations that guide the users in their investigation. The
workflow is based on three main visual representations and steps: one based on
aggregate statistics to see how data distributes across correct / incorrect
decisions; one based on explanations to understand which features are used to
make these decisions; and one based on raw data, to derive insights on
potential root causes for the observed patterns. The work is validated through
a long-term collaboration with a group of machine learning and healthcare
professionals who used our method to make sense of machine learning models they
developed. The case study from this collaboration demonstrates that the
proposed method helps experts derive useful knowledge about the model and the
phenomena it describes and also generate useful hypotheses on how a model can
be improved.
Avikalp Srivastava, Madhav Datt, Jaikrishna Chaparala, Shubham Mangla, Priyadarshi Patnaik
Comments: Accepted to SIGIR 2017
Subjects: Information Retrieval (cs.IR)
Corporations spend millions of dollars on developing creative image-based
promotional content to advertise to their user-base on platforms like Twitter.
Our paper is an initial study, where we propose a novel method to evaluate and
improve outreach of promotional images from corporations on Twitter, based
purely on their describable aesthetic attributes. Existing works in aesthetic
based image analysis exclusively focus on the attributes of digital
photographs, and are not applicable to advertisements due to the influences of
inherent content and context based biases on outreach.
Our paper identifies broad categories of biases affecting such images,
describes a method for normalization to eliminate effects of those biases and
score images based on their outreach, and examines the effects of certain
handcrafted describable aesthetic features on image outreach. Optimizing on the
describable aesthetic features resulting from this research is a simple method
for corporations to complement their existing marketing strategy to gain
significant improvement in user engagement on social media for promotional
images.
ThaiBinh Nguyen, Kenro Aihara, Atsuhiro Takasu
Subjects: Information Retrieval (cs.IR)
Collaborative filtering (CF) is one of the most efficient ways for
recommender systems. Typically, CF-based algorithms analyze users’ preferences
and items’ attributes using one of two types of feedback: extit{explicit
feedback} (e.g., ratings given to item by users, like/dislike) or
extit{implicit feedback} (e.g., clicks, views, purchases). Explicit feedback
is reliable but is extremely sparse; whereas implicit feedback is abundant but
is not reliable. To leverage the sparsity of explicit feedback, in this paper,
we propose a model that efficiently combines explicit and implicit feedback in
a unified model for rating prediction. This model is a combination of
extit{matrix factorization} and extit{item embedding}, a similar concept
with word-embedding in natural language processing. The experiments on three
real-datasets ( extit{Movilens 1M}, extit{Movielens 20M}, and
extit{Bookcrossing}) demonstrate that our method can efficiently predict
ratings for items even if the ratings data is not available for them. The
experimental results also show that our method outperforms competing methods on
rating prediction task in general as well as for users and items which have few
ratings.
Hien To, Sumeet Agrawal, Seon Ho Kim, Cyrus Shahabi
Subjects: Information Retrieval (cs.IR); Learning (cs.LG)
Social media such as tweets are emerging as platforms contributing to
situational awareness during disasters. Information shared on Twitter by both
affected population (e.g., requesting assistance, warning) and those outside
the impact zone (e.g., providing assistance) would help first responders,
decision makers, and the public to understand the situation first-hand.
Effective use of such information requires timely selection and analysis of
tweets that are relevant to a particular disaster. Even though abundant tweets
are promising as a data source, it is challenging to automatically identify
relevant messages since tweet are short and unstructured, resulting to
unsatisfactory classification performance of conventional learning-based
approaches. Thus, we propose a simple yet effective algorithm to identify
relevant messages based on matching keywords and hashtags, and provide a
comparison between matching-based and learning-based approaches. To evaluate
the two approaches, we put them into a framework specifically proposed for
analyzing disaster-related tweets. Analysis results on eleven datasets with
various disaster types show that our technique provides relevant tweets of
higher quality and more interpretable results of sentiment analysis tasks when
compared to learning approach.
Tesfamariam M. Abuhay, Sergey V. Kovalchuk, Klavdiya O. Bochenina, George Kampis, Valeria V. Krzhizhanovskaya, Michael H. Lees
Comments: Accepted by International Conference on Computational Science (ICCS) 2017 which will be held in Zurich, Switzerland from June 11-June 14
Subjects: Digital Libraries (cs.DL); Computation and Language (cs.CL); Information Retrieval (cs.IR); Social and Information Networks (cs.SI)
This paper presents results of topic modeling and network models of topics
using the International Conference on Computational Science corpus, which
contains domain-specific (computational science) papers over sixteen years (a
total of 5695 papers). We discuss topical structures of International
Conference on Computational Science, how these topics evolve over time in
response to the topicality of various problems, technologies and methods, and
how all these topics relate to one another. This analysis illustrates
multidisciplinary research and collaborations among scientific communities, by
constructing static and dynamic networks from the topic modeling results and
the keywords of authors. The results of this study give insights about the past
and future trends of core discussion topics in computational science. We used
the Non-negative Matrix Factorization topic modeling algorithm to discover
topics and labeled and grouped results hierarchically.
Serkan Ozen, Burcu Can
Comments: 10 pages, accepted and presented at the CICLing 2017 (18th International Conference on Intelligent Text Processing and Computational Linguistics)
Subjects: Computation and Language (cs.CL)
In this paper, we build morphological chains for agglutinative languages by
using a log-linear model for the morphological segmentation task. The model is
based on the unsupervised morphological segmentation system called
MorphoChains. We extend MorphoChains log linear model by expanding the
candidate space recursively to cover more split points for agglutinative
languages such as Turkish, whereas in the original model candidates are
generated by considering only binary segmentation of each word. The results
show that we improve the state-of-art Turkish scores by 12% having a F-measure
of 72% and we improve the English scores by 3% having a F-measure of 74%.
Eventually, the system outperforms both MorphoChains and other well-known
unsupervised morphological segmentation systems. The results indicate that
candidate generation plays an important role in such an unsupervised log-linear
model that is learned using contrastive estimation with negative samples.
Chao Li, Xiaokong Ma, Bing Jiang, Xiangang Li, Xuewei Zhang, Xiao Liu, Ying Cao, Ajay Kannan, Zhenyao Zhu
Subjects: Computation and Language (cs.CL)
We present Deep Speaker, a neural speaker embedding system that maps
utterances to a hypersphere where speaker similarity is measured by cosine
similarity. The embeddings generated by Deep Speaker can be used for many
tasks, including speaker identification, verification, and clustering. We
experiment with ResCNN and GRU architectures to extract the acoustic features,
then mean pool to produce utterance-level speaker embeddings, and train using
triplet loss based on cosine similarity. Experiments on three distinct datasets
suggest that Deep Speaker outperforms a DNN-based i-vector baseline. For
example, Deep Speaker reduces the verification equal error rate by 50%
(relatively) and improves the identification accuracy by 60% (relatively) on a
text-independent dataset. We also present results that suggest adapting from a
model trained with Mandarin can improve accuracy for English speaker
recognition.
Sebastian Brarda, Philip Yeres, Samuel R. Bowman
Comments: 4 pages
Subjects: Computation and Language (cs.CL); Learning (cs.LG)
In this paper we propose a neural network model with a novel Sequential
Attention layer that extends soft attention by assigning weights to words in an
input sequence in a way that takes into account not just how well that word
matches a query, but how well surrounding words match. We evaluate this
approach on the task of reading comprehension (Who did What and CNN datasets)
and show that it dramatically improves a strong baseline like the Stanford
Reader. The resulting model is competitive with the state of the art.
Minglan Li, Yang Gao, Hui Wen, Yang Du, Haijing Liu, Hao Wang
Comments: 6 pages, 3 figures, submitted to IEEE SMC 2017
Subjects: Computation and Language (cs.CL)
Argument Component Boundary Detection (ACBD) is an important sub-task in
argumentation mining; it aims at identifying the word sequences that constitute
argument components, and is usually considered as the first sub-task in the
argumentation mining pipeline. Existing ACBD methods heavily depend on
task-specific knowledge, and require considerable human efforts on
feature-engineering. To tackle these problems, in this work, we formulate ACBD
as a sequence labeling problem and propose a variety of Recurrent Neural
Network (RNN) based methods, which do not use domain specific or handcrafted
features beyond the relative position of the sentence in the document. In
particular, we propose a novel joint RNN model that can predict whether
sentences are argumentative or not, and use the predicted results to more
precisely detect the argument component boundaries. We evaluate our techniques
on two corpora from two different genres; results suggest that our joint RNN
model obtain the state-of-the-art performance on both datasets.
Mengxue Li, Shiqiang Geng, Yang Gao, Haijing Liu, Hao Wang
Comments: 6 pages,3 figures,This article has been submitted to “The 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC2017)”
Subjects: Computation and Language (cs.CL)
Argumentation mining aims at automatically extracting the premises-claim
discourse structures in natural language texts. There is a great demand for
argumentation corpora for customer reviews. However, due to the controversial
nature of the argumentation annotation task, there exist very few large-scale
argumentation corpora for customer reviews. In this work, we novelly use the
crowdsourcing technique to collect argumentation annotations in Chinese hotel
reviews. As the first Chinese argumentation dataset, our corpus includes 4814
argument component annotations and 411 argument relation annotations, and its
annotations qualities are comparable to some widely used argumentation corpora
in other languages.
Ruochen Xu, Yiming Yang
Comments: Accepted at ACL 2017
Subjects: Computation and Language (cs.CL)
Cross-lingual text classification(CLTC) is the task of classifying documents
written in different languages into the same taxonomy of categories. This paper
presents a novel approach to CLTC that builds on model distillation, which
adapts and extends a framework originally proposed for model compression. Using
soft probabilistic predictions for the documents in a label-rich language as
the (induced) supervisory labels in a parallel corpus of documents, we train
classifiers successfully for new languages in which labeled training data are
not available. An adversarial feature adaptation technique is also applied
during the model training to reduce distribution mismatch. We conducted
experiments on two benchmark CLTC datasets, treating English as the source
language and German, French, Japan and Chinese as the unlabeled target
languages. The proposed approach had the advantageous or comparable performance
of the other state-of-art methods.
Hussam Hamdan
Subjects: Computation and Language (cs.CL)
This paper presents Senti17 system which uses ten convolutional neural
networks (ConvNet) to assign a sentiment label to a tweet. The network consists
of a convolutional layer followed by a fully-connected layer and a Softmax on
top. Ten instances of this network are initialized with the same word
embeddings as inputs but with different initializations for the network
weights. We combine the results of all instances by selecting the sentiment
label given by the majority of the ten voters. This system is ranked fourth in
SemEval-2017 Task4 over 38 systems with 67.4%
Xingdi Yuan, Tong Wang, Caglar Gulcehre, Alessandro Sordoni, Philip Bachman, Sandeep Subramanian, Saizheng Zhang, Adam Trischler
Subjects: Computation and Language (cs.CL)
We propose a recurrent neural model that generates natural-language questions
from documents, conditioned on answers. We show how to train the model using a
combination of supervised and reinforcement learning. After teacher forcing for
standard maximum likelihood training, we fine-tune the model using policy
gradient techniques to maximize several rewards that measure question quality.
Most notably, one of these rewards is the performance of a question-answering
system. We motivate question generation as a means to improve the performance
of question answering systems. Our model is trained and evaluated on the recent
question-answering dataset SQuAD.
Jacob Devlin
Subjects: Computation and Language (cs.CL)
Attentional sequence-to-sequence models have become the new standard for
machine translation, but one challenge of such models is a significant increase
in training and decoding cost compared to phrase-based systems. Here, we focus
on efficient decoding, with a goal of achieving accuracy close the
state-of-the-art in neural machine translation (NMT), while achieving CPU
decoding speed/throughput close to that of a phrasal decoder.
We approach this problem from two angles: First, we describe several
techniques for speeding up an NMT beam search decoder, which obtain a 4.4x
speedup over a very efficient baseline decoder without changing the decoder
output. Second, we propose a simple but powerful network architecture which
uses an RNN (GRU/LSTM) layer at bottom, followed by a series of stacked
fully-connected layers applied at every timestep. This architecture achieves
similar accuracy to a deep recurrent model, at a small fraction of the training
and decoding cost. By combining these techniques, our best system achieves a
very competitive accuracy of 38.3 BLEU on WMT English-French NewsTest2014,
while decoding at 100 words/sec on single-threaded CPU. We believe this is the
best published accuracy/speed trade-off of an NMT system.
Xiaosong Wang, Yifan Peng, Le Lu, Zhiyong Lu, Mohammadhadi Bagheri, Ronald M. Summers
Comments: CVPR 2017 spotlight
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
The chest X-ray is one of the most commonly accessible radiological
examinations for screening and diagnosis of many lung diseases. A tremendous
number of X-ray imaging studies accompanied by radiological reports are
accumulated and stored in many modern hospitals’ Picture Archiving and
Communication Systems (PACS). On the other side, it is still an open question
how this type of hospital-size knowledge database containing invaluable imaging
informatics (i.e., loosely labeled) can be used to facilitate the data-hungry
deep learning paradigms in building truly large-scale high precision
computer-aided diagnosis (CAD) systems.
In this paper, we present a new chest X-ray database, namely “ChestX-ray8”,
which comprises 108,948 frontal-view X-ray images of 32,717 unique patients
with the text-mined eight disease image labels (where each image can have
multi-labels), from the associated radiological reports using natural language
processing. Importantly, we demonstrate that these commonly occurring thoracic
diseases can be detected and even spatially-located via a unified
weakly-supervised multi-label image classification and disease localization
framework, which is validated using our proposed dataset. Although the initial
quantitative results are promising as reported, deep convolutional neural
network based “reading chest X-rays” (i.e., recognizing and locating the common
disease patterns trained with only image-level labels) remains a strenuous task
for fully-automated high precision CAD systems.
Tesfamariam M. Abuhay, Sergey V. Kovalchuk, Klavdiya O. Bochenina, George Kampis, Valeria V. Krzhizhanovskaya, Michael H. Lees
Comments: Accepted by International Conference on Computational Science (ICCS) 2017 which will be held in Zurich, Switzerland from June 11-June 14
Subjects: Digital Libraries (cs.DL); Computation and Language (cs.CL); Information Retrieval (cs.IR); Social and Information Networks (cs.SI)
This paper presents results of topic modeling and network models of topics
using the International Conference on Computational Science corpus, which
contains domain-specific (computational science) papers over sixteen years (a
total of 5695 papers). We discuss topical structures of International
Conference on Computational Science, how these topics evolve over time in
response to the topicality of various problems, technologies and methods, and
how all these topics relate to one another. This analysis illustrates
multidisciplinary research and collaborations among scientific communities, by
constructing static and dynamic networks from the topic modeling results and
the keywords of authors. The results of this study give insights about the past
and future trends of core discussion topics in computational science. We used
the Non-negative Matrix Factorization topic modeling algorithm to discover
topics and labeled and grouped results hierarchically.
Michael Axtmann, Sascha Witt, Daniel Ferizovic, Peter Sanders
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
We present a sorting algorithm that works in-place, executes in parallel, is
cache-optimal, avoids branch-mispredictions, and performs work O(n log n) for
arbitrary inputs with high probability. We ran extensive experiments and show
that our algorithm scales linearly in the number of cores on various
multi-socket machines with 32 cores. On large inputs, we outperform our closest
in-place competitor by a factor of 2.25 to 2.53 and our closest non-in-place
competitor by a factor of 1.26 to 1.89. Even sequentially executed, we
outperform our closest sequential competitor, BlockQuicksort, by a factor of
1.19 to 1.23 for large inputs.
Karl Bringmann, Sebastian Krinninger
Subjects: Data Structures and Algorithms (cs.DS); Distributed, Parallel, and Cluster Computing (cs.DC)
We revisit the hardness of approximating the diameter of a network. In the
CONGEST model, ( ilde Omega (n) ) rounds are necessary to compute the
diameter [Frischknecht et al. SODA’12]. Abboud et al. DISC 2016 extended this
result to sparse graphs and, at a more fine-grained level, showed that, for any
integer ( 1 leq ell leq operatorname{polylog} (n) ), distinguishing between
networks of diameter ( 4 ell + 2 ) and ( 6 ell + 1 ) requires ( ilde Omega
(n) ) rounds. We slightly tighten this result by showing that even
distinguishing between diameter ( 2 ell + 1 ) and ( 3 ell + 1 ) requires (
ilde Omega (n) ) rounds. The reduction of Abboud et al. is inspired by
recent conditional lower bounds in the RAM model, where the orthogonal vectors
problem plays a pivotal role. In our new lower bound, we make the connection to
orthogonal vectors explicit, leading to a conceptually more streamlined
exposition. This is suited for teaching both the lower bound in the CONGEST
model and the conditional lower bound in the RAM model.
Francesco Grassi, Andreas Loukas, Nathanaël Perraudin, Benjamin Ricaud
Subjects: Learning (cs.LG)
An emerging way to deal with high-dimensional non-euclidean data is to assume
that the underlying structure can be captured by a graph. Recently, ideas have
begun to emerge related to the analysis of time-varying graph signals. This
work aims to elevate the notion of joint harmonic analysis to a full-fledged
framework denoted as Time-Vertex Signal Processing, that links together the
time-domain signal processing techniques with the new tools of graph signal
processing. This entails three main contributions: (a) We provide a formal
motivation for harmonic time-vertex analysis as an analysis tool for the state
evolution of simple Partial Differential Equations on graphs. (b) We improve
the accuracy of joint filtering operators by up-to two orders of magnitude. (c)
Using our joint filters, we construct time-vertex dictionaries analyzing the
different scales and the local time-frequency content of a signal. The utility
of our tools is illustrated in numerous applications and datasets, such as
dynamic mesh denoising and classification, still-video inpainting, and source
localization in seismic events. Our results suggest that joint analysis of
time-vertex signals can bring benefits to regression and learning.
Nadav Cohen, Or Sharir, Ronen Tamari, David Yakira, Yoav Levine, Amnon Shashua
Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
The driving force behind convolutional networks – the most successful deep
learning architecture to date, is their expressive power. Despite its wide
acceptance and vast empirical evidence, formal analyses supporting this belief
are scarce. The primary notions for formally reasoning about expressiveness are
efficiency and inductive bias. Efficiency refers to the ability of a network
architecture to realize functions that require an alternative architecture to
be much larger. Inductive bias refers to the prioritization of some functions
over others given prior knowledge regarding a task at hand. In this paper we
provide a high-level overview of a series of works written by the authors, that
through an equivalence to hierarchical tensor decompositions, analyze the
expressive efficiency and inductive bias of various architectural features in
convolutional networks (depth, width, convolution strides and more). The
results presented shed light on the demonstrated effectiveness of convolutional
networks, and in addition, provide new tools for network design.
Marek Śmieja, Jacek Tabor
Subjects: Learning (cs.LG)
Gaussian mixture model is very useful in many practical problems.
Nevertheless, it cannot be directly generalized to non Euclidean spaces. To
overcome this problem we present a spherical Gaussian-based clustering approach
for partitioning data sets with respect to arbitrary dissimilarity measure. The
proposed method is a combination of spherical Cross-Entropy Clustering with a
generalized Wards approach. The algorithm finds the optimal number of clusters
by automatically removing groups which carry no information. Moreover, it is
scale invariant and allows for forming of spherically-shaped clusters of
arbitrary sizes. In order to graphically represent and interpret the results
the notion of Voronoi diagram was generalized to non Euclidean spaces and
applied for introduced clustering method.
Lovedeep Gondara
Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
Machine learning models, especially based on deep learning are used in
everyday applications ranging from self driving cars to medical diagnostics.
However, it is easy to trick such models using adversarial samples,
indistinguishable from real samples to human eye, such samples can bias models
towards incorrect classification. Impact of adversarial samples is far-reaching
and efficient detection of adversarial samples remains an open problem. In this
paper we propose to use direct density ratio estimation to detect adversarial
samples, we empirically show that adversarial samples have different underlying
probability densities compared to real samples. Our proposed method works well
with colored and grayscale images, and with different adversarial sample
generation methods.
Sebastian Brarda, Philip Yeres, Samuel R. Bowman
Comments: 4 pages
Subjects: Computation and Language (cs.CL); Learning (cs.LG)
In this paper we propose a neural network model with a novel Sequential
Attention layer that extends soft attention by assigning weights to words in an
input sequence in a way that takes into account not just how well that word
matches a query, but how well surrounding words match. We evaluate this
approach on the task of reading comprehension (Who did What and CNN datasets)
and show that it dramatically improves a strong baseline like the Stanford
Reader. The resulting model is competitive with the state of the art.
Neil D. Lawrence
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Learning (cs.LG)
Application of models to data is fraught. Data-generating collaborators often
only have a very basic understanding of the complications of collating,
processing and curating data. Challenges include: poor data collection
practices, missing values, inconvenient storage mechanisms, intellectual
property, security and privacy. All these aspects obstruct the sharing and
interconnection of data, and the eventual interpretation of data through
machine learning or other approaches. In project reporting, a major challenge
is in encapsulating these problems and enabling goals to be built around the
processing of data. Project overruns can occur due to failure to account for
the amount of time required to curate and collate. But to understand these
failures we need to have a common language for assessing the readiness of a
particular data set. This position paper proposes the use of data readiness
levels: it gives a rough outline of three stages of data preparedness and
speculates on how formalisation of these levels into a common language for data
readiness could facilitate project management.
Michel Besserve, Naji Shajarisales, Bernhard Schölkopf, Dominik Janzing
Comments: 16 pages, 6 figures
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG); Statistics Theory (math.ST)
The postulate of independence of cause and mechanism (ICM) has recently led
to several new causal discovery algorithms. The interpretation of independence
and the way it is utilized, however, varies across these methods. Our aim in
this paper is to propose a group theoretic framework for ICM to unify and
generalize these approaches. In our setting, the cause-mechanism relationship
is assessed by comparing it against a null hypothesis through the application
of random generic group transformations. We show that the group theoretic view
provides a very general tool to study the structure of data generating
mechanisms with direct applications to machine learning.
Cheng-Hao Cai
Comments: 12 pages, 5 figures
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Logic in Computer Science (cs.LO)
This paper introduces an SLD-resolution technique based on deep learning.
This technique enables neural networks to learn from old and successful
resolution processes and to use learnt experiences to guide new resolution
processes. An implementation of this technique is named SLDR-DL. It includes a
Prolog library of deep feedforward neural networks and some essential functions
of resolution. In the SLDR-DL framework, users can define logical rules in the
form of definite clauses and teach neural networks to use the rules in
reasoning processes.
Vatsal Shah, Nikhil Rao, Weicong Ding
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
The problem of predicting unobserved entries of a partially observed matrix
has found wide applicability in several areas, such as recommender systems,
computational biology, and computer vision. Many scalable methods with rigorous
theoretical guarantees have been developed for algorithms where the matrix is
factored into low-rank components, and embeddings are learned for the row and
column variables. While there has been recent research on incorporating
explicit side information in the low-rank matrix factorization setting, often
implicit information can be gleaned from the data, via higher order
interactions among variables. In this paper, we design a method to make use of
this implicit information, via random walks on graphs. We show that the problem
we intend to solve can be cast as factoring a nonlinear transform of the
(partially) observed matrix and develop an efficient coordinate descent based
algorithm for the same. Experiments on several datasets show that the method we
propose outperforms vanilla matrix factorization, and also those methods that
use explicitly available side information.
Yu Chen, Mohammed J. Zaki
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
Autoencoders have been successful in learning meaningful representations from
image datasets. However, their performance on text datasets has not been widely
studied. Traditional autoencoders tend to learn possibly trivial
representations of text documents due to their confounding properties such as
high-dimensionality, sparsity and power-law word distributions. In this paper,
we propose a novel k-competitive autoencoder, called KATE, for text documents.
Due to the competition between the neurons in the hidden layer, each neuron
becomes specialized in recognizing specific data patterns, and overall the
model can learn meaningful representations of textual data. A comprehensive set
of experiments show that KATE can learn better representations than traditional
autoencoders including denoising, contractive, variational, and k-sparse
autoencoders. Our model also outperforms deep generative models, probabilistic
topic models, and even word representation models (e.g., Word2Vec) in terms of
several downstream tasks such as document classification, regression, and
retrieval.
Hien To, Sumeet Agrawal, Seon Ho Kim, Cyrus Shahabi
Subjects: Information Retrieval (cs.IR); Learning (cs.LG)
Social media such as tweets are emerging as platforms contributing to
situational awareness during disasters. Information shared on Twitter by both
affected population (e.g., requesting assistance, warning) and those outside
the impact zone (e.g., providing assistance) would help first responders,
decision makers, and the public to understand the situation first-hand.
Effective use of such information requires timely selection and analysis of
tweets that are relevant to a particular disaster. Even though abundant tweets
are promising as a data source, it is challenging to automatically identify
relevant messages since tweet are short and unstructured, resulting to
unsatisfactory classification performance of conventional learning-based
approaches. Thus, we propose a simple yet effective algorithm to identify
relevant messages based on matching keywords and hashtags, and provide a
comparison between matching-based and learning-based approaches. To evaluate
the two approaches, we put them into a framework specifically proposed for
analyzing disaster-related tweets. Analysis results on eleven datasets with
various disaster types show that our technique provides relevant tweets of
higher quality and more interpretable results of sentiment analysis tasks when
compared to learning approach.
Amr Abdelaziz, C. Emre Koksal
Comments: Submitted to IEEE CNS 2017
Subjects: Information Theory (cs.IT)
Fundamental limits of covert communication have been studied in literature
for different models of scalar channels. It was shown that, over (n)
independent channel uses, (mathcal{O}(sqrt{n})) bits can transmitted reliably
over a public channel while achieving an arbitrarily low probability of
detection (LPD) by other stations. This result is well known as square-root law
and even to achieve this diminishing rate of covert communication, some form of
shared secret is needed between the transmitter and the receiver. In this
paper, we establish the limits of LPD communication over the MIMO AWGN channel.
We define the notion of (epsilon)-probability of detection ((epsilon)-PD) and
provide a formulation to evaluate the maximum achievable rate under the
(epsilon)-PD constraint. Then, we show that the optimal input distribution is
zero mean Gaussian distribution. Assuming channel state information (CSI) about
the main channel only at the transmitter, we derive the optimal input
covariance matrix, hence, establishing the (epsilon)-PD capacity. We evaluate
(epsilon)-PD rates in the limiting regimes for the number of channel uses
(asymptotic block length) and the number of antennas (massive MIMO). We show
that, while the SRL still holds for the MIMO AWGN, the number of bits that can
be transmitted covertly scales exponentially with the number of transmitting
antennas. We establish the condition on the number of transmitting antennas
required to achieve a non-diminishing rate with LPD. The practical implication
of our result is that, MIMO has the potential to provide a substantial increase
in the file sizes that can be covertly communicated subject to a reasonably low
delay.
Rocio Arroyo-Valles, Andrea Simonetto, Geert Leus
Comments: 27 pages, 17 figures
Subjects: Information Theory (cs.IT)
In wireless sensor networks, where energy is scarce, it is inefficient to
have all nodes active because they consume a non-negligible amount of battery.
In this paper we consider the problem of jointly selecting sensors, relays and
links in a wireless sensor network where the active sensors need to communicate
their measurements to one or multiple access points. Information messages are
routed stochastically in order to capture the inherent reliability of the
broadcast links via multiple hops, where the nodes may be acting as sensors or
as relays. We aim at finding optimal sparse solutions where both, the
consistency between the selected subset of sensors, relays and links, and the
graph connectivity in the selected subnetwork are guaranteed. Furthermore,
active nodes should ensure a network performance in a parameter estimation
scenario. Two problems are studied: sensor and link selection; and sensor,
relay and link selection. To solve such problems, we present tractable
optimization formulations and propose two algorithms that satisfy the previous
network requirements. We also explore an extension scenario: only link
selection. Simulation results show the performance of the algorithms and
illustrate how they provide a sparse solution, which not only saves energy but
also guarantees the network requirements.
Alireza Sheikh, Alexandre Graell i Amat, Gianluigi Liva
Subjects: Information Theory (cs.IT)
We consider probabilistic shaping to maximize the achievable information rate
of coded modulation (CM) with hard decision decoding. The proposed scheme using
binary staircase codes outperforms its uniform CM counterpart by more than 1.3
dB for 64-QAM and 5 bits/symbol.
Annina Bracher, Amos Lapidoth, Christoph Pfister
Comments: 5 pages; accepted at ISIT 2017
Subjects: Information Theory (cs.IT)
The rate region of the task-encoding problem for two correlated sources is
characterized using a novel parametric family of dependence measures. The
converse uses a new expression for the (
ho)-th moment of the list size, which
is derived using the relative (alpha)-entropy.
Saeed Vahidian, Maryam Najafi, Marzieh Najafi, Fawaz S. Al-Qahtani
Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)
In this paper, we investigate the performance of a dual-hop block fading
cognitive radio network with underlay spectrum sharing over independent but not
necessarily identically distributed (i.n.i.d.) Nakagami-(m) fading channels.
The primary network consists of a source and a destination. Depending on
whether the secondary network which consists of two source nodes have a single
relay for cooperation or multiple relays thereby employs opportunistic relay
selection for cooperation and whether the two source nodes suffer from the
primary users’ (PU) interference, two cases are considered in this paper, which
are referred to as Scenario (a) and Scenario (b), respectively. For the
considered underlay spectrum sharing, the transmit power constraint of the
proposed system is adjusted by interference limit on the primary network and
the interference imposed by primary user (PU). The developed new analysis
obtains new analytical results for the outage capacity (OC) and average symbol
error probability (ASEP). In particular, for Scenario (a), tight lower bounds
on the OC and ASEP of the secondary network are derived in closed-form. In
addition, a closed from expression for the end-to-end OC of Scenario (a) is
achieved. With regards to Scenario (b), a tight lower bound on the OC of the
secondary network is derived in closed-form. All analytical results are
corroborated using Monte Carlo simulation method.
Yan Sun, Derrick Wing Kwan Ng, Robert Schober
Comments: Invited presentation at SPAWC 2017, Japan
Subjects: Information Theory (cs.IT)
In this paper, we study the resource allocation for an orthogonal frequency
division multiple access (OFDMA) radio system employing a full-duplex base
station for serving multiple half-duplex downlink and uplink users
simultaneously. The resource allocation design objective is the maximization of
the weighted system throughput while limiting the information leakage to
guarantee secure simultaneous downlink and uplink transmission in the presence
of potential eavesdroppers. The algorithm design leads to a mixed combinatorial
non-convex optimization problem and obtaining the globally optimal solution
entails a prohibitively high computational complexity. Therefore, an efficient
successive convex approximation based suboptimal iterative algorithm is
proposed. Our simulation results confirm that the proposed suboptimal algorithm
achieves a significant performance gain compared to two baseline schemes.
Mansi Peer, Vivek Ashok Bohara, Neha Jain
Subjects: Information Theory (cs.IT)
This paper presents a device-to-device (D2D) user selection protocol wherein
multiple D2D pairs coexist with a cellular network. In the developed framework,
certain D2D users harvest energy and share the spectrum of the cellular users
by adopting a hybrid time switching and power splitting protocol. The D2D user
which harvests the maximum energy and achieves the desired target rate for the
cellular communication is selected to serve as a decode-and-forward (DF) relay
for the cellular user. The proposed work analyzes the impact of increase in the
number of D2D users on the performance of cellular user as well as derives an
upper bound on the time duration of energy harvesting within which best
possible rate for cellular user can be obtained. The performance of the
proposed protocol has been quantified by obtaining the closed form expressions
of outage probability.
Pascal Giard, Alexios Balatsoukas-Stimming, Andreas Burg
Comments: 6 pages, 8 figures, submitted to IEEE Int. Workshop on Signal Process. Syst. (SiPS) 2017
Subjects: Information Theory (cs.IT)
Polar codes were recently chosen to protect the control channel information
in the next-generation mobile communication standard (5G) defined by the 3GPP.
As a result, receivers will have to implement blind decoding of polar coded
frames in order to keep complexity, latency, and power consumption tractable.
As a newly proposed class of block codes, to our knowledge, the blind detection
of polar codes has not been addressed yet. In this work, we propose a
low-complexity blind-detection algorithm for polar-encoded frames. We base this
algorithm on a novel detection metric with update rules that leverage the a
priori knowledge of the frozen-bit locations, exploiting the inherent
structures that these locations impose on a polar-encoded block of data. We
show that the proposed detection metric allows to clearly distinguish
polar-encoded frames from other types of data by investigating the cumulative
distribution functions of the detection metric, and the receiver operating
characteristic. The results presented are tailored to the 5G standardization
effort discussions, i.e., for a short low-rate polar code concatenated with a
CRC.
Adam Greig, Ramji Venkataramanan
Comments: 21 pages, 14 figures
Subjects: Information Theory (cs.IT)
Sparse superposition codes are a recent class of codes introduced by Barron
and Joseph for efficient communication over the AWGN channel. With an
appropriate power allocation, these codes have been shown to be asymptotically
capacity-achieving with computationally feasible decoding. However, a direct
implementation of the capacity-achieving construction does not give good finite
length error performance, and has unnecessarily high computational complexity.
In this paper, we consider sparse superposition codes with approximate message
passing (AMP) decoding, and describe a variety of techniques to improve their
finite length performance and decrease computational complexity. These include
using implicit Walsh-Hadamard design matrices, optimizing SPARC power
allocations and codebook parameters, and estimating a critical decoding
parameter online instead of pre-computation. Finally we compare the error
performance of AMP-decoded sparse superposition codes with coded modulation
using LDPC codes from the WiMAX standard.
Ahmed Ewaisha, Cihan Tepedelenlioglu
Comments: Submitted to Asilomar 2017. arXiv admin note: text overlap with arXiv:1612.08326
Subjects: Information Theory (cs.IT)
We consider a joint scheduling-and-power-allocation problem of a downlink
cellular system. The system consists of two groups of users: real-time (RT) and
non-real-time (NRT) users. Given an average power constraint on the base
station, the problem is to find an algorithm that satisfies the RT hard
deadline constraint and NRT queue stability constraint. We propose a
sum-rate-maximizing algorithm that satisfies these constraints. We also show,
through simulations, that the proposed algorithm has an average complexity that
is close-to-linear in the number of RT users. The power allocation policy in
the proposed algorithm has a closed-form expression for the two groups of
users. However, interestingly, the power policy of the RT users differ in
structure from that of the NRT users. We also show the superiority of the
proposed algorithms over existing approaches using extensive simulations.
Venkatesan Guruswami, Ray Li
Comments: arXiv admin note: substantial text overlap with arXiv:1612.06335
Subjects: Information Theory (cs.IT); Data Structures and Algorithms (cs.DS)
In the random deletion channel, each bit is deleted independently with
probability (p). For the random deletion channel, the existence of codes of
rate ((1-p)/9), and thus bounded away from (0) for any (p < 1), has been known.
We give an explicit construction with polynomial time encoding and deletion
correction algorithms with rate (c_0 (1-p)) for an absolute constant (c_0 > 0).
Gilles Puy, Patrick Pérez
Subjects: Social and Information Networks (cs.SI); Information Theory (cs.IT)
This work concerns sampling of smooth signals on arbitrary graphs. We first
study a structured sampling strategy for such smooth graph signals that
consists of a random selection of few pre-defined groups of nodes. The number
of groups to sample to stably embed the set of (k)-bandlimited signals is
driven by a quantity called the emph{group} graph cumulative coherence. For
some optimised sampling distributions, we show that sampling (O(klog(k)))
groups is always sufficient to stably embed the set of (k)-bandlimited signals
but that this number can be smaller — down to (O(log(k))) — depending on the
structure of the groups of nodes. Fast methods to approximate these sampling
distributions are detailed. Second, we consider (k)-bandlimited signals that
are nearly piecewise constant over pre-defined groups of nodes. We show that it
is possible to speed up the reconstruction of such signals by reducing
drastically the dimension of the vectors to reconstruct. When combined with the
proposed structured sampling procedure, we prove that the method provides
stable and accurate reconstruction of the original signal. Finally, we present
numerical experiments that illustrate our theoretical results and, as an
example, show how to combine these methods for interactive object segmentation
in an image using superpixels.
Helmut Bölcskei, Philipp Grohs, Gitta Kutyniok, Philipp Petersen
Subjects: Learning (cs.LG); Information Theory (cs.IT); Functional Analysis (math.FA)
We derive fundamental lower bounds on the connectivity and the memory
requirements of deep neural networks guaranteeing uniform approximation rates
for arbitrary function classes in (L^2(mathbb{R}^d)). In other words, we
establish a connection between the complexity of a function class and the
complexity of deep neural networks approximating functions from this class to
within a prescribed accuracy.
Additionally, we prove that our lower bounds are achievable for a broad
family of function classes. Specifically, all function classes that are
optimally approximated by a general class of representation systems—so-called
emph{affine systems}—can be approximated by deep neural networks with
minimal connectivity and memory requirements. Affine systems encompass a wealth
of representation systems from applied harmonic analysis such as wavelets,
ridgelets, curvelets, shearlets, (alpha)-shearlets, and more generally
(alpha)-molecules. This result elucidates a remarkable universality property
of neural networks and shows that they achieve the optimum approximation
properties of all affine systems combined. As a specific example, we consider
the class of (1/alpha)-cartoon-like functions, which is approximated optimally
by (alpha)-shearlets.
We also explain how our results can be extended to the case of functions on
low-dimensional immersed manifolds.
Finally, we present numerical experiments demonstrating that the standard
stochastic gradient descent algorithm generates deep neural networks providing
close-to-optimal approximation rates at minimal connectivity. Moreover, these
results show that stochastic gradient descent actually learns approximations
that are sparse in the representation systems optimally sparsifying the
function class the network is trained on.