Hendrik Richter
Subjects: Chaotic Dynamics (nlin.CD); Neural and Evolutionary Computing (cs.NE); Populations and Evolution (q-bio.PE)
The paper deals with using chaos to direct trajectories to targets and
analyzes ruggedness and fractality of the resulting fitness landscapes. The
targeting problem is formulated as a dynamic fitness landscape and four
different chaotic maps generating such a landscape are studied. By using a
computational approach, we analyze properties of the landscapes and quantify
their fractal and rugged characteristics. In particular, it is shown that
ruggedness measures such as correlation length and information content are
scale-invariant and self-similar.
Zijun Wu, Rolf Moehring, Jianhui Lai
Comments: 38 pages, 7 figures
Subjects: Data Structures and Algorithms (cs.DS); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
This article analyzes the stochastic runtime of a Cross-Entropy Algorithm on
two classes of traveling salesman problems. The algorithm shares main features
of the famous Max-Min Ant System with iteration-best reinforcement.
For simple instances that have a ({1,n})-valued distance function and a
unique optimal solution, we prove a stochastic runtime of (O(n^{6+epsilon}))
with the vertex-based random solution generation, and a stochastic runtime of
(O(n^{3+epsilon}ln n)) with the edge-based random solution generation for an
arbitrary (epsilonin (0,1)). These runtimes are very close to the known
expected runtime for variants of Max-Min Ant System with best-so-far
reinforcement. They are obtained for the stronger notion of stochastic runtime,
which means that an optimal solution is obtained in that time with an
overwhelming probability, i.e., a probability tending exponentially fast to one
with growing problem size.
We also inspect more complex instances with (n) vertices positioned on an
(m imes m) grid. When the (n) vertices span a convex polygon, we obtain a
stochastic runtime of (O(n^{3}m^{5+epsilon})) with the vertex-based random
solution generation, and a stochastic runtime of (O(n^{2}m^{5+epsilon})) for
the edge-based random solution generation. When there are (k = O(1)) many
vertices inside a convex polygon spanned by the other (n-k) vertices, we obtain
a stochastic runtime of (O(n^{4}m^{5+epsilon}+n^{6k-1}m^{epsilon})) with the
vertex-based random solution generation, and a stochastic runtime of
(O(n^{3}m^{5+epsilon}+n^{3k}m^{epsilon})) with the edge-based random solution
generation. These runtimes are better than the expected runtime for the
so-called ((mu!+!lambda)) EA reported in a recent article, and again
obtained for the stronger notion of stochastic runtime.
Cewu Lu, Hao Su, Yongyi Lu, Li Yi, Chikeung Tang, Leonidas Guibas
Comments: 9 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Important high-level vision tasks such as human-object interaction, image
captioning and robotic manipulation require rich semantic descriptions of
objects at part level. Based upon previous work on part localization, in this
paper, we address the problem of inferring rich semantics imparted by an object
part in still images. We propose to tokenize the semantic space as a discrete
set of part states. Our modeling of part state is spatially localized,
therefore, we formulate the part state inference problem as a pixel-wise
annotation problem. An iterative part-state inference neural network is
specifically designed for this task, which is efficient in time and accurate in
performance. Extensive experiments demonstrate that the proposed method can
effectively predict the semantic states of parts and simultaneously correct
localization errors, thus benefiting a few visual understanding applications.
The other contribution of this paper is our part state dataset which contains
rich part-level semantic annotations.
Pavel Tokmakov, Karteek Alahari, Cordelia Schmid
Subjects: Computer Vision and Pattern Recognition (cs.CV)
The problem of determining whether an object is in motion, irrespective of
the camera motion, is far from being solved. We address this challenging task
by learning motion patterns in videos. The core of our approach is a fully
convolutional network, which is learnt entirely from synthetic video sequences,
and their ground-truth optical flow and motion segmentation. This
encoder-decoder style architecture first learns a coarse representation of the
optical flow field features, and then refines it iteratively to produce motion
labels at the original high-resolution. The output label of each pixel denotes
whether it has undergone independent motion, i.e., irrespective of the camera
motion. We demonstrate the benefits of this learning framework on the moving
object segmentation task, where the goal is to segment all the objects in
motion. To this end we integrate an objectness measure into the framework. Our
approach outperforms the top method on the recently released DAVIS benchmark
dataset, comprising real-world sequences, by 5.6%. We also evaluate on the
Berkeley motion segmentation database, achieving state-of-the-art results.
Kyunghyun Paeng, Sangheum Hwang, Sunggyun Park, Minsoo Kim, Seokhwi Kim
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Predicting tumor proliferation scores is an important biomarker indicative of
breast cancer patients’ prognosis. In this paper, we present a unified
framework to predict tumor proliferation scores from whole slide images in
breast histopathology. The proposed system is offers a fully automated solution
to predicting both a molecular data based, and a mitosis counting based tumor
proliferation score. The framework integrates three modules, each fine-tuned to
maximize the overall performance: an image processing component for handling
whole slide images, a deep learning based mitosis detection network, and a
proliferation scores prediction module. We have achieved 0.567 quadratic
weighted Cohen’s kappa in mitosis counting based score prediction and 0.652
F1-score in mitosis detection. On Spearman’s correlation coefficient, which
evaluates prediction on the molecular data based score, the system obtained
0.6171. Our system won first place in all of the three tasks in Tumor
Proliferation Assessment Challenge at MICCAI 2016, outperforming all other
approaches.
Kun Sun, Wenbing Tao
Comments: this manuscript has been submitted to cvpr 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Accuracy and efficiency are two key problems in large scale incremental
Structure from Motion (SfM). In this paper, we propose a unified framework to
divide the image set into clusters suitable for reconstruction as well as find
multiple reliable and stable starting points. Image partitioning performs in
two steps. First, some small image groups are selected at places with high
image density, and then all the images are clustered according to their optimal
reconstruction paths to these image groups. This promises that the scene is
always reconstructed from dense places to sparse areas, which can reduce error
accumulation when images have weak overlap. To enable faster speed, images
outside the selected group in each cluster are further divided to achieve a
greater degree of parallelism. Experiments show that our method achieves
significant speedup, higher accuracy and better completeness.
Bin Bai, Jianbin Liu, Yu Zhou, Songlin Zhang, Yuchen He, Zhuo Xu
Subjects: Computer Vision and Pattern Recognition (cs.CV); Optics (physics.optics)
We have designed a single-pixel camera with imaging around corners based on
computational ghost imaging. It can obtain the image of an object when the
camera cannot look at the object directly. Our imaging system explores the fact
that a bucket detector in a ghost imaging setup has no spatial resolution
capability. A series of experiments have been designed to confirm our
predictions. This camera has potential applications for imaging around corner
or other similar environments where the object cannot be observed directly.
Yaman Umuroglu, Nicholas J. Fraser, Giulio Gambardella, Michaela Blott, Philip Leong, Magnus Jahre, Kees Vissers
Comments: To appear in the 25th International Symposium on Field-Programmable Gate Arrays, February 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV); Hardware Architecture (cs.AR); Learning (cs.LG)
Research has shown that convolutional neural networks contain significant
redundancy, and high classification accuracy can be obtained even when weights
and activations are reduced from floating point to binary values. In this
paper, we present FINN, a framework for building fast and flexible FPGA
accelerators using a flexible heterogeneous streaming architecture. By
utilizing a novel set of optimizations that enable efficient mapping of
binarized neural networks to hardware, we implement fully connected,
convolutional and pooling layers, with per-layer compute resources being
tailored to user-provided throughput requirements. On a ZC706 embedded FPGA
platform drawing less than 25 W total system power, we demonstrate up to 12.3
million image classifications per second with 0.31 {mu}s latency on the MNIST
dataset with 95.8% accuracy, and 21906 image classifications per second with
283 {mu}s latency on the CIFAR-10 and SVHN datasets with respectively 80.1%
and 94.9% accuracy. To the best of our knowledge, ours are the fastest
classification rates reported to date on these benchmarks.
Jiuxiang Gu, Gang Wang, Tsuhan Chen
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
In this paper, we propose a Recurrent Highway Network with Language CNN for
image caption generation. Our network consists of three sub-networks: the deep
Convolutional Neural Network for image representation, the Convolutional Neural
Network for language modeling, and the Multimodal Recurrent Highway Network for
sequence prediction. Our proposed model can naturally exploit the hierarchical
and temporal structure of history words, which are critical for image caption
generation. The effectiveness of our model is validated on two datasets MS COCO
and Flickr30K. Our extensive experiment results show that our method is
competitive with the state-of-the-art methods.
Alex Zwanenburg, Stefan Leger, Martin Vallières, Steffen Löck, for the Image Biomarker Standardisation Initiative
Comments: 59 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
While analysis of medical images has practically taken place since the first
image was recorded, high throughput analysis of medical images is a more recent
phenomenon. The aim of such a radiomics process is to provide decision support
based on medical imaging. Part of the radiomics process is the conversion of
image data into numerical features which capture different medical image
aspects, and can be subsequently correlated as biomarkers to e.g. expected
oncological treatment outcome.
With the growth of the radiomics field, it has become clear that results are
often difficult to reproduce, that standards for image processing and feature
extraction are missing, and that reporting guidelines are absent. The image
biomarker standardisation initiative (IBSI) seeks to address these issues. The
current document provides definitions for a large number of image features.
Dotan Kaufman, Gil Levi, Tal Hassner, Lior Wolf
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We present a general approach to video understanding, inspired by semantic
transfer techniques successfully used for 2D image understanding. Our method
considers a video to be a 1D sequence of clips, each one associated with its
own semantics. The nature of these semantics — natural language captions or
other labels — depends on the task at hand. A test video is processed by
forming correspondences between its clips and the clips of reference videos
with known semantics, following which, reference semantics can be transferred
to the test video. We describe two matching methods, both designed to ensure
that (a) reference clips appear similar to test clips and (b), taken together,
the semantics of selected reference clips is consistent and maintains temporal
coherence. We use our method for video captioning on the LSMDC’16 benchmark and
video summarization on the SumMe benchmark. In both cases, our method not only
surpasses state of the art results, but importantly, it is the only method we
know of that was successfully applied to both video understanding tasks.
Fei Xiaoxiao, Tanaka Kanji, Inamoto Kouya
Comments: Technical Report, 5 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In this study, we explore the use of deep convolutional neural networks
(DCNNs) in visual place classification for robotic mapping and localization. An
open question is how to partition the robot’s workspace into places to maximize
the performance (e.g., accuracy, precision, recall) of potential DCNN
classifiers. This is a chicken and egg problem: If we had a well-trained DCNN
classifier, it is rather easy to partition the robot’s workspace into places,
but the training of a DCNN classifier requires a set of pre-defined place
classes. In this study, we address this problem and present several strategies
for unsupervised discovery of place classes (“time cue,” “location cue,”
“time-appearance cue,” and “location-appearance cue”). We also evaluate the
efficacy of the proposed methods using the publicly available University of
Michigan North Campus Long-Term (NCLT) Dataset.
Subarna Tripathi, Brian Guenter
Comments: Accepted for publication in WACV 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We present a novel, automatic eye gaze tracking scheme inspired by smooth
pursuit eye motion while playing mobile games or watching virtual reality
contents. Our algorithm continuously calibrates an eye tracking system for a
head mounted display. This eliminates the need for an explicit calibration step
and automatically compensates for small movements of the headset with respect
to the head. The algorithm finds correspondences between corneal motion and
screen space motion, and uses these to generate Gaussian Process Regression
models. A combination of those models provides a continuous mapping from
corneal position to screen space position. Accuracy is nearly as good as
achieved with an explicit calibration step.
Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Li Fei-Fei, C. Lawrence Zitnick, Ross Girshick
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Learning (cs.LG)
When building artificial intelligence systems that can reason and answer
questions about visual data, we need diagnostic tests to analyze our progress
and discover shortcomings. Existing benchmarks for visual question answering
can help, but have strong biases that models can exploit to correctly answer
questions without reasoning. They also conflate multiple sources of error,
making it hard to pinpoint model weaknesses. We present a diagnostic dataset
that tests a range of visual reasoning abilities. It contains minimal biases
and has detailed annotations describing the kind of reasoning each question
requires. We use this dataset to analyze a variety of modern visual reasoning
systems, providing novel insights into their abilities and limitations.
Angeliki Lazaridou, Alexander Peysakhovich, Marco Baroni
Comments: Under submission at ICLR 2017
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Computer Science and Game Theory (cs.GT); Learning (cs.LG); Multiagent Systems (cs.MA)
The current mainstream approach to train natural language systems is to
expose them to large amounts of text. This passive learning is problematic if
we are interested in developing interactive machines, such as conversational
agents. We propose a framework for language learning that relies on multi-agent
communication. We study this learning in the context of referential games. In
these games, a sender and a receiver see a pair of images. The sender is told
one of them is the target and is allowed to send a message from a fixed,
arbitrary vocabulary to the receiver. The receiver must rely on this message to
identify the target. Thus, the agents develop their own language interactively
out of the need to communicate. We show that two networks with simple
configurations are able to learn to coordinate in the referential game. We
further explore how to make changes to the game environment to cause the “word
meanings” induced in the game to better reflect intuitive semantic properties
of the images. In addition, we present a simple strategy for grounding the
agents’ code into natural language. Both of these are necessary steps towards
developing machines that are able to communicate with humans productively.
Ketan Rajawat, Sandeep Kumar
Subjects: Optimization and Control (math.OC); Computer Vision and Pattern Recognition (cs.CV)
Multidimensional scaling (MDS) is a popular dimensionality reduction
techniques that has been widely used for network visualization and cooperative
localization. However, the traditional stress minimization formulation of MDS
necessitates the use of batch optimization algorithms that are not scalable to
large-sized problems. This paper considers an alternative stochastic stress
minimization framework that is amenable to incremental and distributed
solutions. A novel linear-complexity stochastic optimization algorithm is
proposed that is provably convergent and simple to implement. The applicability
of the proposed algorithm to localization and visualization tasks is also
expounded. Extensive tests on synthetic and real datasets demonstrate the
efficacy of the proposed algorithm.
Aleksander Lodwich
Comments: 60 pages, 55 figures, gray literature
Subjects: Artificial Intelligence (cs.AI)
The raise of complexity of technical systems also raises knowledge required
to set them up and to maintain them. The cost to evolve such systems can be
prohibitive. In the field of Autonomic Computing, technical systems should
therefore have various self-healing capabilities allowing system owners to
provide only partial, potentially inconsistent updates of the system. The
self-healing or self-integrating system shall find out the remaining changes to
communications and functionalities in order to accommodate change and yet still
restore function. This issue becomes even more interesting in context of
Internet of Things and Industrial Internet where previously unexpected device
combinations can be assembled in order to provide a surprising new function. In
order to pursue higher levels of self-integration capabilities I propose to
think of self-integration as sophisticated error correcting communications.
Therefore, this paper discusses an extended scope of error correction with the
purpose to emphasize error correction’s role as an integrated element of
bi-directional communication channels in self-integrating, autonomic
communication scenarios.
Neil Burch, Martin Schmid, Matej Moravčík, Michael Bowling
Subjects: Artificial Intelligence (cs.AI)
Evaluating agent performance when outcomes are stochastic and agents use
randomized strategies can be challenging when there is limited data available.
The variance of sampled outcomes may make the simple approach of Monte Carlo
sampling inadequate. This is the case for agents playing heads-up no-limit
Texas hold’em poker, where man-machine competitions have involved multiple days
of consistent play and still not resulted in statistically significant
conclusions even when the winner’s margin is substantial. In this paper, we
introduce AIVAT, a low variance, provably unbiased value assessment tool that
uses an arbitrary heuristic estimate of state value, as well as the explicit
strategy of a subset of the agents. Unlike existing techniques which reduce the
variance from chance events, or only consider game ending actions, AIVAT
reduces the variance both from choices by nature and by players with a known
strategy. The resulting estimator in no-limit poker can reduce the number of
hands needed to draw statistical conclusions by more than a factor of 10.
Lei Tai, Ming Liu
Comments: 16 pages, 4 figures, submit to journal
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Learning (cs.LG); Systems and Control (cs.SY)
Deep-learning has dramatically changed the world overnight. It greatly
boosted the development of visual perception, object detection, and speech
recognition, etc. That was attributed to the multiple convolutional processing
layers for abstraction of learning representations from massive data. The
advantages of deep convolutional structures in data processing motivated the
applications of artificial intelligence methods in robotic problems, especially
perception and control system, the two typical and challenging problems in
robotics. This paper presents a survey of the deep-learning research landscape
in mobile robotics. We start with introducing the definition and development of
deep-learning in related fields, especially the essential distinctions between
image processing and robotic tasks. We described and discussed several typical
applications and related works in this domain, followed by the benefits from
deep-learning, and related existing frameworks. Besides, operation in the
complex dynamic environment is regarded as a critical bottleneck for mobile
robots, such as that for autonomous driving. We thus further emphasize the
recent achievement on how deep-learning contributes to navigation and control
systems for mobile robots. At the end, we discuss the open challenges and
research frontiers.
Mirko Polato, Fabio Aiolli
Comments: 21 pages, 25 figures, 2 tables
Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI)
In many personalized recommendation problems available data consists only of
positive interactions (implicit feedback) between users and items. This problem
is also known as One-Class Collaborative Filtering (OC-CF). Linear models
usually achieves state-of-the-art performances on OC-CF problems and many
efforts have been devoted to build more expressive and complex representations
able to improve the recommendations but with no much success. Recent analysis
shows that collaborative filtering (CF) datasets have peculiar characteristics
such as high sparsity and a long tailed distribution of the ratings. In this
paper we propose a boolean kernel, called Disjunctive Kernel, which is less
expressive than the linear one but it is able to alleviate the sparsity issue
in CF contexts. The embedding of this kernel is composed by all the
combinations of a certain degree (d) of the input variables, and these combined
features are semantically interpreted as disjunctions of the input variables.
Experiments on several CF datasets show the effectiveness and the efficiency of
the proposed kernel.
Zijun Wu, Rolf Moehring, Jianhui Lai
Comments: 38 pages, 7 figures
Subjects: Data Structures and Algorithms (cs.DS); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
This article analyzes the stochastic runtime of a Cross-Entropy Algorithm on
two classes of traveling salesman problems. The algorithm shares main features
of the famous Max-Min Ant System with iteration-best reinforcement.
For simple instances that have a ({1,n})-valued distance function and a
unique optimal solution, we prove a stochastic runtime of (O(n^{6+epsilon}))
with the vertex-based random solution generation, and a stochastic runtime of
(O(n^{3+epsilon}ln n)) with the edge-based random solution generation for an
arbitrary (epsilonin (0,1)). These runtimes are very close to the known
expected runtime for variants of Max-Min Ant System with best-so-far
reinforcement. They are obtained for the stronger notion of stochastic runtime,
which means that an optimal solution is obtained in that time with an
overwhelming probability, i.e., a probability tending exponentially fast to one
with growing problem size.
We also inspect more complex instances with (n) vertices positioned on an
(m imes m) grid. When the (n) vertices span a convex polygon, we obtain a
stochastic runtime of (O(n^{3}m^{5+epsilon})) with the vertex-based random
solution generation, and a stochastic runtime of (O(n^{2}m^{5+epsilon})) for
the edge-based random solution generation. When there are (k = O(1)) many
vertices inside a convex polygon spanned by the other (n-k) vertices, we obtain
a stochastic runtime of (O(n^{4}m^{5+epsilon}+n^{6k-1}m^{epsilon})) with the
vertex-based random solution generation, and a stochastic runtime of
(O(n^{3}m^{5+epsilon}+n^{3k}m^{epsilon})) with the edge-based random solution
generation. These runtimes are better than the expected runtime for the
so-called ((mu!+!lambda)) EA reported in a recent article, and again
obtained for the stronger notion of stochastic runtime.
Nam Khanh Tran
Comments: CIKM Cup 2016
Subjects: Information Retrieval (cs.IR); Learning (cs.LG)
In this paper, we propose two methods for tackling the problem of
cross-device matching for online advertising at CIKM Cup 2016. The first method
considers the matching problem as a binary classification task and solve it by
utilizing ensemble learning techniques. The second method defines the matching
problem as a ranking task and effectively solve it with using learning-to-rank
algorithms. The results show that the proposed methods obtain promising
results, in which the ranking-based method outperforms the classification-based
method for the task.
Ze Hu, Zhan Zhang, Qing Chen, Haiqin Yang, Decheng Zuo
Comments: Submitted to Journal of Biomedical Informatics journal on Dec 10, 2016
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)
Currently, a growing number of health consumers are asking health-related
questions online, at any time and from anywhere, which effectively lowers the
cost of health care. The most common approach is using online health expert
question-answering (HQA) services, as health consumers are more willing to
trust answers from professional physicians. However, these answers can be of
varying quality depending on circumstance. In addition, as the available HQA
services grow, how to predict the answer quality of HQA services via machine
learning becomes increasingly important and challenging. In an HQA service,
answers are normally short texts, which are severely affected by the data
sparsity problem. Furthermore, HQA services lack community features such as
best answer and user votes. Therefore, the wisdom of the crowd is not available
to rate answer quality. To address these problems, in this paper, the
prediction of HQA answer quality is defined as a classification task. First,
based on the characteristics of HQA services and feedback from medical experts,
a standard for HQA service answer quality evaluation is defined. Next, based on
the characteristics of HQA services, several novel non-textual features are
proposed, including surface linguistic features and social features. Finally, a
deep belief network (DBN)-based HQA answer quality prediction framework is
proposed to predict the quality of answers by learning the high-level hidden
semantic representation from the physicians’ answers. Our results prove that
the proposed framework overcomes the problem of overly sparse textual features
in short text answers and effectively identifies high-quality answers.
Mirko Polato, Fabio Aiolli
Comments: 21 pages, 25 figures, 2 tables
Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI)
In many personalized recommendation problems available data consists only of
positive interactions (implicit feedback) between users and items. This problem
is also known as One-Class Collaborative Filtering (OC-CF). Linear models
usually achieves state-of-the-art performances on OC-CF problems and many
efforts have been devoted to build more expressive and complex representations
able to improve the recommendations but with no much success. Recent analysis
shows that collaborative filtering (CF) datasets have peculiar characteristics
such as high sparsity and a long tailed distribution of the ratings. In this
paper we propose a boolean kernel, called Disjunctive Kernel, which is less
expressive than the linear one but it is able to alleviate the sparsity issue
in CF contexts. The embedding of this kernel is composed by all the
combinations of a certain degree (d) of the input variables, and these combined
features are semantically interpreted as disjunctions of the input variables.
Experiments on several CF datasets show the effectiveness and the efficiency of
the proposed kernel.
Xingzhong Du, Hongzhi Yin, Ling Chen, Yang Wang, Yi Yang, Xiaofang Zhou
Subjects: Information Retrieval (cs.IR); Learning (cs.LG)
Video recommendation has become an essential way of helping people explore
the video world and discover the ones that may be of interest to them. However,
mainstream collaborative filtering techniques usually suffer from limited
performance due to the sparsity of user-video interactions, and hence are
ineffective for new video recommendation. Although some recent recommender
models such as CTR and CDL, have integrated text information to boost
performance, user-generated videos typically include scarce or low-quality text
information, which seriously degenerates performance. In this paper, we
investigate how to leverage the non-textual content contained in videos to
improve the quality of recommendations. We propose to first extract and encode
the diverse audio, visual and action information that rich video content
provides, then effectively incorporate these features with collaborative
filtering using a collaborative embedding regression model (CER). We also study
how to fuse multiple types of content features to further improve video
recommendation using a novel fusion method that unifies both non-textual and
textual features. We conducted extensive experiments on a large video dataset
collected from multiple sources. The experimental results reveal that our
proposed recommender model and feature fusion method outperform the
state-of-the-art methods.
Tengfei Ma
Subjects: Computation and Language (cs.CL)
A good lexicon is an important resource for various cross-lingual tasks such
as information retrieval and text mining. In this paper, we focus on extracting
translation pairs from non-parallel cross-lingual corpora. Previous lexicon
extraction algorithms for non-parallel data generally rely on an accurate seed
dictionary and extract translation pairs by using context similarity. However,
there are two problems. One, a lot of semantic information is lost if we just
use seed dictionary words to construct context vectors and obtain the context
similarity. Two, in practice, we may not have a clean seed dictionary. For
example, if we use a generic dictionary as a seed dictionary in a special
domain, it might be very noisy. To solve these two problems, we propose two new
bilingual topic models to better capture the semantic information of each word
while discriminating the multiple translations in a noisy seed dictionary. We
then use an effective measure to evaluate the similarity of words in different
languages and select the optimal translation pairs. Results of experiments
using real Japanese-English data demonstrate the effectiveness of our models.
Angeliki Lazaridou, Alexander Peysakhovich, Marco Baroni
Comments: Under submission at ICLR 2017
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Computer Science and Game Theory (cs.GT); Learning (cs.LG); Multiagent Systems (cs.MA)
The current mainstream approach to train natural language systems is to
expose them to large amounts of text. This passive learning is problematic if
we are interested in developing interactive machines, such as conversational
agents. We propose a framework for language learning that relies on multi-agent
communication. We study this learning in the context of referential games. In
these games, a sender and a receiver see a pair of images. The sender is told
one of them is the target and is allowed to send a message from a fixed,
arbitrary vocabulary to the receiver. The receiver must rely on this message to
identify the target. Thus, the agents develop their own language interactively
out of the need to communicate. We show that two networks with simple
configurations are able to learn to coordinate in the referential game. We
further explore how to make changes to the game environment to cause the “word
meanings” induced in the game to better reflect intuitive semantic properties
of the images. In addition, we present a simple strategy for grounding the
agents’ code into natural language. Both of these are necessary steps towards
developing machines that are able to communicate with humans productively.
Gábor Berend
Subjects: Computation and Language (cs.CL)
In this paper we propose and carefully evaluate a sequence labeling framework
which solely utilizes sparse indicator features derived from dense distributed
word representations. The proposed model obtains (near) state-of-the art
performance for both part-of-speech tagging and named entity recognition for a
variety of languages. Our model relies only on a few thousand sparse
coding-derived features, without applying any modification of the word
representations employed for the different tasks. The proposed model has
favorable generalization properties as it retains over 89.8% of its average POS
tagging accuracy when trained at 1.2% of the total available training data,
i.e.~150 sentences per language.
Markus Freitag, Yaser Al-Onaizan
Subjects: Computation and Language (cs.CL)
Neural Machine Translation (NMT) is a new approach for automatic translation
of text from one human language into another. The basic concept in NMT is to
train a large Neural Network that maximizes the translation performance on a
given parallel corpus. NMT is gaining popularity in the research community
because it outperformed traditional SMT approaches in several translation tasks
at WMT and other evaluation tasks/benchmarks at least for some language pairs.
However, many of the enhancements in SMT over the years have not been
incorporated into the NMT framework. In this paper, we focus on one such
enhancement namely domain adaptation. We propose an approach for adapting a NMT
system to a new domain. The main idea behind domain adaptation is that the
availability of large out-of-domain training data and a small in-domain
training data. We report significant gains with our proposed method in both
automatic metrics and a human subjective evaluation metric on two language
pairs. With our adaptation method, we show large improvement on the new domain
while the performance of our general domain only degrades slightly. In
addition, our approach is fast enough to adapt an already trained system to a
new domain within few hours without the need to retrain the NMT model on the
combined data which usually takes several days/weeks depending on the volume of
the data.
Ze Hu, Zhan Zhang, Qing Chen, Haiqin Yang, Decheng Zuo
Comments: Submitted to Journal of Biomedical Informatics journal on Dec 10, 2016
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)
Currently, a growing number of health consumers are asking health-related
questions online, at any time and from anywhere, which effectively lowers the
cost of health care. The most common approach is using online health expert
question-answering (HQA) services, as health consumers are more willing to
trust answers from professional physicians. However, these answers can be of
varying quality depending on circumstance. In addition, as the available HQA
services grow, how to predict the answer quality of HQA services via machine
learning becomes increasingly important and challenging. In an HQA service,
answers are normally short texts, which are severely affected by the data
sparsity problem. Furthermore, HQA services lack community features such as
best answer and user votes. Therefore, the wisdom of the crowd is not available
to rate answer quality. To address these problems, in this paper, the
prediction of HQA answer quality is defined as a classification task. First,
based on the characteristics of HQA services and feedback from medical experts,
a standard for HQA service answer quality evaluation is defined. Next, based on
the characteristics of HQA services, several novel non-textual features are
proposed, including surface linguistic features and social features. Finally, a
deep belief network (DBN)-based HQA answer quality prediction framework is
proposed to predict the quality of answers by learning the high-level hidden
semantic representation from the physicians’ answers. Our results prove that
the proposed framework overcomes the problem of overly sparse textual features
in short text answers and effectively identifies high-quality answers.
Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Li Fei-Fei, C. Lawrence Zitnick, Ross Girshick
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Learning (cs.LG)
When building artificial intelligence systems that can reason and answer
questions about visual data, we need diagnostic tests to analyze our progress
and discover shortcomings. Existing benchmarks for visual question answering
can help, but have strong biases that models can exploit to correctly answer
questions without reasoning. They also conflate multiple sources of error,
making it hard to pinpoint model weaknesses. We present a diagnostic dataset
that tests a range of visual reasoning abilities. It contains minimal biases
and has detailed annotations describing the kind of reasoning each question
requires. We use this dataset to analyze a variety of modern visual reasoning
systems, providing novel insights into their abilities and limitations.
Abdurrachman Mappuji, Nazrul Effendy, Muhamad Mustaghfirin, Fandy Sondok, Rara Priska Yuniar, Sheptiani Putri Pangesti
Comments: Pre-print of conference paper on International Conference on Information Technology and Electrical Engineering
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
High performance computing (HPC) devices is no longer exclusive for academic,
R&D, or military purposes. The use of HPC device such as supercomputer now
growing rapidly as some new area arise such as big data, and computer
simulation. It makes the use of supercomputer more inclusive. Todays
supercomputer has a huge computing power, but requires an enormous amount of
energy to operate. In contrast a single board computer (SBC) such as Raspberry
Pi has minimum computing power, but require a small amount of energy to
operate, and as a bonus it is small and cheap. This paper covers the result of
utilizing many Raspberry Pi 2 SBCs, a quad-core Cortex A7 900 MHz, as a cluster
to compensate its computing power. The high performance linpack (HPL) is used
to benchmark the computing power, and a power meter with resolution 10mV / 10mA
is used to measure the power consumption. The experiment shows that the
increase of number of cores in every SBC member in a cluster is not giving
significant increase in computing power. This experiment give a recommendation
that 4 nodes is a maximum number of nodes for SBC cluster based on the
characteristic of computing performance and power consumption.
Evan Shelhamer, Parsa Mahmoudieh, Max Argus, Trevor Darrell
Subjects: Learning (cs.LG)
Reinforcement learning, driven by reward, addresses tasks by optimizing
policies for expected return. Need the supervision be so narrow? Reward is
delayed and sparse for many tasks, so we argue that reward alone is a difficult
and impoverished signal for end-to-end optimization. To augment reward, we
consider a range of self-supervised tasks that incorporate states, actions, and
successors to provide auxiliary losses. These losses offer ubiquitous and
instantaneous supervision for representation learning even in the absence of
reward. While current results show that learning from reward alone is feasible,
pure reinforcement learning methods are constrained by computational and data
efficiency issues that can be remedied by auxiliary losses. Self-supervised
pre-training improves the data efficiency and policy returns of end-to-end
reinforcement learning.
Chao Du, Chongxuan Li, Yin Zheng, Jun Zhu, Cailiang Liu, Hanning Zhou, Bo Zhang
Subjects: Learning (cs.LG)
Besides the success on object recognition, machine translation and system
control in games, (deep) neural networks have achieved state-of-the-art results
in collaborative filtering (CF) recently. Previous neural approaches for CF are
either user-based or item-based, which cannot leverage all relevant information
explicitly. We propose CF-UIcA, a neural co-autoregressive model for CF tasks,
which exploit the structural autoregressiveness in the domains of both users
and items. Furthermore, we separate the inherent dependence in this structure
under a natural assumption and develop an efficient stochastic learning
algorithm to handle large scale datasets. We evaluate CF-UIcA on two popular
benchmarks: MovieLens 1M and Netflix, and achieve state-of-the-art predictive
performance, which demonstrates the effectiveness of CF-UIcA.
Carlos M. Alaíz, Michaël Fanuel, Johan A. K. Suykens
Subjects: Learning (cs.LG)
A graph-based classification method is proposed both for semi-supervised
learning in the case of Euclidean data and for classification in the case of
graph data. Our manifold learning technique is based on a convex optimization
problem involving a convex regularization term and a concave loss function with
a trade-off parameter carefully chosen so that the objective function remains
convex. As shown experimentally, the advantage of considering a concave loss
function is that the learning problem becomes more robust in the presence of
noisy labels. Furthermore, the loss function considered is then more similar to
a classification loss while several other methods treat graph-based
classification problems as regression problems.
Haishuai Wang, Jia Wu, Peng Zhang, Chengqi Zhang
Subjects: Learning (cs.LG)
This paper formulates the problem of learning discriminative features
( extit{i.e.,} segments) from networked time series data considering the
linked information among time series. For example, social network users are
considered to be social sensors that continuously generate social signals
(tweets) represented as a time series. The discriminative segments are often
referred to as emph{shapelets} in a time series. Extracting shapelets for time
series classification has been widely studied. However, existing works on
shapelet selection assume that the time series are independent and identically
distributed (i.i.d.). This assumption restricts their applications to social
networked time series analysis, since a user’s actions can be correlated to
his/her social affiliations. In this paper we propose a new Network Regularized
Least Squares (NetRLS) feature selection model that combines typical time
series data and user network data for analysis. Experiments on real-world
networked time series Twitter and DBLP data demonstrate the performance of the
proposed method. NetRLS performs better than LTS, the state-of-the-art time
series feature selection approach, on real-world data.
Xi Chen, Kevin Jiao, Qihang Lin
Journal-ref: Journal of Machine Learning Research 17 (2016) 1-40
Subjects: Machine Learning (stat.ML); Learning (cs.LG); Methodology (stat.ME)
Rank aggregation based on pairwise comparisons over a set of items has a wide
range of applications. Although considerable research has been devoted to the
development of rank aggregation algorithms, one basic question is how to
efficiently collect a large amount of high-quality pairwise comparisons for the
ranking purpose. Because of the advent of many crowdsourcing services, a crowd
of workers are often hired to conduct pairwise comparisons with a small
monetary reward for each pair they compare. Since different workers have
different levels of reliability and different pairs have different levels of
ambiguity, it is desirable to wisely allocate the limited budget for
comparisons among the pairs of items and workers so that the global ranking can
be accurately inferred from the comparison results. To this end, we model the
active sampling problem in crowdsourced ranking as a Bayesian Markov decision
process, which dynamically selects item pairs and workers to improve the
ranking accuracy under a budget constraint. We further develop a
computationally efficient sampling policy based on knowledge gradient as well
as a moment matching technique for posterior approximation. Experimental
evaluations on both synthetic and real data show that the proposed policy
achieves high ranking accuracy with a lower labeling cost.
Angeliki Lazaridou, Alexander Peysakhovich, Marco Baroni
Comments: Under submission at ICLR 2017
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Computer Science and Game Theory (cs.GT); Learning (cs.LG); Multiagent Systems (cs.MA)
The current mainstream approach to train natural language systems is to
expose them to large amounts of text. This passive learning is problematic if
we are interested in developing interactive machines, such as conversational
agents. We propose a framework for language learning that relies on multi-agent
communication. We study this learning in the context of referential games. In
these games, a sender and a receiver see a pair of images. The sender is told
one of them is the target and is allowed to send a message from a fixed,
arbitrary vocabulary to the receiver. The receiver must rely on this message to
identify the target. Thus, the agents develop their own language interactively
out of the need to communicate. We show that two networks with simple
configurations are able to learn to coordinate in the referential game. We
further explore how to make changes to the game environment to cause the “word
meanings” induced in the game to better reflect intuitive semantic properties
of the images. In addition, we present a simple strategy for grounding the
agents’ code into natural language. Both of these are necessary steps towards
developing machines that are able to communicate with humans productively.
Lei Tai, Ming Liu
Comments: 16 pages, 4 figures, submit to journal
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Learning (cs.LG); Systems and Control (cs.SY)
Deep-learning has dramatically changed the world overnight. It greatly
boosted the development of visual perception, object detection, and speech
recognition, etc. That was attributed to the multiple convolutional processing
layers for abstraction of learning representations from massive data. The
advantages of deep convolutional structures in data processing motivated the
applications of artificial intelligence methods in robotic problems, especially
perception and control system, the two typical and challenging problems in
robotics. This paper presents a survey of the deep-learning research landscape
in mobile robotics. We start with introducing the definition and development of
deep-learning in related fields, especially the essential distinctions between
image processing and robotic tasks. We described and discussed several typical
applications and related works in this domain, followed by the benefits from
deep-learning, and related existing frameworks. Besides, operation in the
complex dynamic environment is regarded as a critical bottleneck for mobile
robots, such as that for autonomous driving. We thus further emphasize the
recent achievement on how deep-learning contributes to navigation and control
systems for mobile robots. At the end, we discuss the open challenges and
research frontiers.
Yaman Umuroglu, Nicholas J. Fraser, Giulio Gambardella, Michaela Blott, Philip Leong, Magnus Jahre, Kees Vissers
Comments: To appear in the 25th International Symposium on Field-Programmable Gate Arrays, February 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV); Hardware Architecture (cs.AR); Learning (cs.LG)
Research has shown that convolutional neural networks contain significant
redundancy, and high classification accuracy can be obtained even when weights
and activations are reduced from floating point to binary values. In this
paper, we present FINN, a framework for building fast and flexible FPGA
accelerators using a flexible heterogeneous streaming architecture. By
utilizing a novel set of optimizations that enable efficient mapping of
binarized neural networks to hardware, we implement fully connected,
convolutional and pooling layers, with per-layer compute resources being
tailored to user-provided throughput requirements. On a ZC706 embedded FPGA
platform drawing less than 25 W total system power, we demonstrate up to 12.3
million image classifications per second with 0.31 {mu}s latency on the MNIST
dataset with 95.8% accuracy, and 21906 image classifications per second with
283 {mu}s latency on the CIFAR-10 and SVHN datasets with respectively 80.1%
and 94.9% accuracy. To the best of our knowledge, ours are the fastest
classification rates reported to date on these benchmarks.
Nam Khanh Tran
Comments: CIKM Cup 2016
Subjects: Information Retrieval (cs.IR); Learning (cs.LG)
In this paper, we propose two methods for tackling the problem of
cross-device matching for online advertising at CIKM Cup 2016. The first method
considers the matching problem as a binary classification task and solve it by
utilizing ensemble learning techniques. The second method defines the matching
problem as a ranking task and effectively solve it with using learning-to-rank
algorithms. The results show that the proposed methods obtain promising
results, in which the ranking-based method outperforms the classification-based
method for the task.
Jiuxiang Gu, Gang Wang, Tsuhan Chen
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
In this paper, we propose a Recurrent Highway Network with Language CNN for
image caption generation. Our network consists of three sub-networks: the deep
Convolutional Neural Network for image representation, the Convolutional Neural
Network for language modeling, and the Multimodal Recurrent Highway Network for
sequence prediction. Our proposed model can naturally exploit the hierarchical
and temporal structure of history words, which are critical for image caption
generation. The effectiveness of our model is validated on two datasets MS COCO
and Flickr30K. Our extensive experiment results show that our method is
competitive with the state-of-the-art methods.
Badong Chen, Lei Xing, Xin Wang, Jing Qin, Nanning Zheng
Comments: 11 pages, 7 figures, 10 tables
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
Correntropy is a second order statistical measure in kernel space, which has
been successfully applied in robust learning and signal processing. In this
paper, we define a nonsecond order statistical measure in kernel space, called
the kernel mean-p power error (KMPE), including the correntropic loss (CLoss)
as a special case. Some basic properties of KMPE are presented. In particular,
we apply the KMPE to extreme learning machine (ELM) and principal component
analysis (PCA), and develop two robust learning algorithms, namely ELM-KMPE and
PCA-KMPE. Experimental results on synthetic and benchmark data show that the
developed algorithms can achieve consistently better performance when compared
with some existing methods.
Xingzhong Du, Hongzhi Yin, Ling Chen, Yang Wang, Yi Yang, Xiaofang Zhou
Subjects: Information Retrieval (cs.IR); Learning (cs.LG)
Video recommendation has become an essential way of helping people explore
the video world and discover the ones that may be of interest to them. However,
mainstream collaborative filtering techniques usually suffer from limited
performance due to the sparsity of user-video interactions, and hence are
ineffective for new video recommendation. Although some recent recommender
models such as CTR and CDL, have integrated text information to boost
performance, user-generated videos typically include scarce or low-quality text
information, which seriously degenerates performance. In this paper, we
investigate how to leverage the non-textual content contained in videos to
improve the quality of recommendations. We propose to first extract and encode
the diverse audio, visual and action information that rich video content
provides, then effectively incorporate these features with collaborative
filtering using a collaborative embedding regression model (CER). We also study
how to fuse multiple types of content features to further improve video
recommendation using a novel fusion method that unifies both non-textual and
textual features. We conducted extensive experiments on a large video dataset
collected from multiple sources. The experimental results reveal that our
proposed recommender model and feature fusion method outperform the
state-of-the-art methods.
Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Li Fei-Fei, C. Lawrence Zitnick, Ross Girshick
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Learning (cs.LG)
When building artificial intelligence systems that can reason and answer
questions about visual data, we need diagnostic tests to analyze our progress
and discover shortcomings. Existing benchmarks for visual question answering
can help, but have strong biases that models can exploit to correctly answer
questions without reasoning. They also conflate multiple sources of error,
making it hard to pinpoint model weaknesses. We present a diagnostic dataset
that tests a range of visual reasoning abilities. It contains minimal biases
and has detailed annotations describing the kind of reasoning each question
requires. We use this dataset to analyze a variety of modern visual reasoning
systems, providing novel insights into their abilities and limitations.
Faicel Chamroukhi
Comments: arXiv admin note: substantial text overlap with arXiv:1506.06707
Subjects: Methodology (stat.ME); Learning (cs.LG); Machine Learning (stat.ML)
Mixture of Experts (MoE) is a popular framework in the fields of statistics
and machine learning for modeling heterogeneity in data for regression,
classification and clustering. MoE for continuous data are usually based on the
normal distribution. However, it is known that for data with asymmetric
behavior, heavy tails and atypical observations, the use of the normal
distribution is unsuitable. We introduce a new robust non-normal mixture of
experts modeling using the skew (t) distribution. The proposed skew (t) mixture
of experts, named STMoE, handles these issues of the normal mixtures experts
regarding possibly skewed, heavy-tailed and noisy data. We develop a dedicated
expectation conditional maximization (ECM) algorithm to estimate the model
parameters by monotonically maximizing the observed data log-likelihood. We
describe how the presented model can be used in prediction and in model-based
clustering of regression data. Numerical experiments carried out on simulated
data show the effectiveness and the robustness of the proposed model in fitting
non-linear regression functions as well as in model-based clustering. Then, the
proposed model is applied to the real-world data of tone perception for musical
data analysis, and the one of temperature anomalies for the analysis of climate
change data. The obtained results confirm the usefulness of the model for
practical data analysis applications.
Italo Atzeni, Marios Kountouris
Comments: Submitted for possible publication
Subjects: Information Theory (cs.IT)
Full-duplex (FD) technology is envisaged as a key component for future mobile
broadband networks due to its ability to boost the spectral efficiency. FD
systems can transmit and receive simultaneously on the same frequency at the
expense of residual self-interference and additional interference to the
network compared with half-duplex (HD) transmission. This paper analyzes the
performance of wireless networks with FD multi-antenna base stations (BSs) and
HD user equipments (UEs) using stochastic geometry. Our analytical results
quantify the success probability and the achievable spectral efficiency and
indicate the amount of self-interference cancellation needed for beneficial FD
operation. The advantages of multi-antenna BSs/UEs are investigated and the
performance gains achieved by optimally balancing desired signal power increase
and interference cancellation are derived. The proposed framework provides
crisp insights on the system-level gains of FD mode with respect to HD mode in
terms of network throughput, as well as useful design guidelines for the
practical implementation of FD technology in large small-cell networks.
Mehrnaz Afshang, Harpreet S. Dhillon
Subjects: Information Theory (cs.IT)
This paper develops a comprehensive new approach to the modeling and analysis
of HetNets that accurately incorporates correlation in the locations of users
and base stations, which exists due to the deployment of small cell base
stations (SBSs) at the places of high user density (termed user hotspots in
this paper). Modeling the locations of the geographical centers of user
hotspots as a homogeneous Poisson Point Process (PPP), we assume that the users
and SBSs are clustered around each user hotspot center independently with two
different distributions. The macrocell base station (BS) locations are modeled
by an independent PPP. This model naturally captures correlation that exists
between the locations of users and their serving SBSs. Using this model, we
study the performance of a typical user in terms of coverage probability and
throughput for two association policies: i) power-based association, where a
typical user is served by the open-access BS that provides maximum averaged
received power, and ii) distance-based association, where a typical user is
served by its nearest open-access SBS if it is located closer than a certain
distance threshold; and macro tier otherwise. After deriving all the results in
terms of general distributions describing the locations of users and SBSs
around the geographical center of user hotspots, we specialize the setup to the
Thomas cluster process. A key intermediate step in this analysis is the
derivation of distance distributions from a typical user to the open-access and
closed-access interfering SBSs. Consistent with the intuition, our analysis
demonstrates that as the number of SBSs reusing the same resource block
increases (higher frequency reuse), coverage probability decreases whereas
throughput increases. Thus the same resource block can be aggressively reused
by more SBSs as long as the coverage probability remains acceptable.
Pietro Danzi, Marko Angjelichinoski, Čedomir Stefanović, Petar Popovski
Subjects: Information Theory (cs.IT)
In standard implementations of distributed secondary control for DC
MicroGrids (MGs), the exchange of local measurements among neighboring control
agents is enabled via off-the-shelf wireless solutions, such as IEEE 802.11.
However, Denial of Service (DoS) attacks on the wireless interface through
jamming prevents the secondary control system from performing its main tasks,
which might compromise the stability of the MG. In this paper, we propose
novel, robust and secure secondary control reconfiguration strategy, tailored
to counteract DoS attacks. Specifically, upon detecting the impairment of the
wireless interface, the jammed secondary control agent notifies its peers via a
secure, low-rate powerline channel based on Power Talk communication. This
triggers reconfiguration of the wireless communication graph through primary
control mode switching, where the jammed agents leave the secondary control by
switching to current source mode, and are replaced by nonjammed current sources
that switch to voltage source mode and join the secondary control. The strategy
fits within the software-defined networking framework, where the network
control is split from the data plane using reliable and secure side power talk
communication channel, created via software modification of the MG primary
control loops. The simulation results illustrate the feasibility of the
solution and prove that the MG resilience and performance can be indeed
improved via software-defined networking approaches.
Marko Angjelichinoski, Pietro Danzi, Čedomir Stefanović, Petar Popovski
Subjects: Information Theory (cs.IT)
We propose a novel framework for secure and reliable authentication of
Distributed Energy Resources to the centralized secondary/tertiary control
system of a DC MicroGrid (MG), networked using the IEEE 802.11 wireless
interface. The key idea is to perform the authentication using power talk,
which is a powerline communication technique executed by the primary control
loops of the power electronic converters, without the use of a dedicated
hardware for its modem. In addition, the scheme also promotes direct and active
participation of the control system in the authentication process, a feature
not commonly encountered in current networked control systems for MicroGrids.
The PLECS-based simulations verifies the viability of our scheme.
Dirk Liebhold, Gabriele Nebe, Angeles Vazquez-Castro
Subjects: Information Theory (cs.IT)
We develop a network coding technique based on flags of subspaces and a
corresponding network channel model. To define error correcting codes we
introduce a new distance on the flag variety, the Grassmann distance on flags
and compare it to the commonly used gallery distance for full flags.
Elsa Dupraz, Thomas Maugey, Aline Roumy, Michel Kieffer
Comments: Submitted to IEEE Transactions on Information Theory
Subjects: Information Theory (cs.IT)
This paper introduces a new source coding paradigm called Massive Random
Access (MRA). In MRA, a set of correlated sources is jointly encoded and stored
on a server, and clients want to access to only a subset of the sources. Since
the number of simultaneous clients can be huge, the server is only authorized
to extract a bitstream from the stored data: no re-encoding can be performed
before the transmission of the specific client’s request. In this paper, we
formally define the MRA framework and we introduce the notion of rate-storage
region to characterize the performance of MRA. From an information theoretic
analysis, we derive achievable rate-storage bounds for lossless source coding
of i.i.d. and non i.i.d. sources, and rate-storage distortion regions for
Gaussian sources. We also show two practical implementations of MRA systems
based on rate-compatible LDPC codes. Both the theoretical and the experimental
results demonstrate that MRA systems can reach the same transmission rates as
in traditional point to point source coding schemes, while having a reasonable
storage cost overhead. These results constitute a breakthrough for many recent
data transmission applications in which only a part of the data is requested by
the clients.
Francisco Revson F. Pereira, Giuliano G. La Guardia, Francisco M. de Assis
Comments: 14 pages, 2 Table
Subjects: Information Theory (cs.IT); Algebraic Geometry (math.AG)
In this paper, we construct new families of convolutional codes. Such codes
are obtained by means of algebraic geometry codes. Additionally, more families
of convolutional codes are constructed by means of puncturing, extending,
expanding and by the direct product code construction applied to algebraic
geometry codes. The parameters of the new convolutional codes are better than
or comparable to the ones available in literature. In particular, a family of
almost near MDS codes is presented.
Oliver Johnson, Matthew Aldridge, Jonathan Scarlett
Subjects: Information Theory (cs.IT); Probability (math.PR)
We consider the nonadaptive group testing problem in the case that each item
appears in a constant number of tests, chosen uniformly at random with
replacement, so that the testing matrix has (almost) constant column weights.
We analyse the performance of simple and practical algorithms in a range of
sparsity regimes, showing that the performance is consistently improved in
comparison with more standard Bernoulli designs. In particular, using a
constant-column weight design, the DD algorithm is shown to outperform all
possible algorithms for Bernoulli designs in a broad range of sparsity regimes,
and to beat the best-known theoretical guarantees of existing practical
algorithms in all sparsity regimes.
Xihua Zou, Wei Pan, Ge Yu, Bin Luo, Lianshan Yan
Comments: 5 pages, 6 figures
Subjects: Information Theory (cs.IT)
Frequency offset modulation (FOM) is proposed as a new concept to provide
both high energy efficiency and high spectral efficiency for communications. In
the FOM system, an array of transmitters (TXs) is deployed and only one TX is
activated for data transmission at any signaling time instance. The TX index
distinguished by a very slight frequency offset among the entire occupied
bandwidth is exploited to implicitly convey a bit unit without any power or
signal radiation, saving the power and spectral resources. Moreover, the FOM is
characterized by removing the stringent requirements on distinguishable spatial
channels and perfect priori channel knowledge, while retaining the advantages
of no inter-channel interference and no need of inter-antenna synchronization.
In addition, a hybrid solution integrating the FOM and the spatial modulation
is discussed to further improve the energy efficiency and spectral efficiency.
Consequently, the FOM will be an enabling and green solution to support
ever-increasing high-capacity data traffic in a variety of interdisciplinary
fields.
Siddhartha Kumar, Eirik Rosnes, Alexandre Graell i Amat
Comments: Submitted to the 2017 IEEE International Symposium on Information Theory
Subjects: Information Theory (cs.IT)
We propose an information-theoretic private information retrieval (PIR)
scheme for distributed storage systems where data is stored using a linear
systematic code of rate (R > 1/2). The proposed scheme generalizes the PIR
scheme for data stored using maximum distance separable codes recently proposed
by Tajeddine and El Rouayheb for the scenario of a single spy node. We further
propose an algorithm to optimize the communication price of privacy (cPoP)
using the structure of the underlying linear code. As an example, we apply the
proposed algorithm to several distributed storage codes, showing that the cPoP
can be significantly reduced by exploiting the structure of the distributed
storage code.
Gaopeng Jian, Rongquan Feng
Subjects: Information Theory (cs.IT)
Since Ding et al. proposed a general method for constructing linear codes
from defining sets, the researchers have obtained a large number of linear
codes with few weights by choosing appropriate defining sets. Let
(mathbb{F}_q) be a finite field with (q=p^m) elements, where (p) is an odd
prime and (m) is a positive integer. Let ( ext{Tr}) denote the trace function
from (mathbb{F}_q) to (mathbb{F}_p) and (D={ (x,y) in mathbb{F}_q^2
ackslash {(0,0)}: ext{Tr}(x+y^{p^k+1})=0}), where (k) is a positive
integer. We define a (p)-ary linear codes (C_D) by [
C_D={c(a,b)=( ext{Tr}(ax+by) )_{(x,y) in D}: a,b in mathbb{F}_q }. ] In
this paper, we use Weil sums to investigate the weight distribution of (C_D).
We show that the code has three nonzero weights and it can be used to construct
secret sharing schemes.
Qian Gao, Chen Gong, Zhengyuan Xu
Comments: This work was submitted to the Transaction on Wireless Communications on Feb. 16, 2016 and is currently under review. An abridged version of this manuscript was accepted by the IEEE Globecom 2016
Subjects: Information Theory (cs.IT)
In this paper, we investigate the problem of the joint transceiver and offset
design (JTOD) for point-to-point multiple-input-multiple-output (MIMO) and
multiple user multiple-input-single-output (MU-MISO) visible light
communication (VLC) systems. Both uplink and downlink multi-user scenarios are
considered. The shot noise induced by the incoming signals is considered,
leading to a more realistic MIMO VLC channel model. Under key lighting
constraints, we formulate non-convex optimization problems aiming at minimizing
the sum mean squared error. To optimize the transceiver and the offset jointly,
a gradient projection based procedure is resorted to. When only imperfect
channel state information is available, a semidefinite programming (SDP) based
scheme is proposed to obtain robust transceiver and offset. The proposed method
is shown to non-trivially outperform the conventional scaled zero forcing (ZF)
and singular value decomposition (SVD) based equalization methods. The robust
scheme works particularly well when the signal is much stronger than the noise.
Biao He, An Liu, Nan Yang, Vincent K. N. Lau
Subjects: Information Theory (cs.IT)
This paper proposes a new design of non-orthogonal multiple access (NOMA)
under secrecy considerations. We focus on a NOMA system where a transmitter
sends confidential messages to multiple users in the presence of an external
eavesdropper. The optimal designs of decoding order, transmission rates, and
power allocated to each user are investigated. Considering the practical
passive eavesdropping scenario where the instantaneous channel state of the
eavesdropper is unknown, we adopt the secrecy outage probability as the secrecy
metric. We first consider the problem of minimizing the transmit power subject
to the secrecy outage and quality of service constraints, and derive the
closed-form solution to this problem. We then explore the problem of maximizing
the minimum confidential information rate among users subject to the secrecy
outage and transmit power constraints, and provide an iterative algorithm to
solve this problem. We find that the secrecy outage constraint in the studied
problems does not change the optimal decoding order for NOMA, and one should
increase the power allocated to the user whose channel is relatively bad when
the secrecy constraint becomes more stringent. Finally, we show the advantage
of NOMA over orthogonal multiple access in the studied problems both
analytically and numerically.
Sergey Loyka, Charalambos D. Charalambous
Comments: submitted to IEEE Info. Theory Transactions
Subjects: Information Theory (cs.IT)
A discrete compound channel with memory is considered, where no stationarity,
ergodicity or information stability is required, and where the uncertainty set
can be arbitrary. When the discrete noise is additive but otherwise arbitrary
and there is no cost constraint on the input, it is shown that the causal
feedback does not increase the capacity. This extends the earlier result
obtained for general single-state channels with full transmitter (Tx) channel
state information (CSI) to the compound setting. It is further shown that, for
this compound setting and under a mild technical condition on the additive
noise, the addition of the full Tx CSI does not increase the capacity either,
so that the worst-case and compound channel capacities are the same. This can
also be expressed as a saddle-point in the information-theoretic game between
the transmitter (who selects the input distribution) and the nature (who
selects the channel state), even though the objective function (the
inf-information rate) is not convex/concave in the right way. Cases where the
Tx CSI does increase the capacity are identified.
Conditions under which the strong converse holds for this channel are
studied. The ergodic behaviour of the worst-case noise in otherwise
information-unstable channel is shown to be both sufficient and necessary for
the strong converse to hold, including feedback and no feedback cases.
Giuliano Gadioli La Guardia (corresponding author), Francisco Revson F. Pereira
Subjects: Quantum Physics (quant-ph); Information Theory (cs.IT)
In this paper we construct several new families of quantum codes with good
and asymptotically good parameters. These new quantum codes are derived from
(classical) algebraic geometry (AG) codes by applying the
Calderbank-Shor-Steane (CSS) construction. Many of these codes have large
minimum distances when compared with its code length and they have relatively
small Singleton defect. For example, we construct a family [[46; 2(t_2 – t1);
d]]_25 of quantum codes, where t_1, t_2 are positive integers such that 1 < t_1
< t_2 < 23 and d >= min {46 – 2t_2, 2t_1-2}, of length n = 46, with minimum
distance in the range 2 =< d =< 20, having Singleton defect four. Additionally,
by utilizing t-point AG codes, with t >= 2, we show how to obtain sequences of
asymptotically good quantum codes.
Xerxes D. Arsiwalla, Paul Verschure
Comments: 16 pages, 6 figures
Subjects: Neurons and Cognition (q-bio.NC); Information Theory (cs.IT); Dynamical Systems (math.DS); Biological Physics (physics.bio-ph)
How much information do large brain networks integrate as a whole over the
sum of their parts? Can the dynamical complexity of such networks be globally
quantified in an information-theoretic way and be meaningfully coupled to brain
function? Recently, measures of dynamical complexity such as integrated
information have been proposed. However, problems related to the normalization
and Bell number of partitions associated to these measures make these
approaches computationally infeasible for large-scale brain networks. Our goal
in this work is to address this problem. Our formulation of network integrated
information is based on the Kullback-Leibler divergence between the
multivariate distribution on the set of network states versus the corresponding
factorized distribution over its parts. We find that implementing the maximum
information partition optimizes computations. These methods are well-suited for
large networks with linear stochastic dynamics. We compute the integrated
information for both, the system’s attractor states, as well as non-stationary
dynamical states of the network. We then apply this formalism to brain networks
to compute the integrated information for the human brain’s connectome.
Compared to a randomly re-wired network, we find that the specific topology of
the brain generates greater information complexity.