IT博客汇 | arXiv Paper Daily: Mon, 20 Mar 2017

arXiv Paper Daily: Mon, 20 Mar 2017

我爱机器学习(52ml.net)发表于 2017-03-20 00:00:00

Neural and Evolutionary Computing

Implicit Gradient Neural Networks with a Positive-Definite Mass Matrix for Online Linear Equations Solving

Ke Chen
Comments: Submitted to Information Processing Letters
Subjects: Neural and Evolutionary Computing (cs.NE); Systems and Control (cs.SY)

Motivated by the advantages achieved by implicit analogue net for solving
online linear equations, a novel implicit neural model is designed based on
conventional explicit gradient neural networks in this letter by introducing a
positive-definite mass matrix. In addition to taking the advantages of the
implicit neural dynamics, the proposed implicit gradient neural networks can
still achieve globally exponential convergence to the unique theoretical
solution of linear equations and also global stability even under no-solution
and multi-solution situations. Simulative results verify theoretical
convergence analysis on the proposed neural dynamics.

Reservoir Computing and Extreme Learning Machines using Pairs of Cellular Automata Rules

Nathan McDonald
Comments: accepted to International Joint Conference on Neural Networks (IJCNN 2017)
Subjects: Neural and Evolutionary Computing (cs.NE)

A framework for implementing reservoir computing (RC) and extreme learning
machines (ELMs), two types of artificial neural networks, based on 1D
elementary Cellular Automata (CA) is presented, in which two separate CA rules
explicitly implement the minimum computational requirements of the reservoir
layer: hyperdimensional projection and short-term memory. CAs are cell-based
state machines, which evolve in time in accordance with local rules based on a
cells current state and those of its neighbors. Notably, simple single cell
shift rules as the memory rule in a fixed edge CA afforded reasonable success
in conjunction with a variety of projection rules, potentially significantly
reducing the optimal solution search space. Optimal iteration counts for the CA
rule pairs can be estimated for some tasks based upon the category of the
projection rule. Initial results support future hardware realization, where CAs
potentially afford orders of magnitude reduction in size, weight, and power
(SWaP) requirements compared with floating point RC implementations.

Pattern representation and recognition with accelerated analog neuromorphic systems

Mihai A. Petrovici, Sebastian Schmitt, Johann Klähn, David Stöckel, Anna Schroeder, Guillaume Bellec, Johannes Bill, Oliver Breitwieser, Ilja Bytschok, Andreas Grübl, Maurice Güttler, Andreas Hartel, Stephan Hartmann, Dan Husmann, Kai Husmann, Sebastian Jeltsch, Vitali Karasenko, Mitja Kleider, Christoph Koke, Alexander Kononov, Christian Mauch, Paul Müller, Johannes Partzsch, Thomas Pfeil, Stefan Schiefer, Stefan Scholze, Anand Subramoney, Vasilis Thanasoulis, Bernhard Vogginger, Robert Legenstein, Wolfgang Maass, René Schüffny, Christian Mayr, Johannes Schemmel, Karlheinz Meier
Comments: accepted at ISCAS 2017
Subjects: Neurons and Cognition (q-bio.NC); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

Despite being originally inspired by the central nervous system, artificial
neural networks have diverged from their biological archetypes as they have
been remodeled to fit particular tasks. In this paper, we review several
possibilites to reverse map these architectures to biologically more realistic
spiking networks with the aim of emulating them on fast, low-power neuromorphic
hardware. Since many of these devices employ analog components, which cannot be
perfectly controlled, finding ways to compensate for the resulting effects
represents a key challenge. Here, we discuss three different strategies to
address this problem: the addition of auxiliary network components for
stabilizing activity, the utilization of inherently robust architectures and a
training method for hardware-emulated networks that functions without perfect
knowledge of the system’s dynamics and parameters. For all three scenarios, we
corroborate our theoretical considerations with experimental results on
accelerated analog neuromorphic platforms.

Computer Vision and Pattern Recognition

PSF field learning based on Optimal Transport Distances

F. M. Ngolè Mboula, J.-L. Starck
Subjects: Computer Vision and Pattern Recognition (cs.CV); Instrumentation and Methods for Astrophysics (astro-ph.IM)

Context: in astronomy, observing large fractions of the sky within a
reasonable amount of time implies using large field-of-view (fov) optical
instruments that typically have a spatially varying Point Spread Function
(PSF). Depending on the scientific goals, galaxies images need to be corrected
for the PSF whereas no direct measurement of the PSF is available. Aims: given
a set of PSFs observed at random locations, we want to estimate the PSFs at
galaxies locations for shapes measurements correction. Contributions: we
propose an interpolation framework based on Sliced Optimal Transport. A
non-linear dimension reduction is first performed based on local pairwise
approximated Wasserstein distances. A low dimensional representation of the
unknown PSFs is then estimated, which in turn is used to derive representations
of those PSFs in the Wasserstein metric. Finally, the interpolated PSFs are
calculated as approximated Wasserstein barycenters. Results: the proposed
method was tested on simulated monochromatic PSFs of the Euclid space mission
telescope (to be launched in 2020). It achieves a remarkable accuracy in terms
of pixels values and shape compared to standard methods such as Inverse
Distance Weighting or Radial Basis Function based interpolation methods.

Towards Diverse and Natural Image Descriptions via a Conditional GAN

Bo Dai, Dahua Lin, Raquel Urtasun, Sanja Fidler
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Despite the substantial progress in recent years, the image captioning
techniques are still far from being perfect.Sentences produced by existing
methods, e.g. those based on RNNs, are often overly rigid and lacking in
variability. This issue is related to a learning principle widely used in
practice, that is, to maximize the likelihood of training samples. This
principle encourages high resemblance to the “ground-truth” captions while
suppressing other reasonable descriptions. Conventional evaluation metrics,
e.g. BLEU and METEOR, also favor such restrictive methods. In this paper, we
explore an alternative approach, with the aim to improve the naturalness and
diversity — two essential properties of human expression. Specifically, we
propose a new framework based on Conditional Generative Adversarial Networks
(CGAN), which jointly learns a generator to produce descriptions conditioned on
images and an evaluator to assess how well a description fits the visual
content. It is noteworthy that training a sequence generator is nontrivial. We
overcome the difficulty by Policy Gradient, a strategy stemming from
Reinforcement Learning, which allows the generator to receive early feedback
along the way. We tested our method on two large datasets, where it performed
competitively against real people in our user study and outperformed other
methods on various tasks.

Color Orchestra: Ordering Color Palettes for Interpolation and Prediction

Huy Q. Phan, Hongbo Fu, Antoni B. Chan
Comments: IEEE Transactions on Visualization and Computer Graphics
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)

Color theme or color palette can deeply influence the quality and the feeling
of a photograph or a graphical design. Although color palettes may come from
different sources such as online crowd-sourcing, photographs and graphical
designs, in this paper, we consider color palettes extracted from fine art
collections, which we believe to be an abundant source of stylistic and unique
color themes. We aim to capture color styles embedded in these collections by
means of statistical models and to build practical applications upon these
models. As artists often use their personal color themes in their paintings,
making these palettes appear frequently in the dataset, we employed density
estimation to capture the characteristics of palette data. Via density
estimation, we carried out various predictions and interpolations on palettes,
which led to promising applications such as photo-style exploration, real-time
color suggestion, and enriched photo recolorization. It was, however,
challenging to apply density estimation to palette data as palettes often come
as unordered sets of colors, which make it difficult to use conventional
metrics on them. To this end, we developed a divide-and-conquer sorting
algorithm to rearrange the colors in the palettes in a coherent order, which
allows meaningful interpolation between color palettes. To confirm the
performance of our model, we also conducted quantitative experiments on
datasets of digitized paintings collected from the Internet and received
favorable results.

Auxiliary Manifold Embedding for Fully Convolutional Networks

Christoph Baur, Shadi Albarqouni, Nassir Navab
Comments: 9 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Deep learning usually requires large amounts of labeled training data, but
annotating data is costly and tedious. The framework of semi-supervised
learning provides the means to use both labeled data and arbitrary amounts of
unlabeled data for training. Recently, semi-supervised deep learning has been
intensively studied for standard CNN architectures. However, Fully
Convolutional Networks (FCNs) set the state-of-the-art for many image
segmentation tasks. To the best of our knowledge, there is no existing
semi-supervised learning method for such FCNs yet. We lift the concept of
auxiliary manifold embedding for semi-supervised learning to FCNs with the help
of Random Feature Embedding. In our experiments on the challenging task of MS
Lesion Segmentation, we leverage the proposed framework for the purpose of
domain adaptation and report substantial improvements over the baseline model.

Comparison of Different Methods for Tissue Segmentation in Histopathological Whole-Slide Images

Péter Bándi, Rob van de Loo, Milad Intezar, Daan Geijs, Francesco Ciompi, Bram van Ginneken, Jeroen van der Laak, Geert Litjens
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

Tissue segmentation is an important pre-requisite for efficient and accurate
diagnostics in digital pathology. However, it is well known that whole-slide
scanners can fail in detecting all tissue regions, for example due to the
tissue type, or due to weak staining because their tissue detection algorithms
are not robust enough. In this paper, we introduce two different convolutional
neural network architectures for whole slide image segmentation to accurately
identify the tissue sections. We also compare the algorithms to a published
traditional method. We collected 54 whole slide images with differing stains
and tissue types from three laboratories to validate our algorithms. We show
that while the two methods do not differ significantly they outperform their
traditional counterpart (Jaccard index of 0.937 and 0.929 vs. 0.870, p < 0.01).

Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery

Thomas Schlegl, Philipp Seeböck, Sebastian M. Waldstein, Ursula Schmidt-Erfurth, Georg Langs
Comments: To be published in the proceedings of the international conference on Information Processing in Medical Imaging (IPMI), 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

Obtaining models that capture imaging markers relevant for disease
progression and treatment monitoring is challenging. Models are typically based
on large amounts of data with annotated examples of known markers aiming at
automating detection. High annotation effort and the limitation to a vocabulary
of known markers limit the power of such approaches. Here, we perform
unsupervised learning to identify anomalies in imaging data as candidates for
markers. We propose AnoGAN, a deep convolutional generative adversarial network
to learn a manifold of normal anatomical variability, accompanying a novel
anomaly scoring scheme based on the mapping from image space to a latent space.
Applied to new data, the model labels anomalies, and scores image patches
indicating their fit into the learned distribution. Results on optical
coherence tomography images of the retina demonstrate that the approach
correctly identifies anomalous images, such as images containing retinal fluid
or hyperreflective foci.

Computer Aided Detection of Anemia-like Pallor

Sohini Roychowdhury, Donny Sun, Matthew Bihis, Johnny Ren, Paul Hage, Humairat H. Rahman
Comments: 4 pages,2 figures, 2 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Paleness or pallor is a manifestation of blood loss or low hemoglobin
concentrations in the human blood that can be caused by pathologies such as
anemia. This work presents the first automated screening system that utilizes
pallor site images, segments, and extracts color and intensity-based features
for multi-class classification of patients with high pallor due to anemia-like
pathologies, normal patients and patients with other abnormalities. This work
analyzes the pallor sites of conjunctiva and tongue for anemia screening
purposes. First, for the eye pallor site images, the sclera and conjunctiva
regions are automatically segmented for regions of interest. Similarly, for the
tongue pallor site images, the inner and outer tongue regions are segmented.
Then, color-plane based feature extraction is performed followed by machine
learning algorithms for feature reduction and image level classification for
anemia. In this work, a suite of classification algorithms image-level
classifications for normal (class 0), pallor (class 1) and other abnormalities
(class 2). The proposed method achieves 86% accuracy, 85% precision and 67%
recall in eye pallor site images and 98.2% accuracy and precision with 100%
recall in tongue pallor site images for classification of images with pallor.
The proposed pallor screening system can be further fine-tuned to detect the
severity of anemia-like pathologies using controlled set of local images that
can then be used for future benchmarking purposes.

Learning Robust Visual-Semantic Embeddings

Yao-Hung Hubert Tsai, Liang-Kang Huang, Ruslan Salakhutdinov
Comments: 12 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Learning (cs.LG)

Many of the existing methods for learning joint embedding of images and text
use only supervised information from paired images and its textual attributes.
Taking advantage of the recent success of unsupervised learning in deep neural
networks, we propose an end-to-end learning framework that is able to extract
more robust multi-modal representations across domains. The proposed method
combines representation learning models (i.e., auto-encoders) together with
cross-domain learning criteria (i.e., Maximum Mean Discrepancy loss) to learn
joint embeddings for semantic and visual features. A novel technique of
unsupervised-data adaptation inference is introduced to construct more
comprehensive embeddings for both labeled and unlabeled data. We evaluate our
method on Animals with Attributes and Caltech-UCSD Birds 200-2011 dataset with
a wide range of applications, including zero and few-shot image recognition and
retrieval, from inductive to transductive settings. Empirically, we show that
our framework improves over the current state of the art on many of the
considered tasks.

Need for Speed: A Benchmark for Higher Frame Rate Object Tracking

Hamed Kiani Galoogahi, Ashton Fagg, Chen Huang, Deva Ramanan, Simon Lucey
Subjects: Computer Vision and Pattern Recognition (cs.CV)

In this paper, we propose the first higher frame rate video dataset (called
Need for Speed – NfS) and benchmark for visual object tracking. The dataset
consists of 100 videos (380K frames) captured with now commonly available
higher frame rate (240 FPS) cameras from real world scenarios. All frames are
annotated with axis aligned bounding boxes and all sequences are manually
labelled with nine visual attributes – such as occlusion, fast motion,
background clutter, etc. Our benchmark provides an extensive evaluation of many
recent and state-of-the-art trackers on higher frame rate sequences. We ranked
each of these trackers according to their tracking accuracy and real-time
performance. One of our surprising conclusions is that at higher frame rates,
simple trackers such as correlation filters outperform complex methods based on
deep networks. This suggests that for practical applications (such as in
robotics or embedded vision), one needs to carefully tradeoff bandwidth
constraints associated with higher frame rate acquisition, computational costs
of real-time analysis, and the required application accuracy. Our dataset and
benchmark allows for the first time (to our knowledge) systematic exploration
of such issues, and will be made available to allow for further research in
this space.

DropRegion Training of Inception Font Network for High-Performance Chinese Font Recognition

Shuangping Huangm Zhuoyao Zhong, Lianwen Jin, Shuye Zhang, Haobin Wang
Comments: 15 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Chinese font recognition (CFR) has gained significant attention in recent
years. However, due to the sparsity of labeled font samples and the structural
complexity of Chinese characters, CFR is still a challenging task. In this
paper, a DropRegion method is proposed to generate a large number of stochastic
variant font samples whose local regions are selectively disrupted and an
inception font network (IFN) with two additional convolutional neural network
(CNN) structure elements, i.e., a cascaded cross-channel parametric pooling
(CCCP) and global average pooling, is designed. Because the distribution of
strokes in a font image is non-stationary, an elastic meshing technique that
adaptively constructs a set of local regions with equalized information is
developed. Thus, DropRegion is seamlessly embedded in the IFN, which enables
end-to-end training; the proposed DropRegion-IFN can be used for high
performance CFR. Experimental results have confirmed the effectiveness of our
new approach for CFR.

Understanding Traffic Density from Large-Scale Web Camera Data

Shanghang Zhang, Guanhang Wu, Joao P. Costeira, Jose M. F. Moura
Comments: Accepted by CVPR 2017. Preprint version was uploaded on this http URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)

In this paper, we estimate traffic density from low quality videos captured
by city web cameras (webcams). Webcam videos have low resolution, low frame
rate, high occlusion and large perspective, making most existing methods lose
their efficacy. To deeply understand traffic density, we explore both deep
learning based and optimization based methods. To avoid individual vehicle
detection and tracking, both methods map the image into vehicle density map,
one based on rank constrained regression and the other one based on fully
convolution networks (FCN). The regression based method learns different
weights for different blocks in the image to increase freedom degrees of
weights and embed perspective information. The FCN based method jointly
estimates vehicle density map and vehicle count with a residual learning
framework to perform end-to-end dense prediction, allowing arbitrary image
resolution, and adapting to different vehicle scales and perspectives. We
analyze and compare both methods, and get insights from optimization based
method to improve deep model. Since existing datasets do not cover all the
challenges in our work, we collected and labelled a large-scale traffic video
dataset, containing 60 million frames from 212 webcams. Both methods are
extensively evaluated and compared on different counting tasks and three
datasets, with experimental results demonstrating their effectiveness and
robustness. In particular, FCN based method significantly reduces the mean
absolute value from 10.99 to 5.31 on the public dataset TRANCOS compared with
the state-of-the-art baseline.

Towards Closing the Energy Gap Between HOG and CNN Features for Embedded Vision

Amr Suleiman, Yu-Hsin Chen, Joel Emer, Vivienne Sze
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Computer vision enables a wide range of applications in robotics/drones,
self-driving cars, smart Internet of Things, and portable/wearable electronics.
For many of these applications, local embedded processing is preferred due to
privacy and/or latency concerns. Accordingly, energy-efficient embedded vision
hardware delivering real-time and robust performance is crucial. While deep
learning is gaining popularity in several computer vision algorithms, a
significant energy consumption difference exists compared to traditional
hand-crafted approaches. In this paper, we provide an in-depth analysis of the
computation, energy and accuracy trade-offs between learned features such as
deep Convolutional Neural Networks (CNN) and hand-crafted features such as
Histogram of Oriented Gradients (HOG). This analysis is supported by
measurements from two chips that implement these algorithms. Our goal is to
understand the source of the energy discrepancy between the two approaches and
to provide insight about the potential areas where CNNs can be improved and
eventually approach the energy-efficiency of HOG while maintaining its
outstanding performance accuracy.

Automatically identifying wild animals in camera trap images with deep learning

Mohammed Sadegh Norouzzadeh, Anh Nguyen, Margaret Kosmala, Ali Swanson, Craig Packer, Jeff Clune
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

Having accurate, detailed, and up-to-date information about wildlife location
and behavior across broad geographic areas would revolutionize our ability to
study, conserve, and manage species and ecosystems. Currently such data are
mostly gathered manually at great expense, and thus are sparsely and
infrequently collected. Here we investigate the ability to automatically,
accurately, and inexpensively collect such data from motion sensor cameras.
These camera traps enable pictures of wildlife to be collected inexpensively,
unobtrusively, and at high-volume. However, identifying the animals, animal
attributes, and behaviors in these pictures remains an expensive,
time-consuming, manual task often performed by researchers, hired technicians,
or crowdsourced teams of human volunteers. In this paper, we demonstrate that
such data can be automatically extracted by deep neural networks (aka deep
learning), which is a cutting-edge type of artificial intelligence. In
particular, we use the existing human-labeled images from the Snapshot
Serengeti dataset to train deep convolutional neural networks for identifying
48 species in 3.2 million images taken from Tanzania’s Serengeti National Park.
We train neural networks that automatically identify animals with over 92%
accuracy. More importantly, we can choose to have our system classify only the
images it is highly confident about, allowing valuable human time to be focused
only on challenging images. In this case, our automatic animal identification
system saves approximately ~8.3 years (at 40 hours per week) of human labeling
effort (i.e. over 17,000 hours) while operating on a 3.2-million-image dataset
at the same 96% accuracy level of crowdsourced teams of human volunteers. Those
efficiency gains immediately highlight the importance of using deep neural
networks to automate data extraction from camera trap images.

Low-rank and Sparse NMF for Joint Endmembers' Number Estimation and Blind Unmixing of Hyperspectral Images

Paris V. Giampouras, Athanasios A. Rontogiannis, Konstantinos D. Koutroumbas
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

Estimation of the number of endmembers existing in a scene constitutes a
critical task in the hyperspectral unmixing process. The accuracy of this
estimate plays a crucial role in subsequent unsupervised unmixing steps i.e.,
the derivation of the spectral signatures of the endmembers (endmembers’
extraction) and the estimation of the abundance fractions of the pixels. A
common practice amply followed in literature is to treat endmembers’ number
estimation and unmixing, independently as two separate tasks, providing the
outcome of the former as input to the latter. In this paper, we go beyond this
computationally demanding strategy. More precisely, we set forth a multiple
constrained optimization framework, which encapsulates endmembers’ number
estimation and unsupervised unmixing in a single task. This is attained by
suitably formulating the problem via a low-rank and sparse nonnegative matrix
factorization rationale, where low-rankness is promoted with the use of a
sophisticated (ell_2/ell_1) norm penalty term. An alternating proximal
algorithm is then proposed for minimizing the emerging cost function. The
results obtained by simulated and real data experiments verify the
effectiveness of the proposed approach.

Artificial Intelligence

Approximation Complexity of Maximum A Posteriori Inference in Sum-Product Networks

Denis Deratani Mauá, Cassio P. de Campos
Comments: 14 pages
Subjects: Artificial Intelligence (cs.AI)

We discuss the computational complexity of approximating maximum a posteriori
inference in sum-product networks. We first show NP-hardness in three-level
trees by a reduction from maximum independent set; this implies
non-approximability within a sublinear factor. We show that this is a tight
bound, as we can find an approximation within a linear factor in three-level
networks. We then show that in four-level trees it is NP-hard to approximate
the problem within a factor (2^{f(n)}) for any sublinear function (f) of the
size of the input (n). Again, this is bound is tight, as we prove that the
usual max-product algorithm finds (in any network) approximations within factor
(2^{c n}) from some constant (c < 1). Last, we present a simple algorithm, and
show that it provably produces solutions at least as good as, and potentially
much better than, the max-product algorithm.

A Visual Web Tool to Perform What-If Analysis of Optimization Approaches

Sascha Van Cauwelaert, Michele Lombardi, Pierre Schaus
Subjects: Artificial Intelligence (cs.AI); Performance (cs.PF)

In Operation Research, practical evaluation is essential to validate the
efficacy of optimization approaches. This paper promotes the usage of
performance profiles as a standard practice to visualize and analyze
experimental results. It introduces a Web tool to construct and export
performance profiles as SVG or HTML files. In addition, the application relies
on a methodology to estimate the benefit of hypothetical solver improvements.
Therefore, the tool allows one to employ what-if analysis to screen possible
research directions, and identify those having the best potential. The approach
is showcased on two Operation Research technologies: Constraint Programming and
Mixed Integer Linear Programming.

Generalised Reichenbachian Common Cause Systems

Claudio Mazzola
Subjects: Other Statistics (stat.OT); Artificial Intelligence (cs.AI)

The principle of the common cause claims that if an improbable coincidence
has occurred, there must exist a common cause. This is generally taken to mean
that positive correlations between non-causally related events should disappear
when conditioning on the action of some underlying common cause. The extended
interpretation of the principle, by contrast, urges that common causes should
be called for in order to explain positive deviations between the estimated
correlation of two events and the expected value of their correlation. The aim
of this paper is to provide the extended reading of the principle with a
general probabilistic model, capturing the simultaneous action of a system of
multiple common causes. To this end, two distinct models are elaborated, and
the necessary and sufficient conditions for their existence are determined.

Modeling Relational Data with Graph Convolutional Networks

Michael Schlichtkrull, Thomas N. Kipf, Peter Bloem, Rianne van den Berg, Ivan Titov, Max Welling
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Databases (cs.DB); Learning (cs.LG)

Knowledge bases play a crucial role in many applications, for example
question answering and information retrieval. Despite the great effort invested
in creating and maintaining them, even the largest representatives (e.g., Yago,
DBPedia or Wikidata) are highly incomplete. We introduce relational graph
convolutional networks (R-GCNs) and apply them to two standard knowledge base
completion tasks: link prediction (recovery of missing facts, i.e.
subject-predicate-object triples) and entity classification (recovery of
missing attributes of entities). R-GCNs are a generalization of graph
convolutional networks, a recent class of neural networks operating on graphs,
and are developed specifically to deal with highly multi-relational data,
characteristic of realistic knowledge bases. Our methods achieve competitive
results on standard benchmarks for both tasks.

Particle Value Functions

Chris J. Maddison, Dieterich Lawson, George Tucker, Nicolas Heess, Arnaud Doucet, Andriy Mnih, Yee Whye Teh
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

The policy gradients of the expected return objective can react slowly to
rare rewards. Yet, in some cases agents may wish to emphasize the low or high
returns regardless of their probability. Borrowing from the economics and
control literature, we review the risk-sensitive value function that arises
from an exponential utility and illustrate its effects on an example. This
risk-sensitive value function is not always applicable to reinforcement
learning problems, so we introduce the particle value function defined by a
particle filter over the distributions of an agent’s experience, which bounds
the risk-sensitive one. We illustrate the benefit of the policy gradients of
this objective in Cliffworld.

Information Retrieval

Global Entity Ranking Across Multiple Languages

Prantik Bhattacharyya, Nemanja Spasojevic
Comments: 2 Pages, 1 Figure, 2 Tables, WWW2017 Companion, WWW 2017 Companion
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Social and Information Networks (cs.SI)

We present work on building a global long-tailed ranking of entities across
multiple languages using Wikipedia and Freebase knowledge bases. We identify
multiple features and build a model to rank entities using a ground-truth
dataset of more than 10 thousand labels. The final system ranks 27 million
entities with 75% precision and 48% F1 score. We provide performance evaluation
and empirical evidence of the quality of ranking across languages, and open the
final ranked lists for future research.

Temporal Information Extraction for Question Answering Using Syntactic Dependencies in an LSTM-based Architecture

Yuanliang Meng, Anna Rumshisky, Alexey Romanov
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

In this paper, we propose to use a set of simple, uniform in architecture
LSTM-based models to recover different kinds of temporal relations from text.
Using the shortest dependency path between entities as input, the same
architecture is used to extract intra-sentence, cross-sentence, and document
creation time relations. A “double-checking” technique reverses entity pairs in
classification, boosting the recall of positive cases and reducing
misclassifications between opposite classes. An efficient pruning algorithm
resolves conflicts globally. Evaluated on QA-TempEval (SemEval2015 Task 5), our
proposed technique outperforms state-of-the-art methods by a large margin.

Search Engine Drives the Evolution of Social Networks

Cai Fu, Chenchen Peng, Xiao-Yang Liu
Comments: 9 pages, 11 figures
Subjects: Social and Information Networks (cs.SI); Information Retrieval (cs.IR); Physics and Society (physics.soc-ph)

The search engine is tightly coupled with social networks and is primarily
designed for users to acquire interested information. Specifically, the search
engine assists the information dissemination for social networks, i.e.,
enabling users to access interested contents with keywords-searching and
promoting the process of contents-transferring from the source users directly
to potential interested users. Accompanying such processes, the social network
evolves as new links emerge between users with common interests. However, there
is no clear understanding of such a “chicken-and-egg” problem, namely, new
links encourage more social interactions, and vice versa. In this paper, we aim
to quantitatively characterize the social network evolution phenomenon driven
by a search engine. First, we propose a search network model for social network
evolution. Second, we adopt two performance metrics, namely, degree
distribution and network diameter. Theoretically, we prove that the degree
distribution follows an intensified power-law, and the network diameter
shrinks. Third, we quantitatively show that the search engine accelerates the
rumor propagation in social networks. Finally, based on four real-world data
sets (i.e., CDBLP, Facebook, Weibo Tweets, P2P), we verify our theoretical
findings. Furthermore, we find that the search engine dramatically increases
the speed of rumor propagation.

Computation and Language

Construction of a Japanese Word Similarity Dataset

Yuya Sakaizawa, Mamoru Komachi
Comments: 5 pages
Subjects: Computation and Language (cs.CL)

An evaluation of distributed word representation is generally conducted using
a word similarity task and/or a word analogy task. There are many datasets
readily available for these tasks in English. However, evaluating distributed
representation in languages that do not have such resources (e.g., Japanese) is
difficult. Therefore, as a first step toward evaluating distributed
representations in Japanese, we constructed a Japanese word similarity dataset.
To the best of our knowledge, our dataset is the first resource that can be
used to evaluate distributed representations in Japanese. Moreover, our dataset
contains various parts of speech and includes rare words in addition to common
words.

Empirical Evaluation of Parallel Training Algorithms on Acoustic Modeling

Wenpeng Li, BinBin Zhang, Lei Xie, Dong Yu
Subjects: Computation and Language (cs.CL); Learning (cs.LG); Sound (cs.SD)

Deep learning models (DLMs) are state-of-the-art techniques in speech
recognition. However, training good DLMs can be time consuming especially for
production-size models and corpora. Although several parallel training
algorithms have been proposed to improve training efficiency, there is no clear
guidance on which one to choose for the task in hand due to lack of systematic
and fair comparison among them. In this paper we aim at filling this gap by
comparing four popular parallel training algorithms in speech recognition,
namely asynchronous stochastic gradient descent (ASGD), blockwise model-update
filtering (BMUF), bulk synchronous parallel (BSP) and elastic averaging
stochastic gradient descent (EASGD), on 1000-hour LibriSpeech corpora using
feed-forward deep neural networks (DNNs) and convolutional, long short-term
memory, DNNs (CLDNNs). Based on our experiments, we recommend using BMUF as the
top choice to train acoustic models since it is most stable, scales well with
number of GPUs, can achieve reproducible results, and in many cases even
outperforms single-GPU SGD. ASGD can be used as a substitute in some cases.

Global Entity Ranking Across Multiple Languages

We present work on building a global long-tailed ranking of entities across
multiple languages using Wikipedia and Freebase knowledge bases. We identify
multiple features and build a model to rank entities using a ground-truth
dataset of more than 10 thousand labels. The final system ranks 27 million
entities with 75% precision and 48% F1 score. We provide performance evaluation
and empirical evidence of the quality of ranking across languages, and open the
final ranked lists for future research.

Learning Robust Visual-Semantic Embeddings

Yao-Hung Hubert Tsai, Liang-Kang Huang, Ruslan Salakhutdinov
Comments: 12 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Learning (cs.LG)

Many of the existing methods for learning joint embedding of images and text
use only supervised information from paired images and its textual attributes.
Taking advantage of the recent success of unsupervised learning in deep neural
networks, we propose an end-to-end learning framework that is able to extract
more robust multi-modal representations across domains. The proposed method
combines representation learning models (i.e., auto-encoders) together with
cross-domain learning criteria (i.e., Maximum Mean Discrepancy loss) to learn
joint embeddings for semantic and visual features. A novel technique of
unsupervised-data adaptation inference is introduced to construct more
comprehensive embeddings for both labeled and unlabeled data. We evaluate our
method on Animals with Attributes and Caltech-UCSD Birds 200-2011 dataset with
a wide range of applications, including zero and few-shot image recognition and
retrieval, from inductive to transductive settings. Empirically, we show that
our framework improves over the current state of the art on many of the
considered tasks.

Temporal Information Extraction for Question Answering Using Syntactic Dependencies in an LSTM-based Architecture

Yuanliang Meng, Anna Rumshisky, Alexey Romanov
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

In this paper, we propose to use a set of simple, uniform in architecture
LSTM-based models to recover different kinds of temporal relations from text.
Using the shortest dependency path between entities as input, the same
architecture is used to extract intra-sentence, cross-sentence, and document
creation time relations. A “double-checking” technique reverses entity pairs in
classification, boosting the recall of positive cases and reducing
misclassifications between opposite classes. An efficient pruning algorithm
resolves conflicts globally. Evaluated on QA-TempEval (SemEval2015 Task 5), our
proposed technique outperforms state-of-the-art methods by a large margin.

Distributed, Parallel, and Cluster Computing

Communication Primitives in Cognitive Radio Networks

Seth Gilbert, Fabian Kuhn, Chaodong Zheng
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

Cognitive radio networks are a new type of multi-channel wireless network in
which different nodes can have access to different sets of channels. By
providing multiple channels, they improve the efficiency and reliability of
wireless communication. However, the heterogeneous nature of cognitive radio
networks also brings new challenges to the design and analysis of distributed
algorithms.

In this paper, we focus on two fundamental problems in cognitive radio
networks: neighbor discovery, and global broadcast. We consider a network
containing (n) nodes, each of which has access to (c) channels. We assume the
network has diameter (D), and each pair of neighbors have at least (kgeq 1),
and at most (k_{max}leq c), shared channels. We also assume each node has at
most (Delta) neighbors. For the neighbor discovery problem, we design a
randomized algorithm CSeek which has time complexity
( ilde{O}((c^2/k)+(k_{max}/k)cdotDelta)). CSeek is flexible and robust,
which allows us to use it as a generic “filter” to find “well-connected”
neighbors with an even shorter running time. We then move on to the global
broadcast problem, and propose CGCast, a randomized algorithm which takes
( ilde{O}((c^2/k)+(k_{max}/k)cdotDelta+DcdotDelta)) time. CGCast uses
CSeek to achieve communication among neighbors, and uses edge coloring to
establish an efficient schedule for fast message dissemination.

Towards the end of the paper, we give lower bounds for solving the two
problems. These lower bounds demonstrate that in many situations, CSeek and
CGCast are near optimal.

Block CUR : Decomposing Large Distributed Matrices

Urvashi Oswal, Swayambhoo Jain, Kevin S. Xu, Brian Eriksson
Subjects: Machine Learning (stat.ML); Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS); Learning (cs.LG)

A common problem in large-scale data analysis is to approximate a matrix
using a combination of specifically sampled rows and columns, known as CUR
decomposition. Unfortunately, in many real-world environments, the ability to
sample specific individual rows or columns of the matrix is limited by either
system constraints or cost. In this paper, we consider matrix approximation by
sampling predefined blocks of columns (or rows) from the matrix. This regime is
commonly found when data is distributed across multiple nodes in a compute
cluster, where such blocks correspond to columns (or rows) of the matrix stored
on the same node, which can be retrieved with much less overhead than
retrieving individual columns stored across different nodes. We propose a novel
algorithm for sampling useful column blocks and provide guarantees for the
quality of the approximation. We demonstrate the practical utility of this
algorithm for computing the block CUR decomposition of large matrices in a
distributed setting using Apache Spark. Using our proposed block CUR
algorithms, we can achieve a significant speed-up compared to a regular CUR
decomposition with the same quality of approximation.

Computation Peer Offloading for Energy-Constrained Mobile Edge Computing in Small-Cell Networks

Lixing Chen, Sheng Zhou, Jie Xu
Subjects: Computer Science and Game Theory (cs.GT); Distributed, Parallel, and Cluster Computing (cs.DC)

The (ultra-)dense deployment of small-cell base stations (SBSs) endowed with
cloud-like computing functionalities paves the way for pervasive mobile edge
computing (MEC), enabling ultra-low latency and location-awareness for a
variety of emerging mobile applications and the Internet of Things. To handle
spatially uneven computation workloads in the network, cooperation among SBSs
via workload peer offloading is essential to avoid large computation latency at
overloaded SBSs and provide high quality of service to end users. However,
performing effective peer offloading faces many unique challenges in small cell
networks due to limited energy resources committed by self-interested SBS
owners, uncertainties in the system dynamics and co-provisioning of radio
access and computing services. This paper develops a novel online SBS peer
offloading framework, called OPEN, by leveraging the Lyapunov technique, in
order to maximize the long-term system performance while keeping the energy
consumption of SBSs below individual long-term constraints. OPEN works online
without requiring information about future system dynamics, yet provides
provably near-optimal performance compared to the oracle solution that has the
complete future information. In addition, this paper formulates a novel peer
offloading game among SBSs, analyzes its equilibrium and efficiency loss in
terms of the price of anarchy in order to thoroughly understand SBSs’ strategic
behaviors, thereby enabling decentralized and autonomous peer offloading
decision making. Extensive simulations are carried out and show that peer
offloading among SBSs dramatically improves the edge computing performance.

Learning

Deep Sets

Manzil Zaheer, Satwik Kottur, Siamak Ravanbhakhsh, Barnabas Poczos, Ruslan Ssalakhutdinov, Alexander Smola
Subjects: Learning (cs.LG); Machine Learning (stat.ML)

In this paper, we study the problem of designing objective functions for
machine learning problems defined on finite emph{sets}. In contrast to
traditional objective functions defined for machine learning problems operating
on finite dimensional vectors, the new objective functions we propose are
operating on finite sets and are invariant to permutations. Such problems are
widespread, ranging from estimation of population statistics
citep{poczos13aistats}, via anomaly detection in piezometer data of embankment
dams citep{Jung15Exploration}, to cosmology
citep{Ntampaka16Dynamical,Ravanbakhsh16ICML1}. Our main theorem characterizes
the permutation invariant objective functions and provides a family of
functions to which any permutation invariant objective function must belong.
This family of functions has a special structure which enables us to design a
deep network architecture that can operate on sets and which can be deployed on
a variety of scenarios including both unsupervised and supervised learning
tasks. We demonstrate the applicability of our method on population statistic
estimation, point cloud classification, set expansion, and image tagging.

Online Learning for Offloading and Autoscaling in Energy Harvesting Mobile Edge Computing

Jie Xu, Lixing Chen, Shaolei Ren
Comments: arXiv admin note: text overlap with arXiv:1701.01090 by other authors
Subjects: Learning (cs.LG); Networking and Internet Architecture (cs.NI)

Mobile edge computing (a.k.a. fog computing) has recently emerged to enable
in-situ processing of delay-sensitive applications at the edge of mobile
networks. Providing grid power supply in support of mobile edge computing,
however, is costly and even infeasible (in certain rugged or under-developed
areas), thus mandating on-site renewable energy as a major or even sole power
supply in increasingly many scenarios. Nonetheless, the high intermittency and
unpredictability of renewable energy make it very challenging to deliver a high
quality of service to users in energy harvesting mobile edge computing systems.
In this paper, we address the challenge of incorporating renewables into mobile
edge computing and propose an efficient reinforcement learning-based resource
management algorithm, which learns on-the-fly the optimal policy of dynamic
workload offloading (to the centralized cloud) and edge server provisioning to
minimize the long-term system cost (including both service delay and
operational cost). Our online learning algorithm uses a decomposition of the
(offline) value iteration and (online) reinforcement learning, thus achieving a
significant improvement of learning rate and run-time performance when compared
to standard reinforcement learning algorithms such as Q-learning. We prove the
convergence of the proposed algorithm and analytically show that the learned
policy has a simple monotone structure amenable to practical implementation.
Our simulation results validate the efficacy of our algorithm, which
significantly improves the edge computing performance compared to fixed or
myopic optimization schemes and conventional reinforcement learning algorithms.

Conditional Accelerated Lazy Stochastic Gradient Descent

Guanghui Lan, Sebastian Pokutta, Yi Zhou, Daniel Zink
Comments: 33 pages, 9 figures
Subjects: Learning (cs.LG); Machine Learning (stat.ML)

In this work we introduce a conditional accelerated lazy stochastic gradient
descent algorithm with optimal number of calls to a stochastic first-order
oracle and convergence rate (Oleft(frac{1}{varepsilon^2}
ight)) improving
over the projection-free, Online Frank-Wolfe based stochastic gradient descent
of Hazan and Kale [2012] with convergence rate
(Oleft(frac{1}{varepsilon^4}
ight)).

Particle Value Functions

Chris J. Maddison, Dieterich Lawson, George Tucker, Nicolas Heess, Arnaud Doucet, Andriy Mnih, Yee Whye Teh
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

The policy gradients of the expected return objective can react slowly to
rare rewards. Yet, in some cases agents may wish to emphasize the low or high
returns regardless of their probability. Borrowing from the economics and
control literature, we review the risk-sensitive value function that arises
from an exponential utility and illustrate its effects on an example. This
risk-sensitive value function is not always applicable to reinforcement
learning problems, so we introduce the particle value function defined by a
particle filter over the distributions of an agent’s experience, which bounds
the risk-sensitive one. We illustrate the benefit of the policy gradients of
this objective in Cliffworld.

Nonconvex One-bit Single-label Multi-label Learning

Shuang Qiu, Tingjin Luo, Jieping Ye, Ming Lin
Subjects: Machine Learning (stat.ML); Learning (cs.LG)

We study an extreme scenario in multi-label learning where each training
instance is endowed with a single one-bit label out of multiple labels. We
formulate this problem as a non-trivial special case of one-bit rank-one matrix
sensing and develop an efficient non-convex algorithm based on alternating
power iteration. The proposed algorithm is able to recover the underlying
low-rank matrix model with linear convergence. For a rank-(k) model with (d_1)
features and (d_2) classes, the proposed algorithm achieves (O(epsilon))
recovery error after retrieving (O(k^{1.5}d_1 d_2/epsilon)) one-bit labels
within (O(kd)) memory. Our bound is nearly optimal in the order of
(O(1/epsilon)). This significantly improves the state-of-the-art sampling
complexity of one-bit multi-label learning. We perform experiments to verify
our theory and evaluate the performance of the proposed algorithm.

Modeling Relational Data with Graph Convolutional Networks

Knowledge bases play a crucial role in many applications, for example
question answering and information retrieval. Despite the great effort invested
in creating and maintaining them, even the largest representatives (e.g., Yago,
DBPedia or Wikidata) are highly incomplete. We introduce relational graph
convolutional networks (R-GCNs) and apply them to two standard knowledge base
completion tasks: link prediction (recovery of missing facts, i.e.
subject-predicate-object triples) and entity classification (recovery of
missing attributes of entities). R-GCNs are a generalization of graph
convolutional networks, a recent class of neural networks operating on graphs,
and are developed specifically to deal with highly multi-relational data,
characteristic of realistic knowledge bases. Our methods achieve competitive
results on standard benchmarks for both tasks.

Machine learning approach for early detection of autism by combining questionnaire and home video screening

Halim Abbas, Ford Garberson, Eric Glover, Dennis P Wall
Subjects: Computers and Society (cs.CY); Learning (cs.LG)

Existing screening tools for early detection of autism are expensive,
cumbersome, time-intensive, and sometimes fall short in predictive value. In
this work, we apply Machine Learning (ML) to gold standard clinical data
obtained across thousands of children at risk for autism spectrum disorders to
create a low-cost, quick, and easy to apply autism screening tool that performs
as well or better than most widely used standardized instruments. This new tool
combines two screening methods into a single assessment, one based on short,
structured parent-report questionnaires and the other on tagging key behaviors
from short, semi-structured home videos of children. To overcome the scarcity,
sparsity, and imbalance of training data, we apply creative feature selection,
feature engineering, and novel feature encoding techniques. We allow for
inconclusive determination where appropriate in order to boost screening
accuracy when conclusive. We demonstrate a significant accuracy improvement
over standard screening tools in a clinical study sample of 162 children.

Block CUR : Decomposing Large Distributed Matrices

A common problem in large-scale data analysis is to approximate a matrix
using a combination of specifically sampled rows and columns, known as CUR
decomposition. Unfortunately, in many real-world environments, the ability to
sample specific individual rows or columns of the matrix is limited by either
system constraints or cost. In this paper, we consider matrix approximation by
sampling predefined blocks of columns (or rows) from the matrix. This regime is
commonly found when data is distributed across multiple nodes in a compute
cluster, where such blocks correspond to columns (or rows) of the matrix stored
on the same node, which can be retrieved with much less overhead than
retrieving individual columns stored across different nodes. We propose a novel
algorithm for sampling useful column blocks and provide guarantees for the
quality of the approximation. We demonstrate the practical utility of this
algorithm for computing the block CUR decomposition of large matrices in a
distributed setting using Apache Spark. Using our proposed block CUR
algorithms, we can achieve a significant speed-up compared to a regular CUR
decomposition with the same quality of approximation.

Comparison of Different Methods for Tissue Segmentation in Histopathological Whole-Slide Images

Tissue segmentation is an important pre-requisite for efficient and accurate
diagnostics in digital pathology. However, it is well known that whole-slide
scanners can fail in detecting all tissue regions, for example due to the
tissue type, or due to weak staining because their tissue detection algorithms
are not robust enough. In this paper, we introduce two different convolutional
neural network architectures for whole slide image segmentation to accurately
identify the tissue sections. We also compare the algorithms to a published
traditional method. We collected 54 whole slide images with differing stains
and tissue types from three laboratories to validate our algorithms. We show
that while the two methods do not differ significantly they outperform their
traditional counterpart (Jaccard index of 0.937 and 0.929 vs. 0.870, p < 0.01).

Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery

Obtaining models that capture imaging markers relevant for disease
progression and treatment monitoring is challenging. Models are typically based
on large amounts of data with annotated examples of known markers aiming at
automating detection. High annotation effort and the limitation to a vocabulary
of known markers limit the power of such approaches. Here, we perform
unsupervised learning to identify anomalies in imaging data as candidates for
markers. We propose AnoGAN, a deep convolutional generative adversarial network
to learn a manifold of normal anatomical variability, accompanying a novel
anomaly scoring scheme based on the mapping from image space to a latent space.
Applied to new data, the model labels anomalies, and scores image patches
indicating their fit into the learned distribution. Results on optical
coherence tomography images of the retina demonstrate that the approach
correctly identifies anomalous images, such as images containing retinal fluid
or hyperreflective foci.

Learning Robust Visual-Semantic Embeddings

Yao-Hung Hubert Tsai, Liang-Kang Huang, Ruslan Salakhutdinov
Comments: 12 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Learning (cs.LG)

Many of the existing methods for learning joint embedding of images and text
use only supervised information from paired images and its textual attributes.
Taking advantage of the recent success of unsupervised learning in deep neural
networks, we propose an end-to-end learning framework that is able to extract
more robust multi-modal representations across domains. The proposed method
combines representation learning models (i.e., auto-encoders) together with
cross-domain learning criteria (i.e., Maximum Mean Discrepancy loss) to learn
joint embeddings for semantic and visual features. A novel technique of
unsupervised-data adaptation inference is introduced to construct more
comprehensive embeddings for both labeled and unlabeled data. We evaluate our
method on Animals with Attributes and Caltech-UCSD Birds 200-2011 dataset with
a wide range of applications, including zero and few-shot image recognition and
retrieval, from inductive to transductive settings. Empirically, we show that
our framework improves over the current state of the art on many of the
considered tasks.

Empirical Evaluation of Parallel Training Algorithms on Acoustic Modeling

Wenpeng Li, BinBin Zhang, Lei Xie, Dong Yu
Subjects: Computation and Language (cs.CL); Learning (cs.LG); Sound (cs.SD)

Deep learning models (DLMs) are state-of-the-art techniques in speech
recognition. However, training good DLMs can be time consuming especially for
production-size models and corpora. Although several parallel training
algorithms have been proposed to improve training efficiency, there is no clear
guidance on which one to choose for the task in hand due to lack of systematic
and fair comparison among them. In this paper we aim at filling this gap by
comparing four popular parallel training algorithms in speech recognition,
namely asynchronous stochastic gradient descent (ASGD), blockwise model-update
filtering (BMUF), bulk synchronous parallel (BSP) and elastic averaging
stochastic gradient descent (EASGD), on 1000-hour LibriSpeech corpora using
feed-forward deep neural networks (DNNs) and convolutional, long short-term
memory, DNNs (CLDNNs). Based on our experiments, we recommend using BMUF as the
top choice to train acoustic models since it is most stable, scales well with
number of GPUs, can achieve reproducible results, and in many cases even
outperforms single-GPU SGD. ASGD can be used as a substitute in some cases.

Automatically identifying wild animals in camera trap images with deep learning

Mohammed Sadegh Norouzzadeh, Anh Nguyen, Margaret Kosmala, Ali Swanson, Craig Packer, Jeff Clune
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

Having accurate, detailed, and up-to-date information about wildlife location
and behavior across broad geographic areas would revolutionize our ability to
study, conserve, and manage species and ecosystems. Currently such data are
mostly gathered manually at great expense, and thus are sparsely and
infrequently collected. Here we investigate the ability to automatically,
accurately, and inexpensively collect such data from motion sensor cameras.
These camera traps enable pictures of wildlife to be collected inexpensively,
unobtrusively, and at high-volume. However, identifying the animals, animal
attributes, and behaviors in these pictures remains an expensive,
time-consuming, manual task often performed by researchers, hired technicians,
or crowdsourced teams of human volunteers. In this paper, we demonstrate that
such data can be automatically extracted by deep neural networks (aka deep
learning), which is a cutting-edge type of artificial intelligence. In
particular, we use the existing human-labeled images from the Snapshot
Serengeti dataset to train deep convolutional neural networks for identifying
48 species in 3.2 million images taken from Tanzania’s Serengeti National Park.
We train neural networks that automatically identify animals with over 92%
accuracy. More importantly, we can choose to have our system classify only the
images it is highly confident about, allowing valuable human time to be focused
only on challenging images. In this case, our automatic animal identification
system saves approximately ~8.3 years (at 40 hours per week) of human labeling
effort (i.e. over 17,000 hours) while operating on a 3.2-million-image dataset
at the same 96% accuracy level of crowdsourced teams of human volunteers. Those
efficiency gains immediately highlight the importance of using deep neural
networks to automate data extraction from camera trap images.

Information Theory

On the Minimization of Convex Functionals of Probability Distributions Under Band Constraints

Michael Fauss, Abdelhak M. Zoubir
Comments: 11 pages, 5 figures, 1 table, to be submitted to the IEEE Transactions on Signal Processing
Subjects: Information Theory (cs.IT)

The problem of minimizing convex functionals of probability distributions is
solved under the assumption that the density of every distribution is bounded
from above and below. First, a system of sufficient and necessary first order
optimality conditions, which characterize global minima as solutions of a
fixed-point equation, is derived. Based on these conditions, two algorithms are
proposed that iteratively solve the fixed-point equation via a block coordinate
descent strategy. While the first algorithm is conceptually simpler and more
efficient, it is not guaranteed to converge for objective functions that are
not strictly convex. This shortcoming is overcome in the second algorithm,
which uses an additional outer proximal iteration, and, which is proven to
converge under very mild assumptions. Two examples are given to demonstrate the
theoretical usefulness of the optimality conditions as well as the high
efficiency and accuracy of the proposed numerical algorithms.

Performance Analysis of Ultra-Dense Networks with Elevated Base Stations

Italo Atzeni, Jesús Arnau, Marios Kountouris
Comments: 6 pages, 4 figures. To be presented at SpaSWiN’17 (WiOpt workshops), May 2017
Subjects: Information Theory (cs.IT)

This paper analyzes the downlink performance of ultra-dense networks with
elevated base stations (BSs). We consider a general dual-slope pathloss model
with distance-dependent probability of line-of-sight (LOS) transmission between
BSs and receivers. Specifically, we consider the scenario where each link may
be obstructed by randomly placed buildings. Using tools from stochastic
geometry, we show that both coverage probability and area spectral efficiency
decay to zero as the BS density grows large. Interestingly, we show that the BS
height alone has a detrimental effect on the system performance even when the
standard single-slope pathloss model is adopted.

Globally Optimal Beamforming Design for Downlink CoMP transmission with Limited Backhaul Capacity

Kien-Giang Nguyen, Quang-Doanh Vu, Markku Juntti, Le-Nam Tran
Comments: 5 pages, 2 figures; Accepted for publication, ICASSP 2017
Subjects: Information Theory (cs.IT)

This paper considers a multicell downlink channel in which multiple base
stations (BSs) cooperatively serve users by jointly precoding shared data
transported from a central processor over limited-capacity backhaul links. We
jointly design the beamformers and BS-user link selection so as to maximize the
sum rate subject to user-specific signal-to-interference-noise (SINR)
requirements, per-BS backhaul capacity and per-BS power constraints. As
existing solutions for the considered problem are suboptimal and their
optimality remains unknown due to the lack of globally optimal solutions, we
characterized this gap by proposing a globally optimal algorithm for the
problem of interest. Specifically, the proposed method is customized from a
generic framework of a branch and bound algorithm applied to discrete monotonic
optimization. We show that the proposed algorithm converges after a finite
number of iterations, and can serve as a benchmark for existing suboptimal
solutions and those that will be developed for similar contexts in the future.
In this regard, we numerically compare the proposed optimal solution to a
current state-of-the-art, which show that this suboptimal method only attains
70% to 90% of the optimal performance.

Energy Efficient Precoding C-RAN Downlink with Compression at Fronthaul

Kien-Giang Nguyen, Quang-Doanh Vu, Markku Juntti, Le-Nam Tran
Comments: 6 pages, 3 figures, accepted to IEEE ICC 2017 – Signal Processing for Communications Symposium
Subjects: Information Theory (cs.IT)

This paper considers a downlink transmission of cloud radio access network
(C-RAN) in which precoded baseband signals at a common baseband unit are
compressed before being forwarded to radio units (RUs) through limited
fronthaul capacity links. We investigate the joint design of precoding,
multivariate compression and RU-user selection which maximizes the energy
efficiency of downlink C-RAN networks. The considered problem is inherently a
rank-constrained mixed Boolean nonconvex program for which a globally optimal
solution is difficult and computationally expensive to find. In order to derive
practically appealing solutions, we invoke some useful relaxation and
transformation techniques to arrive at a more tractable (but still nonconvex)
continuous program. To solve the relaxation problem, we propose an iterative
procedure based on DC algorithms which is provably convergent. Numerical
results demonstrate the superior of the proposed solution in terms of
achievable energy efficiency compared to existing schemes.

A Novel Robust Transceiver Design for MIMO Interference Channel

Ali Dalir, Hassan Aghaeinia
Subjects: Information Theory (cs.IT)

This paper focuses on robust transceiver design for throughput enhancement on
the interference channel (IC), under imperfect channel state information (CSI).
In this paper, algorithm is proposed to improve the throughput of the
multi-input multi-output (MIMO) IC. Each transmitter and receiver has
respectively M and N antennas and IC operates in a time division duplex mode.
In the proposed algorithm, each transceiver adjusts its filter to minimize the
estimated variance of signal-to-interference-plus-noise ratio (SINR) to hedge
against the variability due to CSI error. Taylor expansion is exploited to
approximate the effect of CSI imperfection on variance. Monte Carlo simulations
are employed to investigate improvement in sum rate performance of the proposed
algorithms and the advantage of incorporating variation minimization into the
transceiver design.

A Tight Upper Bound on the Second-Order Coding Rate of Parallel Gaussian Channels with Feedback

Silas L. Fong, Vincent Y. F. Tan
Comments: 18 pages. arXiv admin note: text overlap with arXiv:1410.2390
Subjects: Information Theory (cs.IT)

This paper investigates the asymptotic expansion for the maximum coding rate
of a parallel Gaussian channel with feedback under the following setting: A
peak power constraint is imposed on every transmitted codeword, and the average
error probability of decoding the transmitted message is non-vanishing as the
blocklength increases. It is well known that the presence of feedback does not
increase the first-order asymptotics of the channel, i.e., capacity, in the
asymptotic expansion, and the closed-form expression of the capacity can be
obtained by the well-known water-filling algorithm. The main contribution of
this paper is a self-contained proof of an upper bound on the second-order
asymptotics of the parallel Gaussian channel with feedback. The proof
techniques involve developing an information spectrum bound followed by using
Curtiss’ theorem to show that a sum of dependent random variables associated
with the information spectrum bound converges in distribution to a sum of
independent random variables, thus facilitating the use of the usual central
limit theorem. Combined with existing achievability results, our result implies
that the presence of feedback does not improve the second-order asymptotics.

A Contract-based Incentive Mechanism for Energy Harvesting-based Internet of Things

Zhanwei Hou, He Chen, Yonghui Li, Zhu Han, Branka Vucetic
Subjects: Computer Science and Game Theory (cs.GT); Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

By enabling wireless devices to be charged wirelessly and remotely, radio
frequency energy harvesting (RFEH) has become a promising technology to power
the unattended Internet of Things (IoT) low-power devices. To enable this, in
future IoT networks, besides the conventional data access points (DAPs)
responsible for collecting data from IoT devices, energy access points (EAPs)
should be deployed to transfer radio frequency (RF) energy to IoT devices to
maintain their sustainable operations. In practice, the DAPs and EAPs may be
operated by different operators and a DAP should provide certain incentives to
motivate the surrounding EAPs to charge its associated IoT device(s) to assist
its data collection. Motivated by this, in this paper we develop a contract
theory-based incentive mechanism for the energy trading in RFEH assisted IoT
systems. The necessary and sufficient condition for the feasibility of the
formulated contract is analyzed. The optimal contract is derived to maximize
the DAP’s expected utility as well as the social welfare. Simulation results
demonstrate the feasibility and effectiveness of the proposed incentive
mechanism.

Invertibility of graph translation and support of Laplacian Fiedler vectors

Matthew Begué, Kasso A. Okoudjou
Comments: 21 pages, 7 figures
Subjects: Functional Analysis (math.FA); Information Theory (cs.IT)

The graph Laplacian operator is widely studied in spectral graph theory
largely due to its importance in modern data analysis. Recently, the Fourier
transform and other time-frequency operators have been defined on graphs using
Laplacian eigenvalues and eigenvectors. We extend these results and prove that
the translation operator to the (i)’th node is invertible if and only if all
eigenvectors are nonzero on the (i)’th node. Because of this dependency on the
support of eigenvectors we study the characteristic set of Laplacian
eigenvectors. We prove that the Fiedler vector of a planar graph cannot vanish
on large neighborhoods and then explicitly construct a family of non-planar
graphs that do exhibit this property.

Network Constrained Distributed Dual Coordinate Ascent for Machine Learning

Myung Cho, Lifeng Lai, Weiyu Xu
Comments: 5 pages, 6 figures
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Information Theory (cs.IT); Learning (cs.LG)

With explosion of data size and limited storage space at a single location,
data are often distributed at different locations. We thus face the challenge
of performing large-scale machine learning from these distributed data through
communication networks. In this paper, we study how the network communication
constraints will impact the convergence speed of distributed machine learning
optimization algorithms. In particular, we give the convergence rate analysis
of the distributed dual coordinate ascent in a general tree structured network.
Furthermore, by considering network communication delays, we optimize the
network-constrained dual coordinate ascent algorithms to maximize its
convergence speed. Our results show that under different network communication
delays, to achieve maximum convergence speed, one needs to adopt
delay-dependent numbers of local and global iterations for distributed dual
coordinate ascent.