Soheil Hashemi, Nicholas Anthony, Hokchhay Tann, R. Iris Bahar, Sherief Reda
Comments: Accepted for conference proceedings in DATE17
Subjects: Neural and Evolutionary Computing (cs.NE)
Deep neural networks are gaining in popularity as they are used to generate
state-of-the-art results for a variety of computer vision and machine learning
applications. At the same time, these networks have grown in depth and
complexity in order to solve harder problems. Given the limitations in power
budgets dedicated to these networks, the importance of low-power, low-memory
solutions has been stressed in recent years. While a large number of dedicated
hardware using different precisions has recently been proposed, there exists no
comprehensive study of different bit precisions and arithmetic in both inputs
and network parameters. In this work, we address this issue and perform a study
of different bit-precisions in neural networks (from floating-point to
fixed-point, powers of two, and binary). In our evaluation, we consider and
analyze the effect of precision scaling on both network accuracy and hardware
metrics including memory footprint, power and energy consumption, and design
area. We also investigate training-time methodologies to compensate for the
reduction in accuracy due to limited bit precision and demonstrate that in most
cases, precision scaling can deliver significant benefits in design metrics at
the cost of very modest decreases in network accuracy. In addition, we propose
that a small portion of the benefits achieved when using lower precisions can
be forfeited to increase the network size and therefore the accuracy. We
evaluate our experiments, using three well-recognized networks and datasets to
show its generality. We investigate the trade-offs and highlight the benefits
of using lower precisions in terms of energy and memory footprint.
Xun Huang, Yixuan Li, Omid Poursaeed, John Hopcroft, Serge Belongie
Comments: Under review
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)
In this paper we aim to leverage the powerful bottom-up discriminative
representations to guide a top-down generative model. We propose a novel
generative model named Stacked Generative Adversarial Networks (SGAN), which is
trained to invert the hierarchical representations of a discriminative
bottom-up deep network. Our model consists of a top-down stack of GANs, each
trained to generate “plausible” lower-level representations, conditioned on
higher-level representations. A representation discriminator is introduced at
each feature hierarchy to encourage the representation manifold of the
generator to align with that of the bottom-up discriminative network, providing
intermediate supervision. In addition, we introduce a conditional loss that
encourages the use of conditional information from the layer above, and a novel
entropy loss that maximizes a variational lower bound on the conditional
entropy of generator outputs. To the best of our knowledge, the entropy loss is
the first attempt to tackle the conditional model collapse problem that is
common in conditional GANs. We first train each GAN of the stack independently,
and then we train the stack end-to-end. Unlike the original GAN that uses a
single noise vector to represent all the variations, our SGAN decomposes
variations into multiple levels and gradually resolves uncertainties in the
top-down generative process. Experiments demonstrate that SGAN is able to
generate diverse and high-quality images, as well as being more interpretable
than a vanilla GAN.
Haik Manukian, Fabio L. Traversa, Massimiliano Di Ventra
Subjects: Emerging Technologies (cs.ET); Neural and Evolutionary Computing (cs.NE)
We propose to use Digital Memcomputing Machines (DMMs), implemented with
self-organizing logic gates (SOLGs), to solve the problem of numerical
inversion. Starting from fixed-point scalar inversion we describe the
generalization to solving linear systems and matrix inversion. This method,
when realized in hardware, will output the result in only one computational
step. As an example, we perform simulations of the scalar case using a 5-bit
logic circuit made of SOLGs, and show that the circuit successfully performs
the inversion. Since this type of numerical inversion can be implemented by DMM
units in hardware, it is scalable, and thus of great benefit to any real-time
computing application.
Sandeep Aswath Narayana
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Hardware Architecture (cs.AR); Neural and Evolutionary Computing (cs.NE)
Continuous improvement in silicon process technologies has made possible the
integration of hundreds of cores on a single chip. However, power and heat have
become dominant constraints in designing these massive multicore chips causing
issues with reliability, timing variations and reduced lifetime of the chips.
Dynamic Thermal Management (DTM) is a solution to avoid high temperatures on
the die. Typical DTM schemes only address core level thermal issues. However,
the Network-on-chip (NoC) paradigm, which has emerged as an enabling
methodology for integrating hundreds to thousands of cores on the same die can
contribute significantly to the thermal issues. Moreover, the typical DTM is
triggered reactively based on temperature measurements from on-chip thermal
sensor requiring long reaction times whereas predictive DTM method estimates
future temperature in advance, eliminating the chance of temperature overshoot.
Artificial Neural Networks (ANNs) have been used in various domains for
modeling and prediction with high accuracy due to its ability to learn and
adapt. This thesis concentrates on designing an ANN prediction engine to
predict the thermal profile of the cores and Network-on-Chip elements of the
chip. This thermal profile of the chip is then used by the predictive DTM that
combines both core level and network level DTM techniques. On-chip wireless
interconnect which is recently envisioned to enable energy-efficient data
exchange between cores in a multicore environment, will be used to provide a
broadcast-capable medium to efficiently distribute thermal control messages to
trigger and manage the DTM schemes.
Bodo Rueckauer, Iulia-Alexandra Lungu, Yuhuang Hu, Michael Pfeiffer
Comments: 9 pages, 2 figures, presented at the workshop “Computing with Spikes” at NIPS 2016, Barcelona, Spain
Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Deep convolutional neural networks (CNNs) have shown great potential for
numerous real-world machine learning applications, but performing inference in
large CNNs in real-time remains a challenge. We have previously demonstrated
that traditional CNNs can be converted into deep spiking neural networks
(SNNs), which exhibit similar accuracy while reducing both latency and
computational load as a consequence of their data-driven, event-based style of
computing. Here we provide a novel theory that explains why this conversion is
successful, and derive from it several new tools to convert a larger and more
powerful class of deep networks into SNNs. We identify the main sources of
approximation errors in previous conversion methods, and propose simple
mechanisms to fix these issues. Furthermore, we develop spiking implementations
of common CNN operations such as max-pooling, softmax, and batch-normalization,
which allow almost loss-less conversion of arbitrary CNN architectures into the
spiking domain. Empirical evaluation of different network architectures on the
MNIST and CIFAR10 benchmarks leads to the best SNN results reported to date.
Nabiha Asghar, Pascal Poupart, Jiang Xin, Hang Li
Comments: 8 pages
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
We propose an online, end-to-end, deep reinforcement learning technique to
develop generative conversational agents for open-domain dialogue. We use a
unique combination of offline two-phase supervised learning and online
reinforcement learning with human users to train our agent. While most existing
research proposes hand-crafted and develop-defined reward functions for
reinforcement, we devise a novel reward mechanism based on a variant of Beam
Search and one-character user-feedback at each step. Experiments show that our
model, when trained on a small and shallow Seq2Seq network, successfully
promotes the generation of meaningful, diverse and interesting responses, and
can be used to train agents with customized personas and conversational styles.
Xun Huang, Yixuan Li, Omid Poursaeed, John Hopcroft, Serge Belongie
Comments: Under review
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)
In this paper we aim to leverage the powerful bottom-up discriminative
representations to guide a top-down generative model. We propose a novel
generative model named Stacked Generative Adversarial Networks (SGAN), which is
trained to invert the hierarchical representations of a discriminative
bottom-up deep network. Our model consists of a top-down stack of GANs, each
trained to generate “plausible” lower-level representations, conditioned on
higher-level representations. A representation discriminator is introduced at
each feature hierarchy to encourage the representation manifold of the
generator to align with that of the bottom-up discriminative network, providing
intermediate supervision. In addition, we introduce a conditional loss that
encourages the use of conditional information from the layer above, and a novel
entropy loss that maximizes a variational lower bound on the conditional
entropy of generator outputs. To the best of our knowledge, the entropy loss is
the first attempt to tackle the conditional model collapse problem that is
common in conditional GANs. We first train each GAN of the stack independently,
and then we train the stack end-to-end. Unlike the original GAN that uses a
single noise vector to represent all the variations, our SGAN decomposes
variations into multiple levels and gradually resolves uncertainties in the
top-down generative process. Experiments demonstrate that SGAN is able to
generate diverse and high-quality images, as well as being more interpretable
than a vanilla GAN.
Tian Qi Chen, Mark Schmidt
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Learning (cs.LG)
Artistic style transfer is an image synthesis problem where the content of an
image is reproduced with the style of another. Recent works show that a
visually appealing style transfer can be achieved by using the hidden
activations of a pretrained convolutional neural network. However, existing
methods either apply (i) an optimization procedure that works for any style
image but is very expensive, or (ii) an efficient feedforward network that only
allows a limited number of trained styles. In this work we propose a simpler
optimization objective based on local matching that combines the content
structure and style textures in a single layer of the pretrained network. We
show that our objective has desirable properties such as a simpler optimization
landscape, intuitive parameter tuning, and consistent frame-by-frame
performance on video. Furthermore, we use 80,000 natural images and 80,000
paintings to train an inverse network that approximates the result of the
optimization. This results in a procedure for artistic style transfer that is
efficient but also allows arbitrary content and style images.
Vincent Sitzmann, Ana Serrano, Amy Pavel, Maneesh Agrawala, Diego Gutierrez, Gordon Wetzstein
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Understanding how humans explore virtual environments is crucial for many
applications, such as developing compression algorithms or designing effective
cinematic virtual reality (VR) content, as well as to develop predictive
computational models. We have recorded 780 head and gaze trajectories from 86
users exploring omni-directional stereo panoramas using VR head-mounted
displays. By analyzing the interplay between visual stimuli, head orientation,
and gaze direction, we demonstrate patterns and biases of how people explore
these panoramas and we present first steps toward predicting time-dependent
saliency. To compare how visual attention and saliency in VR are different from
conventional viewing conditions, we have also recorded users observing the same
scenes in a desktop setup. Based on this data, we show how to adapt existing
saliency predictors to VR, so that insights and tools developed for predicting
saliency in desktop scenarios may directly transfer to these immersive
applications.
Akshat Dave, Anil Kumar Vadathya, Kaushik Mitra
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Generative models are considered as the swiss knives for data modelling. In
this paper we leverage the recently proposed recurrent generative model, RIDE,
for applications like image inpainting and compressive image reconstruction.
Recurrent networks can model long range dependencies in images and hence are
suitable to handle global multiplexing in reconstruction from compressive
imaging. We perform MAP inference with RIDE as prior using backpropagation to
the inputs and projected gradient method. We propose a entropy thresholding
based approach for preserving texture well. Our approach shows comparable
results for image inpainting task. It shows superior results in compressive
image reconstruction compared to traditional methods D-AMP and TVAL3 which uses
global prior of minimizing TV norm.
Xiaolin Wu, Xi Zhang, Chang Liu
Subjects: Computer Vision and Pattern Recognition (cs.CV)
This article is a sequel to our earlier paper [24]. Our main objective is to
explore the potential of supervised machine learning in face-induced social
computing and cognition, riding on the momentum of much heralded successes of
face processing, analysis and recognition on the tasks of biometric-based
identification. We present a case study of automated statistical inference on
sociopsychological perceptions of female faces controlled for race,
attractiveness, age and nationality. Like in [24], our empirical evidences
point to the possibility of teaching computer vision and machine learning
algorithms, using example face images, to predict personality traits and
behavioral predisposition.
Reza Fuad Rachmadi, Keiichi Uchimura, Gou Koutaki
Comments: in Proceeding of 11th International Student Conference on Advanced Science and Technology (ICAST) 2016
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Social event detection in a static image is a very challenging problem and
it’s very useful for internet of things applications including automatic photo
organization, ads recommender system, or image captioning. Several publications
show that variety of objects, scene, and people can be very ambiguous for the
system to decide the event that occurs in the image. We proposed the spatial
pyramid configuration of convolutional neural network (CNN) classifier for
social event detection in a static image. By applying the spatial pyramid
configuration to the CNN classifier, the detail that occurs in the image can
observe more accurately by the classifier. USED dataset provided by Ahmad et
al. is used to evaluate our proposed method, which consists of two different
image sets, EiMM, and SED dataset. As a result, the average accuracy of our
system outperforms the baseline method by 15% and 2% respectively.
Aditya Singh, Saurabh Saini, Rajvi Shah, PJ Narayanan
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
User-given tags or labels are valuable resources for semantic understanding
of visual media such as images and videos. Recently, a new type of labeling
mechanism known as hash-tags have become increasingly popular on social media
sites. In this paper, we study the problem of generating relevant and useful
hash-tags for short video clips. Traditional data-driven approaches for tag
enrichment and recommendation use direct visual similarity for label transfer
and propagation. We attempt to learn a direct low-cost mapping from video to
hash-tags using a two step training process. We first employ a natural language
processing (NLP) technique, skip-gram models with neural network training to
learn a low-dimensional vector representation of hash-tags (Tag2Vec) using a
corpus of 10 million hash-tags. We then train an embedding function to map
video features to the low-dimensional Tag2vec space. We learn this embedding
for 29 categories of short video clips with hash-tags. A query video without
any tag-information can then be directly mapped to the vector space of tags
using the learned embedding and relevant tags can be found by performing a
simple nearest-neighbor retrieval in the Tag2Vec space. We validate the
relevance of the tags suggested by our system qualitatively and quantitatively
with a user study.
Ronnachai Jaroensri, Amy Zhao, Guha Balakrishnan, Derek Lo, Jeremy Schmahmann, John Guttag, Fredo Durand
Comments: 8 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
For many movement disorders, such as Parkinson’s and ataxia, disease
progression is usually assessed visually by a clinician according to a
numerical rating scale, or using questionnaires. These tests are subjective,
time-consuming, and must be administered by a professional. We present an
automated method for quantifying the severity of motion impairment in patients
with ataxia, using only video recordings. We focus on videos of the
finger-to-nose test, a common movement task used to assess ataxia progression
during the course of routine clinical checkups.
Our method uses pose estimation and optical flow techniques to track the
motion of the patient’s hand in a video recording. We extract features that
describe qualities of the motion such as speed and variation in performance.
Using labels provided by an expert clinician, we build a supervised learning
model that predicts severity according to the Brief Ataxia Rating Scale (BARS).
Our model achieves a mean absolute error of 0.363 on a 0-4 scale and a
prediction-label correlation of 0.835 in a leave-one-patient-out experiment.
The accuracy of our system is comparable to the reported inter-rater
correlation among clinicians assessing the finger-to-nose exam using a similar
ataxia rating scale. This work demonstrates the feasibility of using videos to
produce more objective and clinically useful measures of motor impairment.
Sunil Kumar, J. V. Desa, Shaktidev Mukherjee
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Copy move forgery detection in digital images has become a very popular
research topic in the area of image forensics. Due to the availability of
sophisticated image editing tools and ever increasing hardware capabilities, it
has become an easy task to manipulate the digital images. Passive forgery
detection techniques are more relevant as they can be applied without the prior
information about the image in question. Block based techniques are used to
detect copy move forgery, but have limitations of large time complexity and
sensitivity against affine operations like rotation and scaling. Keypoint based
approaches are used to detect forgery in large images where the possibility of
significant post processing operations like rotation and scaling is more. A
hybrid approach is proposed using different methods for keypoint detection and
description. Speeded Up Robust Features (SURF) are used to detect the keypoints
in the image and Binary Robust Invariant Scalable Keypoints (BRISK) features
are used to describe features at these keypoints. The proposed method has
performed better than the existing forgery detection method using SURF
significantly in terms of detection speed and is invariant to post processing
operations like rotation and scaling. The proposed method is also invariant to
other commonly applied post processing operations like adding Gaussian noise
and JPEG compression
Marcel Sheeny de Moraes, Sankha Mukherjee, Neil M Robertson
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Human interaction recognition is a challenging problem in computer vision and
has been researched over the years due to its important applications. With the
development of deep models for the human pose estimation problem, this work
aims to verify the effectiveness of using the human pose in order to recognize
the human interaction in monocular videos. This paper developed a method based
on 5 steps: detect each person in the scene, track them, retrieve the human
pose, extract features based on the pose and finally recognize the interaction
using a classifier. The Two-Person interaction dataset was used for the
development of this methodology. Using a whole sequence evaluation approach it
achieved 87.56% of average accuracy of all interaction. Yun, et at achieved
91.10% using the same dataset, however their methodology used the depth sensor
to recognize the interaction. The methodology developed in this paper shows
that an RGB camera can be as effective as depth cameras to recognize the
interaction between two persons using the recent development of deep models to
estimate the human pose.
Arjun Raj Rajanna, Kamelia Aryafar, Rajeev Ramchandran, Christye Sisson, Ali Shokoufandeh, Raymond Ptucha
Comments: Published in Proceedings of “IEEE Western NY Image & Signal Processing Workshop”
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Widespread outreach programs using remote retinal imaging have proven to
decrease the risk from diabetic retinopathy, the leading cause of blindness in
the US. However, this process still requires manual verification of image
quality and grading of images for level of disease by a trained human grader
and will continue to be limited by the lack of such scarce resources.
Computer-aided diagnosis of retinal images have recently gained increasing
attention in the machine learning community. In this paper, we introduce a set
of neural networks for diabetic retinopathy classification of fundus retinal
images. We evaluate the efficiency of the proposed classifiers in combination
with preprocessing and augmentation steps on a sample dataset. Our experimental
results show that neural networks in combination with preprocessing on the
images can boost the classification accuracy on this dataset. Moreover the
proposed models are scalable and can be used in large scale datasets for
diabetic retinopathy detection. The models introduced in this paper can be used
to facilitate the diagnosis and speed up the detection process.
Tomoyoshi Shimobaba, Yutaka Endo, Ryuji Hirayama, Yuki Nagahama, Takayuki Takahashi, Takashi Nishitsuji, Takashi Kakue, Atsushi Shiraki, Naoki Takada, Nobuyuki Masuda, Tomoyoshi Ito
Subjects: Computer Vision and Pattern Recognition (cs.CV); Optics (physics.optics)
We propose a holographic image restoration method using an autoencoder, which
is an artificial neural network. Because holographic reconstructed images are
often contaminated by direct light, conjugate light, and speckle noise, the
discrimination of reconstructed images may be difficult. In this paper, we
demonstrate the restoration of reconstructed images from holograms that record
page data in holographic memory and QR codes by using the proposed method.
Sergey Zagoruyko, Nikos Komodakis
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Attention plays a critical role in human visual experience. Furthermore, it
has recently been demonstrated that attention can also play an important role
in the context of applying artificial neural networks to a variety of tasks
from fields such as computer vision and NLP. In this work we show that, by
properly defining attention for convolutional neural networks, we can actually
use this type of information in order to significantly improve the performance
of a student CNN network by forcing it to mimic the attention maps of a
powerful teacher network. To that end, we propose several novel methods of
transferring attention, showing consistent improvement across a variety of
datasets and convolutional neural network architectures.
J. Dolz, C. Desrosiers, I. Ben Ayed
Comments: Submitted to the special issue of Neuroimage: “Brain Segmentation and Parcellation”
Subjects: Computer Vision and Pattern Recognition (cs.CV)
This study investigates a 3D and fully convolutional neural network (CNN) for
subcortical brain structure segmentation in MRI. 3D CNN architectures have been
generally avoided due to their computational and memory requirements during
inference. We address the problem via small kernels, allowing deeper
architectures. We further model both local and global context by embedding
intermediate-layer outputs in the final prediction, which encourages
consistency between features extracted at different scales and embeds
fine-grained information directly in the segmentation process. Our model is
efficiently trained end-to-end on a graphics processing unit (GPU), in a single
stage, exploiting the dense inference capabilities of fully CNNs.
We performed comprehensive experiments over two publicly available data sets.
First, we demonstrate a state-of-the-art performance on the ISBR dataset. Then,
we report a {em large-scale} multi-site evaluation over 1112 unregistered
subject data sets acquired from 17 different sites (ABIDE data set), with ages
ranging from 7 to 64 years, showing that our method is robust to various
acquisition protocols, demographics and clinical factors. Our method yielded
segmentations that are highly consistent with a standard atlas-based approach,
while running in a fraction of the time needed by atlas-based methods and
avoiding registration/normalization steps. This makes it convenient for massive
multi-site neuroanatomical imaging studies. To the best of our knowledge, our
work is the first to study subcortical structure segmentation on such
large-scale and heterogeneous data.
Renata Rychtarikova, Dalibor Stys
Comments: 12 pages, 5 figures, supplementary data
Subjects: Quantitative Methods (q-bio.QM); Computer Vision and Pattern Recognition (cs.CV); Cell Behavior (q-bio.CB); Subcellular Processes (q-bio.SC)
This article presents an algorithm for the evaluation of organelles’
movements inside of an unmodified live cell. We used a time-lapse image series
obtained using wide-field bright-field photon transmission microscopy as an
algorithm input. The benefit of the algorithm is the application of the R’enyi
information entropy, namely a variable called a point information gain, which
enables to highlight the borders of the intracellular organelles and to
localize the organelles’ centers of mass with the precision of one pixel.
Bodo Rueckauer, Iulia-Alexandra Lungu, Yuhuang Hu, Michael Pfeiffer
Comments: 9 pages, 2 figures, presented at the workshop “Computing with Spikes” at NIPS 2016, Barcelona, Spain
Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Deep convolutional neural networks (CNNs) have shown great potential for
numerous real-world machine learning applications, but performing inference in
large CNNs in real-time remains a challenge. We have previously demonstrated
that traditional CNNs can be converted into deep spiking neural networks
(SNNs), which exhibit similar accuracy while reducing both latency and
computational load as a consequence of their data-driven, event-based style of
computing. Here we provide a novel theory that explains why this conversion is
successful, and derive from it several new tools to convert a larger and more
powerful class of deep networks into SNNs. We identify the main sources of
approximation errors in previous conversion methods, and propose simple
mechanisms to fix these issues. Furthermore, we develop spiking implementations
of common CNN operations such as max-pooling, softmax, and batch-normalization,
which allow almost loss-less conversion of arbitrary CNN architectures into the
spiking domain. Empirical evaluation of different network architectures on the
MNIST and CIFAR10 benchmarks leads to the best SNN results reported to date.
Brett W. Israelsen, Nisar Ahmed, Kenneth Center, Roderick Green, Winston Bennett Jr
Comments: submitted copy
Journal-ref: SciTech 2017, paper 2545524
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG); Robotics (cs.RO)
A key requirement for the current generation of artificial decision-makers is
that they should adapt well to changes in unexpected situations. This paper
addresses the situation in which an AI for aerial dog fighting, with tunable
parameters that govern its behavior, must optimize behavior with respect to an
objective function that is evaluated and learned through simulations. Bayesian
optimization with a Gaussian Process surrogate is used as the method for
investigating the objective function. One key benefit is that during
optimization, the Gaussian Process learns a global estimate of the true
objective function, with predicted outcomes and a statistical measure of
confidence in areas that haven’t been investigated yet. Having a model of the
objective function is important for being able to understand possible outcomes
in the decision space; for example this is crucial for training and providing
feedback to human pilots. However, standard Bayesian optimization does not
perform consistently or provide an accurate Gaussian Process surrogate function
for highly volatile objective functions. We treat these problems by introducing
a novel sampling technique called Hybrid Repeat/Multi-point Sampling. This
technique gives the AI ability to learn optimum behaviors in a highly uncertain
environment. More importantly, it not only improves the reliability of the
optimization, but also creates a better model of the entire objective surface.
With this improved model the agent is equipped to more accurately/efficiently
predict performance in unexplored scenarios.
Filippo Bistaffa, Alessandro Farinelli, Jesús Cerquides, Juan A. Rodríguez-Aguilar, Sarvapali D. Ramchurn
Comments: Accepted for publication, cite as “in press”
Journal-ref: ACM Transactions on Intelligent Systems and Technology, 2017,
Volume 8, Issue 4
Subjects: Multiagent Systems (cs.MA); Artificial Intelligence (cs.AI)
Coalition formation typically involves the coming together of multiple,
heterogeneous, agents to achieve both their individual and collective goals. In
this paper, we focus on a special case of coalition formation known as
Graph-Constrained Coalition Formation (GCCF) whereby a network connecting the
agents constrains the formation of coalitions. We focus on this type of problem
given that in many real-world applications, agents may be connected by a
communication network or only trust certain peers in their social network. We
propose a novel representation of this problem based on the concept of edge
contraction, which allows us to model the search space induced by the GCCF
problem as a rooted tree. Then, we propose an anytime solution algorithm
(CFSS), which is particularly efficient when applied to a general class of
characteristic functions called (m+a) functions. Moreover, we show how CFSS can
be efficiently parallelised to solve GCCF using a non-redundant partition of
the search space. We benchmark CFSS on both synthetic and realistic scenarios,
using a real-world dataset consisting of the energy consumption of a large
number of households in the UK. Our results show that, in the best case, the
serial version of CFSS is 4 orders of magnitude faster than the state of the
art, while the parallel version is 9.44 times faster than the serial version on
a 12-core machine. Moreover, CFSS is the first approach to provide anytime
approximate solutions with quality guarantees for very large systems of agents
(i.e., with more than 2700 agents).
Peter Christen
Comments: 12 pages
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI)
Record linkage is the process of identifying records that refer to the same
entities from several databases. This process is challenging because commonly
no unique entity identifiers are available. Linkage therefore has to rely on
partially identifying attributes, such as names and addresses of people. Recent
years have seen the development of novel techniques for linking data from
diverse application areas, where a major focus has been on linking complex data
that contain records about different types of entities. Advanced approaches
that exploit both the similarities between record attributes as well as the
relationships between entities to identify clusters of matching records have
been developed.
In this application paper we study the novel problem where rather than
different types of entities we have databases where the same entity can have
different roles, and where these roles change over time. We specifically
develop novel techniques for linking historical birth, death, marriage and
census records with the aim to reconstruct the population covered by these
records over a period of several decades. Our experimental evaluation on real
Scottish data shows that even with advanced linkage techniques that consider
group, relationship, and temporal aspects it is challenging to achieve high
quality linkage from such complex data.
Mehdi Kargahi (University of Tehran), Ashutosh Trivedi (University of Colorado Boulder)
Journal-ref: EPTCS 232, 2016
Subjects: Systems and Control (cs.SY); Artificial Intelligence (cs.AI); Robotics (cs.RO)
The first International Workshop on Verification and Validation of
Cyber-Physical Systems (V2CPS-16) was held in conjunction with the 12th
International Conference on integration of Formal Methods (iFM 2016) in
Reykjavik, Iceland. The purpose of V2CPS-16 was to bring together researchers
and experts of the fields of formal verification and cyber-physical systems
(CPS) to cover the theme of this workshop, namely a wide spectrum of
verification and validation methods including (but not limited to) control,
simulation, formal methods, etc.
A CPS is an integration of networked computational and physical processes
with meaningful inter-effects; the former monitors, controls, and affects the
latter, while the latter also impacts the former. CPSs have applications in a
wide-range of systems spanning robotics, transportation, communication,
infrastructure, energy, and manufacturing. Many safety-critical systems such as
chemical processes, medical devices, aircraft flight control, and automotive
systems, are indeed CPS. The advanced capabilities of CPS require complex
software and synthesis algorithms, which are hard to verify. In fact, many
problems in this area are undecidable. Thus, a major step is to find particular
abstractions of such systems which might be algorithmically verifiable
regarding specific properties of such systems, describing the partial/overall
behaviors of CPSs.
Brett Israelsen, Nisar Ahmed
Journal-ref: BayesOpt Workshop, NIPS 2016
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG); Robotics (cs.RO)
A key drawback of the current generation of artificial decision-makers is
that they do not adapt well to changes in unexpected situations. This paper
addresses the situation in which an AI for aerial dog fighting, with tunable
parameters that govern its behavior, will optimize behavior with respect to an
objective function that must be evaluated and learned through simulations. Once
this objective function has been modeled, the agent can then choose its desired
behavior in different situations. Bayesian optimization with a Gaussian Process
surrogate is used as the method for investigating the objective function. One
key benefit is that during optimization the Gaussian Process learns a global
estimate of the true objective function, with predicted outcomes and a
statistical measure of confidence in areas that haven’t been investigated yet.
However, standard Bayesian optimization does not perform consistently or
provide an accurate Gaussian Process surrogate function for highly volatile
objective functions. We treat these problems by introducing a novel sampling
technique called Hybrid Repeat/Multi-point Sampling. This technique gives the
AI ability to learn optimum behaviors in a highly uncertain environment. More
importantly, it not only improves the reliability of the optimization, but also
creates a better model of the entire objective surface. With this improved
model the agent is equipped to better adapt behaviors.
Nabiha Asghar, Pascal Poupart, Jiang Xin, Hang Li
Comments: 8 pages
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
We propose an online, end-to-end, deep reinforcement learning technique to
develop generative conversational agents for open-domain dialogue. We use a
unique combination of offline two-phase supervised learning and online
reinforcement learning with human users to train our agent. While most existing
research proposes hand-crafted and develop-defined reward functions for
reinforcement, we devise a novel reward mechanism based on a variant of Beam
Search and one-character user-feedback at each step. Experiments show that our
model, when trained on a small and shallow Seq2Seq network, successfully
promotes the generation of meaningful, diverse and interesting responses, and
can be used to train agents with customized personas and conversational styles.
Yushi Yao, Guangjian Li
Comments: 15 pages
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Traditional sentiment analysis often uses sentiment dictionary to extract
sentiment information in text and classify documents. However, emerging
informal words and phrases in user generated content call for analysis aware to
the context. Usually, they have special meanings in a particular context.
Because of its great performance in representing inter-word relation, we use
sentiment word vectors to identify the special words. Based on the distributed
language model word2vec, in this paper we represent a novel method about
sentiment representation of word under particular context, to be detailed, to
identify the words with abnormal sentiment polarity in long answers. Result
shows the improved model shows better performance in representing the words
with special meaning, while keep doing well in representing special idiomatic
pattern. Finally, we will discuss the meaning of vectors representing in the
field of sentiment, which may be different from general object-based
conditions.
Philipp Meerkamp (Bloomberg LP), Zhengyi Zhou (AT&T Labs Research)
Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR); Learning (cs.LG)
We present an architecture for information extraction from text that augments
an existing parser with a character-level neural network. To train the neural
network, we compute a measure of consistency of extracted data with existing
databases, and use it as a form of noisy supervision. Our architecture combines
the ability of constraint-based information extraction system to easily
incorporate domain knowledge and constraints with the ability of deep neural
networks to leverage large amounts of data to learn complex features. The
system led to large improvements over a mature and highly tuned
constraint-based information extraction system used at Bloomberg for financial
language text. At the same time, the new system massively reduces the
development effort, allowing rule-writers to write high-recall constraints
while relying on the deep neural network to remove false positives and boost
precision.
Radu Soricut, Nan Ding
Comments: 10 pages
Subjects: Computation and Language (cs.CL)
We present a dual contribution to the task of machine reading-comprehension:
a technique for creating large-sized machine-comprehension (MC) datasets using
paragraph-vector models; and a novel, hybrid neural-network architecture that
combines the representation power of recurrent neural networks with the
discriminative power of fully-connected multi-layered networks. We use the
MC-dataset generation technique to build a dataset of around 2 million
examples, for which we empirically determine the high-ceiling of human
performance (around 91% accuracy), as well as the performance of a variety of
computer models. Among all the models we have experimented with, our hybrid
neural-network architecture achieves the highest performance (83.2% accuracy).
The remaining gap to the human-performance ceiling provides enough room for
future model improvements.
Zhiguo Wang, Haitao Mi, Wael Hamza, Radu Florian
Comments: 8
Subjects: Computation and Language (cs.CL)
Previous machine comprehension (MC) datasets are either too small to train
end-to-end deep learning models, or not difficult enough to evaluate the
ability of current MC techniques. The newly released SQuAD dataset alleviates
these limitations, and gives us a chance to develop more realistic MC models.
Based on this dataset, we propose a Multi-Perspective Context Matching (MPCM)
model, which is an end-to-end system that directly predicts the answer
beginning and ending points in a passage. Our model first adjusts each
word-embedding vector in the passage by multiplying a relevancy weight computed
against the question. Then, we encode the question and weighted passage by
using bi-directional LSTMs. For each point in the passage, our model matches
the context of this point against the encoded question from multiple
perspectives and produces a matching vector. Given those matched vectors, we
employ another bi-directional LSTM to aggregate all the information and predict
the beginning and ending points. Experimental result on the test set of SQuAD
shows that our model achieves a competitive result on the leaderboard.
Bruno Nicenboim, Shravan Vasishth
Subjects: Computation and Language (cs.CL); Applications (stat.AP); Machine Learning (stat.ML)
Research on interference has provided evidence that the formation of
dependencies between non-adjacent words relies on a cue-based retrieval
mechanism. Two different models can account for one of the main predictions of
interference, i.e., a slowdown at a retrieval site, when several items share a
feature associated with a retrieval cue: Lewis and Vasishth’s (2005)
activation-based model and McElree’s (2000) direct access model. Even though
these two models have been used almost interchangeably, they are based on
different assumptions and predict differences in the relationship between
reading times and response accuracy. The activation-based model follows the
assumptions of ACT-R, and its retrieval process behaves as a lognormal race
between accumulators of evidence with a single variance. Under this model,
accuracy of the retrieval is determined by the winner of the race and retrieval
time by its rate of accumulation. In contrast, the direct access model assumes
a model of memory where only the probability of retrieval varies between items;
in this model, differences in latencies are a by-product of the possibility and
repairing incorrect retrievals. We implemented both models in a Bayesian
hierarchical framework in order to evaluate them and compare them. We show that
some aspects of the data are better fit under the direct access model than
under the activation-based model. We suggest that this finding does not rule
out the possibility that retrieval may be behaving as a race model with
assumptions that follow less closely the ones from the ACT-R framework. We show
that by introducing a modification of the activation model, i.e, by assuming
that the accumulation of evidence for retrieval of incorrect items is not only
slower but noisier (i.e., different variances for the correct and incorrect
items), the model can provide a fit as good as the one of the direct access
model.
Philipp Meerkamp (Bloomberg LP), Zhengyi Zhou (AT&T Labs Research)
Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR); Learning (cs.LG)
We present an architecture for information extraction from text that augments
an existing parser with a character-level neural network. To train the neural
network, we compute a measure of consistency of extracted data with existing
databases, and use it as a form of noisy supervision. Our architecture combines
the ability of constraint-based information extraction system to easily
incorporate domain knowledge and constraints with the ability of deep neural
networks to leverage large amounts of data to learn complex features. The
system led to large improvements over a mature and highly tuned
constraint-based information extraction system used at Bloomberg for financial
language text. At the same time, the new system massively reduces the
development effort, allowing rule-writers to write high-recall constraints
while relying on the deep neural network to remove false positives and boost
precision.
Gustavo Henrique Paetzold, Lucia Specia
Subjects: Computation and Language (cs.CL)
Parallel corpora have driven great progress in the field of Text
Simplification. However, most sentence alignment algorithms either offer a
limited range of alignment types supported, or simply ignore valuable clues
present in comparable documents. We address this problem by introducing a new
set of flexible vicinity-driven paragraph and sentence alignment algorithms
that 1-N, N-1, N-N and long distance null alignments without the need for
hard-to-replicate supervised models.
Xiang Kong, Preethi Jyothi, Mark Hasegawa-Johnson
Journal-ref: ICASSP 2017
Subjects: Computation and Language (cs.CL)
Mismatched transcriptions have been proposed as a mean to acquire
probabilistic transcriptions from non-native speakers of a language.Prior work
has demonstrated the value of these transcriptions by successfully adapting
cross-lingual ASR systems for different tar-get languages. In this work, we
describe two techniques to refine these probabilistic transcriptions: a
noisy-channel model of non-native phone misperception is trained using a
recurrent neural net-work, and decoded using minimally-resourced
language-dependent pronunciation constraints. Both innovations improve quality
of the transcript, and both innovations reduce phone error rate of a
trainedASR, by 7% and 9% respectively
Xiang Kong, Jeung-Yoon Choi, Stefanie Shattuck-Hufnagel
Comments: ICASSP 2017
Journal-ref: ICASSP 2017
Subjects: Computation and Language (cs.CL)
This paper describes methods for evaluating automatic speech recognition
(ASR) systems in comparison with human perception results, using measures
derived from linguistic distinctive features. Error patterns in terms of
manner, place and voicing are presented, along with an examination of confusion
matrices via a distinctive-feature-distance metric. These evaluation methods
contrast with conventional performance criteria that focus on the phone or word
level, and are intended to provide a more detailed profile of ASR system
performance,as well as a means for direct comparison with human perception
results at the sub-phonemic level.
Robert Speer, Joshua Chin, Catherine Havasi
Subjects: Computation and Language (cs.CL)
Machine learning about language can be improved by supplying it with specific
knowledge and sources of external information. We present here a new version of
the linked open data resource ConceptNet that is particularly well suited to be
used with modern NLP techniques such as word embeddings.
ConceptNet is a knowledge graph that connects words and phrases of natural
language with labeled edges. Its knowledge is collected from many sources that
include expert-created resources, crowd-sourcing, and games with a purpose. It
is designed to represent the general knowledge involved in understanding
language, improving natural language applications by allowing the application
to better understand the meanings behind the words people use.
When ConceptNet is combined with word embeddings acquired from distributional
semantics (such as word2vec), it provides applications with understanding that
they would not acquire from distributional semantics alone, nor from narrower
resources such as WordNet or DBPedia. We demonstrate this with state-of-the-art
results on intrinsic evaluations of word relatedness that translate into
improvements on applications of word vectors, including solving SAT-style
analogies.
Mikael Henaff, Jason Weston, Arthur Szlam, Antoine Bordes, Yann LeCun
Subjects: Computation and Language (cs.CL)
We introduce a new model, the Recurrent Entity Network (EntNet). It is
equipped with a dynamic long-term memory which allows it to maintain and update
a representation of the state of the world as it receives new data. For
language understanding tasks, it can reason on-the-fly as it reads text, not
just when it is required to answer a question or respond as is the case for a
Memory Network (Sukhbaatar et al., 2015). Like a Neural Turing Machine or
Differentiable Neural Computer (Graves et al., 2014; 2016) it maintains a fixed
size memory and can learn to perform location and content-based read and write
operations. However, unlike those models it has a simple parallel architecture
in which several memory locations can be updated simultaneously. The EntNet
sets a new state-of-the-art on the bAbI tasks, and is the first method to solve
all the tasks in the 10k training examples setting. We also demonstrate that it
can solve a reasoning task which requires a large number of supporting facts,
which other methods are not able to solve, and can generalize past its training
horizon. It can also be practically used on large scale datasets such as
Children’s Book Test, where it obtains competitive performance, reading the
story in a single pass.
Nabiha Asghar, Pascal Poupart, Jiang Xin, Hang Li
Comments: 8 pages
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
We propose an online, end-to-end, deep reinforcement learning technique to
develop generative conversational agents for open-domain dialogue. We use a
unique combination of offline two-phase supervised learning and online
reinforcement learning with human users to train our agent. While most existing
research proposes hand-crafted and develop-defined reward functions for
reinforcement, we devise a novel reward mechanism based on a variant of Beam
Search and one-character user-feedback at each step. Experiments show that our
model, when trained on a small and shallow Seq2Seq network, successfully
promotes the generation of meaningful, diverse and interesting responses, and
can be used to train agents with customized personas and conversational styles.
Aditya Singh, Saurabh Saini, Rajvi Shah, PJ Narayanan
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
User-given tags or labels are valuable resources for semantic understanding
of visual media such as images and videos. Recently, a new type of labeling
mechanism known as hash-tags have become increasingly popular on social media
sites. In this paper, we study the problem of generating relevant and useful
hash-tags for short video clips. Traditional data-driven approaches for tag
enrichment and recommendation use direct visual similarity for label transfer
and propagation. We attempt to learn a direct low-cost mapping from video to
hash-tags using a two step training process. We first employ a natural language
processing (NLP) technique, skip-gram models with neural network training to
learn a low-dimensional vector representation of hash-tags (Tag2Vec) using a
corpus of 10 million hash-tags. We then train an embedding function to map
video features to the low-dimensional Tag2vec space. We learn this embedding
for 29 categories of short video clips with hash-tags. A query video without
any tag-information can then be directly mapped to the vector space of tags
using the learned embedding and relevant tags can be found by performing a
simple nearest-neighbor retrieval in the Tag2Vec space. We validate the
relevance of the tags suggested by our system qualitatively and quantitatively
with a user study.
Yiyan Wang, Haotian Xu, Zhijian Ou
Comments: accepted by ICASSP2017
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Learning (cs.LG)
State-of-the-art i-vector based speaker verification relies on variants of
Probabilistic Linear Discriminant Analysis (PLDA) for discriminant analysis. We
are mainly motivated by the recent work of the joint bayesian (JB) method,
which is originally proposed for discriminant analysis in face verification. We
apply JB to speaker verification and make three contributions beyond of the
original JB. 1) In contrast to the EM iterations with approximated statistics
in the original JB, the EM iterations with exact statistics is employed and
gives better performance. 2) We propose to do simultaneously diagonalization
(SD) of the within-class and between-class covariance matrices to achieve
efficient testing, which has broader application scope than the SVD-based
efficient testing method in the original JB. 3) We scrutinize similarities and
differences between various Gaussian PLDAs and JB, complementing the previous
analysis of comparing JB only with Prince-Elder PLDA. Extensive experiments are
conducted on NIST SRE10 core condition 5, empirically validating the
superiority of JB with faster convergence rate and 9 – 13% EER reduction
compared with state-of-the-art PLDA.
Yuan Tang
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Learning (cs.LG)
TF.Learn is a high-level Python module for distributed machine learning
inside TensorFlow. It provides an easy-to-use Scikit-learn style interface to
simplify the process of creating, configuring, training, evaluating, and
experimenting a machine learning model. TF.Learn integrates a wide range of
state-of-art machine learning algorithms built on top of TensorFlow’s low level
APIs for small to large-scale supervised and unsupervised problems. This module
focuses on bringing machine learning to non-specialists using a general-purpose
high-level language as well as researchers who want to implement, benchmark,
and compare their new methods in a structured environment. Emphasis is put on
ease of use, performance, documentation, and API consistency.
Sandeep Aswath Narayana
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Hardware Architecture (cs.AR); Neural and Evolutionary Computing (cs.NE)
Continuous improvement in silicon process technologies has made possible the
integration of hundreds of cores on a single chip. However, power and heat have
become dominant constraints in designing these massive multicore chips causing
issues with reliability, timing variations and reduced lifetime of the chips.
Dynamic Thermal Management (DTM) is a solution to avoid high temperatures on
the die. Typical DTM schemes only address core level thermal issues. However,
the Network-on-chip (NoC) paradigm, which has emerged as an enabling
methodology for integrating hundreds to thousands of cores on the same die can
contribute significantly to the thermal issues. Moreover, the typical DTM is
triggered reactively based on temperature measurements from on-chip thermal
sensor requiring long reaction times whereas predictive DTM method estimates
future temperature in advance, eliminating the chance of temperature overshoot.
Artificial Neural Networks (ANNs) have been used in various domains for
modeling and prediction with high accuracy due to its ability to learn and
adapt. This thesis concentrates on designing an ANN prediction engine to
predict the thermal profile of the cores and Network-on-Chip elements of the
chip. This thermal profile of the chip is then used by the predictive DTM that
combines both core level and network level DTM techniques. On-chip wireless
interconnect which is recently envisioned to enable energy-efficient data
exchange between cores in a multicore environment, will be used to provide a
broadcast-capable medium to efficiently distribute thermal control messages to
trigger and manage the DTM schemes.
Aditya Devarakonda, Kimon Fountoulakis, James Demmel, Michael W. Mahoney
Comments: 30 pages
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Primal and dual block coordinate descent methods are iterative methods for
solving regularized and unregularized optimization problems. Distributed-memory
parallel implementations of these methods have become popular in analyzing
large machine learning datasets. However, existing implementations communicate
at every iteration which, on modern data center and supercomputing
architectures, often dominates the cost of floating-point computation. Recent
results on communication-avoiding Krylov subspace methods suggest that large
speedups are possible by re-organizing iterative algorithms to avoid
communication. We show how applying similar algorithmic transformations can
lead to primal and dual block coordinate descent methods that only communicate
every s iterations–where s is a tuning parameter–instead of every iteration
for the regularized least-squares problem. We derive communication-avoiding
variants of the primal and dual block coordinate descent methods which reduce
the number of synchronizations by a factor of s on distributed-memory parallel
machines without altering the convergence rate. Our communication-avoiding
algorithms attain modeled strong scaling speedups of 14x and 165x on a modern
supercomputer using MPI and Apache Spark, respectively. Our algorithms attain
modeled weak scaling speedups of 12x and 396x on the same machine using MPI and
Apache Spark, respectively.
Francesco Paolo Schiavo, Vladimiro Sassone, Luca Nicoletti, Andrea Margheri
Comments: Technical Report Edited by Francesco Paolo Schiavo, Vladimiro Sassone, Luca Nicoletti and Andrea Margheri
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
This document is the main high-level architecture specification of the
SUNFISH cloud federation solution. Its main objective is to introduce the
concept of Federation-as-a-Service (FaaS) and the SUNFISH platform. FaaS is the
new and innovative cloud federation service proposed by the SUNFISH project.
The document defines the functionalities of FaaS, its governance and precise
objectives. With respect to these objectives, the document proposes the
high-level architecture of the SUNFISH platform: the software architecture that
permits realising a FaaS federation. More specifically, the document describes
all the components forming the platform, the offered functionalities and their
high-level interactions underlying the main FaaS functionalities. The document
concludes by outlining the main implementation strategies towards the actual
implementation of the proposed cloud federation solution.
Victor Dorobantu, Per Andre Stromhaug, Jess Renteria
Subjects: Learning (cs.LG)
The vanishing and exploding gradient problems are well-studied obstacles that
make it difficult for recurrent neural networks to learn long-term time
dependencies. We propose a reparameterization of standard recurrent neural
networks to update linear transformations in a provably norm-preserving way
through Givens rotations. Additionally, we use the absolute value function as
an element-wise non-linearity to preserve the norm of backpropagated signals
over the entire network. We show that this reparameterization reduces the
number of parameters and maintains the same algorithmic complexity as a
standard recurrent neural network, while outperforming standard recurrent
neural networks with orthogonal initializations and Long Short-Term Memory
networks on the copy problem.
Sulin Liu, Sinno Jialin Pan, Qirong Ho
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
In this paper, we propose a distributed multi-task learning framework that
simultaneously learns predictive models for each task as well as task
relationships between tasks alternatingly in the parameter server paradigm. In
our framework, we first offer a general dual form for a family of regularized
multi-task relationship learning methods. Subsequently, we propose a
communication-efficient primal-dual distributed optimization algorithm to solve
the dual problem by carefully designing local subproblems to make the dual
problem decomposable. Moreover, we provide a theoretical convergence analysis
for the proposed algorithm, which is specific for distributed multi-task
relationship learning. We conduct extensive experiments on both synthetic and
real-world datasets to evaluate our proposed framework in terms of scalability,
effectiveness, and convergence.
Daniel Jiwoong Im, He Ma, Chris Dongjoo Kim, Graham Taylor
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
Generative Adversarial Networks have become one of the most studied
frameworks for unsupervised learning due to their intuitive formulation. They
have also been shown to be capable of generating convincing examples in limited
domains, such as low-resolution images. However, they still prove difficult to
train in practice and tend to ignore modes of the data generating distribution.
Quantitatively capturing effects such as mode coverage and more generally the
quality of the generative model still remain elusive. We propose Generative
Adversarial Parallelization, a framework in which many GANs or their variants
are trained simultaneously, exchanging their discriminators. This eliminates
the tight coupling between a generator and discriminator, leading to improved
convergence and improved coverage of modes. We also propose an improved variant
of the recently proposed Generative Adversarial Metric and show how it can
score individual GANs or their collections under the GAP model.
Daniel Jiwoong Im, Michael Tao, Kristin Branson
Subjects: Learning (cs.LG)
The training of deep neural networks is a high-dimension optimization problem
with respect to the loss function of a model. Unfortunately, these functions
are of high dimension and non-convex and hence difficult to characterize. In
this paper, we empirically investigate the geometry of the loss functions for
state-of-the-art networks with multiple stochastic optimization methods. We do
this through several experiments that are visualized on polygons to understand
how and when these stochastic optimization methods find minima.
Filip L. Iliev, Valentin G. Stanev, Velimir V. Vesselinov, Boian S. Alexandrov
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
Non-negative matrix factorization (NMF) is a well-known unsupervised learning
method that has been successfully used for blind source separation of
non-negative additive signals.NMF method requires the number of the original
sources to be known a priori. Recently, we reported a method, we called NMFk,
which by coupling the original NMF multiplicative algorithm with a custom
semi-supervised clustering allows us to estimate the number of the sources
based on the robustness of the reconstructed solutions. Here, an extension of
NMFk is developed, called ShiftNMFk, which by combining NMFk with previously
formulated ShiftNMF algorithm, Akaike Information Criterion (AIC), and a custom
procedure for estimating the source locations is capable of identifying: (a)
the number of the unknown sources, (b) the eventual delays in the signal
propagation, (c) the locations of the sources, and (d) the speed of propagation
of each of the signals in the medium. Our new method is a natural extension of
NMFk that can be used for sources identification based only on observational
data. We demonstrate how our novel method identifies the components of
synthetic data sets, discuss its limitations, and present a Julia language
implementation of ShiftNMFk algorithm.
Valentin G. Stanev, Filip L. Iliev, Velimir V. Vesselinov, Boian S. Alexandrov
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
These records are then used to estimate properties of the contaminant
sources, e.g., locations, release strengths and model parameters representing
contaminant migration (e.g., velocity, dispersivity, etc.). These estimates are
essential for a reliable assessment of the contamination hazards and risks. If
there are more than one contaminant sources (with different locations and
strengths), the observed records represent contaminant mixtures; typically, the
number of sources is unknown. The mixing ratios of the different contaminant
sources at the detectors are also unknown; this further hinders the reliability
and complexity of the inverse-model analyses. To circumvent some of these
challenges, we have developed a novel hybrid source identification method
coupling machine learning and inverse-analysis methods, and called Green-NMFk.
It performs decomposition of the observed mixtures based on Non-negative Matrix
Factorization method for Blind Source Separation, coupled with custom
semi-supervised clustering algorithm, and uses Green’s functions of
advection-diffusion equation. Our method is capable of identifying the unknown
number, locations, and properties of a set of contaminant sources from measured
contaminant-source mixtures with unknown mixing ratios, without any additional
information. It also estimates the contaminant transport properties, such as
velocity and dispersivity. Green-NMFk is not limited to contaminant transport
but can be applied directly to any problem controlled by partial-differential
parabolic equation where mixtures of an unknown number of physical sources are
monitored at multiple locations. Green-NMFk can be also applied with different
Green’s functions; for example, representing anomalous (non-Fickian) dispersion
or wave propagation in dispersive media.
Xun Huang, Yixuan Li, Omid Poursaeed, John Hopcroft, Serge Belongie
Comments: Under review
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)
In this paper we aim to leverage the powerful bottom-up discriminative
representations to guide a top-down generative model. We propose a novel
generative model named Stacked Generative Adversarial Networks (SGAN), which is
trained to invert the hierarchical representations of a discriminative
bottom-up deep network. Our model consists of a top-down stack of GANs, each
trained to generate “plausible” lower-level representations, conditioned on
higher-level representations. A representation discriminator is introduced at
each feature hierarchy to encourage the representation manifold of the
generator to align with that of the bottom-up discriminative network, providing
intermediate supervision. In addition, we introduce a conditional loss that
encourages the use of conditional information from the layer above, and a novel
entropy loss that maximizes a variational lower bound on the conditional
entropy of generator outputs. To the best of our knowledge, the entropy loss is
the first attempt to tackle the conditional model collapse problem that is
common in conditional GANs. We first train each GAN of the stack independently,
and then we train the stack end-to-end. Unlike the original GAN that uses a
single noise vector to represent all the variations, our SGAN decomposes
variations into multiple levels and gradually resolves uncertainties in the
top-down generative process. Experiments demonstrate that SGAN is able to
generate diverse and high-quality images, as well as being more interpretable
than a vanilla GAN.
Ahmad El Sallab, Mohammed Abdou, Etienne Perot, Senthil Yogamani
Comments: Presented at the Machine Learning for Intelligent Transportation Systems Workshop, NIPS 2016
Subjects: Machine Learning (stat.ML); Learning (cs.LG); Robotics (cs.RO)
Reinforcement learning is considered to be a strong AI paradigm which can be
used to teach machines through interaction with the environment and learning
from their mistakes, but it has not yet been successfully used for automotive
applications. There has recently been a revival of interest in the topic,
however, driven by the ability of deep learning algorithms to learn good
representations of the environment. Motivated by Google DeepMind’s successful
demonstrations of learning for games from Breakout to Go, we will propose
different methods for autonomous driving using deep reinforcement learning.
This is of particular interest as it is difficult to pose autonomous driving as
a supervised learning problem as it has a strong interaction with the
environment including other vehicles, pedestrians and roadworks. As this is a
relatively new area of research for autonomous driving, we will formulate two
main categories of algorithms: 1) Discrete actions category, and 2) Continuous
actions category. For the discrete actions category, we will deal with Deep
Q-Network Algorithm (DQN) while for the continuous actions category, we will
deal with Deep Deterministic Actor Critic Algorithm (DDAC). In addition to
that, We will also discover the performance of these two categories on an open
source car simulator for Racing called (TORCS) which stands for The Open Racing
car Simulator. Our simulation results demonstrate learning of autonomous
maneuvering in a scenario of complex road curvatures and simple interaction
with other vehicles. Finally, we explain the effect of some restricted
conditions, put on the car during the learning phase, on the convergence time
for finishing its learning phase.
Tian Qi Chen, Mark Schmidt
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Learning (cs.LG)
Artistic style transfer is an image synthesis problem where the content of an
image is reproduced with the style of another. Recent works show that a
visually appealing style transfer can be achieved by using the hidden
activations of a pretrained convolutional neural network. However, existing
methods either apply (i) an optimization procedure that works for any style
image but is very expensive, or (ii) an efficient feedforward network that only
allows a limited number of trained styles. In this work we propose a simpler
optimization objective based on local matching that combines the content
structure and style textures in a single layer of the pretrained network. We
show that our objective has desirable properties such as a simpler optimization
landscape, intuitive parameter tuning, and consistent frame-by-frame
performance on video. Furthermore, we use 80,000 natural images and 80,000
paintings to train an inverse network that approximates the result of the
optimization. This results in a procedure for artistic style transfer that is
efficient but also allows arbitrary content and style images.
Brett W. Israelsen, Nisar Ahmed, Kenneth Center, Roderick Green, Winston Bennett Jr
Comments: submitted copy
Journal-ref: SciTech 2017, paper 2545524
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG); Robotics (cs.RO)
A key requirement for the current generation of artificial decision-makers is
that they should adapt well to changes in unexpected situations. This paper
addresses the situation in which an AI for aerial dog fighting, with tunable
parameters that govern its behavior, must optimize behavior with respect to an
objective function that is evaluated and learned through simulations. Bayesian
optimization with a Gaussian Process surrogate is used as the method for
investigating the objective function. One key benefit is that during
optimization, the Gaussian Process learns a global estimate of the true
objective function, with predicted outcomes and a statistical measure of
confidence in areas that haven’t been investigated yet. Having a model of the
objective function is important for being able to understand possible outcomes
in the decision space; for example this is crucial for training and providing
feedback to human pilots. However, standard Bayesian optimization does not
perform consistently or provide an accurate Gaussian Process surrogate function
for highly volatile objective functions. We treat these problems by introducing
a novel sampling technique called Hybrid Repeat/Multi-point Sampling. This
technique gives the AI ability to learn optimum behaviors in a highly uncertain
environment. More importantly, it not only improves the reliability of the
optimization, but also creates a better model of the entire objective surface.
With this improved model the agent is equipped to more accurately/efficiently
predict performance in unexplored scenarios.
Long-Gang Pang, Kai Zhou, Nan Su, Hannah Petersen, Horst Stöcker, Xin-Nian Wang
Subjects: High Energy Physics – Phenomenology (hep-ph); Learning (cs.LG); Nuclear Theory (nucl-th); Machine Learning (stat.ML)
Supervised learning with a deep convolutional neural network is used to
identify the QCD equation of state (EoS) employed in relativistic hydrodynamic
simulations of heavy-ion collisions. The final-state particle spectra
(
ho(p_T,Phi)) provide directly accessible information from experiments.
High-level correlations of (
ho(p_T,Phi)) learned by the neural network act
as an “EoS-meter”, effective in detecting the nature of the QCD transition. The
EoS-meter is model independent and insensitive to other simulation input,
especially the initial conditions. Thus it provides a formidable
direct-connection of heavy-ion collision observable with the bulk properties of
QCD.
Mona Alshahrani, Mohammed Asif Khan, Omar Maddouri, Akira R Kinjo, Núria Queralt-Rosinach, Robert Hoehndorf
Subjects: Quantitative Methods (q-bio.QM); Learning (cs.LG); Molecular Networks (q-bio.MN)
Motivation: Biological data and knowledge bases increasingly rely on Semantic
Web technologies and the use of knowledge graphs for data integration,
retrieval and federated queries. In the past years, feature learning methods
that are applicable to graph-structured data are becoming available, but have
not yet widely been applied and evaluated on structured biological knowledge.
Results: We develop a novel method for feature learning on biological knowledge
graphs. Our method combines symbolic methods, in particular knowledge
representation using symbolic logic and automated reasoning, with neural
networks to generate embeddings of nodes that encode for related information
within knowledge graphs. Through the use of symbolic logic, these embeddings
contain both explicit and implicit information. We apply these embeddings to
the prediction of edges in the knowledge graph representing problems of
function prediction, finding candidate genes of diseases, protein-protein
interactions, or drug target relations, and demonstrate performance that
matches and sometimes outperforms traditional approaches based on manually
crafted features. Our method can be applied to any biological knowledge graph,
and will thereby open up the increasing amount of Semantic Web based knowledge
bases in biology to use in machine learning and data analytics. Availability
and Implementation:
this https URL Contact:
robert.hoehndorf@kaust.edu.sa
Yuan Tang
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Learning (cs.LG)
TF.Learn is a high-level Python module for distributed machine learning
inside TensorFlow. It provides an easy-to-use Scikit-learn style interface to
simplify the process of creating, configuring, training, evaluating, and
experimenting a machine learning model. TF.Learn integrates a wide range of
state-of-art machine learning algorithms built on top of TensorFlow’s low level
APIs for small to large-scale supervised and unsupervised problems. This module
focuses on bringing machine learning to non-specialists using a general-purpose
high-level language as well as researchers who want to implement, benchmark,
and compare their new methods in a structured environment. Emphasis is put on
ease of use, performance, documentation, and API consistency.
Philipp Meerkamp (Bloomberg LP), Zhengyi Zhou (AT&T Labs Research)
Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR); Learning (cs.LG)
We present an architecture for information extraction from text that augments
an existing parser with a character-level neural network. To train the neural
network, we compute a measure of consistency of extracted data with existing
databases, and use it as a form of noisy supervision. Our architecture combines
the ability of constraint-based information extraction system to easily
incorporate domain knowledge and constraints with the ability of deep neural
networks to leverage large amounts of data to learn complex features. The
system led to large improvements over a mature and highly tuned
constraint-based information extraction system used at Bloomberg for financial
language text. At the same time, the new system massively reduces the
development effort, allowing rule-writers to write high-recall constraints
while relying on the deep neural network to remove false positives and boost
precision.
Alec Koppel, Garrett Warnell, Ethan Stump, Alejandro Ribeiro
Comments: Submitted to JMLR on 11/24/2016
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
Despite their attractiveness, popular perception is that techniques for
nonparametric function approximation do not scale to streaming data due to an
intractable growth in the amount of storage they require. To solve this problem
in a memory-affordable way, we propose an online technique based on functional
stochastic gradient descent in tandem with supervised sparsification based on
greedy function subspace projections. The method, called parsimonious online
learning with kernels (POLK), provides a controllable tradeoff? between its
solution accuracy and the amount of memory it requires. We derive conditions
under which the generated function sequence converges almost surely to the
optimal function, and we establish that the memory requirement remains finite.
We evaluate POLK for kernel multi-class logistic regression and kernel
hinge-loss classification on three canonical data sets: a synthetic Gaussian
mixture model, the MNIST hand-written digits, and the Brodatz texture database.
On all three tasks, we observe a favorable tradeoff of objective function
evaluation, classification performance, and complexity of the nonparametric
regressor extracted the proposed method.
Sam Work
Comments: MSc dissertation, qualitative analysis, machine learning researchers
Subjects: Computers and Society (cs.CY); Learning (cs.LG)
This MSc dissertation considers the effects of the current corporate interest
on researchers in the field of machine learning. Situated within the field’s
cyclical history of academic, public and corporate interest, this dissertation
investigates how current researchers view recent developments and negotiate
their own research practices within an environment of increased commercial
interest and funding. The original research consists of in-depth interviews
with 12 machine learning researchers working in both academia and industry.
Building on theory from science, technology and society studies, this
dissertation problematizes the traditional narratives of the neoliberalization
of academic research by allowing the researchers themselves to discuss how
their career choices, working environments and interactions with others in the
field have been affected by the reinvigorated corporate interest of recent
years.
Yiyan Wang, Haotian Xu, Zhijian Ou
Comments: accepted by ICASSP2017
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Learning (cs.LG)
State-of-the-art i-vector based speaker verification relies on variants of
Probabilistic Linear Discriminant Analysis (PLDA) for discriminant analysis. We
are mainly motivated by the recent work of the joint bayesian (JB) method,
which is originally proposed for discriminant analysis in face verification. We
apply JB to speaker verification and make three contributions beyond of the
original JB. 1) In contrast to the EM iterations with approximated statistics
in the original JB, the EM iterations with exact statistics is employed and
gives better performance. 2) We propose to do simultaneously diagonalization
(SD) of the within-class and between-class covariance matrices to achieve
efficient testing, which has broader application scope than the SVD-based
efficient testing method in the original JB. 3) We scrutinize similarities and
differences between various Gaussian PLDAs and JB, complementing the previous
analysis of comparing JB only with Prince-Elder PLDA. Extensive experiments are
conducted on NIST SRE10 core condition 5, empirically validating the
superiority of JB with faster convergence rate and 9 – 13% EER reduction
compared with state-of-the-art PLDA.
Bodo Rueckauer, Iulia-Alexandra Lungu, Yuhuang Hu, Michael Pfeiffer
Comments: 9 pages, 2 figures, presented at the workshop “Computing with Spikes” at NIPS 2016, Barcelona, Spain
Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Deep convolutional neural networks (CNNs) have shown great potential for
numerous real-world machine learning applications, but performing inference in
large CNNs in real-time remains a challenge. We have previously demonstrated
that traditional CNNs can be converted into deep spiking neural networks
(SNNs), which exhibit similar accuracy while reducing both latency and
computational load as a consequence of their data-driven, event-based style of
computing. Here we provide a novel theory that explains why this conversion is
successful, and derive from it several new tools to convert a larger and more
powerful class of deep networks into SNNs. We identify the main sources of
approximation errors in previous conversion methods, and propose simple
mechanisms to fix these issues. Furthermore, we develop spiking implementations
of common CNN operations such as max-pooling, softmax, and batch-normalization,
which allow almost loss-less conversion of arbitrary CNN architectures into the
spiking domain. Empirical evaluation of different network architectures on the
MNIST and CIFAR10 benchmarks leads to the best SNN results reported to date.
Brett Israelsen, Nisar Ahmed
Journal-ref: BayesOpt Workshop, NIPS 2016
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG); Robotics (cs.RO)
A key drawback of the current generation of artificial decision-makers is
that they do not adapt well to changes in unexpected situations. This paper
addresses the situation in which an AI for aerial dog fighting, with tunable
parameters that govern its behavior, will optimize behavior with respect to an
objective function that must be evaluated and learned through simulations. Once
this objective function has been modeled, the agent can then choose its desired
behavior in different situations. Bayesian optimization with a Gaussian Process
surrogate is used as the method for investigating the objective function. One
key benefit is that during optimization the Gaussian Process learns a global
estimate of the true objective function, with predicted outcomes and a
statistical measure of confidence in areas that haven’t been investigated yet.
However, standard Bayesian optimization does not perform consistently or
provide an accurate Gaussian Process surrogate function for highly volatile
objective functions. We treat these problems by introducing a novel sampling
technique called Hybrid Repeat/Multi-point Sampling. This technique gives the
AI ability to learn optimum behaviors in a highly uncertain environment. More
importantly, it not only improves the reliability of the optimization, but also
creates a better model of the entire objective surface. With this improved
model the agent is equipped to better adapt behaviors.
Javier de la Cruz
Subjects: Information Theory (cs.IT)
In this paper we define and study a family of codes which come close to be
MRD codes, so we call them AMRD codes (almost MRD). An AMRD code is a code with
rank defect equal to 1. AMRD codes whose duals are AMRD are called dually AMRD.
Dually AMRD codes are the closest to the MRD codes given that both they and
their dual codes are almost optimal. Necessary and sufficient conditions for
the codes to be dually AMRD are given. Furthermore we show that dually AMRD
codes and codes of rank defect one and maximum 2-generalized weight coincide
when the size of the matrix divides the dimension.
Chaoyang Jiang, Yeng Chai Soh, Hua Li, Mustafa K. Masood, Zhe Wei, Xiaoli Zhou, Deqing Zhai
Comments: 17 pages
Subjects: Information Theory (cs.IT); Graphics (cs.GR)
Current CFD calibration work has mainly focused on the CFD model calibration.
However no known work has considered the calibration of the CFD results. In
this paper, we take inspiration from the image editing problem to develop a
methodology to calibrate CFD simulation results based on sparse sensor
observations. We formulate the calibration of CFD results as an optimization
problem. The cost function consists of two terms. One term guarantees a good
local adjustment of the simulation results based on the sparse sensor
observations. The other term transmits the adjustment from local regions around
sensing locations to the global domain. The proposed method can enhance the CFD
simulation results while preserving the overall original profile. An experiment
in an air-conditioned room was implemented to verify the effectiveness of the
proposed method. In the experiment, four sensor observations were used to
calibrate a simulated thermal map with 167×365 data points. The experimental
results show that the proposed method is effective and practical.
P K Deekshith, Trupthi Chougule, Shreya Turmari, Ramya Raju, Rakshitha Ram, Vinod Sharma
Subjects: Information Theory (cs.IT)
In this work, we consider a scenario wherein an energy harvesting wireless
radio equipment sends information to multiple receivers alongside powering
them. In addition to harvesting the incoming radio frequency (RF) energy, the
receivers also harvest energy from {its environment (e.g., solar energy)}. This
communication framework is captured by a fading Gaussian Broadcast Channel
(GBC) with energy harvesting transmitter and receivers. In order to ensure
{some quality of service (QoS)} in data reception among the receivers, we
impose a extit{minimum-rate} requirement on data transmission. For the
setting in place, we characterize the fundamental limits in jointly
transmitting information and power subject to a QoS guarantee, for three
cardinal receiver structures namely, extit{ideal}, extit{time-switching}
and extit{power-splitting}. We show that a time-switching receiver can
{switch} between {information reception mode} and {energy harvesting mode},
extit{without} the transmitter’s knowledge of the same and extit{without}
any extra extit{rate loss}. We also prove that, for the same amount of power
transferred, on average, a power-splitting receiver supports higher data rates
compared to a time-switching receiver.
P K Deekshith, Trupthi Chougule, Shreya Turmari, Ramya Raju, Rakshitha Ram, Vinod Sharma
Subjects: Information Theory (cs.IT)
We consider the problem of Simultaneous Wireless Information and Power
Transfer (SWIPT) over a fading multiple access channel with additive Gaussian
noise. The transmitters as well as the receiver harvest energy from ambient
sources. We assume that the transmitters have two classes of data to send, viz.
delay sensitive and delay tolerant data. Each transmitter sends the delay
sensitive data at a certain minimum rate irrespective of the channel conditions
(fading states). In addition, if the channel conditions are good, the delay
tolerant data is sent. {Along with data, the transmitters also transfer power
to aid the receiver in meeting its energy requirements.} In this setting, we
characterize the extit{minimum-rate capacity region} which provides the
fundamental limit of transferring information and power simultaneously with
minimum rate guarantees. Owing to the limitations of current technology, these
limits might not be achievable in practice. Among the practical receiver
structures proposed for SWIPT in literature, two popular architectures are the
extit{time switching} and extit{power splitting} receivers. For each of
these architectures, we derive the minimum-rate capacity regions. We show that
power splitting receivers although more complex, provide a larger capacity
region.
Jae-Nam Shim, Hongseok Park, GeeYong Suk, Chan-Byoung Chae, Dong Ku Kim
Subjects: Information Theory (cs.IT)
In this paper, we consider the Cramer-Rao lower bound (CRLB) for estimation
of a lens-embedded antenna array with deterministic parameters. Unlike CRLB of
uniform linear array (ULA), it is noted that CRLB for direction of arrival
(DoA) of lens-embedded antenna array is dominated by not only angle but
characteristics of lens. Derivation is based on the approximation that
amplitude of received signal with lens is approximated to Gaussian function. We
confirmed that parameters needed to design a lens can be derived by standard
deviation of Gaussian, which represents characteristic of received signal, by
simulation of beam propagation method. Well-designed lens antenna shows better
performance than ULA in terms of estimating DoA. This is a useful derivation
because, result can be the guideline for designing parameters of lens to
satisfy certain purpose.
Emil Björnson, Luca Sanguinetti, Merouane Debbah
Comments: 5 pages, 3 figures, 1 table
Journal-ref: presented at Asilomar Conference on Signals, Systems, and
Computers, Pacific Grove, USA, Nov. 2016
Subjects: Information Theory (cs.IT)
This work investigates the impact of imperfect statistical information in the
uplink of massive MIMO systems. In particular, we first show why covariance
information is needed and then propose two schemes for covariance matrix
estimation. A lower bound on the spectral efficiency (SE) of any combining
scheme is derived, under imperfect covariance knowledge, and a closed-form
expression is computed for maximum-ratio combin- ing. We show that having
covariance information is not critical, but that it is relatively easy to
acquire it and to achieve SE close to the ideal case of having perfect
statistical information.
Reuben George Stephen, Rui Zhang
Comments: Presented in IEEE Online Conference on Green Communications (Online GreenComm), Nov. 2016 (Invited Paper)
Subjects: Information Theory (cs.IT)
Cloud radio access network (CRAN), in which remote radio heads (RRHs) are
deployed to serve users in a target area, and connected to a central processor
(CP) via limited-capacity links termed the fronthaul, is a promising candidate
for the next-generation wireless communication systems. Due to the
content-centric nature of future wireless communications, it is desirable to
cache popular contents beforehand at the RRHs, to reduce the burden on the
fronthaul and achieve energy saving through cooperative transmission. This
motivates our study in this paper on the energy efficient transmission in an
orthogonal frequency division multiple access (OFDMA)-based CRAN with multiple
RRHs and users, where the RRHs can prefetch popular contents. We consider a
joint optimization of the user-SC assignment, RRH selection and transmit power
allocation over all the SCs to minimize the total transmit power of the RRHs,
subject to the RRHs’ individual fronthaul capacity constraints and the users’
minimum rate constraints, while taking into account the caching status at the
RRHs. Although the problem is non-convex, we propose a Lagrange duality based
solution, which can be efficiently computed with good accuracy. We compare the
minimum transmit power required by the proposed algorithm with different
caching strategies against the case without caching by simulations, which show
the significant energy saving with caching.
Difan Zou, Chen Gong, Zhengyuan Xu
Subjects: Information Theory (cs.IT)
In optical wireless scattering communication, received signal in each symbol
interval is captured by a photomultiplier tube (PMT) and then sampled through
very short but finite interval sampling. The resulting samples form a signal
vector for symbol detection. The upper and lower bounds on transmission rate of
such a processing system are studied. It is shown that the gap between two
bounds approaches zero as the thermal noise and shot noise variances approach
zero. The maximum a posteriori (MAP) signal detection is performed and a low
computational complexity receiver is derived under piecewise polynomial
approximation. Meanwhile, the threshold based signal detection is also studied,
where two threshold selection rules are proposed based on the detection error
probability and the Kullback-Leibler (KL) distance. For the latter, it is shown
that the KL distance is not sensitive to the threshold selection for small shot
and thermal noise variances, and thus the threshold can be selected among a
wide range without significant loss from the optimal KL distance. The
performances of the transmission rate bounds, the signal detection, and the
threshold selection approaches are evaluated by the numerical results.
Hassan Khodaiemehr, Mohammad-Reza Sadeghi, Daniel Panario
Comments: 44 pages, 6 figures. Part of this work has been presented at ISIT 2016, Spain
Subjects: Information Theory (cs.IT)
LDPC lattices were the first family of lattices which have an efficient
decoding algorithm in high dimensions over an AWGN channel. Considering
Construction D’ of lattices with one binary LDPC code as underlying code gives
the well known Construction A LDPC lattices or 1-level LDPC lattices.
Block-fading channel (BF) is a useful model for various wireless communication
channels in both indoor and outdoor environments. Frequency-hopping schemes and
orthogonal frequency division multiplexing (OFDM) can conveniently be modelled
as block-fading channels. Applying lattices in this type of channel entails
dividing a lattice point into multiple blocks such that fading is constant
within a block but changes, independently, across blocks. The design of
lattices for BF channels offers a challenging problem, which differs greatly
from its counterparts like AWGN channels. Recently, the original binary
Construction A for lattices, due to Forney, have been generalized to a lattice
construction from totally real and complex multiplication fields. This
generalized Construction A of lattices provides signal space diversity
intrinsically, which is the main requirement for the signal sets designed for
fading channels. In this paper we construct full diversity LDPC lattices for
block-fading channels using Construction A over totally real number fields. We
propose a new iterative decoding method for these family of lattices which has
complexity that grows linearly in the dimension of the lattice. In order to
implement our decoding algorithm, we propose the definition of a parity check
matrix and Tanner graph for full diversity Construction A lattices. We also
prove that the constructed LDPC lattices together with the proposed decoding
method admit diversity order n-1 over an n-block-fading channel.
Kuikui Li, Chenchen Yang, Zhiyong Chen, Meixia Tao
Comments: submitted to IEEE Trans. Wireless Communications
Subjects: Information Theory (cs.IT)
In this paper, we study the probabilistic caching for an N-tier wireless
heterogeneous network (HetNet) using stochastic geometry. A general and
tractable expression of the successful delivery probability (SDP) is first
derived. We then optimize the caching probabilities for maximizing the SDP in
the high signal-to-noise ratio (SNR) region. The problem is proved to be convex
and solved efficiently. We next establish an interesting connection between
N-tier HetNets and single-tier networks. Unlike the single-tier network where
the optimal performance only depends on the cache size, the optimal performance
of N-tier HetNets depends also on the BS densities. The performance upper bound
is, however, determined by an equivalent single-tier network. We further show
that with even caching probabilities regardless of content popularities, to
achieve a target SDP, the BS density of a tier can be reduced by increasing the
cache size of the tier when the cache size is larger than a threshold;
otherwise the BS density and BS cache size can be increased simultaneously. It
is also found analytically that the BS density of a tier is inverse to the BS
cache size of the same tier and is linear to BS cache sizes of other tiers.
Hela Jedda, Amine Mezghani, A. Lee Swindlehurst, Josef A. Nossek
Comments: 5 pages, submitted to ICC 2017
Subjects: Information Theory (cs.IT)
We consider a multi-user (MU) multiple-input-single-output (MISO) downlink
system with M single-antenna users and N transmit antennas with a nonlinear
power amplifier (PA) at each antenna. Instead of emitting constant envelope
(CE) signals from the antennas to have highly power efficient PAs, we relax the
CE constraint and allow the transmit signals to have instantaneous power less
than or equal to the available power at each PA. The PA power efficiency
decreases but simulation results show that the same performance in terms of
bit-error-ratio (BER) can be achieved with less transmitted power and less PA
power consumption. We propose a linear and a nonlinear precoder design to
mitigate the multi-user interference (MUI) under the constraint of a maximal
instantaneous per-antenna peak power.
Yirui Cong, Xiangyun Zhou, Rodney A. Kennedy
Comments: This paper is accepted for publication in Globecom’16
Subjects: Information Theory (cs.IT)
This paper studies a wireless network consisting of multiple
transmitter-receiver pairs sharing the same spectrum where interference is
regarded as noise. Previously, the throughput region of such a network was
characterized for either one time slot or an infinite time horizon. This work
aims to close the gap by investigating the throughput region for transmissions
over a finite time horizon. We derive an efficient algorithm to examine the
achievability of any given rate in the finite-horizon throughput region and
provide the rate-achieving policy. The computational efficiency of our
algorithm comes from the use of A* search with a carefully chosen heuristic
function and a tree pruning strategy. We also show that the celebrated
max-weight algorithm which finds all achievable rates in the infinite-horizon
throughput region fails to work for the finite-horizon throughput region.
Qurrat-Ul-Ain Nadeem, Abla Kammoun, Mérouane Debbah, and Mohamed-Slim Alouini
Subjects: Information Theory (cs.IT)
Massive multiple-input-multiple-output (MIMO) transmission is a promising
technology to improve the capacity and reliability of wireless systems.
However, the number of antennas that can be equipped at a base station (BS) is
limited by the BS form factor, posing a challenge to the deployment of massive
linear arrays. To cope with this limitation, this work discusses Full Dimension
MIMO (FD-MIMO), which is currently an active area of research and
standardization in the 3rd Generation Partnership Project (3GPP) for evolution
towards fifth generation (5G) cellular systems. FD-MIMO utilizes an active
antenna system (AAS) with a 2D planar array structure, which provides the
ability of adaptive electronic beamforming in the 3D space. This paper presents
the design of the AAS and the ongoing efforts in the 3GPP to develop the
corresponding 3D channel model. Compact structure of large-scale antenna arrays
drastically increases the spatial correlation in FD-MIMO systems. In order to
account for its effects, the generalized spatial correlation functions for
channels constituted by individual antenna elements and overall antenna ports
in the AAS are derived. Exploiting the quasi-static channel covariance matrices
of the users, the problem of determining the optimal downtilt weight vector for
antenna ports, which maximizes the minimum signal-to-interference ratio of a
multi-user multiple-input-single-output system, is formulated as a fractional
optimization problem. A quasi-optimal solution is obtained through the
application of semi-definite relaxation and Dinkelbach’s method. Finally, the
user-group specific elevation beamforming scenario is devised, which offers
significant performance gains as confirmed through simulations. These results
have direct application in the analysis of 5G FD-MIMO systems.
Yahia A. Eldemerdash, Octavia A. Dobre
Subjects: Information Theory (cs.IT)
Signal identification represents the task of a receiver to identify the
signal type and its parameters, with applications to both military and
commercial communications. In this paper, we investigate the identification of
spatial multiplexing (SM) and Alamouti (AL) space-time block code (STBC) with
single carrier frequency division multiple access (SC-FDMA) signals, when the
receiver is equipped with a single antenna. We develop a discriminating feature
based on a fourth-order statistic of the received signal, as well as a constant
false alarm rate decision criterion which relies on the statistical properties
of the feature estimate. Furthermore, we present the theoretical performance
analysis of the proposed identification algorithm. The algorithm does not
require channel or noise power estimation, modulation classification, and block
synchronization. Simulation results show the validity of the proposed
algorithm, as well as a very good agreement with the theoretical analysis.
Anastasios Giovanidis, Apostolos Avranas
Comments: 14 pages, double column, 5 figures, 15 sub-figures in total. arXiv admin note: substantial text overlap with arXiv:1602.07623
Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT); Multimedia (cs.MM); Performance (cs.PF)
This article introduces a novel family of decentralised caching policies,
applicable to wireless networks with finite storage at the edge-nodes
(stations). These policies, that are based on the Least-Recently-Used
replacement principle, are here referred to as spatial multi-LRU. They update
cache inventories in a way that provides content diversity to users who are
covered by, and thus have access to, more than one station. Two variations are
proposed, the multi-LRU-One and -All, which differ in the number of replicas
inserted in the involved caches. We analyse their performance under two types
of traffic demand, the Independent Reference Model (IRM) and a model that
exhibits temporal locality. For IRM, we propose a Che-like approximation to
predict the hit probability, which gives very accurate results. Numerical
evaluations show that the performance of multi-LRU increases the more the
multi-coverage areas increase, and it is close to the performance of
centralised policies, when multi-coverage is sufficient. For IRM traffic,
multi-LRU-One is preferable to multi-LRU-All, whereas when the traffic exhibits
temporal locality the -All variation can perform better. Both variations
outperform the simple LRU. When popularity knowledge is not accurate, the new
policies can perform better than centralised ones.
M. Hassan Najafi, David J. Lilja
Subjects: Cryptography and Security (cs.CR); Information Theory (cs.IT)
This work proposes a high-capacity scheme for separable reversible data
hiding in encrypted images. At the sender side, the original uncompressed image
is encrypted using an encryption key. One or several data hiders use the MSB of
some image pixels to hide additional data. Given the encrypted image containing
this additional data, with only one of those data hiding keys, the receiver can
extract the corresponding embedded data, although the image content will remain
inaccessible. With all of the embedding keys, the receiver can extract all of
the embedded data. Finally, with the encryption key, the receiver can decrypt
the received data and reconstruct the original image perfectlyignore{ without
the data embedding key(s) }by exploiting the spatial correlation of natural
images. Based on the proposed method a receiver could recover the original
image perfectly even when it does not have the data embedding key(s) and the
embedding rate is high.
Cristina Perfecto, Javier Del Ser, Mehdi Bennis
Comments: 14 pages, 6 figures
Subjects: Networking and Internet Architecture (cs.NI); Computer Science and Game Theory (cs.GT); Information Theory (cs.IT)
Recently millimeter-wave bands have been postulated as a means to accommodate
the foreseen extreme bandwidth demands in vehicular communications, which
result from the dissemination of sensory data to nearby vehicles for enhanced
environmental awareness and improved safety level. However, the literature is
particularly scarce in regards to principled resource allocation schemes that
deal with the challenging radio conditions posed by the high mobility of
vehicular scenarios. In this work we propose a novel framework that blends
together Matching Theory and Swarm Intelligence to dynamically and efficiently
pair vehicles and optimize both transmission and reception beamwidths. This is
done by jointly considering Channel (CSI) and Queue (QSI) State Information
when establishing vehicle-to-vehicle (V2V) links. To validate the proposed
framework simulation results are presented and discussed where the throughput
performance as well as the latency/reliability trade-offs of the proposed
approach are assessed and compared to several baseline approaches recently
proposed in the literature. The results obtained in our study -with performance
gains in terms of reliability and delay of up to 25% for ultra-dense vehicular
scenarios and on average 50% more paired vehicles that some of the baselines-
shed light on the operational limits and practical feasibility of mmWave bands
as a viable radio access solution for future high-rate V2V communications.
Oliver Lang, Michael Lunglmayr, Mario Huemer
Subjects: Statistics Theory (math.ST); Information Theory (cs.IT)
We propose a novel iterative algorithm for estimating a deterministic but
unknown parameter vector in the presence of Gaussian model uncertainties. This
iterative algorithm is based on a system model where an overall noise term
describes both, the measurement noise and the noise resulting from the model
uncertainties. This overall noise term is a function of the true parameter
vector. The proposed iterative algorithm can be applied on structured as well
as unstructured models and it outperforms prior art algorithms for a broad
range of applications.
Rashish Tandon, Qi Lei, Alexandros G. Dimakis, Nikos Karampatziakis
Comments: 13 pages, Presented at the Machine Learning Systems Workshop at NIPS 2016
Subjects: Machine Learning (stat.ML); Distributed, Parallel, and Cluster Computing (cs.DC); Information Theory (cs.IT); Computation (stat.CO)
We propose a novel coding theoretic framework for mitigating stragglers in
distributed learning. We show how carefully replicating data blocks and coding
across gradients can provide tolerance to failures and stragglers for
synchronous Gradient Descent. We implement our scheme in MPI and show how we
compare against baseline architectures in running time and generalization
error.