Elike Hodo, Xavier Bellekens, Andrew Hamilton, Pierre-louis Dubouilh, Ephraim Iorkyase, Christos Tachtatzis, Robert Atkinson
Comments: Published in The 2016 International Symposium on Networks, Computers and Communications (IEEE ISNCC’16) , Hammamet, Tunisia, 2016
Subjects: Neural and Evolutionary Computing (cs.NE); Cryptography and Security (cs.CR)
The Internet of things (IoT) is still in its infancy and has attracted much
interest in many industrial sectors including medical fields, logistics
tracking, smart cities and automobiles. However as a paradigm, it is
susceptible to a range of significant intrusion threats. This paper presents a
threat analysis of the IoT and uses an Artificial Neural Network (ANN) to
combat these threats. A multi-level perceptron, a type of supervised ANN, is
trained using internet packet traces, then is assessed on its ability to thwart
Distributed Denial of Service (DDoS/DoS) attacks. This paper focuses on the
classification of normal and threat patterns on an IoT Network. The ANN
procedure is validated against a simulated IoT network. The experimental
results demonstrate 99.4% accuracy and can successfully detect various DDoS/DoS
attacks.
Benjamin Doerr, Christian Gießen, Carsten Witt, Jing Yang
Comments: An extended abstract of this report will appear in the proceedings of the 2017 Genetic and Evolutionary Computation Conference (GECCO 2017)
Subjects: Neural and Evolutionary Computing (cs.NE)
We propose a new way to self-adjust the mutation rate in population-based
evolutionary algorithms. Roughly speaking, it consists of creating half the
offspring with a mutation rate that is twice the current mutation rate and the
other half with half the current rate. The mutation rate is then updated to the
rate used in that subpopulation which contains the best offspring.
We analyze how the ((1+lambda)) evolutionary algorithm with this
self-adjusting mutation rate optimizes the OneMax test function. We prove that
this dynamic version of the ((1+lambda))~EA finds the optimum in an expected
optimization time (number of fitness evaluations) of
(O(nlambda/!loglambda+nlog n)). This time is asymptotically smaller than
the optimization time of the classic ((1+lambda)) EA. Previous work shows that
this performance is best-possible among all (lambda)-parallel mutation-based
unbiased black-box algorithms.
This result shows that the new way of adjusting the mutation rate can find
optimal dynamic parameter values on the fly. Since our adjustment mechanism is
simpler than the ones previously used for adjusting the mutation rate and does
not have parameters itself, we are optimistic that it will find other
applications.
Mohammad Javad Shafiee, Elnaz Barshan, Alexander Wong
Comments: 8 pages. arXiv admin note: substantial text overlap with arXiv:1609.01360
Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
A promising paradigm for achieving highly efficient deep neural networks is
the idea of evolutionary deep intelligence, which mimics biological evolution
processes to progressively synthesize more efficient networks. A crucial design
factor in evolutionary deep intelligence is the genetic encoding scheme used to
simulate heredity and determine the architectures of offspring networks. In
this study, we take a deeper look at the notion of synaptic cluster-driven
evolution of deep neural networks which guides the evolution process towards
the formation of a highly sparse set of synaptic clusters in offspring
networks. Utilizing a synaptic cluster-driven genetic encoding, the
probabilistic encoding of synaptic traits considers not only individual
synaptic properties but also inter-synaptic relationships within a deep neural
network. This process results in highly sparse offspring networks which are
particularly tailored for parallel computational devices such as GPUs and deep
neural network accelerator chips. Comprehensive experimental results using four
well-known deep neural network architectures (LeNet-5, AlexNet, ResNet-56, and
DetectNet) on two different tasks (object categorization and object detection)
demonstrate the efficiency of the proposed method. Cluster-driven genetic
encoding scheme synthesizes networks that can achieve state-of-the-art
performance with significantly smaller number of synapses than that of the
original ancestor network. ((sim)125-fold decrease in synapses for MNIST).
Furthermore, the improved cluster efficiency in the generated offspring
networks ((sim)9.71-fold decrease in clusters for MNIST and a (sim)8.16-fold
decrease in clusters for KITTI) is particularly useful for accelerated
performance on parallel computing hardware architectures such as those in GPUs
and deep neural network accelerator chips.
Aditya Shukla, Vinay Kumar, Udayan Ganguly
Comments: Eight pages, ten figures and two tables
Subjects: Neural and Evolutionary Computing (cs.NE)
Spiking Neural Network (SNN) naturally inspires hardware implementation as it
is based on biology. For learning, spike time dependent plasticity (STDP) may
be implemented using an energy efficient waveform superposition on memristor
based synapse. However, system level implementation has three challenges.
First, a classic dilemma is that recognition requires current reading for short
voltage(-)spikes which is disturbed by large voltage(-)waveforms that are
simultaneously applied on the same memristor for real(-)time learning i.e. the
simultaneous read(-)write dilemma. Second, the hardware needs to exactly
replicate software implementation for easy adaptation of algorithm to hardware.
Third, the devices used in hardware simulations must be realistic. In this
paper, we present an approach to address the above concerns. First, the
learning and recognition occurs in separate arrays simultaneously in
real(-)time, asynchronously (-) avoiding non(-)biomimetic clocking based
complex signal management. Second, we show that the hardware emulates software
at every stage by comparison of SPICE (circuit(-)simulator) with MATLAB
(mathematical SNN algorithm implementation in software) implementations. As an
example, the hardware shows 97.5 per cent accuracy in classification which is
equivalent to software for a Fisher(-)Iris dataset. Third, the STDP is
implemented using a model of synaptic device implemented using HfO2 memristor.
We show that an increasingly realistic memristor model slightly reduces the
hardware performance (85 per cent), which highlights the need to engineer RRAM
characteristics specifically for SNN.
Yaoyuan Zhang, Zhenxu Ye, Yansong Feng, Dongyan Zhao, Rui Yan
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
Sentence simplification reduces semantic complexity to benefit people with
language impairments. Previous simplification studies on the sentence level and
word level have achieved promising results but also meet great challenges. For
sentence-level studies, sentences after simplification are fluent but sometimes
are not really simplified. For word-level studies, words are simplified but
also have potential grammar errors due to different usages of words before and
after simplification. In this paper, we propose a two-step simplification
framework by combining both the word-level and the sentence-level
simplifications, making use of their corresponding advantages. Based on the
two-step framework, we implement a novel constrained neural generation model to
simplify sentences given simplified words. The final results on Wikipedia and
Simple Wikipedia aligned datasets indicate that our method yields better
performance than various baselines.
Rishidev Chaudhuri, Ila Fiete
Comments: 35 pages, 7 figures
Subjects: Neurons and Cognition (q-bio.NC); Neural and Evolutionary Computing (cs.NE)
The brain must robustly store a large number of memories, corresponding to
the many events and scenes a person encounters over a lifetime. However, the
number of memory states in existing neural network models either grows weakly
with network size or recall performance fails catastrophically with vanishingly
little noise. Here we show that it is possible to construct an associative
content-addressable memory (ACAM) with exponentially many stable states and
robust error-correction. The network possesses expander graph connectivity on a
restricted Boltzmann machine architecture. The expansion property allows simple
neural network dynamics to perform at par with modern error-correcting codes.
Appropriate networks can be constructed with sparse random connections combined
with glomerular nodes and a local associative learning rule, using low
dynamic-range weights. Thus, sparse quasi-random constraint structures —
characteristic of an important class of modern error-correcting codes — may
provide for high-performance computation in artificial neural networks and the
brain.
Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We present a new autoencoder-type architecture, that is trainable in an
unsupervised mode, sustains both generation and inference, and has the quality
of conditional and unconditional samples boosted by adversarial learning.
Unlike previous hybrids of autoencoders and adversarial networks, the
adversarial game in our approach is set up directly between the encoder and the
generator, and no external mappings are trained in the process of learning. The
game objective compares the divergences of each of the real and the generated
data distributions with the canonical distribution in the latent space. We show
that direct generator-vs-encoder game leads to a tight coupling of the two
components, resulting in samples and reconstructions of a comparable quality to
some recently-proposed more complex architectures.
Miguel A Bautista, Artsiom Sanakoyeu, Bjorn Ömmer
Comments: Accepted for publication at IEEE Computer Vision and Pattern Recognition 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Unsupervised learning of visual similarities is of paramount importance to
computer vision, particularly due to lacking training data for fine-grained
similarities. Deep learning of similarities is often based on relationships
between pairs or triplets of samples. Many of these relations are unreliable
and mutually contradicting, implying inconsistencies when trained without
supervision information that relates different tuples or triplets to each
other. To overcome this problem, we use local estimates of reliable
(dis-)similarities to initially group samples into compact surrogate classes
and use local partial orders of samples to classes to link classes to each
other. Similarity learning is then formulated as a partial ordering task with
soft correspondences of all samples to classes. Adopting a strategy of
self-supervision, a CNN is trained to optimally represent samples in a mutually
consistent manner while updating the classes. The similarity learning and
grouping procedure are integrated in a single model and optimized jointly. The
proposed unsupervised approach shows competitive performance on detailed pose
estimation and object classification.
Steffen Wolf, Lukas Schott, Ullrich Köthe, Fred Hamprecht
Comments: The first two authors contributed equally
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Learned boundary maps are known to outperform hand- crafted ones as a basis
for the watershed algorithm. We show, for the first time, how to train
watershed computation jointly with boundary map prediction. The estimator for
the merging priorities is cast as a neural network that is con- volutional
(over space) and recurrent (over iterations). The latter allows learning of
complex shape priors. The method gives the best known seeded segmentation
results on the CREMI segmentation challenge.
Maedeh Aghaei, Federico Parezzan, Mariella Dimiccoli, Petia Radeva, Marco Cristani
Comments: To appear in the 12th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2017)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In our society and century, clothing is not anymore used only as a means for
body protection. Our paper builds upon the evidence, studied within the social
sciences, that clothing brings a clear communicative message in terms of social
signals, influencing the impression and behaviour of others towards a person.
In fact, clothing correlates with personality traits, both in terms of
self-assessment and assessments that unacquainted people give to an individual.
The consequences of these facts are important: the influence of clothing on the
decision making of individuals has been investigated in the literature, showing
that it represents a discriminative factor to differentiate among diverse
groups of people. Unfortunately, this has been observed after cumbersome and
expensive manual annotations, on very restricted populations, limiting the
scope of the resulting claims. With this position paper, we want to sketch the
main steps of the very first systematic analysis, driven by social signal
processing techniques, of the relationship between clothing and social signals,
both sent and perceived. Thanks to human parsing technologies, which exhibit
high robustness owing to deep learning architectures, we are now capable to
isolate visual patterns characterising a large types of garments. These
algorithms will be used to capture statistical relations on a large corpus of
evidence to confirm the sociological findings and to go beyond the state of the
art.
Xiaoming Deng, Shuo Yang, Yinda Zhang, Ping Tan, Liang Chang, Hongan Wang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We propose a novel 3D neural network architecture for 3D hand pose estimation
from a single depth image. Different from previous works that mostly run on 2D
depth image domain and require intermediate or post process to bring in the
supervision from 3D space, we convert the depth map to a 3D volumetric
representation, and feed it into a 3D convolutional neural network(CNN) to
directly produce the pose in 3D requiring no further process. Our system does
not require the ground truth reference point for initialization, and our
network architecture naturally integrates both local feature and global context
in 3D space. To increase the coverage of the hand pose space of the training
data, we render synthetic depth image by transferring hand pose from existing
real image datasets. We evaluation our algorithm on two public benchmarks and
achieve the state-of-the-art performance. The synthetic hand pose dataset will
be available.
Hamed R. Tavakoli, Jorma Laaksonen, Esa Rahtu
Subjects: Computer Vision and Pattern Recognition (cs.CV)
This paper revisits recognition of natural image pleasantness by employing
deep convolutional neural networks and affordable eye trackers. There exist
several approaches to recognize image pleasantness: (1) computer vision, and
(2) psychophysical signals. For natural images, computer vision approaches have
not been as successful as for abstract paintings and is lagging behind the
psychophysical signals like eye movements. Despite better results, the
scalability of eye movements is adversely affected by the sensor cost. While
the introduction of affordable sensors have helped the scalability issue by
making the sensors more accessible, the application of such sensors in a
loosely controlled human-computer interaction setup is not yet studied for
affective image tagging. On the other hand, deep convolutional neural networks
have boosted the performance of vision-based techniques significantly in recent
years. To investigate the current status in regard to affective image tagging,
we (1) introduce a new eye movement dataset using an affordable eye tracker,
(2) study the use of deep neural networks for pleasantness recognition, (3)
investigate the gap between deep features and eye movements. To meet these
ends, we record eye movements in a less controlled setup, akin to daily
human-computer interaction. We assess features from eye movements, visual
features, and their combination. Our results show that (1) recognizing natural
image pleasantness from eye movement under less restricted setup is difficult
and previously used techniques are prone to fail, and (2) visual class
categories are strong cues for predicting pleasantness, due to their
correlation with emotions, necessitating careful study of this phenomenon. This
latter finding is alerting as some deep learning approaches may fit to the
class category bias.
Dieu Linh Tran, Robert Walecki, Ognjen (Oggi)
Rudovic, Stefanos Eleftheriadis, Bjoern Schuller, Maja Pantic
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Variational (deep) parametric auto-encoders (VAE) have shown a great
potential for unsupervised extraction of latent representations from large
amounts of data. Human face exhibits an inherent hierarchy in facial
representations (encoded in facial action units (AUs) and their intensity).
This makes VAE a sophisticated method for learning facial features for AU
intensity estimation. Yet, most existing methods apply classifiers learned
separately from the encoded features. On the other hand, non-parametric
(probabilistic) approaches, such as Gaussian Processes (GPs), typically
outperform their parametric counterparts, but cannot deal easily with large
amounts of data. In this paper, we propose a novel VAE semi-parametric modeling
framework, named DeepCoder, which combines the modeling power of parametric
(convolutional) and nonparametric (ordinal GPs) VAEs, for joint learning of (1)
latent representations at multiple levels in a task hierarchy, and (2)
classification of multiple ordinal outputs (AUs intensities). We show on
benchmark datasets for AU intensity estimation that the proposed DeepCoder
significantly outperforms state-of-the-art approaches, and related parametric
VAEs, deep learning and parametric models.
Xiaoyong Shen, Hongyun Gao, Xin Tao, Chao Zhou, Jiaya Jia
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Estimating correspondence between two images and extracting the foreground
object are two challenges in computer vision. With dual-lens smart phones, such
as iPhone 7Plus and Huawei P9, coming into the market, two images of slightly
different views provide us new information to unify the two topics. We propose
a joint method to tackle them simultaneously via a joint fully connected
conditional random field (CRF) framework. The regional correspondence is used
to handle textureless regions in matching and make our CRF system
computationally efficient. Our method is evaluated over 2,000 new image pairs,
and produces promising results on challenging portrait images.
Ryo Yonetani, Vishnu Naresh Boddeti, Kris M. Kitani, Yoichi Sato
Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
We propose a privacy-preserving framework for learning visual classifiers by
leveraging distributed private image data. This framework is designed to
aggregate multiple classifiers updated locally using private data and to ensure
that no private information about the data is exposed during its learning
procedure. We utilize a homomorphic cryptosystem that can aggregate the local
classifiers while they are encrypted and thus kept secret. To overcome the high
computational cost of homomorphic encryption of high-dimensional classifiers,
we (1) impose sparsity constraints on local classifier updates and (2) propose
a novel efficient encryption scheme named doubly-permuted homomorphic
encryption (DPHE) which is tailored to sparse high-dimensional data. DPHE (i)
decomposes sparse data into its constituent non-zero values and their
corresponding support indices, (ii) applies homomorphic encryption only to the
non-zero values, and (iii) employs double permutations on the support indices
to make them secret. Our experimental evaluation on several public datasets
demonstrates that the proposed approach significantly outperforms other
privacy-preserving methods and achieves comparable performance against
state-of-the-art visual recognition methods without privacy preservation.
Franziska Mueller, Dushyant Mehta, Oleksandr Sotnychenko, Srinath Sridhar, Dan Casas, Christian Theobalt
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We present an approach for real-time, robust and accurate hand pose
estimation from moving egocentric RGB-D cameras in cluttered real environments.
Existing methods typically fail for hand-object interactions in cluttered
scenes imaged from egocentric viewpoints, common for virtual or augmented
reality applications. Our approach uses two subsequently applied Convolutional
Neural Networks (CNNs) to localize the hand and regress 3D joint locations.
Hand localization is achieved by using a CNN to estimate the 2D position of the
hand center in the input, even in the presence of clutter and occlusions. The
localized hand position, together with the corresponding input depth value, is
used to generate a normalized cropped image that is fed into a second CNN to
regress relative 3D hand joint locations in real time. For added accuracy,
robustness and temporal stability, we refine the pose estimates using a
kinematic pose tracking energy. To train the CNNs, we introduce a new
photorealistic dataset that uses a merged reality approach to capture and
synthesize large amounts of annotated data of natural hand interaction in
cluttered scenes. Through quantitative and qualitative evaluation, we show that
our method is robust to self-occlusion and occlusions by objects, particularly
in moving egocentric perspectives.
Yuta Matsuzaki, Kazushige Okayasu, Takaaki Imanari, Naomichi Kobayashi, Yoshihiro Kanehara, Ryousuke Takasawa, Akio Nakamura, Hirokatsu Kataoka
Comments: 4 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In this paper, we aim to estimate the Winner of world-wide film festival from
the exhibited movie poster. The task is an extremely challenging because the
estimation must be done with only an exhibited movie poster, without any film
ratings and box-office takings. In order to tackle this problem, we have
created a new database which is consist of all movie posters included in the
four biggest film festivals. The movie poster database (MPDB) contains historic
movies over 80 years which are nominated a movie award at each year. We apply a
couple of feature types, namely hand-craft, mid-level and deep feature to
extract various information from a movie poster. Our experiments showed
suggestive knowledge, for example, the Academy award estimation can be better
rate with a color feature and a facial emotion feature generally performs good
rate on the MPDB. The paper may suggest a possibility of modeling human taste
for a movie recommendation.
Weidong Yin, Yanwei Fu, Leonid Sigal, Xiangyang Xue
Comments: 10 pages, submitted to ICCV 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Generating and manipulating human facial images using high-level attributal
controls are important and interesting problems. The models proposed in
previous work can solve one of these two problems (generation or manipulation),
but not both coherently. This paper proposes a novel model that learns how to
both generate and modify the facial image from high-level semantic attributes.
Our key idea is to formulate a Semi-Latent Facial Attribute Space (SL-FAS) to
systematically learn relationship between user-defined and latent attributes,
as well as between those attributes and RGB imagery. As part of this newly
formulated space, we propose a new model — SL-GAN which is a specific form of
Generative Adversarial Network. Finally, we present an iterative training
algorithm for SL-GAN. The experiments on recent CelebA and CASIA-WebFace
datasets validate the effectiveness of our proposed framework. We will also
make data, pre-trained models and code available.
Marc Bolaños, Álvaro Peris, Francisco Casacuberta, Sergi Soler, Petia Radeva
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Egocentric vision consists in acquiring images along the day from a first
person point-of-view using wearable cameras. The automatic analysis of this
information allows to discover daily patterns for improving the quality of life
of the user. A natural topic that arises in egocentric vision is storytelling,
that is, how to understand and tell the story relying behind the pictures.
In this paper, we tackle storytelling as an egocentric sequences description
problem. We propose a novel methodology that exploits information from
temporally neighboring events, matching precisely the nature of egocentric
sequences. Furthermore, we present a new method for multimodal data fusion
consisting on a multi-input attention recurrent network. We also publish the
first dataset for egocentric image sequences description, consisting of 1,339
events with 3,991 descriptions, from 55 days acquired by 11 people.
Furthermore, we prove that our proposal outperforms classical attentional
encoder-decoder methods for video description.
Abhijit Guha Roy, Sailesh Conjeti, Sri Phani Krishna Karri, Debdoot Sheet, Amin Katouzian, Christian Wachinger, Nassir Navab
Comments: Manuscript Under Review
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Optical coherence tomography (OCT) is extensively used for diagnosis of
diabetic macular edema due to its non-invasive imaging based assessment of the
retinal layers. In this paper, we propose a new fully convolutional deep
learning architecture, termed ReLayNet, for segmentation of retinal layers and
fluid masses in eye OCT scans. ReLayNet uses a contracting path of
convolutional blocks (encoders) to learn a heirarchy of contextual features,
followed by an expansive path of convolutional blocks (decoders) for semantic
segmentation. Additionally, skip connections relaying encoder outputs to
matched decoder inputs are introduced to recover spatial information lost
during downsampling. ReLayNet is trained with stochastic gradient descent to
optimize a joint loss function comprising of both weighted logistic regression
and Dice overlap loss. The framework is validated on a publicly available
benchmark dataset with comparisons against five state-of-the-art segmentation
methods which includes two deep learning based approaches. Additionally, eight
incremental baselines are defined and compared with, to validate the individual
contributions of the proposed framework. We demonstrate that ReLayNet can
reliably segment the retinal layers and accumulated fluids with improved
performance in retinal thickness estimation and contour delineation. With a
segmentation time of 5s per volume, it is well suited for clinical
applications.
Dan Xu, Elisa Ricci, Wanli Ouyang, Xiaogang Wang, Nicu Sebe
Comments: Accepted as a spotlight paper at CVPR 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
This paper addresses the problem of depth estimation from a single still
image. Inspired by recent works on multi- scale convolutional neural networks
(CNN), we propose a deep model which fuses complementary information derived
from multiple CNN side outputs. Different from previous methods, the
integration is obtained by means of continuous Conditional Random Fields
(CRFs). In particular, we propose two different variations, one based on a
cascade of multiple CRFs, the other on a unified graphical model. By designing
a novel CNN implementation of mean-field updates for continuous CRFs, we show
that both proposed models can be regarded as sequential deep networks and that
training can be performed end-to-end. Through extensive experimental evaluation
we demonstrate the effective- ness of the proposed approach and establish new
state of the art results on publicly available datasets.
Upal Mahbub, Sayantan Sarkar, Rama Chellappa
Comments: 18 pages, 22 figures, 3 tables, submitted to IEEE Transactions on Image Processing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Generic face detection algorithms do not perform well in the mobile domain
due to significant presence of occluded and partially visible faces. One
promising technique to handle the challenge of partial faces is to design face
detectors based on facial segments. In this paper two different approaches of
facial segment-based face detection are discussed, namely, proposal-based
detection and detection by end-to-end regression. Methods that follow the first
approach rely on generating face proposals that contain facial segment
information. The three detectors following this approach, namely Facial
Segment-based Face Detector (FSFD), SegFace and DeepSegFace, discussed in this
paper, perform binary classification on each proposal based on features learned
from facial segments. The process of proposal generation, however, needs to be
handled separately, which can be very time consuming, and is not truly
necessary given the nature of the active authentication problem. Hence a novel
algorithm, Deep Regression-based User Image Detector (DRUID) is proposed, which
shifts from the classification to the regression paradigm, thus obviating the
need for proposal generation. DRUID has an unique network architecture with
customized loss functions, is trained using a relatively small amount of data
by utilizing a novel data augmentation scheme and is fast since it outputs the
bounding boxes of a face and its segments in a single pass. Being robust to
occlusion by design, the facial segment-based face detection methods,
especially DRUID show superior performance over other state-of-the-art face
detectors in terms of precision-recall and ROC curve on two mobile face
datasets.
Anoop Cherian, Basura Fernando, Mehrtash Harandi, Stephen Gould
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Most popular deep models for action recognition split video sequences into
short sub-sequences consisting of a few frames; frame-based features are then
pooled for recognizing the activity. Usually, this pooling step discards the
temporal order of the frames, which could otherwise be used for better
recognition. Towards this end, we propose a novel pooling method, generalized
rank pooling (GRP), that takes as input, features from the intermediate layers
of a CNN that is trained on tiny sub-sequences, and produces as output the
parameters of a subspace which (i) provides a low-rank approximation to the
features and (ii) preserves their temporal order. We propose to use these
parameters as a compact representation for the video sequence, which is then
used in a classification setup. We formulate an objective for computing this
subspace as a Riemannian optimization problem on the Grassmann manifold, and
propose an efficient conjugate gradient scheme for solving it. Experiments on
several activity recognition datasets show that our scheme leads to
state-of-the-art performance.
Dan Wang, Heyan Huang, Chi Lu, Bo-Si Feng, Liqiang Nie, Guihua Wen, Xian-Ling Mao
Comments: 7 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Recently, hashing methods have been widely used in large-scale image
retrieval. However, most existing hashing methods did not consider the
hierarchical relation of labels, which means that they ignored the rich
information stored in the hierarchy. Moreover, most of previous works treat
each bit in a hash code equally, which does not meet the scenario of
hierarchical labeled data. In this paper, we propose a novel deep hashing
method, called supervised hierarchical deep hashing (SHDH), to perform hash
code learning for hierarchical labeled data. Specifically, we define a novel
similarity formula for hierarchical labeled data by weighting each layer, and
design a deep convolutional neural network to obtain a hash code for each data
point. Extensive experiments on several real-world public datasets show that
the proposed method outperforms the state-of-the-art baselines in the image
retrieval task.
Li Sulimowicz, Ishfaq Ahmad
Comments: 6 pages, 5 figures, ICME conference
Subjects: Computer Vision and Pattern Recognition (cs.CV)
The sheer volume and size of histopathological images (e.g.,10^6 MPixel)
underscores the need for faster and more accurate Regions-of-interest (ROI)
detection algorithms. In this paper, we propose such an algorithm, which has
four main components that help achieve greater accuracy and faster speed:
First, while using coarse-to-fine topology preserving segmentation as the
baseline, the proposed algorithm uses a superpixel regularity optimization
scheme for avoiding irregular and extremely small superpixels. Second, the
proposed technique employs a prediction strategy to focus only on important
superpixels at finer image levels. Third, the algorithm reuses the information
gained from the coarsest image level at other finer image levels. Both the
second and the third components drastically lower the complexity. Fourth, the
algorithm employs a highly effective parallelization scheme using adap- tive
data partitioning, which gains high speedup. Experimental results, conducted on
the BSD500 [1] and 500 whole-slide histological images from the National Lung
Screening Trial (NLST)1 dataset, confirm that the proposed algorithm gained 13
times speedup compared with the baseline, and around 160 times compared with
SLIC [11], without losing accuracy.
Xiaoyong Shen, Ying-Cong Chen, Xin Tao, Jiaya Jia
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We propose a principled convolutional neural pyramid (CNP) framework for
general low-level vision and image processing tasks. It is based on the
essential finding that many applications require large receptive fields for
structure understanding. But corresponding neural networks for regression
either stack many layers or apply large kernels to achieve it, which is
computationally very costly. Our pyramid structure can greatly enlarge the
field while not sacrificing computation efficiency. Extra benefit includes
adaptive network depth and progressive upsampling for quasi-realtime testing on
VGA-size input. Our method profits a broad set of applications, such as
depth/RGB image restoration, completion, noise/artifact removal, edge
refinement, image filtering, image enhancement and colorization.
Mohammad Javad Shafiee, Elnaz Barshan, Alexander Wong
Comments: 8 pages. arXiv admin note: substantial text overlap with arXiv:1609.01360
Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
A promising paradigm for achieving highly efficient deep neural networks is
the idea of evolutionary deep intelligence, which mimics biological evolution
processes to progressively synthesize more efficient networks. A crucial design
factor in evolutionary deep intelligence is the genetic encoding scheme used to
simulate heredity and determine the architectures of offspring networks. In
this study, we take a deeper look at the notion of synaptic cluster-driven
evolution of deep neural networks which guides the evolution process towards
the formation of a highly sparse set of synaptic clusters in offspring
networks. Utilizing a synaptic cluster-driven genetic encoding, the
probabilistic encoding of synaptic traits considers not only individual
synaptic properties but also inter-synaptic relationships within a deep neural
network. This process results in highly sparse offspring networks which are
particularly tailored for parallel computational devices such as GPUs and deep
neural network accelerator chips. Comprehensive experimental results using four
well-known deep neural network architectures (LeNet-5, AlexNet, ResNet-56, and
DetectNet) on two different tasks (object categorization and object detection)
demonstrate the efficiency of the proposed method. Cluster-driven genetic
encoding scheme synthesizes networks that can achieve state-of-the-art
performance with significantly smaller number of synapses than that of the
original ancestor network. ((sim)125-fold decrease in synapses for MNIST).
Furthermore, the improved cluster efficiency in the generated offspring
networks ((sim)9.71-fold decrease in clusters for MNIST and a (sim)8.16-fold
decrease in clusters for KITTI) is particularly useful for accelerated
performance on parallel computing hardware architectures such as those in GPUs
and deep neural network accelerator chips.
Silvia Chiappa, Sébastien Racaniere, Daan Wierstra, Shakir Mohamed
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)
Models that can simulate how environments change in response to actions can
be used by agents to plan and act efficiently. We improve on previous
environment simulators from high-dimensional pixel observations by introducing
recurrent neural networks that are able to make temporally and spatially
coherent predictions for hundreds of time-steps into the future. We present an
in-depth analysis of the factors affecting performance, providing the most
extensive attempt to advance the understanding of the properties of these
models. We address the issue of computationally inefficiency with a model that
does not need to generate a high-dimensional image at each time-step. We show
that our approach can be used to improve exploration and is adaptable to many
diverse environments, namely 10 Atari games, a 3D car racing environment, and
complex 3D mazes.
Utku Kose, Selcuk Sert
Comments: 8 pages, 6 figures
Journal-ref: Ecoforum Journal, 6, 1 (10), 2017
Subjects: Artificial Intelligence (cs.AI); Social and Information Networks (cs.SI)
Content marketing is todays one of the most remarkable approaches in the
context of marketing processes of companies. Value of this kind of marketing
has improved in time, thanks to the latest developments regarding to computer
and communication technologies. Nowadays, especially social media based
platforms have a great importance on enabling companies to design multimedia
oriented, interactive content. But on the other hand, there is still something
more to do for improved content marketing approaches. In this context,
objective of this study is to focus on intelligent content marketing, which can
be done by using artificial intelligence. Artificial Intelligence is todays one
of the most remarkable research fields and it can be used easily as
multidisciplinary. So, this study has aimed to discuss about its potential on
improving content marketing. In detail, the study has enabled readers to
improve their awareness about the intersection point of content marketing and
artificial intelligence. Furthermore, the authors have introduced some example
models of intelligent content marketing, which can be achieved by using current
Web technologies and artificial intelligence techniques.
Yaoyuan Zhang, Zhenxu Ye, Yansong Feng, Dongyan Zhao, Rui Yan
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
Sentence simplification reduces semantic complexity to benefit people with
language impairments. Previous simplification studies on the sentence level and
word level have achieved promising results but also meet great challenges. For
sentence-level studies, sentences after simplification are fluent but sometimes
are not really simplified. For word-level studies, words are simplified but
also have potential grammar errors due to different usages of words before and
after simplification. In this paper, we propose a two-step simplification
framework by combining both the word-level and the sentence-level
simplifications, making use of their corresponding advantages. Based on the
two-step framework, we implement a novel constrained neural generation model to
simplify sentences given simplified words. The final results on Wikipedia and
Simple Wikipedia aligned datasets indicate that our method yields better
performance than various baselines.
Nicolas Tremblay (CNRS, GIPSA-CICS), Simon Barthelme (CNRS, GIPSA-VIBS), Pierre-Olivier Amblard (CNRS, GIPSA-CICS)
Comments: in French
Subjects: Data Structures and Algorithms (cs.DS); Artificial Intelligence (cs.AI); Discrete Mathematics (cs.DM); Learning (cs.LG)
We consider the problem of sampling k-bandlimited graph signals, ie, linear
combinations of the first k graph Fourier modes. We know that a set of k nodes
embedding all k-bandlimited signals always exists, thereby enabling their
perfect reconstruction after sampling. Unfortunately, to exhibit such a set,
one needs to partially diagonalize the graph Laplacian, which becomes
prohibitive at large scale. We propose a novel strategy based on determinantal
point processes that side-steps partial diagonalisation and enables
reconstruction with only O(k) samples. While doing so, we exhibit a new general
algorithm to sample determinantal process, faster than the state-of-the-art
algorithm by an order k.
Mohammad Javad Shafiee, Elnaz Barshan, Alexander Wong
Comments: 8 pages. arXiv admin note: substantial text overlap with arXiv:1609.01360
Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
A promising paradigm for achieving highly efficient deep neural networks is
the idea of evolutionary deep intelligence, which mimics biological evolution
processes to progressively synthesize more efficient networks. A crucial design
factor in evolutionary deep intelligence is the genetic encoding scheme used to
simulate heredity and determine the architectures of offspring networks. In
this study, we take a deeper look at the notion of synaptic cluster-driven
evolution of deep neural networks which guides the evolution process towards
the formation of a highly sparse set of synaptic clusters in offspring
networks. Utilizing a synaptic cluster-driven genetic encoding, the
probabilistic encoding of synaptic traits considers not only individual
synaptic properties but also inter-synaptic relationships within a deep neural
network. This process results in highly sparse offspring networks which are
particularly tailored for parallel computational devices such as GPUs and deep
neural network accelerator chips. Comprehensive experimental results using four
well-known deep neural network architectures (LeNet-5, AlexNet, ResNet-56, and
DetectNet) on two different tasks (object categorization and object detection)
demonstrate the efficiency of the proposed method. Cluster-driven genetic
encoding scheme synthesizes networks that can achieve state-of-the-art
performance with significantly smaller number of synapses than that of the
original ancestor network. ((sim)125-fold decrease in synapses for MNIST).
Furthermore, the improved cluster efficiency in the generated offspring
networks ((sim)9.71-fold decrease in clusters for MNIST and a (sim)8.16-fold
decrease in clusters for KITTI) is particularly useful for accelerated
performance on parallel computing hardware architectures such as those in GPUs
and deep neural network accelerator chips.
Hossein Soleimani, Adarsh Subbaswamy, Suchi Saria
Comments: The first two authors contributed equally to this work
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG)
Treatment effects can be estimated from observational data as the difference
in potential outcomes. In this paper, we address the challenge of estimating
the potential outcome when treatment-dose levels can vary continuously over
time. Further, the outcome variable may not be measured at a regular frequency.
Our proposed solution represents the treatment response curves using linear
time-invariant dynamical systems—this provides a flexible means for modeling
response over time to highly variable dose curves. Moreover, for multivariate
data, the proposed method: uncovers shared structure in treatment response and
the baseline across multiple markers; and, flexibly models challenging
correlation structure both across and within signals over time. For this, we
build upon the framework of multiple-output Gaussian Processes. On simulated
and a challenging clinical dataset, we show significant gains in accuracy over
state-of-the-art models.
Eric S. Tellez, Daniela Moctezuma, Sabino Miranda-Jímenez, Mario Graff
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
The amount of textual data generated in environments such as social media,
blogs, online newspapers, and so on, have attracted the attention of the
scientific community in order to automatize and improve several tasks that were
manually performed such as sentiment analysis, user profiling, or text
categorization, just to mention a few. Fortunately, several of these activities
can be posed as a classification problem, i.e., a problem where one is
interested in developing a function, from a set of texts with associated
labels, capable of predicting a label given an unseen text. In this
contribution, we propose a text classifier, named (mu)TC. (mu)TC is composed
of a number of easy to implement text transformation, text representation and a
machine learning algorithm that produce a competitive classifier even over
informal written text when these parts are correctly configured. We provide a
detailed description of (mu)TC along with an extensive experimental comparison
with the relevant state-of-the-art methods. (mu)TC was compared on 30
different datasets obtaining the best performance (regarding accuracy) in 18 of
them. The different datasets include several problems like topic and polarity
classification, spam detection, user profiling and authorship attribution.
Furthermore, it is important to comment that our approach allows the usage of
the technology even for users without knowledge of machine learning and natural
language processing.
Rose Catherine, William Cohen
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Learning (cs.LG)
Recently, deep learning methods have been shown to improve the performance of
recommender systems over traditional methods, especially when review text is
available. For example, a recent model, DeepCoNN, uses neural nets to learn one
latent representation for the text of all reviews written by a target user, and
a second latent representation for the text of all reviews for a target item,
and then combines these latent representations to obtain state-of-the-art
performance on recommendation tasks. We show that (unsurprisingly) much of the
predictive value of review text comes from reviews of the target user for the
target item. We then introduce a way in which this information can be used in
recommendation, even when the target user’s review for the target item is not
available. Our model, called TransNets, extends the DeepCoNN model by
introducing an additional latent layer representing the target user-target item
pair. We then regularize this layer, at training time, to be similar to another
latent representation of the target user’s review of the target item. We show
that TransNets and extensions of it improve substantially over the previous
state-of-the-art.
Ali Mottaghi, Kayhan Behdin, Ashkan Esmaeili, Mohammadreza Heydari, Farokh Marvasti
Subjects: Sound (cs.SD); Information Retrieval (cs.IR); Learning (cs.LG); Multimedia (cs.MM)
In this paper, we design a system in order to perform the real-time beat
tracking for an audio signal. We use Onset Strength Signal (OSS) to detect the
onsets and estimate the tempos. Then, we form Cumulative Beat Strength Signal
(CBSS) by taking advantage of OSS and estimated tempos. Next, we perform peak
detection by extracting the periodic sequence of beats among all CBSS peaks. In
simulations, we can see that our proposed algorithm, Online Beat TrAckINg
(OBTAIN), outperforms state-of-art results in terms of prediction accuracy
while maintaining comparable and practical computational complexity. The
real-time performance is tractable visually as illustrated in the simulations.
Yi-Kun Tang, Xian-Ling Mao, Heyan Huang, Guihua Wen
Comments: 7 pages
Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)
Recently, topic modeling has been widely used to discover the abstract topics
in text corpora. Most of the existing topic models are based on the assumption
of three-layer hierarchical Bayesian structure, i.e. each document is modeled
as a probability distribution over topics, and each topic is a probability
distribution over words. However, the assumption is not optimal. Intuitively,
it’s more reasonable to assume that each topic is a probability distribution
over concepts, and then each concept is a probability distribution over words,
i.e. adding a latent concept layer between topic layer and word layer in
traditional three-layer assumption. In this paper, we verify the proposed
assumption by incorporating the new assumption in two representative topic
models, and obtain two novel topic models. Extensive experiments were conducted
among the proposed models and corresponding baselines, and the results show
that the proposed models significantly outperform the baselines in terms of
case study and perplexity, which means the new assumption is more reasonable
than traditional one.
Yaoyuan Zhang, Zhenxu Ye, Yansong Feng, Dongyan Zhao, Rui Yan
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
Sentence simplification reduces semantic complexity to benefit people with
language impairments. Previous simplification studies on the sentence level and
word level have achieved promising results but also meet great challenges. For
sentence-level studies, sentences after simplification are fluent but sometimes
are not really simplified. For word-level studies, words are simplified but
also have potential grammar errors due to different usages of words before and
after simplification. In this paper, we propose a two-step simplification
framework by combining both the word-level and the sentence-level
simplifications, making use of their corresponding advantages. Based on the
two-step framework, we implement a novel constrained neural generation model to
simplify sentences given simplified words. The final results on Wikipedia and
Simple Wikipedia aligned datasets indicate that our method yields better
performance than various baselines.
Loïc Vial, Andon Tchechmedjiev, Didier Schwab
Subjects: Computation and Language (cs.CL)
This article compares four probabilistic algorithms (global algorithms) for
Word Sense Disambiguation (WSD) in terms of the number of scorer calls (local
algo- rithm) and the F1 score as determined by a gold-standard scorer. Two
algorithms come from the state of the art, a Simulated Annealing Algorithm
(SAA) and a Genetic Algorithm (GA) as well as two algorithms that we first
adapt from WSD that are state of the art probabilistic search algorithms,
namely a Cuckoo search algorithm (CSA) and a Bat Search algorithm (BS). As WSD
requires to evaluate exponentially many word sense combinations (with branching
factors of up to 6 or more), probabilistic algorithms allow to find approximate
solution in a tractable time by sampling the search space. We find that CSA, GA
and SA all eventually converge to similar results (0.98 F1 score), but CSA gets
there faster (in fewer scorer calls) and reaches up to 0.95 F1 before SA in
fewer scorer calls. In BA a strict convergence criterion prevents it from
reaching above 0.89 F1.
Edilson A. Corrêa Jr., Vanessa Queiroz Marinho, Leandro Borges dos Santos
Comments: Published in Proceedings of SemEval-2017, 5 pages
Subjects: Computation and Language (cs.CL); Learning (cs.LG)
This paper describes our multi-view ensemble approach to SemEval-2017 Task 4
on Sentiment Analysis in Twitter, specifically, the Message Polarity
Classification subtask for English (subtask A). Our system is a voting
ensemble, where each base classifier is trained in a different feature space.
The first space is a bag-of-words model and has a Linear SVM as base
classifier. The second and third spaces are two different strategies of
combining word embeddings to represent sentences and use a Linear SVM and a
Logistic Regressor as base classifiers. The proposed system was ranked 18th out
of 38 systems considering F1 score and 20th considering recall.
Steffen Eger, Erik-Lân Do Dinh, Ilia Kuznetsov, Masoud Kiaeeha, Iryna Gurevych
Subjects: Computation and Language (cs.CL)
This paper describes our approach to the SemEval 2017 Task 10: “Extracting
Keyphrases and Relations from Scientific Publications”, specifically to Subtask
(B): “Classification of identified keyphrases”. We explored three different
deep learning approaches: a character-level convolutional neural network (CNN),
a stacked learner with an MLP meta-classifier, and an attention based Bi-LSTM.
From these approaches, we created an ensemble of differently
hyper-parameterized systems, achieving a micro-F 1 -score of 0.63 on the test
data. Our approach ranks 2nd (score of 1st placed system: 0.64) out of four
according to this official score. However, we erroneously trained 2 out of 3
neural nets (the stacker and the CNN) on only roughly 15% of the full data,
namely, the original development set. When trained on the full data
(training+development), our ensemble has a micro-F 1 -score of 0.69. Our code
is available from this https URL
Rik van Noord, Johan Bos
Comments: To appear in Proceedings of SemEval, 2017
Subjects: Computation and Language (cs.CL)
We evaluate a semantic parser based on a character-based sequence-to-sequence
model in the context of the SemEval-2017 shared task on semantic parsing for
AMRs. With data augmentation, super characters, and POS-tagging we gain major
improvements in performance compared to a baseline model. The overall accuracy
however is still lower than a state-of-the-art AMR parser. An ensemble
combining our neural semantic parser with an existing, traditional parser,
yields a small gain in performance.
Nathan Schneider, Jena D. Hwang, Archna Bhatia, Na-Rae Han, Vivek Srikumar, Tim O'Gorman, Omri Abend
Subjects: Computation and Language (cs.CL)
This document describes an inventory of 50 semantic labels designed to
characterize the use of adpositions and case markers at a somewhat coarse level
of granularity. Version 2 is a revision of the supersense inventory proposed
for English by Schneider et al. (2015, 2016) and documented in PrepWiki
(henceforth “v1”), which in turn was based on previous schemes. The present
inventory was developed after extensive review of the v1 corpus annotations for
English, as well as consideration of adposition and case phenomena in Hebrew,
Hindi, and Korean. Examples in this document are limited to English; a
multilingual and more detailed online lexical resource is forthcoming.
Yi-Kun Tang, Xian-Ling Mao, Heyan Huang, Guihua Wen
Comments: 7 pages
Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)
Recently, topic modeling has been widely used to discover the abstract topics
in text corpora. Most of the existing topic models are based on the assumption
of three-layer hierarchical Bayesian structure, i.e. each document is modeled
as a probability distribution over topics, and each topic is a probability
distribution over words. However, the assumption is not optimal. Intuitively,
it’s more reasonable to assume that each topic is a probability distribution
over concepts, and then each concept is a probability distribution over words,
i.e. adding a latent concept layer between topic layer and word layer in
traditional three-layer assumption. In this paper, we verify the proposed
assumption by incorporating the new assumption in two representative topic
models, and obtain two novel topic models. Extensive experiments were conducted
among the proposed models and corresponding baselines, and the results show
that the proposed models significantly outperform the baselines in terms of
case study and perplexity, which means the new assumption is more reasonable
than traditional one.
Vicky Zayats, Mari Ostendorf
Comments: Submitted to TACL
Subjects: Computation and Language (cs.CL)
This paper presents a novel approach for modeling threaded discussions on
social media using a graph-structured bidirectional LSTM which represents both
hierarchical and temporal conversation structure. In experiments with a task of
predicting popularity of comments in Reddit discussions, the proposed model
outperforms a node-independent architecture for different sets of input
features. Analyses show a benefit to the model over the full course of the
discussion, improving detection in both early and late stages. Further, the use
of language cues with the bidirectional tree state updates helps with
identifying controversial comments.
Eric S. Tellez, Daniela Moctezuma, Sabino Miranda-Jímenez, Mario Graff
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
The amount of textual data generated in environments such as social media,
blogs, online newspapers, and so on, have attracted the attention of the
scientific community in order to automatize and improve several tasks that were
manually performed such as sentiment analysis, user profiling, or text
categorization, just to mention a few. Fortunately, several of these activities
can be posed as a classification problem, i.e., a problem where one is
interested in developing a function, from a set of texts with associated
labels, capable of predicting a label given an unseen text. In this
contribution, we propose a text classifier, named (mu)TC. (mu)TC is composed
of a number of easy to implement text transformation, text representation and a
machine learning algorithm that produce a competitive classifier even over
informal written text when these parts are correctly configured. We provide a
detailed description of (mu)TC along with an extensive experimental comparison
with the relevant state-of-the-art methods. (mu)TC was compared on 30
different datasets obtaining the best performance (regarding accuracy) in 18 of
them. The different datasets include several problems like topic and polarity
classification, spam detection, user profiling and authorship attribution.
Furthermore, it is important to comment that our approach allows the usage of
the technology even for users without knowledge of machine learning and natural
language processing.
Rose Catherine, William Cohen
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Learning (cs.LG)
Recently, deep learning methods have been shown to improve the performance of
recommender systems over traditional methods, especially when review text is
available. For example, a recent model, DeepCoNN, uses neural nets to learn one
latent representation for the text of all reviews written by a target user, and
a second latent representation for the text of all reviews for a target item,
and then combines these latent representations to obtain state-of-the-art
performance on recommendation tasks. We show that (unsurprisingly) much of the
predictive value of review text comes from reviews of the target user for the
target item. We then introduce a way in which this information can be used in
recommendation, even when the target user’s review for the target item is not
available. Our model, called TransNets, extends the DeepCoNN model by
introducing an additional latent layer representing the target user-target item
pair. We then regularize this layer, at training time, to be similar to another
latent representation of the target user’s review of the target item. We show
that TransNets and extensions of it improve substantially over the previous
state-of-the-art.
Xavier Bellekens, Christos Tachtatzis, Robert Atkinson, Craig Renfrew, Tony Kirkham
Comments: Published in The 7th International Conference of Security of Information and Networks, SIN 2014, Glasgow, UK, September, 2014
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Cryptography and Security (cs.CR)
Large industrial systems that combine services and applications, have become
targets for cyber criminals and are challenging from the security, monitoring
and auditing perspectives. Security log analysis is a key step for uncovering
anomalies, detecting intrusion, and enabling incident response. The constant
increase of link speeds, threats and users, produce large volumes of log data
and become increasingly difficult to analyse on a Central Processing Unit
(CPU). This paper presents a massively parallel Graphics Processing Unit (GPU)
LOg Processing (GLoP) library and can also be used for Deep Packet Inspection
(DPI), using a prefix matching technique, harvesting the full power of
off-the-shelf technologies. GLoP implements two different algorithm using
different GPU memory and is compared against CPU counterpart implementations.
The library can be used for processing nodes with single or multiple GPUs as
well as GPU cloud farms. The results show throughput of 20Gbps and demonstrate
that modern GPUs can be utilised to increase the operational speed of large
scale log processing scenarios, saving precious time before and after an
intrusion has occurred.
Xavier Bellekens, Christos Tachtatzis, Robert Atkinson, Craig Renfrew, Tony Kirkham
Comments: Published in The 7th International Conference of Security of Information and Networks, SIN 2014, Glasgow, UK, September, 2014
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Cryptography and Security (cs.CR)
Pattern Matching is a computationally intensive task used in many research
fields and real world applications. Due to the ever-growing volume of data to
be processed, and increasing link speeds, the number of patterns to be matched
has risen significantly. In this paper we explore the parallel capabilities of
modern General Purpose Graphics Processing Units (GPGPU) applications for high
speed pattern matching. A highly compressed failure-less Aho-Corasick algorithm
is presented for Intrusion Detection Systems on off-the-shelf hardware. This
approach maximises the bandwidth for data transfers between the host and the
Graphics Processing Unit (GPU). Experiments are performed on multiple alphabet
sizes, demonstrating the capabilities of the library to be used in different
research fields, while sustaining an adequate throughput for intrusion
detection systems or DNA sequencing. The work also explores the performance
impact of adequate prefix matching for alphabet sizes and varying pattern
numbers achieving speeds up to 8Gbps and low memory consumption for intrusion
detection systems.
Mireya Paredes, Graham Riley, Mikel Lujan
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
The Breadth-First Search (BFS) algorithm is an important building block for
graph analysis of large datasets. The BFS parallelisation has been shown to be
challenging because of its inherent characteristics, including irregular memory
access patterns, data dependencies and workload imbalance, that limit its
scalability. We investigate the optimisation and vectorisation of the hybrid
BFS (a combination of top-down and bottom-up approaches for BFS) on the Xeon
Phi, which has advanced vector processing capabilities. The results show that
our new implementation improves by 33\%, for a one million vertices graph,
compared to the state-of-the-art.
Joseph Tassarotti
Comments: 12 pages
Subjects: Data Structures and Algorithms (cs.DS); Distributed, Parallel, and Cluster Computing (cs.DC)
In this paper we present a method for obtaining tail-bounds for random
variables satisfying certain probabilistic recurrences that arise in the
analysis of randomized parallel divide and conquer algorithms. In such
algorithms, some computation is initially done to process an input x, which is
then randomly split into subproblems (h_1(x), …, h_n(x)), and the algorithm
proceeds recursively in parallel on each subproblem. The total work on input x,
W(x), then satisfies a probabilistic recurrence of the form (W(x) = a(x) +
sum_{i=1}^n W (h_i(x))), and the span (the longest chain of sequential
dependencies), satisfies (S(x) = b(x) + max_{i=1}^n S(h_i(x))), where a(x) and
b(x) are the work and span to split x and combine the results of the recursive
calls.
Karp has previously presented methods for obtaining tail-bounds in the case
when n = 1, and under certain stronger assumptions for the work-recurrence when
n > 1, but left open the question of the span-recurrence. We first show how to
extend his technique to handle the span-recurrence. We then show that in some
cases, the work-recurrence can be bounded under simpler assumptions than Karp’s
by transforming it into a related span-recurrence and applying our first
result. We demonstrate our results by deriving tail bounds for the work and
span of quicksort and the height of a randomly generated binary search tree.
Samuel Pollard, Boyana Norris
Comments: 12 pages, 4 figures, Submitted to EuroPar 2017
Subjects: Performance (cs.PF); Distributed, Parallel, and Cluster Computing (cs.DC)
The rapidly growing number of large network analysis problems has led to the
emergence of many parallel and distributed graph processing systems—one
survey in 2014 identified over 80. Since then, the landscape has evolved; some
packages have become inactive while more are being developed. Determining the
best approach for a given problem is infeasible for most developers. To enable
easy, rigorous, and repeatable comparison of the capabilities of such systems,
we present an approach and associated software for analyzing the performance
and scalability of parallel, open-source graph libraries. We demonstrate our
approach on five graph frameworks: GraphMat, the Graph500, the Graph Algorithm
Platform, GraphBIG, and PowerGraph using synthetic and real-world datasets. We
examine previously overlooked aspects of parallel graph processing performance
such as phases of execution and energy usage for three algorithms: breadth
first search, single source shortest paths, and PageRank and compare our
results to Graphalytics.
Subhojyoti Mukherjee, K. P. Naveen, Nandan Sudarsanam, Balaraman Ravindran
Subjects: Learning (cs.LG)
In this paper we propose the Augmented-UCB (AugUCB) algorithm for a
fixed-budget version of the thresholding bandit problem (TBP), where the
objective is to identify a set of arms whose quality is above a threshold. A
key feature of AugUCB is that it uses both mean and variance estimates to
eliminate arms that have been sufficiently explored; to the best of our
knowledge this is the first algorithm to employ such an approach for the
considered TBP. Theoretically, we obtain an upper bound on the loss
(probability of mis-classification) incurred by AugUCB. Although UCBEV in
literature provides a better guarantee, it is important to emphasize that UCBEV
has access to problem complexity (whose computation requires arms’ mean and
variances), and hence is not realistic in practice; this is in contrast to
AugUCB whose implementation does not require any such complexity inputs. We
conduct extensive simulation experiments to validate the performance of AugUCB.
Through our simulation work, we establish that AugUCB, owing to its utilization
of variance estimates, performs significantly better than the state-of-the-art
APT, CSAR and other non variance-based algorithms.
Sejun Park, Yunhun Jang, Andreas Galanis, Jinwoo Shin, Daniel Stefankovic, Eric Vigoda
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
The Gibbs sampler is a particularly popular Markov chain used for learning
and inference problems in Graphical Models (GMs). These tasks are
computationally intractable in general, and the Gibbs sampler often suffers
from slow mixing. In this paper, we study the Swendsen-Wang dynamics which is a
more sophisticated Markov chain designed to overcome bottlenecks that impede
the Gibbs sampler. We prove O(log n) mixing time for attractive binary
pairwise GMs (i.e., ferromagnetic Ising models) on stochastic partitioned
graphs having n vertices, under some mild conditions, including low temperature
regions where the Gibbs sampler provably mixes exponentially slow. Our
experiments also confirm that the Swendsen-Wang sampler significantly
outperforms the Gibbs sampler when they are used for learning parameters of
attractive GMs.
Maciej Zieba, Lei Wang
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
Triplet networks are widely used models that are characterized by good
performance in classification and retrieval tasks. In this work we propose to
train a triplet network by putting it as the discriminator in Generative
Adversarial Nets (GANs). We make use of the good capability of representation
learning of the discriminator to increase the predictive quality of the model.
We evaluated our approach on Cifar10 and MNIST datasets and observed
significant improvement on the classification performance using the simple k-nn
method.
Vibin Vijay, Raghunath Vp, Amarjot Singh, SN Omar
Comments: Accepted at the 7th IEEE International Advance Computing Conference (IACC-2017)
Subjects: Learning (cs.LG)
Clustering is a useful data exploratory method with its wide applicability in
multiple fields. However, data clustering greatly relies on initialization of
cluster centers that can result in large intra-cluster variance and dead
centers, therefore leading to sub-optimal solutions. This paper proposes a
novel variance based version of the conventional Moving K-Means (MKM) algorithm
called Variance Based Moving K-Means (VMKM) that can partition data into
optimal homogeneous clusters, irrespective of cluster initialization. The
algorithm utilizes a novel distance metric and a unique data element selection
criteria to transfer the selected elements between clusters to achieve low
intra-cluster variance and subsequently avoid dead centers. Quantitative and
qualitative comparison with various clustering techniques is performed on four
datasets selected from image processing, bioinformatics, remote sensing and the
stock market respectively. An extensive analysis highlights the superior
performance of the proposed method over other techniques.
Dan Elbaz, Michael Zibulevsky
Subjects: Learning (cs.LG); Sound (cs.SD)
Long short-term memory (LSTM) Hochreiter et al. is a powerful type of neural
network that can capture long range dependencies and nonlinear dynamics.
Frequency modulation FM is a nonlinear encoding of information in a carrier
wave. In this paper we present a novel application of software defined radio
(SDR) based on LSTM recurrent neural networks to decode audio signal from its
noisy frequency demodulated version.
Rose Catherine, William Cohen
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Learning (cs.LG)
Recently, deep learning methods have been shown to improve the performance of
recommender systems over traditional methods, especially when review text is
available. For example, a recent model, DeepCoNN, uses neural nets to learn one
latent representation for the text of all reviews written by a target user, and
a second latent representation for the text of all reviews for a target item,
and then combines these latent representations to obtain state-of-the-art
performance on recommendation tasks. We show that (unsurprisingly) much of the
predictive value of review text comes from reviews of the target user for the
target item. We then introduce a way in which this information can be used in
recommendation, even when the target user’s review for the target item is not
available. Our model, called TransNets, extends the DeepCoNN model by
introducing an additional latent layer representing the target user-target item
pair. We then regularize this layer, at training time, to be similar to another
latent representation of the target user’s review of the target item. We show
that TransNets and extensions of it improve substantially over the previous
state-of-the-art.
Edilson A. Corrêa Jr., Vanessa Queiroz Marinho, Leandro Borges dos Santos
Comments: Published in Proceedings of SemEval-2017, 5 pages
Subjects: Computation and Language (cs.CL); Learning (cs.LG)
This paper describes our multi-view ensemble approach to SemEval-2017 Task 4
on Sentiment Analysis in Twitter, specifically, the Message Polarity
Classification subtask for English (subtask A). Our system is a voting
ensemble, where each base classifier is trained in a different feature space.
The first space is a bag-of-words model and has a Linear SVM as base
classifier. The second and third spaces are two different strategies of
combining word embeddings to represent sentences and use a Linear SVM and a
Logistic Regressor as base classifiers. The proposed system was ranked 18th out
of 38 systems considering F1 score and 20th considering recall.
Silvia Chiappa, Sébastien Racaniere, Daan Wierstra, Shakir Mohamed
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)
Models that can simulate how environments change in response to actions can
be used by agents to plan and act efficiently. We improve on previous
environment simulators from high-dimensional pixel observations by introducing
recurrent neural networks that are able to make temporally and spatially
coherent predictions for hundreds of time-steps into the future. We present an
in-depth analysis of the factors affecting performance, providing the most
extensive attempt to advance the understanding of the properties of these
models. We address the issue of computationally inefficiency with a model that
does not need to generate a high-dimensional image at each time-step. We show
that our approach can be used to improve exploration and is adaptable to many
diverse environments, namely 10 Atari games, a 3D car racing environment, and
complex 3D mazes.
Nicolas Tremblay (CNRS, GIPSA-CICS), Simon Barthelme (CNRS, GIPSA-VIBS), Pierre-Olivier Amblard (CNRS, GIPSA-CICS)
Comments: in French
Subjects: Data Structures and Algorithms (cs.DS); Artificial Intelligence (cs.AI); Discrete Mathematics (cs.DM); Learning (cs.LG)
We consider the problem of sampling k-bandlimited graph signals, ie, linear
combinations of the first k graph Fourier modes. We know that a set of k nodes
embedding all k-bandlimited signals always exists, thereby enabling their
perfect reconstruction after sampling. Unfortunately, to exhibit such a set,
one needs to partially diagonalize the graph Laplacian, which becomes
prohibitive at large scale. We propose a novel strategy based on determinantal
point processes that side-steps partial diagonalisation and enables
reconstruction with only O(k) samples. While doing so, we exhibit a new general
algorithm to sample determinantal process, faster than the state-of-the-art
algorithm by an order k.
Ali Mottaghi, Kayhan Behdin, Ashkan Esmaeili, Mohammadreza Heydari, Farokh Marvasti
Subjects: Sound (cs.SD); Information Retrieval (cs.IR); Learning (cs.LG); Multimedia (cs.MM)
In this paper, we design a system in order to perform the real-time beat
tracking for an audio signal. We use Onset Strength Signal (OSS) to detect the
onsets and estimate the tempos. Then, we form Cumulative Beat Strength Signal
(CBSS) by taking advantage of OSS and estimated tempos. Next, we perform peak
detection by extracting the periodic sequence of beats among all CBSS peaks. In
simulations, we can see that our proposed algorithm, Online Beat TrAckINg
(OBTAIN), outperforms state-of-art results in terms of prediction accuracy
while maintaining comparable and practical computational complexity. The
real-time performance is tractable visually as illustrated in the simulations.
Vincent Cohen-Addad, Varun Kanade, Frederik Mallmann-Trenn, Claire Mathieu
Subjects: Data Structures and Algorithms (cs.DS); Learning (cs.LG)
Hierarchical clustering is a recursive partitioning of a dataset into
clusters at an increasingly finer granularity. Motivated by the fact that most
work on hierarchical clustering was based on providing algorithms, rather than
optimizing a specific objective, Dasgupta framed similarity-based hierarchical
clustering as a combinatorial optimization problem, where a `good’ hierarchical
clustering is one that minimizes some cost function. He showed that this cost
function has certain desirable properties.
We take an axiomatic approach to defining `good’ objective functions for both
similarity and dissimilarity-based hierarchical clustering. We characterize a
set of “admissible” objective functions (that includes Dasgupta’s one) that
have the property that when the input admits a `natural’ hierarchical
clustering, it has an optimal value.
Equipped with a suitable objective function, we analyze the performance of
practical algorithms, as well as develop better algorithms. For
similarity-based hierarchical clustering, Dasgupta showed that the divisive
sparsest-cut approach achieves an (O(log^{3/2} n))-approximation. We give a
refined analysis of the algorithm and show that it in fact achieves an
(O(sqrt{log n}))-approx. (Charikar and Chatziafratis independently proved
that it is a (O(sqrt{log n}))-approx.). This improves upon the LP-based
(O(log n))-approx. of Roy and Pokutta. For dissimilarity-based hierarchical
clustering, we show that the classic average-linkage algorithm gives a factor 2
approx., and provide a simple and better algorithm that gives a factor 3/2
approx..
Finally, we consider `beyond-worst-case’ scenario through a generalisation of
the stochastic block model for hierarchical clustering. We show that Dasgupta’s
cost function has desirable properties for these inputs and we provide a simple
1 + o(1)-approximation in this setting.
Maria Schuld, Francesco Petruccione
Comments: 19 pages, 9 figures
Subjects: Quantum Physics (quant-ph); Learning (cs.LG); Statistics Theory (math.ST)
Quantum machine learning witnesses an increasing amount of quantum algorithms
for data-driven decision making, a problem with potential applications ranging
from automated image recognition to medical diagnosis. Many of those algorithms
are implementations of quantum classifiers, or models for the classification of
data inputs with a quantum computer. Following the success of collective
decision making with ensembles in classical machine learning, this paper
introduces the concept of quantum ensembles of quantum classifiers. Creating
the ensemble corresponds to a state preparation routine, after which the
quantum classifiers are evaluated in parallel and their combined decision is
accessed by a single-qubit measurement. This framework naturally allows for
exponentially large ensembles in which — similar to Bayesian learning — the
individual classifiers do not have to be trained. As an example, we analyse an
exponentially large quantum ensemble in which each classifier is weighed
according to its performance in classifying the training data, leading to new
results for quantum as well as classical machine learning.
Jannicke Pearkes, Wojciech Fedorko, Alison Lister, Colin Gay
Comments: 20 pages, 13 figures
Subjects: High Energy Physics – Experiment (hep-ex); Learning (cs.LG); High Energy Physics – Phenomenology (hep-ph); Machine Learning (stat.ML)
Recent literature on deep neural networks for tagging of highly energetic
jets resulting from top quark decays has focused on image based techniques or
multivariate approaches using high level jet substructure variables. Here a
sequential approach to this task is taken by using an ordered sequence of jet
constituents as training inputs. Unlike previous approaches, this strategy does
not result in a loss of information during pixelisation or the calculation of
high level features. New preprocessing methods that do not alter key physical
quantities such as the jet mass are developed. The jet classification method
achieves background rejection of 45 for 50% efficiency operating point for
reconstruction level jets with transverse momentum range of 600 to 2500 GeV and
is insensitive to multiple proton-proton interactions at the levels expected
throughout LHC Run 2.
Gen Li, Yuantao Gu
Comments: 30 pages, 6 figures
Subjects: Information Theory (cs.IT); Learning (cs.LG)
Dimension reduction plays an essential role when decreasing the complexity of
solving large-scale problems. The well-known Johnson-Lindenstrauss (JL) Lemma
and Restricted Isometry Property (RIP) admit the use of random projection to
reduce the dimension while keeping the Euclidean distance, which leads to the
boom of Compressed Sensing and the field of sparsity related signal processing.
Recently, successful applications of sparse models in computer vision and
machine learning have increasingly hinted that the underlying structure of high
dimensional data looks more like a union of subspaces (UoS). In this paper,
motivated by JL Lemma and an emerging field of Compressed Subspace Clustering,
we study for the first time the RIP of Gaussian random matrix for compressing
two subspaces. We theoretically prove that with high probability the affinity
or distance between two projected subspaces are concentrated around their
estimates. When the ambient dimension after projection is sufficiently large,
the affinity and distance between two subspaces almost remain unchanged after
random projection. Numerical experiments verify the theoretical work.
Hossein Soleimani, Adarsh Subbaswamy, Suchi Saria
Comments: The first two authors contributed equally to this work
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG)
Treatment effects can be estimated from observational data as the difference
in potential outcomes. In this paper, we address the challenge of estimating
the potential outcome when treatment-dose levels can vary continuously over
time. Further, the outcome variable may not be measured at a regular frequency.
Our proposed solution represents the treatment response curves using linear
time-invariant dynamical systems—this provides a flexible means for modeling
response over time to highly variable dose curves. Moreover, for multivariate
data, the proposed method: uncovers shared structure in treatment response and
the baseline across multiple markers; and, flexibly models challenging
correlation structure both across and within signals over time. For this, we
build upon the framework of multiple-output Gaussian Processes. On simulated
and a challenging clinical dataset, we show significant gains in accuracy over
state-of-the-art models.
Dong Yu, Xuankai Chang, Yanmin Qian
Comments: 5 pages, 6 figures, InterSpeech2017
Subjects: Sound (cs.SD); Learning (cs.LG)
In this paper, we propose a novel technique for direct recognition of
multiple speech streams given the single channel of mixed speech, without first
separating them. Our technique is based on permutation invariant training (PIT)
for automatic speech recognition (ASR). In PIT-ASR, we compute the average
cross entropy (CE) over all frames in the whole utterance for each possible
output-target assignment, pick the one with the minimum CE, and optimize for
that assignment. PIT-ASR forces all the frames of the same speaker to be
aligned with the same output layer. This strategy elegantly solves the label
permutation problem and speaker tracing problem in one shot. Our experiments on
artificially mixed AMI data showed that the proposed approach is very
promising.
Chunting Zhou, Graham Neubig
Comments: Accepted by ACL 2017
Subjects: Computation and Language (cs.CL); Learning (cs.LG)
Labeled sequence transduction is a task of transforming one sequence into
another sequence that satisfies desiderata specified by a set of labels. In
this paper we propose multi-space variational encoder-decoders, a new model for
labeled sequence transduction with semi-supervised learning. The generative
model can use neural networks to handle both discrete and continuous latent
variables to exploit various features of data. Experiments show that our model
provides not only a powerful supervised framework but also can effectively take
advantage of the unlabeled data. On the SIGMORPHON morphological inflection
benchmark, our model outperforms single-model state-of-art results by a large
margin for the majority of languages.
Greg Yang
Comments: 72 pages, 22 figures. Comments welcome
Subjects: Commutative Algebra (math.AC); Computational Complexity (cs.CC); Discrete Mathematics (cs.DM); Learning (cs.LG); Combinatorics (math.CO)
In computational complexity, a complexity class is given by a set of problems
or functions, and a basic challenge is to show separations of complexity
classes (A
ot= B) especially when (A) is known to be a subset of (B). In this
paper we introduce a homological theory of functions that can be used to
establish complexity separations, while also providing other interesting
consequences. We propose to associate a topological space (S_A) to each class
of functions (A), such that, to separate complexity classes (A subseteq B’),
it suffices to observe a change in “the number of holes”, i.e. homology, in
(S_A) as a subclass (B) of (B’) is added to (A). In other words, if the
homologies of (S_A) and (S_{A cup B}) are different, then (A
ot= B’). We
develop the underlying theory of functions based on combinatorial and
homological commutative algebra and Stanley-Reisner theory, and recover Minsky
and Papert’s 1969 result that parity cannot be computed by nonmaximal degree
polynomial threshold functions. In the process, we derive a “maximal principle”
for polynomial threshold functions that is used to extend this result further
to arbitrary symmetric functions. A surprising coincidence is demonstrated,
where the maximal dimension of “holes” in (S_A) upper bounds the VC dimension
of (A), with equality for common computational cases such as the class of
polynomial threshold functions or the class of linear functionals in (mathbb
F_2), or common algebraic cases such as when the Stanley-Reisner ring of (S_A)
is Cohen-Macaulay. As another interesting application of our theory, we prove a
result that a priori has nothing to do with complexity separation: it
characterizes when a vector subspace intersects the positive cone, in terms of
homological conditions. By analogy to Farkas’ result doing the same with
*linear conditions*, we call our theorem the Homological Farkas Lemma.
Dongzhu Liu, Kaibin Huang
Comments: 30 pages, submitted to IEEE Trans. on Wireless Communication
Subjects: Information Theory (cs.IT)
Multimedia content especially videos is expected to dominate data traffic in
next-generation mobile networks. Caching popular content at the network edge
has emerged to be a solution for low-latency content delivery. Compared with
the traditional wireless communication, content delivery has a key
characteristic that many signals coexisting in the air carry identical popular
content. They, however, can interfere with each other at a receiver if their
modulation-and-coding (MAC) schemes are adapted to individual channels
following the classic approach. To address this issue, we present a novel idea
of content adaptive MAC (CAMAC) where adapting MAC schemes to content ensures
that all signals carry identical content are encoded using an identical MAC
scheme, achieving spatial MAC alignment. Consequently, interference can be
harnessed as signals, to improve the reliability of wireless delivery. In the
remaining part of the paper, we focus on quantifying the gain CAMAC can bring
to a content-delivery network using a stochastic-geometry model. Specifically,
content helpers are distributed as a Poisson point process, each of which
transmits a file from a content database based on a given popularity
distribution. It is discovered that the successful content-delivery probability
is closely related to the distribution of the ratio of two independent shot
noise processes, named a shot-noise ratio. The distribution itself is an open
mathematical problem that we tackle in this work. Using stable-distribution
theory and tools from stochastic geometry, the distribution function is derived
in closed form. Extending the result in the context of content-delivery
networks with CAMAC yields the content-delivery probability in different closed
forms. In addition, the gain in the probability due to CAMAC is shown to grow
with the level of skewness in the content popularity distribution.
Shun Watanabe
Comments: 5 pages, 0 figure
Subjects: Information Theory (cs.IT)
We show a reduction method to construct a code for the Gray-Wyner (GW)
network from a given code for the Wyner-Ahlswede-K”orner (WAK) network. By
combining this reduction with a converse bound on the GW network, we derive a
converse bound on the WAK network. The derived bound gives an alternative proof
of the strong converse theorem for the WAK network.
Gang Yang, Ying-Chang Liang, Rui Zhang
Comments: 32 pages, 10 figures, journal paper
Subjects: Information Theory (cs.IT)
Ambient backscatter communication (AmBC) enables radio-frequency (RF) powered
backscatter devices (BDs) (e.g., sensors, tags) to modulate their information
bits over ambient RF carriers in an over-the-air manner. This technology also
called “modulation in the air”, thus has emerged as a promising solution to
achieve green communications for future Internet-of-Things. This paper studies
an AmBC system by leveraging the ambient orthogonal frequency division
multiplexing (OFDM) modulated signals in the air. We first model such AmBC
system from a spread-spectrum communication perspective, upon which a novel
joint design for BD waveform and receiver detector is proposed. The BD symbol
period is designed to be in general an integer multiplication of the OFDM
symbol period, and the waveform for BD bit `0′ maintains the same state within
a BD symbol period, while the waveform for BD bit `1′ has a state transition in
the middle of each OFDM symbol period within a BD symbol period. In the
receiver detector design, we construct the test statistic that cancels out the
direct-link interference by exploiting the repeating structure of the ambient
OFDM signals due to the use of cyclic prefix. For the system with a
single-antenna receiver, the maximum-likelihood detector is proposed to recover
the BD bits, for which the optimal threshold is obtained in closed-form
expression. For the system with a multi-antenna receiver, we propose a new test
statistic, and derive the optimal detector. Moreover, practical timing
synchronization algorithms are proposed, and we also analyze the effect of
various system parameters on the system performance. Finally, extensive
numerical results are provided to verify that the proposed transceiver design
can improve the system bit-error-rate (BER) performance and the operating range
significantly, and achieve much higher data rate, as compared to the
conventional design.
Gourab Ghatak, Antonio De Domenico, Marceau Coupechoux
Comments: A 7-page version is submitted to IEEE GLOBECOM 2017
Subjects: Information Theory (cs.IT)
We characterize a multi tier network with classical macro cells, and multi
radio access technology (RAT) small cells, which are able to operate in
microwave and millimeter-wave (mm-wave) bands. The small cells are assumed to
be deployed along roads modeled as a Poisson line process. This
characterization is more realistic as compared to the classical Poisson point
processes typically used in literature. In this context, we derive the
association and RAT selection probabilities of the typical user under various
system parameters such as the small cell deployment density and mm-wave antenna
gain, and with varying street densities. Finally, we calculate the signal to
interference plus noise ratio (SINR) coverage probability for the typical user
considering a tractable dominant interference based model for mm-wave
interference. Our analysis reveals the need of deploying more small cells per
street in cities with more streets to maintain coverage, and highlights that
mm-wave RAT in small cells can help to improve the SINR performance of the
users.
Nicola Durante, Alessandro Siciliano
Comments: submitted; 22 pages
Subjects: Information Theory (cs.IT); Combinatorics (math.CO)
In this paper we construct infinite families of non-linear maximum rank
distance codes by using the setting of bilinear forms of a finite vector space.
We also give a geometric description of such codes by using the cyclic model
for the field reduction of finite geometries and we show that these families
contain the non-linear maximum rank distance codes recently provided by
Cossidente, Marino and Pavese.
Gen Li, Yuantao Gu
Comments: 30 pages, 6 figures
Subjects: Information Theory (cs.IT); Learning (cs.LG)
Dimension reduction plays an essential role when decreasing the complexity of
solving large-scale problems. The well-known Johnson-Lindenstrauss (JL) Lemma
and Restricted Isometry Property (RIP) admit the use of random projection to
reduce the dimension while keeping the Euclidean distance, which leads to the
boom of Compressed Sensing and the field of sparsity related signal processing.
Recently, successful applications of sparse models in computer vision and
machine learning have increasingly hinted that the underlying structure of high
dimensional data looks more like a union of subspaces (UoS). In this paper,
motivated by JL Lemma and an emerging field of Compressed Subspace Clustering,
we study for the first time the RIP of Gaussian random matrix for compressing
two subspaces. We theoretically prove that with high probability the affinity
or distance between two projected subspaces are concentrated around their
estimates. When the ambient dimension after projection is sufficiently large,
the affinity and distance between two subspaces almost remain unchanged after
random projection. Numerical experiments verify the theoretical work.
Felix Krahmer, Christian Kruschel, Michael Sandbichler
Comments: 21 pages, 1 figure
Subjects: Information Theory (cs.IT)
This chapter gives an overview over recovery guarantees for total variation
minimization in compressed sensing for different measurement scenarios. In
addition to summarizing the results in the area, we illustrate why an approach
that is common for synthesis sparse signals fails and different techniques are
necessary. Lastly, we discuss a generalizations of recent results for Gaussian
measurements to the subgaussian case.
Geonu Kim, Jungwoo Lee
Comments: 15 pages. arXiv admin note: text overlap with arXiv:1701.07340
Subjects: Information Theory (cs.IT)
Recently, locally repairable codes (LRCs) with local erasure correction
constraints that are unequal and disjoint have been proposed. In this work, we
study the same topic and provide some improved and additional results.
Penghang Yin, Zhe Sun, Wenlong Jin, Jack Xin
Subjects: Information Theory (cs.IT); Physics and Society (physics.soc-ph)
A computational method, based on (ell_1)-norm minimization, is proposed for
the problem traffic flow correction. Without extra information, the problem is
generally ill-posed when a large portion of the link sensors are unhealthy. It
is possible, however, to correct the corruptions extit{accurately} if there
are only a few bad sensors which are located at certain links by the proposed
method. We mathematically quantify these links that are robust to miscounts and
relate them to the geometric structure of the traffic network. In a more
realistic setting, besides the unhealthy link sensors, if small measure noise
are present at the other sensors, our method guarantees to give an estimated
traffic flow fairly close to the ground-truth. Both toy and real-world examples
are provided to demonstrate the effectiveness of the proposed method.
Mustafa Ozmen, M. Cenk Gursoy
Subjects: Information Theory (cs.IT)
In this paper, throughput and energy efficiency of secure wireless
transmission of delay sensitive data generated by random sources is studied. A
fading broadcast model in which the transmitter sends confidential and common
messages to two receivers is considered. It is assumed that the common and
confidential data, generated from Markovian sources, is stored in buffers prior
to transmission, and the transmitter operates under constraints on buffer/delay
violation probability. Under such statistical quality of service (QoS)
constraints, effective capacity of time-varying wireless transmissions and
effective bandwidth of Markovian sources are employed to determine the
throughput. In particular, secrecy capacity is used to describe the service
rate of buffers containing confidential messages. Moreover, energy per bit is
used as the energy efficiency metric and energy efficiency is studied in the
low signal-to-noise (SNR) regime. Specifically, minimum energy per bit required
for the reliable communication of common and confidential messages is
determined and wideband slope expressions are identified. The impact of
buffer/delay constraints, correlation between channels, source
characteristics/burstiness, channel knowledge at the transmitter, power
allocation, and secrecy requirements on the throughput and energy efficiency of
common and confidential message transmissions is identified.
Sajjad Beygi, Shirin Jalali, Arian Maleki, Urbashi Mitra
Subjects: Information Theory (cs.IT)
Modern image and video compression codes employ elaborate structures existing
in such signals to encode them into few number of bits. Compressed sensing
recovery algorithms on the other hand use such signals’ structures to recover
them from few linear observations. Despite the steady progress in the field of
compressed sensing, structures that are often used for signal recovery are
still much simpler than those employed by state-of-the-art compression codes.
The main goal of this paper is to bridge this gap through answering the
following question: Can one employ a given compression code to build an
efficient (polynomial time) compressed sensing recovery algorithm? In response
to this question, the compression-based gradient descent (C-GD) algorithm is
proposed. C-GD, which is a low-complexity iterative algorithm, is able to
employ a generic compression code for compressed sensing and therefore elevates
the scope of structures used in compressed sensing to those used by compression
codes. The convergence performance of C-GD and its required number of
measurements in terms of the rate-distortion performance of the compression
code are theoretically analyzed. It is also shown that C-GD is robust to
additive white Gaussian noise. Finally, the presented simulation results show
that combining C-GD with commercial image compression codes such as JPEG2000
yields state-of-the-art performance in imaging applications.
Nan Jiang, Yansha Deng, Xin Kang, Arumugam Nallanathan
Comments: This manuscript has been submitted to IEEE Transactions on Wireless Communications
Subjects: Information Theory (cs.IT)
Massive Internet of Things (mIoT) has provided an auspicious opportunity to
build powerful and ubiquitous connections that faces a plethora of new
challenges, where cellular networks are potential solutions due to its
scalability, reliability, and efficiency. The contention-based random access
procedure (RACH) is the first step of connection establishment between IoT
devices and Base Stations (BSs) in the cellular-based mIoT network, where
modelling the interactions between static properties of physical layer network
and dynamic properties of queue evolving in each IoT device are challenging. To
tackle this, we provide a novel traffic-aware spatio-temporal model to analyze
RACH in cellular-based mIoT networks, where the physical layer network are
modelled and analyzed based on stochastic geometry in the spatial domain, and
the queue evolution are analyzed based on probability theory in the time
domain. For performance evaluation, we derive the exact expressions for the
preamble transmission success probabilities of a randomly chosen IoT device
with different RACH transmission schemes in each time slot, which offer
insights into effectiveness of each transmission scheme. Our derived analytical
results are verified by the realistic simulations capturing the evolution of
packets in each IoT device. This mathematical model and analytical framework
can be applied to evaluate the performance of other types of RACH transmission
schemes in the cellular-based networks by simply integrating its packets
evolution principle.
Yi Li, M. Cenk Gursoy, Senem Velipasalar
Subjects: Information Theory (cs.IT)
Recently, wireless caching techniques have been studied to satisfy lower
delay requirements and offload traffic from peak periods. By storing parts of
the popular files at the mobile users, users can locate some of their requested
files in their own caches or the caches at their neighbors. In the latter case,
when a user receives files from its neighbors, device-to-device (D2D)
communication is enabled. D2D communication underlaid with cellular networks is
also a new paradigm for the upcoming 5G wireless systems. By allowing a pair of
adjacent D2D users to communicate directly, D2D communication can achieve
higher throughput, better energy efficiency and lower traffic delay. In this
work, we propose a very efficient caching algorithm for D2D-enabled cellular
networks to minimize the average transmission delay. Instead of searching over
all possible solutions, our algorithm finds out the best <file,user> pairs,
which provide the best delay improvement in each loop to form a caching policy
with very low transmission delay and high throughput. This algorithm is also
extended to address a more general scenario, in which the distributions of
fading coefficients and values of system parameters potentially change over
time. Via numerical results, the superiority of the proposed algorithm is
verified by comparing it with a naive algorithm, in which all users simply
cache their favorite files.
Xiucong Sun, Chao Han, Pei Chen
Comments: 25 pages, 6 figures, Ready for publication in Aerospace Science and Technology, 2017
Subjects: Instrumentation and Methods for Astrophysics (astro-ph.IM); Information Theory (cs.IT)
Precise (sub-meter level) real-time navigation using a space-capable
single-frequency global positioning system (GPS) receiver and ultra-rapid
(real-time) ephemerides from the international global navigation satellite
systems service is proposed for low-Earth-orbiting (LEO) satellites. The C/A
code and L1 carrier phase measurements are combined and single-differenced to
eliminate first-order ionospheric effects and receiver clock offsets. A
random-walk process is employed to model the phase ambiguities in order to
absorb the time-varying and satellite-specific higher-order measurement errors
as well as the GPS clock correction errors. A sequential Kalman filter which
incorporates the known orbital dynamic model is developed to estimate orbital
states and phase ambiguities without matrix inversion. Real flight data from
the single-frequency GPS receiver onboard China’s SJ-9A small satellite are
processed to evaluate the orbit determination accuracy. Statistics from
internal orbit assessments indicate that three-dimensional accuracies of better
than 0.50 m and 0.55 mm/s are achieved for position and velocity, respectively.