Yandan Wang, Wei Wen, Linghao Song, Hai Li
Comments: Best Paper Award of ASP-DAC 2017
Subjects: Neural and Evolutionary Computing (cs.NE)
Brain inspired neuromorphic computing has demonstrated remarkable advantages
over traditional von Neumann architecture for its high energy efficiency and
parallel data processing. However, the limited resolution of synaptic weights
degrades system accuracy and thus impedes the use of neuromorphic systems. In
this work, we propose three orthogonal methods to learn synapses with one-level
precision, namely, distribution-aware quantization, quantization regularization
and bias tuning, to make image classification accuracy comparable to the
state-of-the-art. Experiments on both multi-layer perception and convolutional
neural networks show that the accuracy drop can be well controlled within 0.19%
(5.53%) for MNIST (CIFAR-10) database, compared to an ideal system without
quantization.
Filippos Kokkinos, Alexandros Potamianos
Comments: Submitted to EACL2017 for review
Subjects: Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE)
We introduce a tree-structured attention neural network for sentences and
small phrases and apply it to the problem of sentiment classification. Our
model expands the current recursive models by incorporating structural
information around a node of a syntactic tree using both bottom-up and top-down
information propagation. Also, the model utilizes structural attention to
identify the most salient representations during the construction of the
syntactic tree. To our knowledge, the proposed models achieve state of the art
performance on the Stanford Sentiment Treebank dataset.
Du Yong Kim
Comments: 10 pages, 2 figures, 2 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In most multi-object tracking algorithms, tuning of model parameters is of
critical importance for reliable performance. In particular, we are interested
in designing a robust tracking algorithm that is able to handle unknown false
measurement rate. The proposed algorithm is based on coupling of two random
finite set filters that share tracking parameters. Performance evaluation with
visual surveillance and cell microscopy images demonstrates the effectiveness
of the tracking algorithm for real-world scenarios.
Changzhe Jiao, Alina Zare
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Signature-based detectors for hyperspectral target detection rely on knowing
the specific target signature in advance. However, target signature are often
difficult or impossible to obtain. Furthermore, common methods for obtaining
target signatures, such as from laboratory measurements or manual selection
from an image scene, usually do not capture the discriminative features of
target class. In this paper, an approach for estimating a discriminative target
signature from imprecise labels is presented. The proposed approach maximizes
the response of the hybrid sub-pixel detector within a multiple instance
learning framework and estimates a set of discriminative target signatures.
After learning target signatures, any signature based detector can then be
applied on test data. Both simulated and real hyperspectral target detection
experiments are shown to illustrate the effectiveness of the method.
Caner Sahin, Rigas Kouskouridas, Tae-Kyun Kim
Subjects: Computer Vision and Pattern Recognition (cs.CV)
State-of-the-art techniques for 6D object pose recovery depend on
occlusion-free point clouds to accurately register objects in 3D space. To deal
with this shortcoming, we introduce a novel architecture called Iterative Hough
Forest with Histogram of Control Points that is capable of estimating the 6D
pose of occluded and cluttered objects given a candidate 2D bounding box. Our
Iterative Hough Forest (IHF) is learnt using parts extracted only from the
positive samples. These parts are represented with Histogram of Control Points
(HoCP), a “scale-variant” implicit volumetric description, which we derive from
recently introduced Implicit B-Splines (IBS). The rich discriminative
information provided by the scale-variant HoCP features is leveraged during
inference. An automatic variable size part extraction framework iteratively
refines the object’s initial pose that is roughly aligned due to the extraction
of coarsest parts, the ones occupying the largest area in image pixels. The
iterative refinement is accomplished based on finer (smaller) parts that are
represented with more discriminative control point descriptors by using our
Iterative Hough Forest. Experiments conducted on a publicly available dataset
report that our approach show better registration performance than the
state-of-the-art methods.
Mattia Rossi, Pascal Frossard
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
Light field cameras can capture the 3D information in a scene with a single
shot. This special feature makes light field cameras very appealing for a
variety of applications: from the popular post-capture refocus, to depth
estimation and image-based rendering. However, light field cameras suffer by
design from strong limitations in their spatial resolution, which should
therefore be augmented by computational methods. On the one hand, off-the-shelf
single-frame and multi-frame super-resolution algorithms are not ideal for
light field data, as they do not consider its particular structure. On the
other hand, the few super-resolution algorithms explicitly tailored for light
field data exhibit significant limitations, such as the need to estimate an
explicit disparity map at each view. In this work we propose a new light field
super-resolution algorithm meant to address these limitations. We adopt a
multi-frame alike super-resolution approach, where the complementary
information in the different light field views is used to augment the spatial
resolution of the whole light field. We show that coupling the multi-frame
approach with a graph regularizer, that enforces the light field structure via
non local self similarities, permits to avoid the costly and challenging
disparity estimation step for all the views. Extensive experiments show that
the proposed algorithm compares favorably to the other state-of-the-art methods
for light field super-resolution, both in terms of PSNR and in terms of visual
quality. Moreover, differently from the other light field super-resolution
methods, the new algorithm provides reconstructed light field views with
uniform quality, which happens to be an important feature for any light field
application.
Tony Lindeberg
Comments: 31 pages, 7 figures, submitted to Journal of Mathematical Imaging and Vision in October 2016
Subjects: Computer Vision and Pattern Recognition (cs.CV)
This paper presents a theory for discretizing the affine Gaussian scale-space
concept so that scale-space properties hold also for the discrete
implementation.
Two ways of discretizing spatial smoothing with affine Gaussian kernels are
presented: (i) by solving semi-discretized affine diffusion equation as derived
by necessity from the requirement of a semi-group structure over a continuum of
scale parameters as parameterized by a family of spatial covariance matrices
and obeying non-creation of new structures from any finer to any coarser scale
as formalized by the requirement of non-enhancement of local extrema and (ii) a
set of parameterized 3×3-kernels as derived from an additional discretization
of the above theory along the scale direction and with the parameters of the
kernels having a direct interpretation in terms of the covariance matrix of the
composed discrete smoothing operation.
We show how convolutions with the first family of kernels can be implemented
in terms of a closed form expression for the Fourier transform and analyse how
a remaining degree of freedom in the theory can be explored to ensure a
positive discretization and optionally also achieve higher-order discrete
approximation of the angular dependency of the shapes of the affine Gaussian
kernels.
We do also show how discrete directional derivative approximations can be
efficiently implemented to approximate affine Gaussian derivatives as
constituting a canonical model for receptive fields over a purely spatial image
domain and with close relations to receptive fields in biological vision.
Changsoo Je, Kyuhyoung Choi, Sang Wook Lee
Comments: 8 pages, 5 figures. Updated version of a conference paper
Journal-ref: Proc. 30th Fall Semiannual Conference of Korea Information Science
Society, vol. 2, pp. 661-663, Seoul, Korea, October, 2003
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
In this paper, we present a novel method for rapid high-resolution range
sensing using green-blue stripe pattern. We use green and blue for designing
high-frequency stripe projection pattern. For accurate and reliable range
recovery, we identify the stripe patterns by our color-stripe segmentation and
unwrapping algorithms. The experimental result for a naked human face shows the
effectiveness of our method.
Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky
Subjects: Computer Vision and Pattern Recognition (cs.CV)
The recent work of Gatys et al., who characterized the style of an image by
the statistics of convolutional neural network filters, ignited a renewed
interest in the texture generation and image stylization problems. While their
image generation technique uses a slow optimization process, recently several
authors have proposed to learn generator neural networks that can produce
similar outputs in one quick forward pass. While generator networks are
promising, they are still inferior in visual quality and diversity compared to
generation-by-optimization. In this work, we advance them in two significant
ways. First, we introduce an instance normalization module to replace batch
normalization with significant improvements to the quality of image
stylization. Second, we improve diversity by introducing a new learning
formulation that encourages generators to sample unbiasedly from the Julesz
texture ensemble, which is the equivalence class of all images characterized by
certain filter responses. Together, these two improvements take feed forward
texture synthesis and image stylization much closer to the quality of
generation-via-optimization, while retaining the speed advantage.
Hamid Reza Shahdoosti
Comments: 9 pages; 2 figures, conference
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Among the existing fusion algorithms, the wavelet fusion method is the most
frequently discussed one in recent publications because the wavelet approach
preserves the spectral characteristics of the multispectral image better than
other methods. The Brovey is also a popular fusion method used for its ability
in preserving the spatial information of the PAN image. This study presents a
new fusion approach that integrates the advantages of both the Brovey (which
preserves a high degree of spatial information) and the wavelet (which
preserves a high degree of spectral information) techniques to reduce the
colour distortion of fusion results. Visual and statistical analyzes show that
the proposed algorithm clearly improves the merging quality in terms of:
correlation coefficient and UIQI; compared to fusion methods including, IHS,
Brovey, PCA , HPF, discrete wavelet transform (DWT), and a-trous wavelet.
Andrea Baraldi, Francesca Despini, Sergio Teggi
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Multispectral (MS) image panchromatic (PAN) sharpening algorithms proposed to
the remote sensing community are ever increasing in number and variety. Their
aim is to sharpen a coarse spatial resolution MS image with a fine spatial
resolution PAN image acquired simultaneously by a spaceborne or airborne Earth
observation (EO) optical imaging sensor pair. Unfortunately, to date, no
standard evaluation procedure for MS image PAN sharpening outcome and process
is community agreed upon, in contrast with the Quality Assurance Framework for
Earth Observation (QA4EO) guidelines proposed by the intergovernmental Group on
Earth Observations (GEO). In general, process is easier to measure, outcome is
more important. The original contribution of the present study is fourfold.
First, existing procedures for quantitative quality assessment (Q2A) of the
(sole) PAN sharpened MS product are critically reviewed. Their conceptual and
implementation drawbacks are highlighted to be overcome for quality
improvement. Second, a novel (to the best of these authors’ knowledge, the
first) protocol for Q2A of MS image PAN sharpening product and process is
designed, implemented and validated by independent means. Third, within this
protocol, an innovative categorization of spectral and spatial image quality
indicators and metrics is presented. Fourth, according to this new taxonomy, an
original third order isotropic multi scale gray level co occurrence matrix
(TIMS GLCM) calculator and a TIMS GLCM texture feature extractor are proposed
to replace popular second order GLCMs.
Andrea Baraldi, João V. B. Soares
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In recent years two sets of planar (two dimensional, 2D) shape attributes,
provided with an intuitive physical meaning, were proposed to the remote
sensing community by, respectively, Nagao & Matsuyama and Shackelford & Davis
in their seminal works on the increasingly popular geographic object based
image analysis (GEOBIA) paradigm. These two known sets of simple geometric
features were selected as initial conditions by the present research and
technological development software project, whose multi objective goal was to
accomplish: (i) a minimally dependent and maximally informative design
(representation) of a general-purpose (application-independent) dictionary of
intuitive 2D shape terms and (ii) an effective and efficient implementation of
2D shape descriptors. To comply with the Quality Assurance Framework for Earth
Observation (QA4EO) guidelines, the proposed suite of geometric functions is
validated by means of a novel quantitative quality assurance (Q2A) policy,
centered on inter-feature dependence (causality) assessment. This multivariate
feature validation strategy is alternative to traditional classification
accuracy estimation, which is inherently case-specific, and cross correlation
estimation, because statistical cross-correlation does not imply causation. The
project deliverable is a validated general purpose software suite of seven off
the shelf (ready for use) 2D shape descriptors, provided with an intuitive
physical meaning. Therefore, it is eligible for use in (GE)OBIA systems in
operating mode, expected to mimic human reasoning based on a convergence of
evidence approach.
Andrea Baraldi, Dirk Tiede, Stefan Lang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Capable of automated near real time superpixel detection and quality
assessment in an uncalibrated monitor typical red green blue (RGB) image,
depicted in either true or false colors, an original low level computer vision
(CV) lightweight computer program, called RGB Image Automatic Mapper (RGBIAM),
is designed and implemented. Constrained by the Calibration Validation (CalVal)
requirements of the Quality Assurance Framework for Earth Observation (QA4EO)
guidelines, RGBIAM requires as mandatory an uncalibrated RGB image pre
processing first stage, consisting of an automated statistical model based
color constancy algorithm. The RGBIAM hybrid inference pipeline comprises: (I)
a direct quantitative to nominal (QN) RGB variable transform, where RGB pixel
values are mapped onto a prior dictionary of color names, equivalent to a
static polyhedralization of the RGB cube. Prior color naming is the deductive
counterpart of inductive vector quantization (VQ), whose typical VQ error
function to minimize is a root mean square error (RMSE). In the output multi
level color map domain, superpixels are automatically detected in linear time
as connected sets of pixels featuring the same color label. (II) An inverse
nominal to quantitative (NQ) RGB variable transform, where a superpixelwise
constant RGB image approximation is generated in linear time to assess a VQ
error image. The hybrid direct and inverse RGBIAM QNQ transform is: (i) general
purpose, data and application independent. (ii) Automated, i.e., it requires no
user machine interaction. (iii) Near real time, with a computational complexity
increasing linearly with the image size. (iv) Implemented in tile streaming
mode, to cope with massive images. Collected outcome and process quality
indicators, including degree of automation, computational efficiency, VQ rate
and VQ error, are consistent with theoretical expectations.
Andrea Baraldi, Michael Laurence Humber, Dirk Tiede, Stefan Lang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
The European Space Agency (ESA) defines an Earth Observation (EO) Level 2
product as a multispectral (MS) image corrected for geometric, atmospheric,
adjacency and topographic effects, stacked with its scene classification map
(SCM) whose legend includes quality layers such as cloud and cloud-shadow. No
ESA EO Level 2 product has ever been systematically generated at the ground
segment. To contribute toward filling an information gap from EO big sensory
data to the ESA EO Level 2 product, a Stage 4 validation (Val) of an off the
shelf Satellite Image Automatic Mapper (SIAM) lightweight computer program for
prior knowledge based MS color naming was conducted by independent means. A
time-series of annual Web Enabled Landsat Data (WELD) image composites of the
conterminous U.S. (CONUS) was selected as input dataset. The annual SIAM WELD
maps of the CONUS were validated in comparison with the U.S. National Land
Cover Data (NLCD) 2006 map. These test and reference maps share the same
spatial resolution and spatial extent, but their map legends are not the same
and must be harmonized. For the sake of readability this paper is split into
two. The previous Part 1 Theory provided the multidisciplinary background of a
priori color naming. The present Part 2 Validation presents and discusses Stage
4 Val results collected from the test SIAM WELD map time series and the
reference NLCD map by an original protocol for wall to wall thematic map
quality assessment without sampling, where the test and reference map legends
can differ in agreement with the Part 1. Conclusions are that the SIAM-WELD
maps instantiate a Level 2 SCM product whose legend is the FAO Land Cover
Classification System (LCCS) taxonomy at the Dichotomous Phase (DP) Level 1
vegetation/nonvegetation, Level 2 terrestrial/aquatic or superior LCCS level.
Andrea Baraldi, Michael Laurence Humber, Dirk Tiede, Stefan Lang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
The European Space Agency (ESA) defines an Earth Observation (EO) Level 2
product as a multispectral (MS) image corrected for geometric, atmospheric,
adjacency and topographic effects, stacked with its scene classification map
(SCM), whose legend includes quality layers such as cloud and cloud-shadow. No
ESA EO Level 2 product has ever been systematically generated at the ground
segment. To contribute toward filling an information gap from EO big data to
the ESA EO Level 2 product, an original Stage 4 validation (Val) of the
Satellite Image Automatic Mapper (SIAM) lightweight computer program was
conducted by independent means on an annual Web-Enabled Landsat Data (WELD)
image composite time-series of the conterminous U.S. The core of SIAM is a one
pass prior knowledge based decision tree for MS reflectance space
hyperpolyhedralization into static color names presented in literature in
recent years. For the sake of readability this paper is split into two. The
present Part 1 Theory provides the multidisciplinary background of a priori
color naming in cognitive science, from linguistics to computer vision. To cope
with dictionaries of MS color names and land cover class names that do not
coincide and must be harmonized, an original hybrid guideline is proposed to
identify a categorical variable pair relationship. An original quantitative
measure of categorical variable pair association is also proposed. The
subsequent Part 2 Validation discusses Stage 4 Val results collected by an
original protocol for wall-to-wall thematic map quality assessment without
sampling where the test and reference map legends can differ. Conclusions are
that the SIAM-WELD maps instantiate a Level 2 SCM product whose legend is the 4
class taxonomy of the FAO Land Cover Classification System at the Dichotomous
Phase Level 1 vegetation/nonvegetation and Level 2 terrestrial/aquatic.
Yiren Zhou, Sibo Song, Ngai-Man Cheung
Comments: 5 pages, 8 figures, ICASSP 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Image blur and image noise are common distortions during image acquisition.
In this paper, we systematically study the effect of image distortions on the
deep neural network (DNN) image classifiers. First, we examine the DNN
classifier performance under four types of distortions. Second, we propose two
approaches to alleviate the effect of image distortion: re-training and
fine-tuning with noisy images. Our results suggest that, under certain
conditions, fine-tuning with noisy images can alleviate much effect due to
distorted inputs, and is more practical than re-training.
Nannan Wang, Xinbo Gao, Jie Li
Comments: A Submission to IEEE Transactions on Neural Networks and Learning Systems
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Face sketch synthesis plays an important role in both digital entertainment
and law enforcement. It generally consists of two parts: neighbor selection and
reconstruction weight representation. State-of-the-art face sketch synthesis
methods perform neighbor selection online in a data-driven manner by (K)
nearest neighbor ((K)-NN) searching. However, the online search increases the
time consuming for synthesis. Moreover, since these methods need to traverse
the whole training dataset for neighbor selection, the computational complexity
increases with the scale of the training database and hence these methods have
limited scalability. In addition, state-of-the-art methods consider that all
selected neighbors contribute equally to the reconstruction weight computation
process while the distinct similarity between the test patch and these
neighbors are neglected. In this paper, we employ offline random sampling in
place of online (K)-NN search to improve the synthesis efficiency. Locality
constraint is introduced to model the distinct correlations between the test
patch and random sampled patches. Extensive experiments on public face sketch
databases demonstrate the superiority of the proposed method in comparison to
state-of-the-art methods, in terms of both synthesis quality and time
consumption. The proposed method could be extended to other heterogeneous face
image transformation problems such as face hallucination. We will release the
source codes of our proposed methods and the evaluation metrics for future
study online: this http URL
Amir Sadeghian, Alexandre Alahi, Silvio Savarese
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We present a multi-cue metric learning framework to tackle the popular yet
unsolved Multi-Object Tracking (MOT) problem. One of the key challenges of
tracking methods is to effectively compute a similarity score that models
multiple cues from the past such as object appearance, motion, or even
interactions. This is particularly challenging when objects get occluded or
share similar appearance properties with surrounding objects. To address this
challenge, we cast the problem as a metric learning task that jointly reasons
on multiple cues across time. Our framework learns to encode long-term temporal
dependencies across multiple cues with a hierarchical Recurrent Neural Network.
We demonstrate the strength of our approach by tracking multiple objects using
their appearance, motion, and interactions. Our method outperforms previous
works by a large margin on multiple publicly available datasets including the
challenging MOT benchmark.
Charika De Alvis, Lionel Ott, Fabio Ramos
Comments: In International Conference On Intelligent Robots and Systems, 2016
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Robots typically possess sensors of different modalities, such as colour
cameras, inertial measurement units, and 3D laser scanners. Often, solving a
particular problem becomes easier when more than one modality is used. However,
while there are undeniable benefits to combine sensors of different modalities
the process tends to be complicated. Segmenting scenes observed by the robot
into a discrete set of classes is a central requirement for autonomy as
understanding the scene is the first step to reason about future situations.
Scene segmentation is commonly performed using either image data or 3D point
cloud data. In computer vision many successful methods for scene segmentation
are based on conditional random fields (CRF) where the maximum a posteriori
(MAP) solution to the segmentation can be obtained by inference. In this paper
we devise a new CRF inference method for scene segmentation that incorporates
global constraints, enforcing the sets of nodes are assigned the same class
label. To do this efficiently, the CRF is formulated as a relaxed quadratic
program whose MAP solution is found using a gradient-based optimisation
approach. The proposed method is evaluated on images and 3D point cloud data
gathered in urban environments where image data provides the appearance
features needed by the CRF, while the 3D point cloud data provides global
spatial constraints over sets of nodes. Comparisons with belief propagation,
conventional quadratic programming relaxation, and higher order potential CRF
show the benefits of the proposed method.
Zeshan Hussain, Tariq Patanam, Hardie Cate
Comments: 7 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In this paper, we introduce a framework for classifying images according to
high-level sentiment. We subdivide the task into three primary problems:
emotion classification on faces, human pose estimation, and 3D estimation and
clustering of groups of people. We introduce novel algorithms for matching body
parts to a common individual and clustering people in images based on physical
location and orientation. Our results outperform several baseline approaches.
Caner Gacav, Burak Benligiray, Cihan Topal
Comments: International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Facial expression recognition methods use a combination of geometric and
appearance-based features. Spatial features are derived from displacements of
facial landmarks, and carry geometric information. These features are either
selected based on prior knowledge, or dimension-reduced from a large pool. In
this study, we produce a large number of potential spatial features using two
combinations of facial landmarks. Among these, we search for a descriptive
subset of features using sequential forward selection. The chosen feature
subset is used to classify facial expressions in the extended Cohn-Kanade
dataset (CK+), and delivered 88.7% recognition accuracy without using any
appearance-based features.
Hardie Cate, Fahim Dalvi, Zeshan Hussain
Comments: 8 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We use CNNs to build a system that both classifies images of faces based on a
variety of different facial attributes and generates new faces given a set of
desired facial characteristics. After introducing the problem and providing
context in the first section, we discuss recent work related to image
generation in Section 2. In Section 3, we describe the methods used to
fine-tune our CNN and generate new images using a novel approach inspired by a
Gaussian mixture model. In Section 4, we discuss our working dataset and
describe our preprocessing steps and handling of facial attributes. Finally, in
Sections 5, 6 and 7, we explain our experiments and results and conclude in the
following section. Our classification system has 82\% test accuracy.
Furthermore, our generation pipeline successfully creates well-formed faces.
Hardie Cate, Fahim Dalvi, Zeshan Hussain
Comments: 5 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Devices like the Myo armband available in the market today enable us to
collect data about the position of a user’s hands and fingers over time. We can
use these technologies for sign language translation since each sign is roughly
a combination of gestures across time. In this work, we utilize a dataset
collected by a group at the University of South Wales, which contains
parameters, such as hand position, hand rotation, and finger bend, for 95
unique signs. For each input stream representing a sign, we predict which sign
class this stream falls into. We begin by implementing baseline SVM and
logistic regression models, which perform reasonably well on high quality data.
Lower quality data requires a more sophisticated approach, so we explore
different methods in temporal classification, including long short term memory
architectures and sequential pattern mining methods.
Yanzhao Zhou, Qixiang Ye, Qiang Qiu, Jianbin Jiao
Comments: 10 pages, 10 figures. Submitted to CVPR 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Deep Convolution Neural Networks (DCNNs) are capable of learning
unprecedentedly effective image representations. However, their ability in
handling significant local and global image rotations remains limited. In this
paper, we propose Active Rotating Filters (ARFs) that actively rotate during
convolution and produce feature maps with location and orientation explicitly
encoded. An ARF acts as a virtual filter bank containing the filter itself and
its multiple unmaterialised rotated versions. During back-propagation, an ARF
is collectively updated using errors from all its rotated versions. DCNNs using
ARFs, referred to as Oriented Response Networks (ORNs), can produce
within-class rotation-invariant deep features while maintaining inter-class
discrimination for classification tasks. The oriented response produced by ORNs
can also be used for image and object orientation estimation tasks. Over
multiple state-of-the-art DCNN architectures, such as VGG, ResNet, and STN, we
consistently observe that replacing regular filters with the proposed ARFs
leads to significant reduction in the number of network parameters and
improvement in classification performance. We report the best results on
several commonly used benchmarks.
Zelun Luo, Boya Peng, De-An Huang, Alexandre Alahi, Li Fei-Fei
Comments: Computer Vision and Pattern Recognition (CVPR) 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We present an unsupervised representation learning approach that compactly
encodes the motion dependencies in videos. Given a pair of images from a video
clip, our framework learns to predict the long-term 3D motions. To reduce the
complexity of the learning framework, we propose to describe the motion as a
sequence of atomic 3D flows computed with RGB-D modality. We use a Recurrent
Neural Network based Encoder-Decoder framework to predict these sequences of
flows. We argue that in order for the decoder to reconstruct these sequences,
the encoder must learn a robust video representation that captures long-term
motion dependencies and spatial-temporal relations. We demonstrate the
effectiveness of our learned temporal representations on activity
classification across multiple modalities and datasets such as NTU RGB+D and
MSR Daily Activity 3D. Our framework is generic to any input modality, i.e.,
RGB, Depth, and RGB-D videos.
Pichao Wang, Wanqing Li, Song Liu, Zhimin Gao, Chang Tang, Philip Ogunbona
Comments: arXiv admin note: text overlap with arXiv:1608.06338
Subjects: Computer Vision and Pattern Recognition (cs.CV)
This paper proposes three simple, compact yet effective representations of
depth sequences, referred to respectively as Dynamic Depth Images (DDI),
Dynamic Depth Normal Images (DDNI) and Dynamic Depth Motion Normal Images
(DDMNI). These dynamic images are constructed from a sequence of depth maps
using bidirectional rank pooling to effectively capture the spatial-temporal
information. Such image-based representations enable us to fine-tune the
existing ConvNets models trained on image data for classification of depth
sequences, without introducing large parameters to learn. Upon the proposed
representations, a convolutional Neural networks (ConvNets) based method is
developed for gesture recognition and evaluated on the Large-scale Isolated
Gesture Recognition at the ChaLearn Looking at People (LAP) challenge 2016. The
method achieved 55.57\% classification accuracy and ranked (2^{nd}) place in
this challenge but was very close to the best performance even though we only
used depth data.
George Papandreou, Tyler Zhu, Nori Kanazawa, Alexander Toshev, Jonathan Tompson, Chris Bregler, Kevin Murphy
Comments: Paper describing an improved version of the G-RMI entry to the 2016 COCO keypoints challenge (this http URL)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We propose a method for multi-person detection and 2-D keypoint localization
(human pose estimation) that achieves state-of-the-art results on the
challenging COCO keypoints task. It is a simple, yet powerful, top-down
approach consisting of two stages.
In the first stage, we predict the location and scale of boxes which are
likely to contain people; for this we use the Faster RCNN detector with an
Inception-ResNet architecture. In the second stage, we estimate the keypoints
of the person potentially contained in each proposed bounding box. For each
keypoint type we predict dense heatmaps and offsets using a fully convolutional
ResNet. To combine these outputs we introduce a novel aggregation procedure to
obtain highly localized keypoint predictions. We also use a novel form of
keypoint-based Non-Maximum-Suppression (NMS), instead of the cruder box-level
NMS, and a novel form of keypoint-based confidence score estimation, instead of
box-level scoring.
Our final system achieves average precision of 0.636 on the COCO test-dev set
and the 0.628 test-standard sets, outperforming the CMU-Pose winner of the 2016
COCO keypoints challenge. Further, by using additional labeled data we obtain
an even higher average precision of 0.668 on the test-dev set and 0.658 on the
test-standard set, thus achieving a roughly 10% improvement over the previous
best performing method on the same challenge.
Hao Sun, Alina Zare
Subjects: Computer Vision and Pattern Recognition (cs.CV)
A map-guided superpixel segmentation method for hyperspectral imagery is
developed and introduced. The proposed approach develops a
hyperspectral-appropriate version of the SLIC superpixel segmentation
algorithm, leverages map information to guide segmentation, and incorporates
the semi-supervised Partial Membership Latent Dirichlet Allocation (sPM-LDA) to
obtain a final superpixel segmentation. The proposed method is applied to two
real hyperspectral data sets and quantitative cluster validity metrics indicate
that the proposed approach outperforms existing hyperspectral superpixel
segmentation methods.
Bernhard Schmitzer, Benedikt Wirth
Subjects: Optimization and Control (math.OC); Computer Vision and Pattern Recognition (cs.CV); Numerical Analysis (math.NA)
We propose a unifying framework for generalising the Wasserstein-1 metric to
a discrepancy measure between nonnegative measures of different mass. This
generalization inherits the convexity and computational efficiency from the
Wasserstein-1 metric, and it includes several previous approaches from the
literature as special cases. For various specific instances of the generalized
Wasserstein-1 metric we furthermore demonstrate their usefulness in
applications by numerical experiments.
Valentina Franzoni
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Information Retrieval (cs.IR); Probability (math.PR)
One of the main problems that emerges in the classic approach to semantics is
the difficulty in acquisition and maintenance of ontologies and semantic
annotations. On the other hand, the Internet explosion and the massive
diffusion of mobile smart devices lead to the creation of a worldwide system,
which information is daily checked and fueled by the contribution of millions
of users who interacts in a collaborative way. Search engines, continually
exploring the Web, are a natural source of information on which to base a
modern approach to semantic annotation. A promising idea is that it is possible
to generalize the semantic similarity, under the assumption that semantically
similar terms behave similarly, and define collaborative proximity measures
based on the indexing information returned by search engines. The PMING
Distance is a proximity measure used in data mining and information retrieval,
which collaborative information express the degree of relationship between two
terms, using only the number of documents returned as result for a query on a
search engine. In this work, the PMINIG Distance is updated, providing a novel
formal algebraic definition, which corrects previous works. The novel point of
view underlines the features of the PMING to be a locally normalized linear
combination of the Pointwise Mutual Information and Normalized Google Distance.
The analyzed measure dynamically reflects the collaborative change made on the
web resources.
Thomas E. Portegys
Subjects: Neurons and Cognition (q-bio.NC); Artificial Intelligence (cs.AI)
Artificial intelligence research to a great degree focuses on the brain and
behaviors that the brain generates. But the brain, an extremely complex
structure resulting from millions of years of evolution, can be viewed as a
solution to problems posed by an environment existing in space and time. The
environment generates signals that produce sensory events within an organism.
Building an internal spatial and temporal model of the environment allows an
organism to navigate and manipulate the environment. Higher intelligence might
be the ability to process information coming from a larger extent of
space-time. In keeping with nature’s penchant for extending rather than
replacing, the purpose of the mammalian neocortex might then be to record
events from distant reaches of space and time and render them, as though yet
near and present, to the older, deeper brain whose instinctual roles have
changed little over eons. Here this notion is embodied in a model called
morphognosis (morpho = shape and gnosis = knowledge). Its basic structure is a
pyramid of event recordings called a morphognostic. At the apex of the pyramid
are the most recent and nearby events. Receding from the apex are less recent
and possibly more distant events. A morphognostic can thus be viewed as a
structure of progressively larger chunks of space-time knowledge. A set of
morphognostics forms long-term memories that are learned by exposure to the
environment. A cellular automaton is used as the platform to investigate the
morphognosis model, using a simulated organism that learns and navigates its
world in search of food, as well as the game of Pong.
Mehmet E. Basbug, Barbara E. Engelhardt
Comments: Under review at AISTATS 2017
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
We present a general framework, the coupled compound Poisson factorization
(CCPF), to capture the missing-data mechanism in extremely sparse data sets by
coupling a hierarchical Poisson factorization with an arbitrary data-generating
model. We derive a stochastic variational inference algorithm for the resulting
model and, as examples of our framework, implement three different
data-generating models—a mixture model, linear regression, and factor
analysis—to robustly model non-random missing data in the context of
clustering, prediction, and matrix factorization. In all three cases, we test
our framework against models that ignore the missing-data mechanism on large
scale studies with non-random missing data, and we show that explicitly
modeling the missing-data mechanism substantially improves the quality of the
results, as measured using data log likelihood on a held-out test set.
Yadollah Yaghoobzadeh, Hinrich Schütze
Comments: 13 pages, accepted in EACL 2016
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Entities are essential elements of natural language. In this paper, we
present methods for learning multi-level representations of entities on three
complementary levels: character (character patterns in entity names extracted,
e.g., by neural networks), word (embeddings of words in entity names) and
entity (entity embeddings). We investigate state-of-the-art learning methods on
each level and find large differences, e.g., for deep learning models,
traditional ngram features and the subword model of fasttext (Bojanowski et
al., 2016) on the character level; for word2vec (Mikolov et al., 2013) on the
word level; and for the order-aware model wang2vec (Ling et al., 2015a) on the
entity level. We confirm experimentally that each level of representation
contributes complementary information and a joint representation of all three
levels improves the existing embedding based baseline for fine-grained entity
typing by a large margin. Additionally, we show that adding information from
entity descriptions further improves multi-level representations of entities.
Thanh Vu, Alistair Willis, Udo Kruschwitz, Dawei Song
Comments: 4 pages, 2 figures, the 2017 ACM SIGIR Conference on Human Information Interaction & Retrieval (CHIIR)
Subjects: Information Retrieval (cs.IR)
Recent research has shown the usefulness of using collective user interaction
data (e.g., query logs) to recommend query modification suggestions for
Intranet search. However, most of the query suggestion approaches for Intranet
search follow an “one size fits all” strategy, whereby different users who
submit an identical query would get the same query suggestion list. This is
problematic, as even with the same query, different users may have different
topics of interest, which may change over time in response to the user’s
interaction with the system. We address the problem by proposing a personalised
query suggestion framework for Intranet search. For each search session, we
construct two temporal user profiles: a click user profile using the user’s
clicked documents and a query user profile using the user’s submitted queries.
We then use the two profiles to re-rank the non-personalised query suggestion
list returned by a state-of-the-art query suggestion method for Intranet
search. Experimental results on a large-scale query logs collection show that
our personalised framework significantly improves the quality of suggested
queries.
Roberto Pagano, Massimo Quadrana, Mehdi Elahi, Paolo Cremonesi
Subjects: Information Retrieval (cs.IR)
One of the main challenges in Recommender Systems (RSs) is the New User
problem which happens when the system has to generate personalised
recommendations for a new user whom the system has no information about. Active
Learning tries to solve this problem by acquiring user preference data with the
maximum quality, and with the minimum acquisition cost. Although there are
variety of works in active learning for RSs research area, almost all of them
have focused only on the single-domain recommendation scenario. However,
several real-world RSs operate in the cross-domain scenario, where the system
generates recommendations in the target domain by exploiting user preferences
in both the target and auxiliary domains. In such a scenario, the performance
of active learning strategies can be significantly influenced and typical
active learning strategies may fail to perform properly. In this paper, we
address this limitation, by evaluating active learning strategies in a novel
evaluation framework, explicitly suited for the cross-domain recommendation
scenario. We show that having access to the preferences of the users in the
auxiliary domain may have a huge impact on the performance of active learning
strategies w.r.t. the classical, single-domain scenario.
Dominik Wurzer, Yumeng Qin
Subjects: Information Retrieval (cs.IR)
Newswire and Social Media are the major sources of information in our time.
While the topical demographic of Western Media was subjects of studies in the
past, less is known about Chinese Media. In this paper, we apply event
detection and tracking technology to examine the information overlap and
differences between Chinese and Western – Traditional Media and Social Media.
Our experiments reveal a biased interest of China towards the West, which
becomes particularly apparent when comparing the interest in celebrities.
Valentina Franzoni
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Information Retrieval (cs.IR); Probability (math.PR)
One of the main problems that emerges in the classic approach to semantics is
the difficulty in acquisition and maintenance of ontologies and semantic
annotations. On the other hand, the Internet explosion and the massive
diffusion of mobile smart devices lead to the creation of a worldwide system,
which information is daily checked and fueled by the contribution of millions
of users who interacts in a collaborative way. Search engines, continually
exploring the Web, are a natural source of information on which to base a
modern approach to semantic annotation. A promising idea is that it is possible
to generalize the semantic similarity, under the assumption that semantically
similar terms behave similarly, and define collaborative proximity measures
based on the indexing information returned by search engines. The PMING
Distance is a proximity measure used in data mining and information retrieval,
which collaborative information express the degree of relationship between two
terms, using only the number of documents returned as result for a query on a
search engine. In this work, the PMINIG Distance is updated, providing a novel
formal algebraic definition, which corrects previous works. The novel point of
view underlines the features of the PMING to be a locally normalized linear
combination of the Pointwise Mutual Information and Normalized Google Distance.
The analyzed measure dynamically reflects the collaborative change made on the
web resources.
Jun Wang, Qiang Tang
Comments: 14 pages
Subjects: Cryptography and Security (cs.CR); Information Retrieval (cs.IR)
Privacy issues of recommender systems have become a hot topic for the society
as such systems are appearing in every corner of our life. In contrast to the
fact that many secure multi-party computation protocols have been proposed to
prevent information leakage in the process of recommendation computation, very
little has been done to restrict the information leakage from the
recommendation results. In this paper, we apply the differential privacy
concept to neighborhood-based recommendation methods (NBMs) under a
probabilistic framework. We first present a solution, by directly calibrating
Laplace noise into the training process, to differential-privately find the
maximum a posteriori parameters similarity. Then we connect differential
privacy to NBMs by exploiting a recent observation that sampling from the
scaled posterior distribution of a Bayesian model results in provably
differentially private systems. Our experiments show that both solutions allow
promising accuracy with a modest privacy budget, and the second solution yields
better accuracy if the sampling asymptotically converges. We also compare our
solutions to the recent differentially private matrix factorization (MF)
recommender systems, and show that our solutions achieve better accuracy when
the privacy budget is reasonably small. This is an interesting result because
MF systems often offer better accuracy when differential privacy is not
applied.
Anca Dumitrache, Lora Aroyo, Chris Welty
Comments: In review for ACM Transactions on Interactive Intelligent Systems (TiiS) Special Issue on Human-Centered Machine Learning
Subjects: Computation and Language (cs.CL); Human-Computer Interaction (cs.HC)
Cognitive computing systems require human labeled data for evaluation, and
often for training. The standard practice used in gathering this data minimizes
disagreement between annotators, and we have found this results in data that
fails to account for the ambiguity inherent in language. We have proposed the
CrowdTruth method for collecting ground truth through crowdsourcing, that
reconsiders the role of people in machine learning based on the observation
that disagreement between annotators provides a useful signal for phenomena
such as ambiguity in the text. We report on using this method to build an
annotated data set for medical relation extraction for the (cause) and (treat)
relations, and how this data performed in a supervised training experiment. We
demonstrate that by modeling ambiguity, labeled data gathered from crowd
workers can (1) reach the level of quality of domain experts for this task
while reducing the cost, and (2) provide better training data at scale than
distant supervision. We further propose and validate new weighted measures for
precision, recall, and F-measure, that account for ambiguity in both human and
machine performance on this task.
Wenpeng Yin, Hinrich Schütze
Comments: EACL’2017 long paper. arXiv admin note: substantial text overlap with arXiv:1604.06896
Subjects: Computation and Language (cs.CL)
This work studies comparatively two typical sentence matching tasks: textual
entailment (TE) and answer selection (AS), observing that weaker phrase
alignments are more critical in TE, while stronger phrase alignments deserve
more attention in AS. The key to reach this observation lies in phrase
detection, phrase representation, phrase alignment, and more importantly how to
connect those aligned phrases of different matching degrees with the final
classifier. Prior work (i) has limitations in phrase generation and
representation, or (ii) conducts alignment at word and phrase levels by
handcrafted features or (iii) utilizes a single framework of alignment without
considering the characteristics of specific tasks, which limits the framework’s
effectiveness across tasks. We propose an architecture based on Gated Recurrent
Unit that supports (i) representation learning of phrases of arbitrary
granularity and (ii) task-specific attentive pooling of phrase alignments
between two sentences. Experimental results on TE and AS match our observation
and show the effectiveness of our approach.
Weinan Zhang, Ting Liu, Yifa Wang, Qingfu Zhu
Subjects: Computation and Language (cs.CL)
In this paper, we focus on the personalized response generation for
conversational systems. Based on the sequence to sequence learning, especially
the encoder-decoder framework, we propose a two-phase approach, namely
initialization then adaptation, to model the responding style of human and then
generate personalized responses. For evaluation, we propose a novel human aided
method to evaluate the performance of the personalized response generation
models by online real-time conversation and offline human judgement. Moreover,
the lexical divergence of the responses generated by the 5 personalized models
indicates that the proposed two-phase approach achieves good results on
modeling the responding style of human and generating personalized responses
for the conversational systems.
Yadollah Yaghoobzadeh, Hinrich Schütze
Comments: 13 pages, accepted in EACL 2016
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Entities are essential elements of natural language. In this paper, we
present methods for learning multi-level representations of entities on three
complementary levels: character (character patterns in entity names extracted,
e.g., by neural networks), word (embeddings of words in entity names) and
entity (entity embeddings). We investigate state-of-the-art learning methods on
each level and find large differences, e.g., for deep learning models,
traditional ngram features and the subword model of fasttext (Bojanowski et
al., 2016) on the character level; for word2vec (Mikolov et al., 2013) on the
word level; and for the order-aware model wang2vec (Ling et al., 2015a) on the
entity level. We confirm experimentally that each level of representation
contributes complementary information and a joint representation of all three
levels improves the existing embedding based baseline for fine-grained entity
typing by a large margin. Additionally, we show that adding information from
entity descriptions further improves multi-level representations of entities.
Fan Xu, Mingwen Wang, Maoxi Li
Comments: 12
Journal-ref: International Journal on Natural Language Computing (IJNLC) Vol.
5, No.6, December 2016
Subjects: Computation and Language (cs.CL)
Identifying the different varieties of the same language is more challenging
than unrelated languages identification. In this paper, we propose an approach
to discriminate language varieties or dialects of Mandarin Chinese for the
Mainland China, Hong Kong, Taiwan, Macao, Malaysia and Singapore, a.k.a., the
Greater China Region (GCR). When applied to the dialects identification of the
GCR, we find that the commonly used character-level or word-level uni-gram
feature is not very efficient since there exist several specific problems such
as the ambiguity and context-dependent characteristic of words in the dialects
of the GCR. To overcome these challenges, we use not only the general features
like character-level n-gram, but also many new word-level features, including
PMI-based and word alignment-based features. A series of evaluation results on
both the news and open-domain dataset from Wikipedia show the effectiveness of
the proposed approach.
Mohaddeseh Bastan, Shahram Khadivi, Mohammad Mehdi Homayounpour
Comments: 6 pages, Submitted in ICEE 2017
Subjects: Computation and Language (cs.CL)
Neural Machine Translation (NMT) is a new approach for Machine Translation
(MT), and due to its success, it has absorbed the attention of many researchers
in the field. In this paper, we study NMT model on Persian-English language
pairs, to analyze the model and investigate the appropriateness of the model
for scarce-resourced scenarios, the situation that exists for Persian-centered
translation systems. We adjust the model for the Persian language and find the
best parameters and hyper parameters for two tasks: translation and
transliteration. We also apply some preprocessing task on the Persian dataset
which yields to increase for about one point in terms of BLEU score. Also, we
have modified the loss function to enhance the word alignment of the model.
This new loss function yields a total of 1.87 point improvements in terms of
BLEU score in the translation quality.
Filippos Kokkinos, Alexandros Potamianos
Comments: Submitted to EACL2017 for review
Subjects: Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE)
We introduce a tree-structured attention neural network for sentences and
small phrases and apply it to the problem of sentiment classification. Our
model expands the current recursive models by incorporating structural
information around a node of a syntactic tree using both bottom-up and top-down
information propagation. Also, the model utilizes structural attention to
identify the most salient representations during the construction of the
syntactic tree. To our knowledge, the proposed models achieve state of the art
performance on the Stanford Sentiment Treebank dataset.
Valentina Franzoni
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Information Retrieval (cs.IR); Probability (math.PR)
One of the main problems that emerges in the classic approach to semantics is
the difficulty in acquisition and maintenance of ontologies and semantic
annotations. On the other hand, the Internet explosion and the massive
diffusion of mobile smart devices lead to the creation of a worldwide system,
which information is daily checked and fueled by the contribution of millions
of users who interacts in a collaborative way. Search engines, continually
exploring the Web, are a natural source of information on which to base a
modern approach to semantic annotation. A promising idea is that it is possible
to generalize the semantic similarity, under the assumption that semantically
similar terms behave similarly, and define collaborative proximity measures
based on the indexing information returned by search engines. The PMING
Distance is a proximity measure used in data mining and information retrieval,
which collaborative information express the degree of relationship between two
terms, using only the number of documents returned as result for a query on a
search engine. In this work, the PMINIG Distance is updated, providing a novel
formal algebraic definition, which corrects previous works. The novel point of
view underlines the features of the PMING to be a locally normalized linear
combination of the Pointwise Mutual Information and Normalized Google Distance.
The analyzed measure dynamically reflects the collaborative change made on the
web resources.
Nguyen Cong Luong, Ping Wang, Dusit Niyato, Wen Yonggang, Zhu Han
Comments: IEEE Communications Surveys and Tutorials, 2017
Subjects: Computer Science and Game Theory (cs.GT); Distributed, Parallel, and Cluster Computing (cs.DC); Networking and Internet Architecture (cs.NI)
This paper presents a comprehensive literature review on applications of
economic and pricing models for resource management in cloud networking. To
achieve sustainable profit advantage, cost reduction, and flexibility in
provisioning of cloud resources, resource management in cloud networking
requires adaptive and robust designs to address many issues, e.g., resource
allocation, bandwidth reservation, request allocation, and workload allocation.
Economic and pricing models have received a lot of attention as they can lead
to desirable performance in terms of social welfare, fairness, truthfulness,
profit, user satisfaction, and resource utilization. This paper reviews
applications of the economic and pricing models to develop adaptive algorithms
and protocols for resource management in cloud networking. Besides, we survey a
variety of incentive mechanisms using the pricing strategies in sharing
resources in edge computing. In addition, we consider using pricing models in
cloud-based Software Defined Wireless Networking (cloud-based SDWN). Finally,
we highlight important challenges, open issues and future research directions
of applying economic and pricing models to cloud networking
Ryan A. Rossi, Rong Zhou, Nesreen K. Ahmed
Subjects: Social and Information Networks (cs.SI); Distributed, Parallel, and Cluster Computing (cs.DC); Combinatorics (math.CO); Machine Learning (stat.ML)
Graphlets are induced subgraphs of a large network and are important for
understanding and modeling complex networks. Despite their practical
importance, graphlets have been severely limited to applications and domains
with relatively small graphs. Most previous work has focused on exact
algorithms, however, it is often too expensive to compute graphlets exactly in
massive networks with billions of edges, and finding an approximate count is
usually sufficient for many applications. In this work, we propose an unbiased
graphlet estimation framework that is (a) fast with significant speedups
compared to the state-of-the-art, (b) parallel with nearly linear-speedups, (c)
accurate with <1% relative error, (d) scalable and space-efficient for massive
networks with billions of edges, and (e) flexible for a variety of real-world
settings, as well as estimating macro and micro-level graphlet statistics
(e.g., counts) of both connected and disconnected graphlets. In addition, an
adaptive approach is introduced that finds the smallest sample size required to
obtain estimates within a given user-defined error bound. On 300 networks from
20 domains, we obtain <1% relative error for all graphlets. This is
significantly more accurate than existing methods while using less data.
Moreover, it takes a few seconds on billion edge graphs (as opposed to
days/weeks). These are by far the largest graphlet computations to date.
Tapabrata Ghosh
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
We present QuickNet, a fast and accurate network architecture that is both
faster and significantly more accurate than other fast deep architectures like
SqueezeNet. Furthermore, it uses less parameters than previous networks, making
it more memory efficient. We do this by making two major modifications to the
reference Darknet model (Redmon et al, 2015): 1) The use of depthwise separable
convolutions and 2) The use of parametric rectified linear units. We make the
observation that parametric rectified linear units are computationally
equivalent to leaky rectified linear units at test time and the observation
that separable convolutions can be interpreted as a compressed Inception
network (Chollet, 2016). Using these observations, we derive a network
architecture, which we call QuickNet, that is both faster and more accurate
than previous models. Our architecture provides at least four major advantages:
(1) A smaller model size, which is more tenable on memory constrained systems;
(2) A significantly faster network which is more tenable on computationally
constrained systems; (3) A high accuracy of 95.7 percent on the CIFAR-10
Dataset which outperforms all but one result published so far, although we note
that our works are orthogonal approaches and can be combined (4) Orthogonality
to previous model compression approaches allowing for further speed gains to be
realized.
Mehmet E. Basbug, Barbara E. Engelhardt
Comments: Under review at AISTATS 2017
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
We present a general framework, the coupled compound Poisson factorization
(CCPF), to capture the missing-data mechanism in extremely sparse data sets by
coupling a hierarchical Poisson factorization with an arbitrary data-generating
model. We derive a stochastic variational inference algorithm for the resulting
model and, as examples of our framework, implement three different
data-generating models—a mixture model, linear regression, and factor
analysis—to robustly model non-random missing data in the context of
clustering, prediction, and matrix factorization. In all three cases, we test
our framework against models that ignore the missing-data mechanism on large
scale studies with non-random missing data, and we show that explicitly
modeling the missing-data mechanism substantially improves the quality of the
results, as measured using data log likelihood on a held-out test set.
Peter Bloem, Steven de Rooij
Subjects: Learning (cs.LG)
We introduce a new learning method for network motifs: interesting or
informative subgraph patterns in a network. Current methods for finding motifs
rely on the frequency of the motif: specifically, subgraphs are motifs when
their frequency in the data is high compared to the expected frequency under a
null model. To compute this expectation, the search for motifs is normally
repeated on as many as 1000 random graphs sampled from the null model, a
prohibitively expensive step. We use ideas from the Minimum Description Length
(MDL) literature to define a new measure of motif relevance. This has several
advantages: the subgraph count on samples from the null model can be
eliminated, and the search for motif candidates within the data itself can be
greatly simplified. Our method allows motif analysis to scale to networks with
billions of links, provided that a fast null model is used.
Xun Zhou, Changle Li, Zhe Liu, Tom H. Luan, Zhifang Miao, Lina Zhu, Lei Xiong
Subjects: Learning (cs.LG); Applications (stat.AP)
The Intelligent Transportation System (ITS) targets to a coordinated traffic
system by applying the advanced wireless communication technologies for road
traffic scheduling. Towards an accurate road traffic control, the short-term
traffic forecasting to predict the road traffic at the particular site in a
short period is often useful and important. In existing works, Seasonal
Autoregressive Integrated Moving Average (SARIMA) model is a popular approach.
The scheme however encounters two challenges: 1) the analysis on related data
is insufficient whereas some important features of data may be neglected; and
2) with data presenting different features, it is unlikely to have one
predictive model that can fit all situations. To tackle above issues, in this
work, we develop a hybrid model to improve accuracy of SARIMA. In specific, we
first explore the autocorrelation and distribution features existed in traffic
flow to revise structure of the time series model. Based on the Gaussian
distribution of traffic flow, a hybrid model with a Bayesian learning algorithm
is developed which can effectively expand the application scenarios of SARIMA.
We show the efficiency and accuracy of our proposal using both analysis and
experimental studies. Using the real-world trace data, we show that the
proposed predicting approach can achieve satisfactory performance in practice.
John Cristian Borges Gamboa
Comments: Written as part of the Seminar on Collaborative Intelligence in the TU Kaiserslautern. January 2016
Subjects: Learning (cs.LG)
In many real-world application, e.g., speech recognition or sleep stage
classification, data are captured over the course of time, constituting a
Time-Series. Time-Series often contain temporal dependencies that cause two
otherwise identical points of time to belong to different classes or predict
different behavior. This characteristic generally increases the difficulty of
analysing them. Existing techniques often depended on hand-crafted features
that were expensive to create and required expert knowledge of the field. With
the advent of Deep Learning new models of unsupervised learning of features for
Time-series analysis and forecast have been developed. Such new developments
are the topic of this paper: a review of the main Deep Learning techniques is
presented, and some applications on Time-Series analysis are summaried. The
results make it clear that Deep Learning has a lot to contribute to the field.
Tian Zhao, Xiaobing Huang, Yu Cao
Subjects: Programming Languages (cs.PL); Learning (cs.LG)
In recent years, Deep Learning (DL) has found great success in domains such
as multimedia understanding. However, the complex nature of multimedia data
makes it difficult to develop DL-based software. The state-of-the art tools,
such as Caffe, TensorFlow, Torch7, and CNTK, while are successful in their
applicable domains, are programming libraries with fixed user interface,
internal representation, and execution environment. This makes it difficult to
implement portable and customized DL applications.
In this paper, we present DeepDSL, a domain specific language (DSL) embedded
in Scala, that compiles deep networks written in DeepDSL to Java source code.
Deep DSL provides (1) intuitive constructs to support compact encoding of deep
networks; (2) symbolic gradient derivation of the networks; (3) static analysis
for memory consumption and error detection; and (4) DSL-level optimization to
improve memory and runtime efficiency.
DeepDSL programs are compiled into compact, efficient, customizable, and
portable Java source code, which operates the CUDA and CUDNN interfaces running
on Nvidia GPU via a Java Native Interface (JNI) library. We evaluated DeepDSL
with a number of popular DL networks. Our experiments show that the compiled
programs have very competitive runtime performance and memory efficiency
compared to the existing libraries.
Elike Hodo, Xavier Bellekens, Andrew Hamilton, Christos Tachtatzis, Robert Atkinson
Subjects: Cryptography and Security (cs.CR); Learning (cs.LG)
Intrusion detection has attracted a considerable interest from researchers
and industries. The community, after many years of research, still faces the
problem of building reliable and efficient IDS that are capable of handling
large quantities of data, with changing patterns in real time situations. The
work presented in this manuscript classifies intrusion detection systems (IDS).
Moreover, a taxonomy and survey of shallow and deep networks intrusion
detection systems is presented based on previous and current works. This
taxonomy and survey reviews machine learning techniques and their performance
in detecting anomalies. Feature selection which influences the effectiveness of
machine learning (ML) IDS is discussed to explain the role of feature selection
in the classification and training phase of ML IDS. Finally, a discussion of
the false and true positive alarm rates is presented to help researchers model
reliable and efficient machine learning based intrusion detection systems.
Michele Svanera, Sergio Benini, Gal Raz, Talma Hendler, Rainer Goebel, Giancarlo Valente
Comments: Presented at MLINI-2016 workshop, 2016 (arXiv:1701.01437)
Subjects: Machine Learning (stat.ML); Learning (cs.LG); Neurons and Cognition (q-bio.NC)
Deep neural networks have been developed drawing inspiration from the brain
visual pathway, implementing an end-to-end approach: from image data to video
object classes. However building an fMRI decoder with the typical structure of
Convolutional Neural Network (CNN), i.e. learning multiple level of
representations, seems impractical due to lack of brain data. As a possible
solution, this work presents the first hybrid fMRI and deep features decoding
approach: collected fMRI and deep learnt representations of video object
classes are linked together by means of Kernel Canonical Correlation Analysis.
In decoding, this allows exploiting the discriminatory power of CNN by relating
the fMRI representation to the last layer of CNN (fc7). We show the
effectiveness of embedding fMRI data onto a subspace related to deep features
in distinguishing semantic visual categories based solely on brain imaging
data.
Ping Li
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
The recently proposed “generalized min-max” (GMM) kernel can be efficiently
linearized via hashing or the Nystrom method, with direct applications in
large-scale machine learning and fast near neighbor search. As a tuning-free
kernel, the linearized GMM kernel has been extensively compared with the
normalized random Fourier features (NRFF). On classification tasks, the
tuning-free GMM kernel performs (surprisingly) well compared to the best-tuned
RBF kernel. Nevertheless, one would naturally expect that the GMM kernel ought
to be further improved if we introduce tuning parameters.
In this paper, we study two simple constructions of tunable GMM kernels: (i)
the exponentiated-GMM (or eGMM) kernel, and (ii) the powered-GMM (or pGMM)
kernel. One can of course introduce additional (and more complex) kernels by
(e.g.,) combining eGMM and pGMM kernels. The pGMM kernel can still be
efficiently linearized by modifying the original hashing procedure for the GMM
kernel. On 47 publicly available classification datasets from the UCI
repository, we verify that the eGMM and pGMM kernels (typically) improve over
the original GMM kernel. On a fraction of datasets, the improvements can be
astonishingly significant.
We hope our introduction of tunable kernels could offer practitioners the
flexibility of choosing appropriate kernels and methods for large-scale search
& learning for their specific applications.
Ian Blake, Shu Lin
Comments: 15 pages, 1 figure, 2 tables
Subjects: Information Theory (cs.IT); Combinatorics (math.CO)
A number of recent works have used a variety of combinatorial constructions
to derive Tanner graphs for LDPC codes and some of these have been shown to
perform well in terms of their probability of error curves and error floors.
Such graphs are bipartite and many of these constructions yield biregular
graphs where the degree of left vertices is a constant (c+1) and that of the
right vertices is a constant (d+1). Such graphs are termed ((c+1,d+1))
biregular bipartite graphs here. One property of interest in such work is the
girth of the graph and the number of short cycles in the graph, cycles of
length either the girth or slightly larger. Such numbers have been shown to be
related to the error floor of the probability of error curve of the related
LDPC code. Using known results of graph theory, it is shown how the girth and
the number of cycles of length equal to the girth may be computed for these
((c+1,d+1)) biregular bipartite graphs knowing only the parameters (c) and (d)
and the numbers of left and right vertices. While numerous algorithms to
determine the number of short cycles in arbitrary graphs exist, the reduction
of the problem from an algorithm to a computation for these biregular bipartite
graphs is of interest.
Chang-Sik Choi, Jae Oh Woo, Jeffrey G. Andrews
Comments: Submitted to IEEE Transactions on Wireless Communications
Subjects: Information Theory (cs.IT)
We propose a new cellular network model that captures both strong repulsion
and no correlation between base stations. The base station locations are
modeled as the superposition of two independent stationary point processes: a
random shifted grid with intensity ( lambda_g ) for the repulsive base
stations and an independent Poisson point process with intensity ( lambda_p )
for the uncorrelated base stations. Assuming that users are associated with
base stations that provide the strongest average receive signal power, we
obtain the probability that a typical user is associated with either a
repulsive base station or an uncorrelated base station. Assuming Rayleigh
fading channels, we derive the expression for the coverage probability of the
typical user. The following three observations can be made. First, the
association and coverage probability of the typical user are fully
characterized by two system parameters: the intensity ratio (
ho_lambda )
and the transmit power ratio ( eta). Second, in user association, a bias
toward the repulsive base stations exists. Finally, the coverage expression is
a monotonically increasing function with respect to the number of repulsive
base stations, or the inverse of the intensity ratio (
ho_lambda^{-1} ).
Assaf Kartowsky, Ido Tal
Comments: 5 pages, submitted to ISIT 2017
Subjects: Information Theory (cs.IT)
Consider a channel with a given input distribution. Our aim is to degrade it
to a channel with at most L output letters. One such degradation method is the
so called “greedy-merge” algorithm. We derive an upper bound on the reduction
in mutual information between input and output. For fixed input alphabet size
and variable L, the upper bound is within a constant factor of an
algorithm-independent lower bound. Thus, we establish that greedy-merge is
optimal in the power-law sense.
Silas L. Fong, Vincent Y. F. Tan, Ayfer Özgür
Comments: 48 pages
Subjects: Information Theory (cs.IT)
This paper investigates the achievable rates of an additive white Gaussian
noise (AWGN) energy-harvesting (EH) channel with an infinite battery under the
assumption that the error probabilities do not vanish as the blocklength
increases. The EH process is characterized by a sequence of blocks of harvested
energy. The harvested energy remains constant within a block while the
harvested energy across different blocks is characterized by a sequence of
independent and identically distributed (i.i.d.) random variables. The blocks
have length (L), which can be interpreted as the coherence time of the energy
arrival process. If (L) is a constant or grows sublinearly in the blocklength
(n), we fully characterize the first-order coding rate. In addition, we obtain
lower and upper bounds on the second-order coding rate, which are proportional
to (-sqrt{frac{L}{n}}) for any fixed error probability (<1/2). If (L) grows
linearly in (n), we obtain lower and upper bounds on the first-order coding
rate, which coincide whenever the EH random variable is continuous. Our results
suggest that correlation in the energy-arrival process decreases the effective
blocklength by a factor of~(L).
Alessandro Biason, Subhrakanti Dey, Michele Zorzi
Comments: 12 pages, 9 figures, submitted to IEEE Transactions on Mobile Computing, Dec. 2016. arXiv admin note: text overlap with arXiv:1610.07284
Subjects: Information Theory (cs.IT)
Designing decentralized policies for wireless communication networks is a
crucial problem, which has only been partially solved in the literature so far.
In this paper, we propose the Decentralized Markov Decision Process (Dec-MDP)
framework to analyze a wireless sensor network with multiple users which access
a common wireless channel. We consider devices with energy harvesting
capabilities, so that they aim at balancing the energy arrivals with the data
departures and with the probability of colliding with other nodes. Randomly
over time, an access point triggers a SYNC slot, wherein it recomputes the
optimal transmission parameters of the whole network, and distributes this
information. Every node receives its own policy, which specifies how it should
access the channel in the future, and, thereafter, proceeds in a fully
decentralized fashion, without interacting with other entities in the network.
We propose a multi-layer Markov model, where an external MDP manages the jumps
between SYNC slots, and an internal Dec-MDP computes the optimal policy in the
near future. We numerically show that, because of the harvesting, a fully
orthogonal scheme (e.g., TDMA-like) is suboptimal in energy harvesting
scenarios, and the optimal trade-off lies between an orthogonal and a random
access system.
Shudi Yang, Xiangli Kong, Chunming Tang
Comments: 19 pages
Subjects: Information Theory (cs.IT)
Recently, linear codes constructed from defining sets have been studied
extensively. They may have nice parameters if the defining set is chosen
properly. Let ( m >2) be a positive integer. For an odd prime ( p ), let (
r=p^m ) and ( ext{Tr}) be the absolute trace function from (mathbb{F}_r) onto
(mathbb{F}_p). In this paper, we give a construction of linear codes by
defining the code ( C_{D}={(mathrm{Tr}(ax))_{xin D}: a in mathbb{F}_{r}
}, ) where ( D =left{ xin mathbb{F}_{r} : mathrm{Tr}(x)=1,
mathrm{Tr}(x^2)=0
ight}. ) Its complete weight enumerator and weight
enumerator are determined explicitly by employing cyclotomic numbers and Gauss
sums. In addition, we obtain several optimal linear codes with a few weights.
They have higher rate compared with other codes, which enables them to have
essential applications in areas such as association schemes and secret sharing
schemes.
Abhishek K. Gupta, Jeffrey G. Andrews, Robert W. Heath Jr
Subjects: Information Theory (cs.IT)
Blocking objects (blockages) between a transmitter and receiver cause
wireless communication links to transition from line-of-sight (LOS) to
non-line-of-sight (NLOS) propagation, which can greatly reduce the received
power, particularly at higher frequencies such as millimeter wave (mmWave). We
consider a cellular network in which a mobile user attempts to connect to two
or more base stations (BSs) simultaneously, to increase the probability of at
least one LOS link, which is a form of macrodiversity. We develop a framework
for determining the LOS probability as a function of the number of BSs, when
taking into account the correlation between blockages: for example, a single
blockage close to the device — including the user’s own body — could block
multiple BSs. We consider the impact of the size of blocking objects on the
system reliability probability and show that macrodiversity gains are higher
when the blocking objects are small. We also show that the BS density must
scale as the square of the blockage density to maintain a given level of
reliability.
Xiugang Wu, Leighton Pate Barnes, Ayfer Ozgur
Comments: Submitted to IEEE Trans. on Information Theory
Subjects: Information Theory (cs.IT)
Consider a memoryless relay channel, where the channel from the relay to the
destination is an isolated bit pipe of capacity (C_0). Let (C(C_0)) denote the
capacity of this channel as a function of (C_0). What is the critical value of
(C_0) such that (C(C_0)) first equals (C(infty))? This is a long-standing open
problem posed by Cover and named “The Capacity of the Relay Channel,” in (Open
Problems in Communication and Computation), Springer-Verlag, 1987. In
this paper, we answer this question in the Gaussian case and show that (C(C_0))
can not equal to (C(infty)) unless (C_0=infty), regardless of the SNR of the
Gaussian channels, while the cut-set bound would suggest that (C(infty)) can
be achieved at finite (C_0). Our approach is geometric and relies on a
strengthening of the isoperimetric inequality on the sphere by using the Riesz
rearrangement inequality.
Khalil Elkhalil, Abla Kammoun, Tareq Y. Al-Naffouri, Mohamed-Slim Alouini
Comments: Submitted to IEEE signal processing letters
Subjects: Information Theory (cs.IT)
This paper is focuses on the computation of the positive moments of one-side
correlated random Gram matrices. Closed-form expressions for the moments can be
obtained easily, but numerical evaluation thereof is prone to numerical
stability, especially in high-dimensional settings. This letter provides a
numerically stable method that efficiently computes the positive moments in
closed-form. The developed expressions are more accurate and can lead to higher
accuracy levels when fed to moment based-approaches. As an application, we show
how the obtained moments can be used to approximate the marginal distribution
of the eigenvalues of random Gram matrices.
Yang You, Chong Qin, Yi Gong
Subjects: Information Theory (cs.IT)
Full-duplex is a promising technology and great system performance
enhancement can be achieved especially when we apply it to base stations. In
this paper, we consider a Multi-User OFDMA network consisting of one
Full-duplex base station, several uplink users, and a number of downlink users.
A joint subcarrier, downlink user and uplink user mapping and power allocation
optimization problem is investigated. In detail, we first decompose the
3-dimensional mapping problem into three 2-dimensional sub-problems and solve
them by iteratively using classical Hungarian Method. Then, based on the dual
method, we sequentially solve the power allocation and 3-dimensional mapping
problem at each dual point. And the optimal solution to the dual problem is
derived by using sub-gradient method. Unlike existing methods that only solve
one sub-problem but with a high computation complexity, we tackle both of them
and successfully reducing computation complexity from exponential to polynomial
order. Numerical simulations are conducted to verify the proposed system.
Alexander Zhdanov
Subjects: Information Theory (cs.IT)
In this paper, we consider coding of short data frames (192 bits) by IRA
codes. A new interleaver for the IRA codes based on a Gruenbaum graph is
proposed. The difference of the proposed algorithm from known methods consists
in the following: permutation is performed by using a match smaller interleaver
which is derived from the Gruenbaum graph by finding in this graph a
Hamiltonian path, enumerating the passed vertices in ascending order and
passing them again in the permuted order through the edges which are not
included in the Hamiltonian path. For the IRA code the obtained interleaver
provides 0.7-0.8 db gain over a convolutional code decoded by Viterbi
algorithm.
Annina Bracher, Eran Hof, Amos Lapidoth
Comments: 76 pages, submitted to IEEE Transactions on Information Theory
Subjects: Information Theory (cs.IT)
The secrecy of a distributed-storage system for passwords is studied. The
encoder, Alice, observes a length-n password and describes it using two hints,
which she stores in different locations. The legitimate receiver, Bob, observes
both hints. In one scenario the requirement is that the expected number of
guesses it takes Bob to guess the password approach one as n tends to infinity,
and in the other that the expected size of the shortest list that Bob must form
to guarantee that it contain the password approach one. The eavesdropper, Eve,
sees only one of the hints. Assuming that Alice cannot control which hints Eve
observes, the largest normalized (by n) exponent that can be guaranteed for the
expected number of guesses it takes Eve to guess the password is characterized
for each scenario. Key to the proof are new results on Arikan’s guessing and
Bunte and Lapidoth’s task-encoding problem; in particular, the paper
establishes a close relation between the two problems. A rate-distortion
version of the model is also discussed, as is a generalization that allows for
Alice to produce {delta} (not necessarily two) hints, for Bob to observe {
u}
(not necessarily two) of the hints, and for Eve to observe {eta} (not
necessarily one) of the hints. The generalized model is robust against {delta}
– {
u} disk failures.
Igal Sason, Sergio Verdú
Comments: Submitted to the IEEE Trans. on Information Theory, September 2016. A shortened version is submitted to ISIT 2017
Subjects: Information Theory (cs.IT); Probability (math.PR); Statistics Theory (math.ST)
This paper gives upper and lower bounds on the minimum error probability of
Bayesian (M)-ary hypothesis testing in terms of the Arimoto-R’enyi conditional
entropy of an arbitrary order (alpha). The improved tightness of these bounds
over their specialized versions with the Shannon conditional entropy
((alpha=1)) is demonstrated. In particular, in the case where (M) is finite,
we show how to generalize Fano’s inequality under both the conventional and
list-decision settings. As a counterpart to the generalized Fano’s inequality,
allowing (M) to be infinite, a lower bound on the Arimoto-R’enyi conditional
entropy is derived as a function of the minimum error probability. Explicit
upper and lower bounds on the minimum error probability are obtained as a
function of the Arimoto-R’enyi conditional entropy for both positive and
negative (alpha). Furthermore, we give upper bounds on the minimum error
probability as a function of the R’enyi divergence and the Chernoff
information. In the setup of discrete memoryless channels, we analyze the
exponentially vanishing decay of the Arimoto-R’enyi conditional entropy of the
transmitted codeword given the channel output when averaged over a code
ensemble.
Qitian Chen, Dong Kang, Yichu He, Tsung-Hui Chang, Ya-Feng Liu
Comments: 13 pages, 2 figures, Accepted by IEEE Signal Processing Letters
Subjects: Information Theory (cs.IT)
In this letter, we consider the joint power and admission control (JPAC)
problem by assuming that only the channel distribution information (CDI) is
available. Under this assumption, we formulate a new chance (probabilistic)
constrained JPAC problem, where the signal to interference plus noise ratio
(SINR) outage probability of the supported links is enforced to be not greater
than a prespecified tolerance. To efficiently deal with the chance SINR
constraint, we employ the sample approximation method to convert them into
finitely many linear constraints. Then, we propose a convex approximation based
deflation algorithm for solving the sample approximation JPAC problem. Compared
to the existing works, this letter proposes a novel two-timescale JPAC
approach, where admission control is performed by the proposed deflation
algorithm based on the CDI in a large timescale and transmission power is
adapted instantly with fast fadings in a small timescale. The effectiveness of
the proposed algorithm is illustrated by simulations.
Nan Zhao
Subjects: Information Theory (cs.IT)
Interference alignment (IA) is a promising solution for interference
management in wireless networks. On the other hand, simultaneous wireless
information and power transfer (SWIPT) has become an emerging technique.
Although some works have been done on IA and SWIPT, these two important areas
have traditionally been addressed separately in the literature. In this paper,
we propose to use a common framework to jointly study IA and SWIPT. We analyze
the performance of SWIPT in IA networks. Specifically, we derive the upper
bound of the power that can be harvested in IA networks. In addition, we show
that, to improve the performance of wireless power transfer and information
transmission, users should be dynamically selected as energy harvesting (EH) or
information decoding (ID) terminals. Furthermore, we design two
easy-implemented SWIPT-user selection (SWIPT-US) algorithms in IA networks. To
optimize the ID and EH performance of SWIPT in IA networks, a power-splitting
optimization (PSO) algorithm is proposed when power splitters are available,
and its closed-form optimal solutions are derived. Power allocation in the PSO
algorithm is also studied to further optimize the performance. Simulation
results are presented to show the effectiveness of the proposed algorithms.
Eldho K. Thomas, Vincent Y. F. Tan, Alexander Vardy, Mehul Motani
Comments: 4 pages; 1 figure; To appear in the IEEE Communication Letters
Subjects: Information Theory (cs.IT)
We study the application of polar codes in deletion channels by analyzing the
cascade of a binary erasure channel (BEC) and a deletion channel. We show how
polar codes can be used effectively on a BEC with a single deletion, and
propose a list decoding algorithm with a cyclic redundancy check for this case.
The decoding complexity is (O(N^2log N)), where (N) is the blocklength of the
code. An important contribution is an optimization of the amount of redundancy
added to minimize the overall error probability. Our theoretical results are
corroborated by numerical simulations which show that the list size can be
reduced to one and the original message can be recovered with high probability
as the length of the code grows.
Changlong Wang, Jigen Peng
Comments: 22 pages, 7 figures
Subjects: Information Theory (cs.IT)
The joint sparse recovery problem is a generalization of the single
measurement vector problem which is widely studied in Compressed Sensing and it
aims to recovery a set of jointly sparse vectors. i.e. have nonzero entries
concentrated at common location. Meanwhile l_p-minimization subject to matrices
is widely used in a large number of algorithms designed for this problem.
Therefore the main contribution in this paper is two theoretical results about
this technique. The first one is to prove that in every multiple systems of
linear equation, there exists a constant p* such that the original unique
sparse solution also can be recovered from a minimization in l_p quasi-norm
subject to matrices whenever 0< p<p*. The other one is to show an analysis
expression of such p*. Finally, we display the results of one example to
confirm the validity of our conclusions.
Haiyan Guo, Zhen Yang, Linghua Zhang, Jia Zhu, Yulong Zou
Comments: 14 pages, IEEE Transactions on Communications, 2017
Subjects: Information Theory (cs.IT)
In this paper, we examine the physical layer security for cooperative
wireless networks with multiple intermediate nodes, where the
decode-and-forward (DF) protocol is considered. We propose a new joint relay
and jammer selection (JRJS) scheme for protecting wireless communications
against eavesdropping, where an intermediate node is selected as the relay for
the sake of forwarding the source signal to the destination and meanwhile, the
remaining intermediate nodes are employed to act as friendly jammers which
broadcast the artificial noise for disturbing the eavesdropper. We further
investigate the power allocation among the source, relay and friendly jammers
for maximizing the secrecy rate of proposed JRJS scheme and derive a
closed-form sub-optimal solution. Specificially, all the intermediate nodes
which successfully decode the source signal are considered as relay candidates.
For each candidate, we derive the sub-optimal closed-form power allocation
solution and obtain the secrecy rate result of the corresponding JRJS scheme.
Then, the candidate which is capable of achieving the highest secrecy rate is
selected as the relay. Two assumptions about the channel state information
(CSI), namely the full CSI (FCSI) and partial CSI (PCSI), are considered.
Simulation results show that the proposed JRJS scheme outperforms the
conventional pure relay selection, pure jamming and GSVD based beamforming
schemes in terms of secrecy rate. Additionally, the proposed FCSI based power
allocation (FCSI-PA) and PCSI based power allocation (PCSI-PA) schemes both
achieve higher secrecy rates than the equal power allocation (EPA) scheme.
Shota Saito, Hideki Yagi, Toshiyasu Matsushima
Subjects: Information Theory (cs.IT)
This paper investigates the problem of variable-length lossy source coding.
We deal with the case where both the excess distortion probability and the
overflow probability of codeword length are less than or equal to positive
constants. The infimum of the thresholds on the overflow probability is
characterized by a smooth max entropy-based quantity. Both non-asymptotic and
asymptotic cases are analyzed. To show the achievability results, we do not
utilize the random coding argument but give an explicit code construction.
Jiejing Wen, Minghui Yang, Fangwei Fu, Keqin Feng
Comments: 10 pages
Subjects: Information Theory (cs.IT); Combinatorics (math.CO)
Strong external difference family (SEDF) and its generalizations GSEDF,
BGSEDF in a finite abelian group (G) are combinatorial designs raised by
Paterson and Stinson [7] in 2016 and have applications in communication theory
to construct optimal strong algebraic manipulation detection codes. In this
paper we firstly present some general constructions of these combinatorial
designs by using difference sets and partial difference sets in (G). Then, as
applications of the general constructions, we construct series of SEDF, GSEDF
and BGSEDF in finite fields by using cyclotomic classes.
Mathieu Lauriere, Dave Touchette
Comments: v1, 39 pages, 2 figures
Subjects: Quantum Physics (quant-ph); Computational Complexity (cs.CC); Information Theory (cs.IT)
In the context of two-party interactive quantum communication protocols, we
study a recently defined notion of quantum information cost (QIC), which
possesses most of the important properties of its classical analogue. Although
this definition has the advantage to be valid for fully quantum inputs and
tasks, its interpretation for classical tasks remained rather obscure. Also,
the link between this new notion and other notions of information cost for
quantum protocols that had previously appeared in the literature was not clear,
if existent at all.
We settle both these issues: for quantum communication with classical inputs,
we provide an alternate characterization of QIC in terms of information about
the input registers, avoiding any reference to the notion of a purification of
the classical input state. We provide an exact operational interpretation of
this alternative characterization as the sum of the cost of transmitting
information about the classical inputs and the cost of forgetting information
about these inputs. To obtain this characterization, we prove a general lemma,
the Information Flow Lemma, assessing exactly the transfer of information in
general interactive quantum processes. Furthermore, we clarify the link between
QIC and IC of classical protocols by simulating quantumly classical protocols.
Finally, we apply these concepts to argue that any quantum protocol that does
not forget information solves Disjointness on n-bits in Omega (n)
communication, completely losing the quadratic quantum speedup. This provides a
specific sense in which forgetting information is a necessary feature of
interactive quantum protocols. We also apply these concepts to prove that QIC
at zero-error is exactly n for the Inner Product function, and n (1 – o(1)) for
a random Boolean function on n+n bits.
Ryutaroh Matsumoto
Comments: IEEEtran.cls, 4 pages, no figure. Submitted to the 2017 IEEE ISIT symposium
Subjects: Quantum Physics (quant-ph); Information Theory (cs.IT)
We propose a unitary procedure to reconstruct quantum secret for a quantum
secret sharing scheme constructed from stabilizer quantum error-correcting
codes. Erasure correcting procedures for stabilizer codes need to add missing
shares for reconstruction of quantum secret while unitary reconstruction
procedures for certain class of quantum secret sharing are known to work
without adding missing shares. The proposed procedure also works without adding
missing shares.
Zhaoqiang Liu, Vincent Y. F. Tan
Comments: 13 pages, 4 figures, journal
Subjects: Machine Learning (stat.ML); Information Theory (cs.IT); Methodology (stat.ME)
We propose a geometric assumption on nonnegative data matrices such that
under this assumption, we are able to provide upper bounds (both deterministic
and probabilistic) on the relative error of nonnegative matrix factorization
(NMF). The algorithm we propose first uses the geometric assumption to obtain
an exact clustering of the columns of the data matrix; subsequently, it employs
several rank-one NMFs to obtain the final decomposition. When applied to data
matrices generated from our statistical model, we observe that our proposed
algorithm produces factor matrices with comparable relative errors vis-a`-vis
classical NMF algorithms but with much faster speeds. On face image and
hyperspectral imaging datasets, we demonstrate that our algorithm provides an
excellent initialization for applying other NMF algorithms at a low
computational cost. Finally, we show on face and text datasets that the
combinations of our algorithm and several classical NMF algorithms outperform
other algorithms in terms of clustering performance.