Priyank Mathur, Arkajyoti Misra, Emrah Budur
Subjects: Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE)
The increase in the use of microblogging came along with the rapid growth on
short linguistic data. On the other hand deep learning is considered to be the
new frontier to extract meaningful information out of large amount of raw data
in an automated manner. In this study, we engaged these two emerging fields to
come up with a robust language identifier on demand, namely Language
Identification Engine (LIDE). As a result, we achieved 95.12% accuracy in
Discriminating between Similar Languages (DSL) Shared Task 2015 dataset, which
is comparable to the maximum reported accuracy of 95.54% achieved so far.
Mehrdad J. Gangeh, Hamid R. Tizhoosh, Kan Wu, Dun Huang, Hadi Tadayyon, Gregory J. Czarnota
Comments: Accepted at BHI 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Recent advances in using quantitative ultrasound (QUS) methods have provided
a promising framework to non-invasively and inexpensively monitor or predict
the effectiveness of therapeutic cancer responses. One of the earliest steps in
using QUS methods is contouring a region of interest (ROI) inside the tumour in
ultrasound B-mode images. While manual segmentation is a very time-consuming
and tedious task for human experts, auto-contouring is also an extremely
difficult task for computers due to the poor quality of ultrasound B-mode
images. However, for the purpose of cancer response prediction, a rough
boundary of the tumour as an ROI is only needed. In this research, a
semi-automated tumour localization approach is proposed for ROI estimation in
ultrasound B-mode images acquired from patients with locally advanced breast
cancer (LABC). The proposed approach comprised several modules, including 1)
feature extraction using keypoint descriptors, 2) augmenting the feature
descriptors with the distance of the keypoints to the user-input pixel as the
centre of the tumour, 3) supervised learning using a support vector machine
(SVM) to classify keypoints as “tumour” or “non-tumour”, and 4) computation of
an ellipse as an outline of the ROI representing the tumour. Experiments with
33 B-mode images from 10 LABC patients yielded promising results with an
accuracy of 76.7% based on the Dice coefficient performance measure. The
results demonstrated that the proposed method can potentially be used as the
first stage in a computer-assisted cancer response prediction system for
semi-automated contouring of breast tumours.
Anli Lim, Bharath Ramesh, Yue Yang, Cheng Xiang, Zhi Gao, Feng Lin
Comments: Journal Paper
Subjects: Computer Vision and Pattern Recognition (cs.CV)
This paper describes the development of a novel algorithm to tackle the
problem of real-time video stabilization for unmanned aerial vehicles (UAVs).
There are two main components in the algorithm: (1) By designing a suitable
model for the global motion of UAV, the proposed algorithm avoids the necessity
of estimating the most general motion model, projective transformation, and
considers simpler motion models, such as rigid transformation and similarity
transformation. (2) To achieve a high processing speed, optical-flow based
tracking is employed in lieu of conventional tracking and matching methods used
by state-of-the-art algorithms. These two new ideas resulted in a real-time
stabilization algorithm, developed over two phases. Stage I considers
processing the whole sequence of frames in the video while achieving an average
processing speed of 50fps on several publicly available benchmark videos. Next,
Stage II undertakes the task of real-time video stabilization using a
multi-threading implementation of the algorithm designed in Stage I.
Liang Lin, Keze Wang, Deyu Meng, Wangmeng Zuo, Lei Zhang
Comments: To appear in IEEE Transactions on Pattern Analysis and Machine Intelligence 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
This paper aims to develop a novel cost-effective framework for face
identification, which progressively maintains a batch of classifiers with the
increasing face images of different individuals. By naturally combining two
recently rising techniques: active learning (AL) and self-paced learning (SPL),
our framework is capable of automatically annotating new instances and
incorporating them into training under weak expert re-certification. We first
initialize the classifier using a few annotated samples for each individual,
and extract image features using the convolutional neural nets. Then, a number
of candidates are selected from the unannotated samples for classifier
updating, in which we apply the current classifiers ranking the samples by the
prediction confidence. In particular, our approach utilizes the high-confidence
and low-confidence samples in the self-paced and the active user-query way,
respectively. The neural nets are later fine-tuned based on the updated
classifiers. Such heuristic implementation is formulated as solving a concise
active SPL optimization problem, which also advances the SPL development by
supplementing a rational dynamic curriculum constraint. The new model finely
accords with the “instructor-student-collaborative” learning mode in human
education. The advantages of this proposed framework are two-folds: i) The
required number of annotated samples is significantly decreased while the
comparable performance is guaranteed. A dramatic reduction of user effort is
also achieved over other state-of-the-art active learning techniques. ii) The
mixture of SPL and AL effectively improves not only the classifier accuracy
compared to existing AL/SPL methods but also the robustness against noisy data.
We evaluate our framework on two challenging datasets, and demonstrate very
promising results. (this http URL)
Keze Wang, Dongyu Zhang, Ya Li, Ruimao Zhang, Liang Lin
Comments: Accepted by IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) 2016
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Recent successes in learning-based image classification, however, heavily
rely on the large number of annotated training samples, which may require
considerable human efforts. In this paper, we propose a novel active learning
framework, which is capable of building a competitive classifier with optimal
feature representation via a limited amount of labeled training instances in an
incremental learning manner. Our approach advances the existing active learning
methods in two aspects. First, we incorporate deep convolutional neural
networks into active learning. Through the properly designed framework, the
feature representation and the classifier can be simultaneously updated with
progressively annotated informative samples. Second, we present a
cost-effective sample selection strategy to improve the classification
performance with less manual annotations. Unlike traditional methods focusing
on only the uncertain samples of low prediction confidence, we especially
discover the large amount of high confidence samples from the unlabeled set for
feature learning. Specifically, these high confidence samples are automatically
selected and iteratively assigned pseudo-labels. We thus call our framework
“Cost-Effective Active Learning” (CEAL) standing for the two
advantages.Extensive experiments demonstrate that the proposed CEAL framework
can achieve promising results on two challenging image classification datasets,
i.e., face recognition on CACD database [1] and object categorization on
Caltech-256 [2].
Utku Aydonat, Shane O'Connell, Davor Capalija, Andrew C. Ling, Gordon R. Chiu
Comments: To be published at FPGA 2017
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Hardware Architecture (cs.AR); Computer Vision and Pattern Recognition (cs.CV)
Convolutional neural nets (CNNs) have become a practical means to perform
vision tasks, particularly in the area of image classification. FPGAs are well
known to be able to perform convolutions efficiently, however, most recent
efforts to run CNNs on FPGAs have shown limited advantages over other devices
such as GPUs. Previous approaches on FPGAs have often been memory bound due to
the limited external memory bandwidth on the FPGA device. We show a novel
architecture written in OpenCL(TM), which we refer to as a Deep Learning
Accelerator (DLA), that maximizes data reuse and minimizes external memory
bandwidth. Furthermore, we show how we can use the Winograd transform to
significantly boost the performance of the FPGA. As a result, when running our
DLA on Intel’s Arria 10 device we can achieve a performance of 1020 img/s, or
23 img/s/W when running the AlexNet CNN benchmark. This comes to 1382 GFLOPs
and is 10x faster with 8.4x more GFLOPS and 5.8x better efficiency than the
state-of-the-art on FPGAs. Additionally, 23 img/s/W is competitive against the
best publicly known implementation of AlexNet on nVidia’s TitanX GPU.
Zimi Li, Nir Oren, Simon Parsons
Subjects: Artificial Intelligence (cs.AI)
In this paper we investigate the links between instantiated argumentation
systems and the axioms for non-monotonic reasoning described in [9] with the
aim of characterising the nature of argument based reasoning. In doing so, we
consider two possible interpretations of the consequence relation, and describe
which axioms are met by ASPIC+ under each of these interpretations. We then
consider the links between these axioms and the rationality postulates. Our
results indicate that argument based reasoning as characterised by ASPIC+ is –
according to the axioms of [9] – non-cumulative and non-monotonic, and
therefore weaker than the weakest non-monotonic reasoning systems they
considered possible. This weakness underpins ASPIC+’s success in modelling
other reasoning systems, and we conclude by considering the relationship
between ASPIC+ and other weak logical systems.
Zhengbing Hu, Yevgeniy V. Bodyanskiy, Oleksii K. Tyshchenko, Viktoriia O. Samitova
Journal-ref: I.J. Intelligent Systems and Applications, 2017, Vol. 9, No. 1,
pp. 67-74
Subjects: Artificial Intelligence (cs.AI)
A fuzzy clustering algorithm for multidimensional data is proposed in this
article. The data is described by vectors whose components are linguistic
variables defined in an ordinal scale. The obtained results confirm the
efficiency of the proposed approach.
Grant Molnar
Comments: 5 pages
Subjects: Artificial Intelligence (cs.AI)
Since Leonard Savage’s epoch-making memoir, Subjective Expected Utility
Theory has been the presumptive model for decision-making. Savage provided an
act-based axiomatization of standard expected utility theory. In this article,
we provide a Savage-like axiomatization of nonstandard expected utility theory.
It corresponds to a weakening of Savage’s (6^{th}) axiom.
Dustin Tran, Matthew D. Hoffman, Rif A. Saurous, Eugene Brevdo, Kevin Murphy, David M. Blei
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG); Programming Languages (cs.PL); Computation (stat.CO)
We propose Edward, a Turing-complete probabilistic programming language.
Edward builds on two compositional representations—random variables and
inference. By treating inference as a first class citizen, on a par with
modeling, we show that probabilistic programming can be as flexible and
computationally efficient as traditional deep learning. For flexibility, Edward
makes it easy to fit the same model using a variety of composable inference
methods, ranging from point estimation, to variational inference, to MCMC. In
addition, Edward can reuse the modeling representation as part of inference,
facilitating the design of rich variational models and generative adversarial
networks. For efficiency, Edward is integrated into TensorFlow, providing
significant speedups over existing probabilistic systems. For example, on a
benchmark logistic regression task, Edward is at least 35x faster than Stan and
PyMC3.
Xiang Xiang, Trac D. Tran
Comments: Codes available at this https URL and this https URL arXiv admin note: text overlap with arXiv:1410.1606
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)
Limited annotated data available for the recognition of facial expression and
action units embarrasses the training of deep networks, which can learn
disentangled invariant features. However, a linear model with just several
parameters normally is not demanding in terms of training data. In this paper,
we propose an elegant linear model to untangle confounding factors in
challenging realistic multichannel signals such as 2D face videos. The simple
yet powerful model does not rely on huge training data and is natural for
recognizing facial actions without explicitly disentangling the identity. Base
on well-understood intuitive linear models such as Sparse Representation based
Classification (SRC), previous attempts require a prepossessing of explicit
decoupling which is practically inexact. Instead, we exploit the low-rank
property across frames to subtract the underlying neutral faces which are
modeled jointly with sparse representation on the action components with group
sparsity enforced. On the extended Cohn-Kanade dataset (CK+), our one-shot
automatic method on raw face videos performs as competitive as SRC applied on
manually prepared action components and performs even better than SRC in terms
of true positive rate. We apply the model to the even more challenging task of
facial action unit recognition, verified on the MPI Face Video Database
(MPI-VDB) achieving a decent performance. All the programs and data have been
made publicly available.
Emrah Budur
Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)
Financial institutions have to screen their transactions to ensure that they
are not affiliated with terrorism entities. Developing appropriate solutions to
detect such affiliations precisely while avoiding any kind of interruption to
large amount of legitimate transactions is essential. In this paper, we present
building blocks of a scalable solution that may help financial institutions to
build their own software to extract terrorism entities out of both structured
and unstructured financial messages in real time and with approximate
similarity matching approach.
Priyank Mathur, Arkajyoti Misra, Emrah Budur
Subjects: Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE)
The increase in the use of microblogging came along with the rapid growth on
short linguistic data. On the other hand deep learning is considered to be the
new frontier to extract meaningful information out of large amount of raw data
in an automated manner. In this study, we engaged these two emerging fields to
come up with a robust language identifier on demand, namely Language
Identification Engine (LIDE). As a result, we achieved 95.12% accuracy in
Discriminating between Similar Languages (DSL) Shared Task 2015 dataset, which
is comparable to the maximum reported accuracy of 95.54% achieved so far.
Seunghyun Yoon, Hyeongu Yun, Yuna Kim, Gyu-tae Park, Kyomin Jung
Comments: AAAI workshop on Crowdsourcing, Deep Learning and Artificial Intelligence Agents, Feb 2017, San Francisco CA, USA
Subjects: Computation and Language (cs.CL)
In this paper, we propose an efficient transfer leaning methods for training
a personalized language model using a recurrent neural network with long
short-term memory architecture. With our proposed fast transfer learning
schemes, a general language model is updated to a personalized language model
with a small amount of user data and a limited computing resource. These
methods are especially useful for a mobile device environment while the data is
prevented from transferring out of the device for privacy purposes. Through
experiments on dialogue data in a drama, it is verified that our transfer
learning methods have successfully generated the personalized language model,
whose output is more similar to the personal language style in both qualitative
and quantitative aspects.
Emrah Budur
Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)
Financial institutions have to screen their transactions to ensure that they
are not affiliated with terrorism entities. Developing appropriate solutions to
detect such affiliations precisely while avoiding any kind of interruption to
large amount of legitimate transactions is essential. In this paper, we present
building blocks of a scalable solution that may help financial institutions to
build their own software to extract terrorism entities out of both structured
and unstructured financial messages in real time and with approximate
similarity matching approach.
Avner May, Alireza Bagheri Garakani, Zhiyun Lu, Dong Guo, Kuan Liu, Aurélien Bellet, Linxi Fan, Michael Collins, Daniel Hsu, Brian Kingsbury, Michael Picheny, Fei Sha
Subjects: Machine Learning (stat.ML); Computation and Language (cs.CL); Learning (cs.LG)
We study large-scale kernel methods for acoustic modeling in speech
recognition and compare their performance to deep neural networks (DNNs). We
perform experiments on four speech recognition datasets, including the TIMIT
and Broadcast News benchmark tasks, and compare these two types of models on
frame-level performance metrics (accuracy, cross-entropy), as well as on
recognition metrics (word/character error rate). In order to scale kernel
methods to these large datasets, we use the random Fourier feature method of
Rahimi and Recht (2007). We propose two novel techniques for improving the
performance of kernel acoustic models. First, in order to reduce the number of
random features required by kernel models, we propose a simple but effective
method for feature selection. The method is able to explore a large number of
non-linear features while maintaining a compact model more efficiently than
existing approaches. Second, we present a number of frame-level metrics which
correlate very strongly with recognition performance when computed on the
heldout set; we take advantage of these correlations by monitoring these
metrics during training in order to decide when to stop learning. This
technique can noticeably improve the recognition performance of both DNN and
kernel models, while narrowing the gap between them. Additionally, we show that
the linear bottleneck method of Sainath et al. (2013) improves the performance
of our kernel models significantly, in addition to speeding up training and
making the models more compact. Together, these three methods dramatically
improve the performance of kernel acoustic models, making their performance
comparable to DNNs on the tasks we explored.
Navid Mirnouri
Comments: 6 pages,2 figures
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
New directions in computing and algorithms has lead to some new applications
that have tolerance to imprecision. Although, These applications are creating
large volumes of data which exceeds the capability of today’s computing
systems. Therefore, researchers are trying to find new techniques to alleviate
this crisis. Approximate Computing is one promising technique that uses a trade
off between precision and efficiency of computing. Acceleration is another
solution that uses specialized logics in order to do computations in a way that
is more power efficient. Another technique is Data compression which is used in
memory systems in order to save capacity and bandwidth.
Christof Schlaak, Maher Fakih, Ralf Stemmer
Comments: Presented at HIP3ES, 2017 7 pages, 6 figures
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Hardware Architecture (cs.AR); Operating Systems (cs.OS)
Timing and power consumption play an important role in the design of embedded
systems. Furthermore, both properties are directly related to the safety
requirements of many embedded systems. With regard to availability
requirements, power considerations are of uttermost importance for battery
operated systems. Validation of timing and power requires observability of
these properties. In many cases this is difficult, because the observability is
either not possible or requires big extra effort in the system validation
process. In this paper, we present a measurement-based approach for the joint
timing and power analysis of Synchronous Dataflow (SDF) applications running on
a shared memory multiprocessor systems-on-chip (MPSoC) architecture. As a
proof-of-concept, we implement an MPSoC system with configurable power and
timing measurement interfaces inside a Field Programmable Gate Array (FPGA).
Our experiments demonstrate the viability of our approach being able of
accurately analyzing different mappings of image processing applications (Sobel
filter and JPEG encoder) on an FPGA-based MPSoC implementation.
Utku Aydonat, Shane O'Connell, Davor Capalija, Andrew C. Ling, Gordon R. Chiu
Comments: To be published at FPGA 2017
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Hardware Architecture (cs.AR); Computer Vision and Pattern Recognition (cs.CV)
Convolutional neural nets (CNNs) have become a practical means to perform
vision tasks, particularly in the area of image classification. FPGAs are well
known to be able to perform convolutions efficiently, however, most recent
efforts to run CNNs on FPGAs have shown limited advantages over other devices
such as GPUs. Previous approaches on FPGAs have often been memory bound due to
the limited external memory bandwidth on the FPGA device. We show a novel
architecture written in OpenCL(TM), which we refer to as a Deep Learning
Accelerator (DLA), that maximizes data reuse and minimizes external memory
bandwidth. Furthermore, we show how we can use the Winograd transform to
significantly boost the performance of the FPGA. As a result, when running our
DLA on Intel’s Arria 10 device we can achieve a performance of 1020 img/s, or
23 img/s/W when running the AlexNet CNN benchmark. This comes to 1382 GFLOPs
and is 10x faster with 8.4x more GFLOPS and 5.8x better efficiency than the
state-of-the-art on FPGAs. Additionally, 23 img/s/W is competitive against the
best publicly known implementation of AlexNet on nVidia’s TitanX GPU.
Guy Avni, Shubham Goel, Thomas A. Henzinger, Guillermo Rodriguez-Navas
Comments: Accepted to TACAS 2017
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Time-triggered switched networks are a deterministic communication
infrastructure used by real-time distributed embedded systems. Due to the
criticality of the applications running over them, developers need to ensure
that end-to-end communication is dependable and predictable. Traditional
approaches assume static networks that are not flexible to changes caused by
reconfigurations or, more importantly, faults, which are dealt with in the
application using redundancy. We adopt the concept of handling faults in the
switches from non-real-time networks while maintaining the required
predictability.
We study a class of forwarding schemes that can handle various types of
failures. We consider probabilistic failures. For a given network with a
forwarding scheme and a constant (ell), we compute the {em score} of the
scheme, namely the probability (induced by faults) that at least (ell)
messages arrive on time. We reduce the scoring problem to a reachability
problem on a Markov chain with a “product-like” structure. Its special
structure allows us to reason about it symbolically, and reduce the scoring
problem to #SAT. Our solution is generic and can be adapted to different
networks and other contexts. Also, we show the computational complexity of the
scoring problem is #P-complete, and we study methods to estimate the score. We
evaluate the effectiveness of our techniques with an implementation.
Bruno Dantas, Calmenelias Fleitas, Alexandre P. Francisco, José Simão, Cátia Vaz
Comments: 19 pages, 10 figures
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Software Engineering (cs.SE)
Biosciences have been revolutionized by next generation sequencing (NGS)
technologies in last years, leading to new perspectives in medical, industrial
and environmental applications. And although our motivation comes from
biosciences, the following is true for many areas of science: published results
are usually hard to reproduce either because data is not available or tools are
not readily available, which delays the adoption of new methodologies and
hinders innovation. Our focus is on tool readiness and pipelines availability.
Even though most tools are freely available, pipelines for data analysis are in
general barely described and their configuration is far from trivial, with many
parameters to be tuned.
In this paper we discuss how to effectively build and use pipelines, relying
on state of the art computing technologies to execute them without users need
to configure, install and manage tools, servers and complex workflow management
systems. We perform an in depth comparative analysis of state of the art
frameworks and systems. The NGSPipes framework is proposed showing that we can
have public pipelines ready to process and analyse experimental data, produced
for instance by high-throughput technologies, but without relying on
centralized servers or Web services.
The NGSPipes framework and underlying architecture provides a major step
towards open science and true collaboration in what concerns tools and
pipelines among computational biology researchers and practitioners. We show
that it is possible to execute data analysis pipelines in a decentralized and
platform independent way. Approaches like the one proposed are crucial for
archiving and reusing data analysis pipelines at medium/long-term. NGSPipes
framework is freely available at this http URL
Joshua J. Daymude, Robert Gmyr, Andrea W. Richa, Christian Scheideler, Thim Strothmann
Subjects: Emerging Technologies (cs.ET); Distributed, Parallel, and Cluster Computing (cs.DC)
In this paper we consider programmable matter that consists of
computationally limited devices (which we call particles) that are able to
self-organize in order to achieve a desired collective goal without the need
for central control or external intervention. Particles can establish and
release bonds and can actively move in a self-organized way. We investigate the
feasibility of solving fundamental problems relevant for programmable matter.
As a model for such self-organizing particle systems, we use the geometric
amoebot model that has received some attention in the last years. We present an
efficient local-control algorithm for leader election, that elects a leader
particle in O(n) rounds with high probability, where n is the number of
particles in the system. Our algorithm relies only on local information (e.g.,
particles do not have ids, nor do they know n, or have any sort of global
coordinate system), and requires only a constant-size memory per particle.
Santiago Badia, Marc Olm
Subjects: Numerical Analysis (cs.NA); Distributed, Parallel, and Cluster Computing (cs.DC)
In this work, we propose two-level space-time domain decomposition
preconditioners for parabolic problems discretized using finite elements. They
are motivated as an extension to space-time of balancing domain decomposition
by constraints preconditioners. The key ingredients to be defined are the
sub-assembled space and operator, the coarse degrees of freedom (DOFs) in which
we want to enforce continuity among subdomains at the preconditioner level, and
the transfer operator from the sub-assembled to the original finite element
space. With regard to the sub-assembled operator, a perturbation of the time
derivative is needed to end up with a well-posed preconditioner. The set of
coarse DOFs includes the time average (at the space-time subdomain) of
classical space constraints plus new constraints between consecutive subdomains
in time. Numerical experiments show that the proposed schemes are weakly
scalable in time, i.e., we can efficiently exploit increasing computational
resources to solve more time steps in the same {total elapsed} time. Further,
the scheme is also weakly space-time scalable, since it leads to asymptotically
constant iterations when solving larger problems both in space and time.
Excellent {wall clock} time weak scalability is achieved for space-time
parallel solvers on some thousands of cores.
Deanna Needell, Tina Woolf
Comments: 5 pages, 2 figures
Subjects: Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC)
Asynchronous parallel computing and sparse recovery are two areas that have
received recent interest. Asynchronous algorithms are often studied to solve
optimization problems where the cost function takes the form (sum_{i=1}^M
f_i(x)), with a common assumption that each (f_i) is sparse; that is, each
(f_i) acts only on a small number of components of (xinmathbb{R}^n). Sparse
recovery problems, such as compressed sensing, can be formulated as
optimization problems, however, the cost functions (f_i) are dense with respect
to the components of (x), and instead the signal (x) is assumed to be sparse,
meaning that it has only (s) non-zeros where (sll n). Here we address how one
may use an asynchronous parallel architecture when the cost functions (f_i) are
not sparse in (x), but rather the signal (x) is sparse. We propose an
asynchronous parallel approach to sparse recovery via a stochastic greedy
algorithm, where multiple processors asynchronously update a vector in shared
memory containing information on the estimated signal support. We include
numerical simulations that illustrate the potential benefits of our proposed
asynchronous method.
Arnim Bleier
Comments: NIPS 2016 Workshop: Advances in Approximate Bayesian Inference
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
Dirichlet process mixture models (DPMM) are a cornerstone of Bayesian
non-parametrics. While these models free from choosing the number of components
a-priori, computationally attractive variational inference often reintroduces
the need to do so, via a truncation on the variational distribution. In this
paper we present a truncation-free hybrid inference for DPMM, combining the
advantages of sampling-based MCMC and variational methods. The proposed
hybridization enables more efficient variational updates, while increasing
model complexity only if needed. We evaluate the properties of the hybrid
updates and their empirical performance in single- as well as mixed-membership
models. Our method is easy to implement and performs favorably compared to
existing schemas.
Valeriya Naumova, Karin Schnass
Comments: 22 pages, 8 figures
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
This paper extends the recently proposed and theoretically justified
iterative thresholding and (K) residual means algorithm ITKrM to learning
dicionaries from incomplete/masked training data (ITKrMM). It further adapts
the algorithm to the presence of a low rank component in the data and provides
a strategy for recovering this low rank component again from incomplete data.
Several synthetic experiments show the advantages of incorporating information
about the corruption into the algorithm. Finally, image inpainting is
considered as application example, which demonstrates the superior performance
of ITKrMM in terms of speed and/or reconstruction quality compared to its
closest dictionary learning counterpart.
Jielei Chu, Hongjun Wang, Hua Meng, Peng Jin, Tianrui Li (Senior member, IEEE)
Comments: 13pages
Subjects: Learning (cs.LG)
Restricted Boltzmann machines (RBMs) and their variants are usually trained
by contrastive divergence (CD) learning, but the training procedure is an
unsupervised learning approach, without any guidances of the background
knowledge. To enhance the expression ability of traditional RBMs, in this
paper, we propose pairwise constraints restricted Boltzmann machine with
Gaussian visible units (pcGRBM) model, in which the learning procedure is
guided by pairwise constraints and the process of encoding is conducted under
these guidances. The pairwise constraints are encoded in hidden layer features
of pcGRBM. Then, some pairwise hidden features of pcGRBM flock together and
another part of them are separated by the guidances. In order to deal with
real-valued data, the binary visible units are replaced by linear units with
Gausian noise in the pcGRBM model. In the learning process of pcGRBM, the
pairwise constraints are iterated transitions between visible and hidden units
during CD learning procedure. Then, the proposed model is inferred by
approximative gradient descent method and the corresponding learning algorithm
is designed in this paper. In order to compare the availability of pcGRBM and
traditional RBMs with Gaussian visible units, the features of the pcGRBM and
RBMs hidden layer are used as input ‘data’ for K-means, spectral clustering
(SP) and affinity propagation (AP) algorithms, respectively. A thorough
experimental evaluation is performed with sixteen image datasets of Microsoft
Research Asia Multimedia (MSRA-MM). The experimental results show that the
clustering performance of K-means, SP and AP algorithms based on pcGRBM model
are significantly better than traditional RBMs. In addition, the pcGRBM model
for clustering task shows better performance than some semi-supervised
clustering algorithms.
Jan Žegklitz, Petr Pošík
Comments: Submitted to Journal of Heuristics
Subjects: Learning (cs.LG)
Recently, several algorithms for symbolic regression (SR) emerged which
employ a form of multiple linear regression (LR) to produce generalized linear
models. The use of LR allows the algorithms to create models with relatively
small error right from the beginning of the search; such algorithms are thus
claimed to be (sometimes by orders of magnitude) faster than SR algorithms
based on vanilla genetic programming. However, a systematic comparison of these
algorithms on a common set of problems is still missing. In this paper we
conceptually and experimentally compare several representatives of such
algorithms (GPTIPS, FFX, and EFS). They are applied as off-the-shelf,
ready-to-use techniques, mostly using their default settings. The methods are
compared on several synthetic and real-world SR benchmark problems. Their
performance is also related to the performance of three conventional machine
learning algorithms — multiple regression, random forests and support vector
regression.
Riccardo Satta, Stefano Cavallari, Eraldo Pomponi, Daniele Grasselli, Davide Picheo, Carlo Annis
Comments: keywords: predictive maintenance, condition-based maintenance, prognosis, machine learning, dissimilarity-based representation, HVAC. 15 pages
Subjects: Learning (cs.LG)
The goal of predictive maintenance is to forecast the occurrence of faults of
an appliance, in order to proactively take the necessary actions to ensure its
availability. In many application scenarios, predictive maintenance is applied
to a set of homogeneous appliances. In this paper, we firstly review taxonomies
and main methodologies currently used for condition-based maintenance;
secondly, we argue that the mutual dissimilarities of the behaviours of all
appliances of this set (the “cohort”) can be exploited to detect upcoming
faults. Specifically, inspired by dissimilarity-based representations, we
propose a novel machine learning approach based on the analysis of concurrent
mutual differences of the measurements coming from the cohort. We evaluate our
method over one year of historical data from a cohort of 17 HVAC (Heating,
Ventilation and Air Conditioning) systems installed in an Italian hospital. We
show that certain kinds of faults can be foreseen with an accuracy, measured in
terms of area under the ROC curve, as high as 0.96.
Deanna Needell, Tina Woolf
Comments: 5 pages, 2 figures
Subjects: Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC)
Asynchronous parallel computing and sparse recovery are two areas that have
received recent interest. Asynchronous algorithms are often studied to solve
optimization problems where the cost function takes the form (sum_{i=1}^M
f_i(x)), with a common assumption that each (f_i) is sparse; that is, each
(f_i) acts only on a small number of components of (xinmathbb{R}^n). Sparse
recovery problems, such as compressed sensing, can be formulated as
optimization problems, however, the cost functions (f_i) are dense with respect
to the components of (x), and instead the signal (x) is assumed to be sparse,
meaning that it has only (s) non-zeros where (sll n). Here we address how one
may use an asynchronous parallel architecture when the cost functions (f_i) are
not sparse in (x), but rather the signal (x) is sparse. We propose an
asynchronous parallel approach to sparse recovery via a stochastic greedy
algorithm, where multiple processors asynchronously update a vector in shared
memory containing information on the estimated signal support. We include
numerical simulations that illustrate the potential benefits of our proposed
asynchronous method.
Dustin Tran, Matthew D. Hoffman, Rif A. Saurous, Eugene Brevdo, Kevin Murphy, David M. Blei
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG); Programming Languages (cs.PL); Computation (stat.CO)
We propose Edward, a Turing-complete probabilistic programming language.
Edward builds on two compositional representations—random variables and
inference. By treating inference as a first class citizen, on a par with
modeling, we show that probabilistic programming can be as flexible and
computationally efficient as traditional deep learning. For flexibility, Edward
makes it easy to fit the same model using a variety of composable inference
methods, ranging from point estimation, to variational inference, to MCMC. In
addition, Edward can reuse the modeling representation as part of inference,
facilitating the design of rich variational models and generative adversarial
networks. For efficiency, Edward is integrated into TensorFlow, providing
significant speedups over existing probabilistic systems. For example, on a
benchmark logistic regression task, Edward is at least 35x faster than Stan and
PyMC3.
Ori Katz, Ronen Talmon, Yu-Lun Lo, Hau-Tieng Wu
Subjects: Machine Learning (stat.ML); Learning (cs.LG); Data Analysis, Statistics and Probability (physics.data-an)
The problem of information fusion from multiple data-sets acquired by
multimodal sensors has drawn significant research attention over the years. In
this paper, we focus on a particular problem setting consisting of a physical
phenomenon or a system of interest observed by multiple sensors. We assume that
all sensors measure some aspects of the system of interest with additional
sensor-specific and irrelevant components. Our goal is to recover the variables
relevant to the observed system and to filter out the nuisance effects of the
sensor-specific variables. We propose an approach based on manifold learning,
which is particularly suitable for problems with multiple modalities, since it
aims to capture the intrinsic structure of the data and relies on minimal prior
model knowledge. Specifically, we propose a nonlinear filtering scheme, which
extracts the hidden sources of variability captured by two or more sensors,
that are independent of the sensor-specific components. In addition to
presenting a theoretical analysis, we demonstrate our technique on real
measured data for the purpose of sleep stage assessment based on multiple,
multimodal sensor measurements. We show that without prior knowledge on the
different modalities and on the measured system, our method gives rise to a
data-driven representation that is well correlated with the underlying sleep
process and is robust to noise and sensor-specific effects.
Avner May, Alireza Bagheri Garakani, Zhiyun Lu, Dong Guo, Kuan Liu, Aurélien Bellet, Linxi Fan, Michael Collins, Daniel Hsu, Brian Kingsbury, Michael Picheny, Fei Sha
Subjects: Machine Learning (stat.ML); Computation and Language (cs.CL); Learning (cs.LG)
We study large-scale kernel methods for acoustic modeling in speech
recognition and compare their performance to deep neural networks (DNNs). We
perform experiments on four speech recognition datasets, including the TIMIT
and Broadcast News benchmark tasks, and compare these two types of models on
frame-level performance metrics (accuracy, cross-entropy), as well as on
recognition metrics (word/character error rate). In order to scale kernel
methods to these large datasets, we use the random Fourier feature method of
Rahimi and Recht (2007). We propose two novel techniques for improving the
performance of kernel acoustic models. First, in order to reduce the number of
random features required by kernel models, we propose a simple but effective
method for feature selection. The method is able to explore a large number of
non-linear features while maintaining a compact model more efficiently than
existing approaches. Second, we present a number of frame-level metrics which
correlate very strongly with recognition performance when computed on the
heldout set; we take advantage of these correlations by monitoring these
metrics during training in order to decide when to stop learning. This
technique can noticeably improve the recognition performance of both DNN and
kernel models, while narrowing the gap between them. Additionally, we show that
the linear bottleneck method of Sainath et al. (2013) improves the performance
of our kernel models significantly, in addition to speeding up training and
making the models more compact. Together, these three methods dramatically
improve the performance of kernel acoustic models, making their performance
comparable to DNNs on the tasks we explored.
Adel Javanmard
Comments: 23 pages
Subjects: Computer Science and Game Theory (cs.GT); Learning (cs.LG); Machine Learning (stat.ML)
We consider a firm that sells a large number of products to its customers in
an online fashion. Each product is described by a high dimensional feature
vector, and the market value of a product is assumed to be linear in the values
of its features. Parameters of the valuation model are unknown and can change
over time. The firm sequentially observes a product’s features and can use the
historical sales data (binary sale/no sale feedbacks) to set the price of
current product, with the objective of maximizing the collected revenue. We
measure the performance of a dynamic pricing policy via regret, which is the
expected revenue loss compared to a clairvoyant that knows the sequence of
model parameters in advance.
We propose a pricing policy based on projected stochastic gradient descent
(PSGD) and characterize its regret in terms of time (T), features dimension
(d), and the temporal variability in the model parameters, (delta_t). We
consider two settings. In the first one, feature vectors are chosen
antagonistically by nature and we prove that the regret of PSGD pricing policy
is of order (O(sqrt{T} + sum_{t=1}^T sqrt{t}delta_t)). In the second
setting (referred to as stochastic features model), the feature vectors are
drawn independently from an unknown distribution. We show that in this case,
the regret of PSGD pricing policy is of order (O(d^2 log T + sum_{t=1}^T
tdelta_t)).
Xiang Xiang, Trac D. Tran
Comments: Codes available at this https URL and this https URL arXiv admin note: text overlap with arXiv:1410.1606
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)
Limited annotated data available for the recognition of facial expression and
action units embarrasses the training of deep networks, which can learn
disentangled invariant features. However, a linear model with just several
parameters normally is not demanding in terms of training data. In this paper,
we propose an elegant linear model to untangle confounding factors in
challenging realistic multichannel signals such as 2D face videos. The simple
yet powerful model does not rely on huge training data and is natural for
recognizing facial actions without explicitly disentangling the identity. Base
on well-understood intuitive linear models such as Sparse Representation based
Classification (SRC), previous attempts require a prepossessing of explicit
decoupling which is practically inexact. Instead, we exploit the low-rank
property across frames to subtract the underlying neutral faces which are
modeled jointly with sparse representation on the action components with group
sparsity enforced. On the extended Cohn-Kanade dataset (CK+), our one-shot
automatic method on raw face videos performs as competitive as SRC applied on
manually prepared action components and performs even better than SRC in terms
of true positive rate. We apply the model to the even more challenging task of
facial action unit recognition, verified on the MPI Face Video Database
(MPI-VDB) achieving a decent performance. All the programs and data have been
made publicly available.
Rafah El-Khatib, Nicolas Macris, Tom Richardson, Rüdiger Urbanke
Comments: 5 pages, 1 figure, submitted to the IEEE International Symposium on Information Theory (ISIT) 2014
Subjects: Information Theory (cs.IT)
Potential functionals have been introduced recently as an important tool for
the analysis of coupled scalar systems (e.g. density evolution equations). In
this contribution, we investigate interesting properties of this potential.
Using the tool of displacement convexity, we show that, under mild assumptions
on the system, the potential functional is displacement convex. Furthermore, we
give the conditions on the system such that the potential is strictly
displacement convex, in which case the minimizer is unique.
Yuhua Sun, Tongjiang Yan, Qiang Wang
Comments: 15 pages
Subjects: Information Theory (cs.IT)
Pseudo-random sequences with good statistical property, such as low
autocorrelation, high linear complexity and large 2-adic complexity, have been
applied in stream cipher. In general, it is difficult to give both the linear
complexity and 2-adic complexity of a periodic binary sequence. Cai and Ding
cite{Cai Ying} gave a class of sequences with almost optimal autocorrelation
by constructing almost difference sets. Wang cite{Wang Qi} proved that one
type of those sequences by Cai and Ding has large linear complexity. Sun et al.
cite{Sun Yuhua} showed that another type of sequences by Cai and Ding has also
large linear complexity. Additionally, Sun et al. also generalized the
construction by Cai and Ding using (d)-form function with difference-balanced
property. In this paper, we first give the detailed autocorrelation
distribution of the sequences was generalized from Cai and Ding cite{Cai Ying}
by Sun et al. cite{Sun Yuhua}. Then, inspired by the method of Hu cite{Hu
Honggang}, we analyse their 2-adic complexity and give a lower bound on the
2-adic complexity of these sequences. Our result show that the 2-adic
complexity of these sequences is at least (N-mathrm{log}_2sqrt{N+1}) and that
it reach (N-1) in many cases, which are large enough to resist the rational
approximation algorithm (RAA) for feedback with carry shift registers (FCSRs).
Rafah El-Khatib, Nicolas Macris
Comments: 5 pages, 3 figures, submitted to the IEEE International Symposium on Information Theory (ISIT) 2016
Subjects: Information Theory (cs.IT)
We consider the dynamics of belief propagation decoding of spatially coupled
Low-Density Parity-Check codes. It has been conjectured that after a short
transient phase, the profile of “error probabilities” along the spatial
direction of a spatially coupled code develops a uniquely-shaped wave-like
solution that propagates with constant velocity v. Under this assumption, and
for transmission over general Binary Memoryless Symmetric channels, we derive a
formula for v. We also propose approximations that are simpler to compute and
support our findings using numerical data.
Rafah El-Khatib, Nicolas Macris
Comments: 5 pages, 5 figures, submitted to the Information Theory Workshop (ITW) 2016 in Cambridge, UK
Subjects: Information Theory (cs.IT)
We consider spatially coupled systems governed by a set of scalar density
evolution equations. Such equations track the behavior of message-passing
algorithms used, for example, in coding, sparse sensing, or
constraint-satisfaction problems. Assuming that the “profile” describing the
average state of the algorithm exhibits a solitonic wave-like behavior after
initial transient iterations, we derive a formula for the propagation velocity
of the wave. We illustrate the formula with two applications, namely
Generalized LDPC codes and compressive sensing.
Anqi He, Lifeng Wang, Yue Chen, Kai-Kit Wong, Maged Elkashlan
Comments: 13 pages, 9 figures
Subjects: Information Theory (cs.IT)
One of key 5G scenarios is that device-to-device (D2D) and massive
multiple-input multiple-output (MIMO) will be co-existed. However, interference
in the uplink D2D underlaid massive MIMO cellular networks needs to be
coordinated, due to the vast cellular and D2D transmissions. To this end, this
paper introduces a spatially dynamic power control solution for mitigating the
cellular-to-D2D and D2D-to-cellular interference. In particular, the proposed
D2D power control policy is rather flexible including the special cases of no
D2D links or using maximum transmit power. Under the considered power control,
an analytical approach is developed to evaluate the spectral efficiency (SE)
and energy efficiency (EE) in such networks. Thus, the exact expressions of SE
for a cellular user or D2D transmitter are derived, which quantify the impacts
of key system parameters such as massive MIMO antennas and D2D density.
Moreover, the D2D scale properties are obtained, which provide the sufficient
conditions for achieving the anticipated SE. Numerical results corroborate our
analysis and show that the proposed power control solution can efficiently
mitigate interference between the cellular and D2D tier. The results
demonstrate that there exists the optimal D2D density for maximizing the area
SE of D2D tier. In addition, the achievable EE of a cellular user can be
comparable to that of a D2D user.
K. Li, J. Zhang, Z. Li, S. Cong
Subjects: Information Theory (cs.IT); Quantum Physics (quant-ph)
Quantum state tomography is a fundamental technique for quantum technology,
with many applications in quantum control and quantum communication. Due to the
exponential complexity of the resources required for QST, people are looking
for approaches that identify quantum states with less efforts and faster speed.
In this Letter, we provide a tailored efficient method for reconstructing mixed
quantum states up to (12) (or even more) qubits from an incomplete set of
observables subject to noises. Our method is applicable to any pure state
(
ho), and can be extended to many states of interest in quantum information
tasks such as (W) state, GHZ state and cluster states that are matrix product
operators of low dimensions. The method applies the quantum density matrix
constraints to a quantum compressive sensing optimization problem, and exploits
a modified Quantum Alternating Direction Multiplier Method (Quantum-ADMM) to
accelerate the convergence. Our algorithm takes (8,35, 226) seconds to
reconstruct arbitrary superposition state density matrices of (10,11,12) qubits
with acceptable fidelity respectively, using less than (1 \%) measurements of
expectation on a normal desktop, which is the fastest realization to date. We
further discuss applications of this method using experimental data of mixed
states obtained in an ion trap experiment up to (8) qubits.
Soheil Khavari Moghaddam, S. Mohammad Razavizadeh
Comments: 20 pages, 7 figures
Subjects: Information Theory (cs.IT)
3D beamforming is a promising approach for interference coordination in
cellular networks which brings significant improvements in comparison with
conventional 2D beamforming techniques. This paper investigates the problem of
joint beamforming design and tilt angle adaptation of the BS antenna array for
maximizing energy efficiency (EE) in downlink of multi-cell multi-user
coordinated cellular networks. An iterative algorithm based on fractional
programming approach is introduced to solve the resulting non-convex
optimization problem. In each iteration, users are clustered based on their
elevation angle. Then, optimization of the tilt angle is carried out through a
reduced complexity greedy search to find the best tilt angle for a given
placement of the users. Numerical results show that the proposed algorithm
achieves higher EE compared to the 2D beamforming techniques.
Meryem Benammar, Abdellatif Zaidi
Comments: Submitted to IEEE Transactions on Information Theory
Subjects: Information Theory (cs.IT)
In this work, we investigate two source coding models, a emph{Helper}
problem and a emph{Gray-Wyner} problem, under equivocation constraints.
Specifically, in the Helper problem, an encoder communicates with a legitimate
receiver through noise-free rate-limited public and private links; and an
external passive eavesdropper intercepts every information that is sent on the
public link. We study two classes of this model: i) when a pair of arbitrarily
correlated discrete memoryless sources is to be encoded such that one component
has to be recovered lossily at the legitimate receiver while the equivocation
about both components at the eavesdropper must be maintained no smaller than
some prescribed level; and ii) when the legitimate receiver reproduces both
components, one of which, that is recovered losslessly, has to be concealed
from the eavesdropper to some equivocation level. For both classes problems, we
establish single-letter characterizations of optimal
rate-distortion-equivocation tradeoffs in the discrete memoryless case. Next,
we extend our results to the case of two legitimate receivers, i.e., Gray-Wyner
network with equivocation constraints. Here, two legitimate receivers are
connected to the encoder each through a dedicated error-free private link as
well as a common error-free public link; and an external passive eavesdropper
overhears on the public link. We study two classes of this model that are
extensions of the aforementioned instances of Helper problems to the case of
two receivers. For each of the two classes, we establish a single-letter
characterization of the optimal rate-distortion-equivocation region. Throughout
the paper, the analysis sheds light on the role of the private links, and we
illustrate the results by computing them for some binary examples. Also, we
make some meaningful connections, e.g., with problems of secret-sharing and
encryption.
Wenyi Zhang, Lingyan Huang
Comments: 5 pages
Subjects: Information Theory (cs.IT)
OR multi-access channel is a simple model where the channel output is the
Boolean OR among the Boolean channel inputs. We revisit this model, showing
that employing Bloom filter, a randomized data structure, as channel inputs
achieves its capacity region with joint decoding and the symmetric sum rate of
(ln 2) bits per channel use without joint decoding. We then proceed to the
“many-access” regime where the number of potential users grows without bound,
treating both activity recognition and message transmission problems,
establishing scaling laws which are optimal within a constant factor, based on
Bloom filter channel inputs.
Meryem Benammar, Abdellatif Zaidi
Comments: Submitted to IEEE Transactions on Information Theory
Subjects: Information Theory (cs.IT)
In this work, we establish a full single-letter characterization of the
rate-distortion region of an instance of the Gray-Wyner model with side
information at the decoders. Specifically, in this model an encoder observes a
pair of memoryless, arbitrarily correlated, sources ((S^n_1,S^n_2)) and
communicates with two receivers over an error-free rate-limited link of
capacity (R_0), as well as error-free rate-limited individual links of
capacities (R_1) to the first receiver and (R_2) to the second receiver. Both
receivers reproduce the source component (S^n_2) losslessly; and Receiver (1)
also reproduces the source component (S^n_1) lossily, to within some prescribed
fidelity level (D_1). Also, Receiver (1) and Receiver (2) are equipped
respectively with memoryless side information sequences (Y^n_1) and (Y^n_2).
Important in this setup, the side information sequences are arbitrarily
correlated among them, and with the source pair ((S^n_1,S^n_2)); and are not
assumed to exhibit any particular ordering. Furthermore, by specializing the
main result to two Heegard-Berger models with successive refinement and
scalable coding, we shed light on the roles of the common and private
descriptions that the encoder should produce and what they should carry
optimally. We develop intuitions by analyzing the developed single-letter
optimal rate-distortion regions of these models, and discuss some insightful
binary examples.
Fabio Giovanneschi, Kumar Vijay Mishra, Maria Antonia Gonzalez-Huici, Yonina C. Eldar, Joachim H. G. Ender
Comments: 4 pages, IGARRS2017
Subjects: Information Theory (cs.IT)
Sparse decomposition of ground penetration radar (GPR) signals facilitates
the use of compressed sensing techniques for faster data acquisition and
enhanced feature extraction for target classification. In this paper, we
investigate use of an online dictionary learning (ODL) technique in the context
of GPR to bring down the learning time as well as improve identification of
abandoned anti-personnel landmines. Our experimental results using real data
from an L-band GPR for PMN/PMA2, ERA and T72 mines show that ODL reduces
learning time by 94\% and increases clutter detection by 10\% over the
classical K-SVD algorithm. Moreover, our methods could be helpful in cognitive
operation of the GPR where the system adapts the range sampling based on the
learned dictionary.
Erdem Biyik, Jean Barbier, Mohamad Dia
Subjects: Information Theory (cs.IT)
Sparse superposition (SS) codes were originally proposed as a
capacity-achieving communication scheme over the additive white Gaussian noise
channel (AWGNC) [1]. Very recently, it was discovered that these codes are
universal, in the sense that they achieve capacity over any memoryless channel
under generalized approximate message-passing (GAMP) decoding [2], although
this decoder has never been stated for SS codes. In this contribution we
introduce the GAMP decoder for SS codes, we confirm empirically the
universality of this communication scheme through its study on various channels
and we provide the main analysis tools: state evolution and potential. We also
compare the performance of GAMP with the Bayes-optimal MMSE decoder. We
empirically illustrate that despite the presence of a phase transition
preventing GAMP to reach the optimal performance, spatial coupling allows to
boost the performance that eventually tends to capacity in a proper limit. We
also prove that, in contrast with the AWGNC case, SS codes for binary input
channels have a vanishing error floor in the limit of large codewords.
Moreover, the performance of Hadamard-based encoders is assessed for practical
implementations.
J. Harshan, Sang-Yoon Chang, Yih-Chun Hu
Comments: To appear in the Proc. of IEEE WCNC 2017
Subjects: Information Theory (cs.IT)
Physical-layer group secret-key (GSK) generation is an effective way of
generating secret keys in wireless networks, wherein the nodes exploit inherent
randomness in the wireless channels to generate group keys, which are
subsequently applied to secure messages while broadcasting, relaying, and other
network-level communications. While existing GSK protocols focus on securing
the common source of randomness from external eavesdroppers, they assume that
the legitimate nodes of the group are trusted. In this paper, we address
insider attacks from the legitimate participants of the wireless network during
the key generation process. Instead of addressing conspicuous attacks such as
switching-off communication, injecting noise, or denying consensus on group
keys, we introduce stealth attacks that can go undetected against
state-of-the-art GSK schemes. We propose two forms of attacks, namely: (i)
different-key attacks, wherein an insider attempts to generate different keys
at different nodes, especially across nodes that are out of range so that they
fail to recover group messages despite possessing the group key, and (ii)
low-rate key attacks, wherein an insider alters the common source of randomness
so as to reduce the key-rate. We also discuss various detection techniques,
which are based on detecting anomalies and inconsistencies on the channel
measurements at the legitimate nodes. Through simulations we show that GSK
generation schemes are vulnerable to insider-threats, especially on topologies
that cannot support additional secure links between neighbouring nodes to
verify the attacks.
J. Harshan, Amin Sakzad, Emanuele Viterbo
Comments: To appear in the Proc. of IEEE WCNC 2017. Substantial text overlap with arXiv:1308.4201v1
Subjects: Information Theory (cs.IT)
In multiple-input multiple-output (MIMO) fading channels, the design
criterion for full-diversity space-time block codes (STBCs) is primarily
determined by the decoding method at the receiver. Although constructions of
STBCs have predominantly matched the maximum-likelihood (ML) decoder, design
criteria and constructions of full-diversity STBCs have also been reported for
low-complexity linear receivers. A new receiver architecture called
Integer-Forcing (IF) linear receiver has been proposed to MIMO channels by Zhan
et al. which showed promising results for the high-rate V-BLAST encoding
scheme. In this work we address the design of full-diversity STBCs for IF
linear receivers. We derive an upper bound on the probability of decoding
error, and show that STBCs that satisfy the non-vanishing singular value (NVS)
property provide full-diversity for the IF receiver. We also present simulation
results to demonstrate that linear designs with NVS property provide full
diversity for IF receiver. As a special case of our analysis on STBCs, we
present an upper bound on the error probability for the V-BLAST architecture
presented by Zhan emph{et al.}, and demonstrate that the IF linear receivers
provide full receive diversity. Our results supplement the existing outage
probability based results for the IF receiver.
Anoop Thomas, B. Sundar Rajan
Comments: 12 pages with 6 tables
Subjects: Information Theory (cs.IT)
The index coding problem has been generalized recently to accommodate
receivers which demand functions of messages and which possess functions of
messages. The connections between index coding and matroid theory have been
well studied in the recent past. Index coding solutions were first connected to
multi linear representation of matroids. For vector linear index codes discrete
polymatroids which can be viewed as a generalization of the matroids was used.
It was shown that a vector linear solution to an index coding problem exists if
and only if there exists a representable discrete polymatroid satisfying
certain conditions. In this work we explore the connections between generalized
index coding and discrete polymatroids. The conditions that needs to be
satisfied by a representable discrete polymatroid for a generalized index
coding problem to have a vector linear solution is established. From a discrete
polymatroid we construct an index coding problem with coded side information
and shows that if the index coding problem has a certain optimal length
solution then the discrete polymatroid satisfies certain properties. From a
matroid we construct a similar generalized index coding problem and shows that
the index coding problem has a binary scalar linear solution of optimal length
if and only if the matroid is binary representable.
Fariborz Salehi, Kishore Jaganathan, Babak Hassibi
Comments: To appear in ICASSP 2017
Subjects: Information Theory (cs.IT); Optimization and Control (math.OC)
Phaseless super-resolution is the problem of recovering an unknown signal
from measurements of the magnitudes of the low frequency Fourier transform of
the signal. This problem arises in applications where measuring the phase, and
making high-frequency measurements, are either too costly or altogether
infeasible. The problem is especially challenging because it combines the
difficult problems of phase retrieval and classical super-resolution