Ronald T. Kneusel, Michael C. Mozer
Subjects: Human-Computer Interaction (cs.HC); Neural and Evolutionary Computing (cs.NE)
Advances in machine learning have produced systems that attain human-level
performance on certain visual tasks, e.g., object identification. Nonetheless,
other tasks requiring visual expertise are unlikely to be entrusted to machines
for some time, e.g., satellite and medical imagery analysis. We describe a
human-machine cooperative approach to visual search, the aim of which is to
outperform either human or machine acting alone. The traditional route to
augmenting human performance with automatic classifiers is to draw boxes around
regions of an image deemed likely to contain a target. Human experts typically
reject this type of hard highlighting. We propose instead a soft highlighting
technique in which the saliency of regions of the visual field is modulated in
a graded fashion based on classifier confidence level. We report on experiments
with both synthetic and natural images showing that soft highlighting achieves
a performance synergy surpassing that attained by hard highlighting.
Nija Mani, Gursaran, Ashish Mani
Subjects: Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
Quantum inspired Evolutionary Algorithms were proposed more than a decade ago
and have been employed for solving a wide range of difficult search and
optimization problems. A number of changes have been proposed to improve
performance of canonical QEA. However, canonical QEA is one of the few
evolutionary algorithms, which uses a search operator with relatively large
number of parameters. It is well known that performance of evolutionary
algorithms is dependent on specific value of parameters for a given problem.
The advantage of having large number of parameters in an operator is that the
search process can be made more powerful even with a single operator without
requiring a combination of other operators for exploration and exploitation.
However, the tuning of operators with large number of parameters is complex and
computationally expensive. This paper proposes a novel heuristic method for
tuning parameters of canonical QEA. The tuned QEA outperforms canonical QEA on
a class of discrete combinatorial optimization problems which, validates the
design of the proposed parameter tuning framework. The proposed framework can
be used for tuning other algorithms with both large and small number of tunable
parameters.
Jaime Fernando Delgado Saa, Mujdat Cetin
Subjects: Computer Vision and Pattern Recognition (cs.CV); Neurons and Cognition (q-bio.NC)
A brain-computer interface (BCI) is a system that aims for establishing a
non-muscular communication path for subjects who had suffer from a
neurodegenerative disease. Many BCI systems make use of the phenomena of
event-related synchronization and de-synchronization of brain waves as a main
feature for classification of different cognitive tasks. However, the temporal
dynamics of the electroencephalographic (EEG) signals contain additional
information that can be incorporated into the inference engine in order to
improve the performance of the BCIs. This information about the dynamics of the
signals have been exploited previously in BCIs by means of generative and
discriminative methods. In particular, hidden Markov models (HMMs) have been
used in previous works. These methods have the disadvantage that the model
parameters such as the number of hidden states and the number of Gaussian
mixtures need to be fix “a priori”. In this work, we propose a Bayesian
nonparametric model for brain signal classification that does not require “a
priori” selection of the number of hidden states and the number of Gaussian
mixtures of a HMM. The results show that the proposed model outperform other
methods based on HMM as well as the winner algorithm of the BCI competition IV.
Fang Zhao, Jiashi Feng, Jian Zhao, Wenhan Yang, Shuicheng Yan
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Face recognition techniques have been developed significantly in recent
years. However, recognizing faces with partial occlusion is still challenging
for existing face recognizers which is heavily desired in real-world
applications concerning surveillance and security. Although much research
effort has been devoted to developing face de-occlusion methods, most of them
can only work well under constrained conditions, such as all the faces are from
a pre-defined closed set. In this paper, we propose a robust LSTM-Autoencoders
(RLA) model to effectively restore partially occluded faces even in the wild.
The RLA model consists of two LSTM components, which aims at occlusion-robust
face encoding and recurrent occlusion removal respectively. The first one,
named multi-scale spatial LSTM encoder, reads facial patches of various scales
sequentially to output a latent representation, and occlusion-robustness is
achieved owing to the fact that the influence of occlusion is only upon some of
the patches. Receiving the representation learned by the encoder, the LSTM
decoder with a dual channel architecture reconstructs the overall face and
detects occlusion simultaneously, and by feat of LSTM, the decoder breaks down
the task of face de-occlusion into restoring the occluded part step by step.
Moreover, to minimize identify information loss and guarantee face recognition
accuracy over recovered faces, we introduce an identity-preserving adversarial
training scheme to further improve RLA. Extensive experiments on both synthetic
and real datasets of faces with occlusion clearly demonstrate the effectiveness
of our proposed RLA in removing different types of facial occlusion at various
locations. The proposed method also provides significantly larger performance
gain than other de-occlusion methods in promoting recognition performance over
partially-occluded faces.
Jian Shi, Yue Dong, Hao Su, Stella X. Yu
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We consider the non-Lambertian object intrinsic problem of recovering diffuse
albedo, shading, and specular highlights from a single image of an object.
We build a large-scale object intrinsics database based on existing 3D models
in the ShapeNet database. Rendered with realistic environment maps, millions of
synthetic images of objects and their corresponding albedo, shading, and
specular ground-truth images are used to train an encoder-decoder CNN. Once
trained, the network can decompose an image into the product of albedo and
shading components, along with an additive specular component.
Our CNN delivers accurate and sharp results in this classical inverse problem
of computer vision, sharp details attributed to skip layer connections at
corresponding resolutions from the encoder to the decoder. Benchmarked on our
ShapeNet and MIT intrinsics datasets, our model consistently outperforms the
state-of-the-art by a large margin.
We train and test our CNN on different object categories. Perhaps surprising
especially from the CNN classification perspective, our intrinsics CNN
generalizes very well across categories. Our analysis shows that feature
learning at the encoder stage is more crucial for developing a universal
representation across categories.
We apply our synthetic data trained model to images and videos downloaded
from the internet, and observe robust and realistic intrinsics results. Quality
non-Lambertian intrinsics could open up many interesting applications such as
image-based albedo and specular editing.
Lilei Zheng, Ying Zhang, Stefan Duffner, Khalid Idrissi, Christophe Garcia, Atilla Baskurt
Comments: 17 pages, 9 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
This paper presents a deep nonlinear metric learning framework for data
visualization on an image dataset. We propose the Triangular Similarity and
prove its equivalence to the Cosine Similarity in measuring a data pair. Based
on this novel similarity, a geometrically motivated loss function – the
triangular loss – is then developed for optimizing a metric learning system
comprising two identical CNNs. It is shown that this deep nonlinear system can
be efficiently trained by a hybrid algorithm based on the conventional
backpropagation algorithm. More interestingly, benefiting from classical
manifold learning theories, the proposed system offers two different views to
visualize the outputs, the second of which provides better classification
results than the state-of-the-art methods in the visualizable spaces.
Song Wang, Li Sun, Wei Fan, Jun Sun, Satoshi Naoi, Koichi Shirahata, Takuya Fukagai, Yasumoto Tomita, Atsushi Ike
Comments: Submitted to ICME 2017 and all the methods in this paper are patented
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Nowadays the CNN is widely used in practical applications for image
classification task. However the design of the CNN model is very professional
work and which is very difficult for ordinary users. Besides, even for experts
of CNN, to select an optimal model for specific task may still need a lot of
time (to train many different models). In order to solve this problem, we
proposed an automated CNN recommendation system for image classification task.
Our system is able to evaluate the complexity of the classification task and
the classification ability of the CNN model precisely. By using the evaluation
results, the system can recommend the optimal CNN model and which can match the
task perfectly. The recommendation process of the system is very fast since we
don’t need any model training. The experiment results proved that the
evaluation methods are very accurate and reliable.
Keke Tang, Peng Song, Xiaoping Chen
Comments: to be published in the Proceedings of the Asian Conference on Computer Vision (ACCV’2016)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Depth scans acquired from different views may contain nuisances such as
noise, occlusion, and varying point density. We propose a novel Signature of
Geometric Centroids descriptor, supporting direct shape matching on the scans,
without requiring any preprocessing such as scan denoising or converting into a
mesh. First, we construct the descriptor by voxelizing the local shape within a
uniquely defined local reference frame and concatenating geometric centroid and
point density features extracted from each voxel. Second, we compare two
descriptors by employing only corresponding voxels that are both non-empty,
thus supporting matching incomplete local shape such as those close to scan
boundary. Third, we propose a descriptor saliency measure and compute it from a
descriptor-graph to improve shape matching performance. We demonstrate the
descriptor’s robustness and effectiveness for shape matching by comparing it
with three state-of-the-art descriptors, and applying it to object/scene
reconstruction and 3D object recognition.
Shah Rez Khan, Martin Feldman, Bahadir K. Gunturk
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Through pixel-wise optical coding of images during exposure time, it is
possible to extract sub-exposure images from a single capture. Such a
capability can be used for different purposes, including high-speed imaging,
high-dynamic-range imaging and compressed sensing. In this paper, we
demonstrate a sub-exposure image extraction method, where the exposure coding
pattern is inspired from frequency division multiplexing idea of communication
systems. The coding masks modulate sub-exposure images in such a way that they
are placed in non-overlapping regions in Fourier domain. The sub-exposure image
extraction process involves digital filtering of the captured signal with
proper band-pass filters. The prototype imaging system incorporates a Liquid
Crystal over Silicon (LCoS) based spatial light modulator synchronized with a
camera for pixel-wise exposure coding.
Gwangbeen Park, Woobin Im
Comments: 8 pages, 5 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Learning (cs.LG)
We present novel method for image-text multi-modal representation learning.
In our knowledge, this work is the first approach of applying adversarial
learning concept to multi-modal learning and not exploiting image-text pair
information to learn multi-modal feature. We only use category information in
contrast with most previous methods using image-text pair information for
multi-modal embedding. In this paper, we show that multi-modal feature can be
achieved without image-text pair information and our method makes more similar
distribution with image and text in multi-modal feature space than other
methods which use image-text pair information. And we show our multi-modal
feature has universal semantic information, even though it was trained for
category prediction. Our model is end-to-end backpropagation, intuitive and
easily extended to other multi-modal learning work.
Jinho Lee, Brian Kenji Iwana, Shouta Ide, Seiichi Uchida
Comments: 6pages, 8figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Tracking is one of the most important but still difficult tasks in computer
vision and pattern recognition. The main difficulties in the tracking field are
appearance variation and occlusion. Most traditional tracking methods set the
parameters or templates to track target objects in advance and should be
modified accordingly. Thus, we propose a new and robust tracking method using a
Fully Convolutional Network (FCN) to obtain an object probability map and
Dynamic Programming (DP) to seek the globally optimal path through all frames
of video. Our proposed method solves the object appearance variation problem
with the use of a FCN and deals with occlusion by DP. We show that our method
is effective in tracking various single objects through video frames.
Joseph Redmon, Ali Farhadi
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We introduce YOLO9000, a state-of-the-art, real-time object detection system
that can detect over 9000 object categories. First we propose various
improvements to the YOLO detection method, both novel and drawn from prior
work. The improved model, YOLOv2, is state-of-the-art on standard detection
tasks like PASCAL VOC and COCO. At 67 FPS, YOLOv2 gets 76.8 mAP on VOC 2007. At
40 FPS, YOLOv2 gets 78.6 mAP, outperforming state-of-the-art methods like
Faster RCNN with ResNet and SSD while still running significantly faster.
Finally we propose a method to jointly train on object detection and
classification. Using this method we train YOLO9000 simultaneously on the COCO
detection dataset and the ImageNet classification dataset. Our joint training
allows YOLO9000 to predict detections for object classes that don’t have
labelled detection data. We validate our approach on the ImageNet detection
task. YOLO9000 gets 19.7 mAP on the ImageNet detection validation set despite
only having detection data for 44 of the 200 classes. On the 156 classes not in
COCO, YOLO9000 gets 16.0 mAP. But YOLO can detect more than just 200 classes;
it predicts detections for more than 9000 different object categories. And it
still runs in real-time.
Yuyin Zhou, Lingxi Xie, Wei Shen, Elliot Fishman, Alan Yuille
Comments: Submitted to IPMI 2017 (13 pages, 4 figures)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Deep neural networks have been widely adopted for automatic organ
segmentation from CT-scanned images. However, the segmentation accuracy on some
small organs (e.g., the pancreas) is sometimes below satisfaction, arguably
because deep networks are easily distracted by the complex and variable
background region which occupies a large fraction of the input volume.
In this paper, we propose a coarse-to-fine approach to deal with this
problem. We train two deep neural networks using different regions of the input
volume. The first one, the coarse-scaled model, takes the entire volume as its
input. It is used for roughly locating the spatial position of the pancreas.
The second one, the fine-scaled model, only sees a small input region covering
the pancreas, thus eliminating the background noise and providing more accurate
segmentation especially around the boundary areas. At the testing stage, we
first use the coarse-scaled model to roughly locate the pancreas, then adopt
the fine-scaled model to refine the initial segmentation in an iterative manner
to obtain increasingly better segmentation. We evaluate our algorithm on the
NIH pancreas segmentation dataset with 82 volumes, and outperform the
state-of-the-art [18] by more than 4% measured by the Dice-Sorensen Coefficient
(DSC). In addition, we report 62.43% DSC for our worst case, significantly
better than 34.11% reported in [18].
Alexander Kolesnikov, Christoph H. Lampert
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We introduce a new technique for probabilistic modeling of natural images
that combines the advantages of classic multi-scale and modern deep learning
models. By explicitly representing natural images at different scales we derive
a model that can capture high level image structure in a computationally
efficient way. We show experimentally that our model achieves new
state-of-the-art image modeling performance on the CIFAR-10 dataset and at the
same time is much faster than competitive models. We also evaluate the proposed
technique on a human faces dataset and demonstrate the potential of our model
to generate nearly photorealistic face samples.
Benjamin Berkels, Benedikt Wirth
Subjects: Computer Vision and Pattern Recognition (cs.CV); Data Analysis, Statistics and Probability (physics.data-an)
Nowadays, modern electron microscopes deliver images at atomic scale. The
precise atomic structure encodes information about material properties. Thus,
an important ingredient in the image analysis is to locate the centers of the
atoms shown in micrographs as precisely as possible. Here, we consider scanning
transmission electron microscopy (STEM), which acquires data in a rastering
pattern, pixel by pixel. Due to this rastering combined with the magnification
to atomic scale, movements of the specimen even at the nanometer scale lead to
random image distortions that make precise atom localization difficult. Given a
series of STEM images, we derive a Bayesian method that jointly estimates the
distortion in each image and reconstructs the underlying atomic grid of the
material by fitting the atom bumps with suitable bump functions. The resulting
highly non-convex minimization problems are solved numerically with a trust
region approach. Well-posedness of the reconstruction method and the model
behavior for faster and faster rastering are investigated using variational
techniques. The performance of the method is finally evaluated on both
synthetic and real experimental data.
Kaihua Zhang, Xuejun Li, Qingshan Liu
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Video object segmentation is challenging due to the factors like rapidly fast
motion, cluttered backgrounds, arbitrary object appearance variation and shape
deformation. Most existing methods only explore appearance information between
two consecutive frames, which do not make full use of the usefully long-term
nonlocal information that is helpful to make the learned appearance stable, and
hence they tend to fail when the targets suffer from large viewpoint changes
and significant non-rigid deformations. In this paper, we propose a simple yet
effective approach to mine the long-term sptatio-temporally nonlocal appearance
information for unsupervised video segmentation. The motivation of our
algorithm comes from the spatio-temporal nonlocality of the region appearance
reoccurrence in a video. Specifically, we first generate a set of superpixels
to represent the foreground and background, and then update the appearance of
each superpixel with its long-term sptatio-temporally nonlocal counterparts
generated by the approximate nearest neighbor search method with the efficient
KD-tree algorithm. Then, with the updated appearances, we formulate a
spatio-temporal graphical model comprised of the superpixel label consistency
potentials. Finally, we generate the segmentation by optimizing the graphical
model via iteratively updating the appearance model and estimating the labels.
Extensive evaluations on the SegTrack and Youtube-Objects datasets demonstrate
the effectiveness of the proposed method, which performs favorably against some
state-of-art methods.
Shervin Ardeshir, Sandesh Sharma, Ali Broji
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computational Geometry (cs.CG)
Human identification remains to be one of the challenging tasks in computer
vision community due to drastic changes in visual features across different
viewpoints, lighting conditions, occlusion, etc. Most of the literature has
been focused on exploring human re-identification across viewpoints that are
not too drastically different in nature. Cameras usually capture oblique or
side views of humans, leaving room for a lot of geometric and visual reasoning.
Given the recent popularity of egocentric and top-view vision,
re-identification across these two drastically different views can now be
explored. Having an egocentric and a top view video, our goal is to identify
the cameraman in the content of the top-view video, and also re-identify the
people visible in the egocentric video, by matching them to the identities
present in the top-view video. We propose a CRF-based method to address the two
problems. Our experimental results demonstrates the efficiency of the proposed
approach over a variety of video recorded from two views.
Aaditya Prakash, Nick Moran, Solomon Garber, Antonella DiLillo, James Storer
Comments: Accepted to Data Compression Conference, 11 pages, 5 figures
Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
It has long been considered a significant problem to improve the visual
quality of lossy image and video compression. Recent advances in computing
power together with the availability of large training data sets has increased
interest in the application of deep learning cnns to address image recognition
and image processing tasks. Here, we present a powerful cnn tailored to the
specific task of semantic image understanding to achieve higher visual quality
in lossy compression. A modest increase in complexity is incorporated to the
encoder which allows a standard, off-the-shelf jpeg decoder to be used. While
jpeg encoding may be optimized for generic images, the process is ultimately
unaware of the specific content of the image to be compressed. Our technique
makes jpeg content-aware by designing and training a model to identify multiple
semantic regions in a given image. Unlike object detection techniques, our
model does not require labeling of object positions and is able to identify
objects in a single pass. We present a new cnn architecture directed
specifically to image compression, which generates a map that highlights
semantically-salient regions so that they can be encoded at higher quality as
compared to background regions. By adding a complete set of features for every
class, and then taking a threshold over the sum of all feature activations, we
generate a map that highlights semantically-salient regions so that they can be
encoded at a better quality compared to background regions. Experiments are
presented on the Kodak PhotoCD dataset and the MIT Saliency Benchmark dataset,
in which our algorithm achieves higher visual quality for the same compressed
size.
Antoine Saillenfest, Jean-Louis Dessalles, Olivier Auber
Comments: This study was supported by grants from the programme Futur&Ruptures and from the ‘Chaire Modelisation des Imaginaires, Innovation et Creation’, this http URL
Journal-ref: Proceedings of the Seventh International Conference on
Computational Creativity (ICCC-2016). Paris, France
Subjects: Artificial Intelligence (cs.AI)
We propose to apply Simplicity Theory (ST) to model interest in creative
situations. ST has been designed to describe and predict interest in
communication. Here we use ST to derive a decision rule that we apply to a
simplified version of a creative game, the Poietic Generator. The decision rule
produces what can be regarded as an elementary form of creativity. This study
is meant as a proof of principle. It suggests that some creative actions may be
motivated by the search for unexpected simplicity.
Vishal Kakkar, Shirish K. Shevade, S Sundararajan, Dinesh Garg
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)
AUC (Area under the ROC curve) is an important performance measure for
applications where the data is highly imbalanced. Learning to maximize AUC
performance is thus an important research problem. Using a max-margin based
surrogate loss function, AUC optimization problem can be approximated as a
pairwise rankSVM learning problem. Batch learning methods for solving the
kernelized version of this problem suffer from scalability and may not result
in sparse classifiers. Recent years have witnessed an increased interest in the
development of online or single-pass online learning algorithms that design a
classifier by maximizing the AUC performance. The AUC performance of nonlinear
classifiers, designed using online methods, is not comparable with that of
nonlinear classifiers designed using batch learning algorithms on many
real-world datasets. Motivated by these observations, we design a scalable
algorithm for maximizing AUC performance by greedily adding the required number
of basis functions into the classifier model. The resulting sparse classifiers
perform faster inference. Our experimental results show that the level of
sparsity achievable can be order of magnitude smaller than the Kernel RankSVM
model without affecting the AUC performance much.
Samuel L Smith
Comments: 4 pages
Subjects: Artificial Intelligence (cs.AI); Data Structures and Algorithms (cs.DS); Human-Computer Interaction (cs.HC)
Algorithms which sort lists of real numbers into ascending order have been
studied for decades. They are typically based on a series of pairwise
comparisons and run entirely on chip. However people routinely sort lists which
depend on subjective or complex judgements that cannot be automated. Examples
include marketing research; where surveys are used to learn about customer
preferences for products, the recruiting process; where interviewers attempt to
rank potential employees, and sporting tournaments; where we infer team
rankings from a series of one on one matches. We develop a novel sorting
algorithm, where each pairwise comparison reflects a subjective human judgement
about which element is bigger or better. We introduce a finite and large error
rate to each judgement, and we take the cost of each comparison to
significantly exceed the cost of other computational steps. The algorithm must
request the most informative sequence of comparisons from the user; in order to
identify the correct sorted list with minimum human input. Our Discrete
Adiabatic Monte Carlo approach exploits the gradual acquisition of information
by tracking a set of plausible hypotheses which are updated after each
additional comparison.
Nija Mani, Gursaran, Ashish Mani
Subjects: Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
Quantum inspired Evolutionary Algorithms were proposed more than a decade ago
and have been employed for solving a wide range of difficult search and
optimization problems. A number of changes have been proposed to improve
performance of canonical QEA. However, canonical QEA is one of the few
evolutionary algorithms, which uses a search operator with relatively large
number of parameters. It is well known that performance of evolutionary
algorithms is dependent on specific value of parameters for a given problem.
The advantage of having large number of parameters in an operator is that the
search process can be made more powerful even with a single operator without
requiring a combination of other operators for exploration and exploitation.
However, the tuning of operators with large number of parameters is complex and
computationally expensive. This paper proposes a novel heuristic method for
tuning parameters of canonical QEA. The tuned QEA outperforms canonical QEA on
a class of discrete combinatorial optimization problems which, validates the
design of the proposed parameter tuning framework. The proposed framework can
be used for tuning other algorithms with both large and small number of tunable
parameters.
Anuj Karpatne, Gowtham Atluri, James Faghmous, Michael Steinbach, Arindam Banerjee, Auroop Ganguly, Shashi Shekhar, Nagiza Samatova, Vipin Kumar
Comments: submitted to IEEE TKDE (in review)
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Data science models, although successful in a number of commercial domains,
have had limited applicability in scientific problems involving complex
physical phenomena. Theory-guided data science (TGDS) is an emerging paradigm
that aims to leverage the wealth of scientific knowledge for improving the
effectiveness of data science models in enabling scientific discovery. The
overarching vision of TGDS is to introduce scientific consistency as an
essential component for learning generalizable models. Further, by producing
scientifically interpretable models, TGDS aims to advance our scientific
understanding by discovering novel domain insights. Indeed, the paradigm of
TGDS has started to gain prominence in a number of scientific disciplines such
as turbulence modeling, material discovery, quantum chemistry, bio-medical
science, bio-marker discovery, climate science, and hydrology. In this paper,
we formally conceptualize the paradigm of TGDS and present a taxonomy of
research themes in TGDS. We describe several approaches for integrating domain
knowledge in different research themes using illustrative examples from
different disciplines. We also highlight some of the promising avenues of novel
research for realizing the full potential of theory-guided data science.
Giannis Karamanolakis, Elias Iosif, Athanasia Zlatintsi, Aggelos Pikrakis, Alexandros Potamianos
Subjects: Information Retrieval (cs.IR)
The recent development of Audio-based Distributional Semantic Models (ADSMs)
enables the computation of audio and lexical vector representations in a joint
acoustic-semantic space. In this work, these joint representations are applied
to the problem of automatic tag generation. The predicted tags together with
their corresponding acoustic representation are exploited for the construction
of acoustic-semantic clip embeddings. The proposed algorithms are evaluated on
the task of similarity measurement between music clips. Acoustic-semantic
models are shown to outperform the state-of-the-art for this task and produce
high quality tags for audio/music clips.
Kamal Sarkar, Debanjan Das, Indra Banerjee, Mamta Kumari, Prasenjit Biswas
Comments: 8th meeting of Forum for Information Retrieval Evaluation 2016, 2016
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)
In this paper, we describe the methodology used and the results obtained by
us for completing the tasks given under the shared task on Consumer Health
Information Search (CHIS) collocated with the Forum for Information Retrieval
Evaluation (FIRE) 2016, ISI Kolkata. The shared task consists of two sub-tasks
– (1) task1: given a query and a document/set of documents associated with that
query, the task is to classify the sentences in the document as relevant to the
query or not and (2) task 2: the relevant sentences need to be further
classified as supporting the claim made in the query, or opposing the claim
made in the query. We have participated in both the sub-tasks. The percentage
accuracy obtained by our developed system for task1 was 73.39 which is third
highest among the 9 teams participated in the shared task.
Anubhav Gupta, M. Narasimha Murty
Comments: KDD Cup Workshop, 2016
Subjects: Social and Information Networks (cs.SI); Information Retrieval (cs.IR)
Ranking in bibliographic information networks is a widely studied problem due
to its many applications such as advertisement industry, funding, search
engines, etc. Most of the existing works on ranking in bibliographic
information network are based on ranking of research papers and their authors.
But the bibliographic information network can be used for solving other
important problems as well. The KDD Cup (2016) competition considers one such
problem, which is to measure the impact of research institutions, i.e. to
perform ranking of research institutions. The competition took place in three
phases. In this paper, we discuss our solutions for ranking institutions in
each phase. We participated under team name “anu@TASL” and our solutions
achieved the average NDCG@(20) score of (0.7483), ranking in eleventh place in
the contest.
Amir Hossein Akhavan Rahnama
Journal-ref: IEEE 2014 International Conference on Control, Decision and
Information Technologies (CoDIT)
Subjects: Machine Learning (stat.ML); Computation and Language (cs.CL); Databases (cs.DB); Distributed, Parallel, and Cluster Computing (cs.DC); Information Retrieval (cs.IR)
Big data trend has enforced the data-centric systems to have continuous fast
data streams. In recent years, real-time analytics on stream data has formed
into a new research field, which aims to answer queries about
what-is-happening-now with a negligible delay. The real challenge with
real-time stream data processing is that it is impossible to store instances of
data, and therefore online analytical algorithms are utilized. To perform
real-time analytics, pre-processing of data should be performed in a way that
only a short summary of stream is stored in main memory. In addition, due to
high speed of arrival, average processing time for each instance of data should
be in such a way that incoming instances are not lost without being captured.
Lastly, the learner needs to provide high analytical accuracy measures.
Sentinel is a distributed system written in Java that aims to solve this
challenge by enforcing both the processing and learning process to be done in
distributed form. Sentinel is built on top of Apache Storm, a distributed
computing platform. Sentinels learner, Vertical Hoeffding Tree, is a parallel
decision tree-learning algorithm based on the VFDT, with ability of enabling
parallel classification in distributed environments. Sentinel also uses
SpaceSaving to keep a summary of the data stream and stores its summary in a
synopsis data structure. Application of Sentinel on Twitter Public Stream API
is shown and the results are discussed.
Lang-Chi Yu, Hung-yi Lee, Lin-shan Lee
Subjects: Computation and Language (cs.CL)
Headline generation for spoken content is important since spoken content is
difficult to be shown on the screen and browsed by the user. It is a special
type of abstractive summarization, for which the summaries are generated word
by word from scratch without using any part of the original content. Many deep
learning approaches for headline generation from text document have been
proposed recently, all requiring huge quantities of training data, which is
difficult for spoken document summarization. In this paper, we propose an ASR
error modeling approach to learn the underlying structure of ASR error patterns
and incorporate this model in an Attentive Recurrent Neural Network (ARNN)
architecture. In this way, the model for abstractive headline generation for
spoken content can be learned from abundant text data and the ASR data for some
recognizers. Experiments showed very encouraging results and verified that the
proposed ASR error model works well even when the input spoken content is
recognized by a recognizer very different from the one the model learned from.
Karthik Bangalore Mani
Comments: 4 pages,10 figures
Subjects: Computation and Language (cs.CL)
We develop models and extract relevant features for automatic text
summarization and investigate the performance of different models on the DUC
2001 dataset. Two different models were developed, one being a ridge regressor
and the other one was a multi-layer perceptron. The hyperparameters were varied
and their performance were noted. We segregated the summarization task into 2
main steps, the first being sentence ranking and the second step being sentence
selection. In the first step, given a document, we sort the sentences based on
their Importance, and in the second step, in order to obtain non-redundant
sentences, we weed out the sentences that are have high similarity with the
previously selected sentences.
Jiwei Li, Will Monroe, Dan Jurafsky
Subjects: Computation and Language (cs.CL)
While neural networks have been successfully applied to many natural language
processing tasks, they come at the cost of interpretability. In this paper, we
propose a general methodology to analyze and interpret decisions from a neural
model by observing the effects on the model of erasing various parts of the
representation, such as input word-vector dimensions, intermediate hidden
units, or input words. We present several approaches to analyzing the effects
of such erasure, from computing the relative difference in evaluation metrics,
to using reinforcement learning to erase the minimum set of input words in
order to flip a neural model’s decision. In a comprehensive analysis of
multiple NLP tasks, including linguistic feature classification, sentence-level
sentiment analysis, and document level sentiment aspect prediction, we show
that the proposed methodology not only offers clear explanations about neural
model decisions, but also provides a way to conduct error analysis on neural
models.
Konstantinos Pappas, Rada Mihalcea
Comments: 8 pages, 3 figures, 12 tables
Subjects: Computation and Language (cs.CL); Social and Information Networks (cs.SI)
Automatic profiling of social media users is an important task for supporting
a multitude of downstream applications. While a number of studies have used
social media content to extract and study collective social attributes, there
is a lack of substantial research that addresses the detection of a user’s
industry. We frame this task as classification using both feature engineering
and ensemble learning. Our industry-detection system uses both posted content
and profile information to detect a user’s industry with 64.3% accuracy,
significantly outperforming the majority baseline in a taxonomy of fourteen
industry classes. Our qualitative analysis suggests that a person’s industry
not only affects the words used and their perceived meanings, but also the
number and type of emotions being expressed.
Kamal Sarkar
Comments: This work is awarded SPRINGER BEST PAPER AWARD, DPIL track @ 8th meeting of Forum for Information Retrieval Evaluation 2016, 2016
Subjects: Computation and Language (cs.CL)
In this work, we describe a system that detects paraphrases in Indian
Languages as part of our participation in the shared Task on detecting
paraphrases in Indian Languages (DPIL) organized by Forum for Information
Retrieval Evaluation (FIRE) in 2016. Our paraphrase detection method uses a
multinomial logistic regression model trained with a variety of features which
are basically lexical and semantic level similarities between two sentences in
a pair. The performance of the system has been evaluated against the test set
released for the FIRE 2016 shared task on DPIL. Our system achieves the highest
f-measure of 0.95 on task1 in Punjabi language.The performance of our system on
task1 in Hindi language is f-measure of 0.90. Out of 11 teams participated in
the shared task, only four teams participated in all four languages, Hindi,
Punjabi, Malayalam and Tamil, but the remaining 7 teams participated in one of
the four languages. We also participated in task1 and task2 both for all four
Indian Languages. The overall average performance of our system including task1
and task2 overall four languages is F1-score of 0.81 which is the second
highest score among the four systems that participated in all four languages.
Amir Hossein Akhavan Rahnama
Journal-ref: IEEE 2014 International Conference on Control, Decision and
Information Technologies (CoDIT)
Subjects: Machine Learning (stat.ML); Computation and Language (cs.CL); Databases (cs.DB); Distributed, Parallel, and Cluster Computing (cs.DC); Information Retrieval (cs.IR)
Big data trend has enforced the data-centric systems to have continuous fast
data streams. In recent years, real-time analytics on stream data has formed
into a new research field, which aims to answer queries about
what-is-happening-now with a negligible delay. The real challenge with
real-time stream data processing is that it is impossible to store instances of
data, and therefore online analytical algorithms are utilized. To perform
real-time analytics, pre-processing of data should be performed in a way that
only a short summary of stream is stored in main memory. In addition, due to
high speed of arrival, average processing time for each instance of data should
be in such a way that incoming instances are not lost without being captured.
Lastly, the learner needs to provide high analytical accuracy measures.
Sentinel is a distributed system written in Java that aims to solve this
challenge by enforcing both the processing and learning process to be done in
distributed form. Sentinel is built on top of Apache Storm, a distributed
computing platform. Sentinels learner, Vertical Hoeffding Tree, is a parallel
decision tree-learning algorithm based on the VFDT, with ability of enabling
parallel classification in distributed environments. Sentinel also uses
SpaceSaving to keep a summary of the data stream and stores its summary in a
synopsis data structure. Application of Sentinel on Twitter Public Stream API
is shown and the results are discussed.
Antonin Bergeaud, Yoann Potiron, Juste Raimbault
Comments: 28 pages ; 9 figures ; 5 Supplementary Materials
Subjects: Physics and Society (physics.soc-ph); Computation and Language (cs.CL)
In this paper, we extend some usual techniques of classification resulting
from a large-scale data-mining and network approach. This new technology, which
in particular is designed to be suitable to big data, is used to construct an
open consolidated database from raw data on 4 million patents taken from the US
patent office from 1976 onward. To build the pattern network, not only do we
look at each patent title, but we also examine their full abstract and extract
the relevant keywords accordingly. We refer to this classification as semantic
approach in contrast with the more common technological approach which consists
in taking the topology when considering US Patent office technological classes.
Moreover, we document that both approaches have highly different topological
measures and strong statistical evidence that they feature a different model.
This suggests that our method is a useful tool to extract endogenous
information.
Gwangbeen Park, Woobin Im
Comments: 8 pages, 5 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Learning (cs.LG)
We present novel method for image-text multi-modal representation learning.
In our knowledge, this work is the first approach of applying adversarial
learning concept to multi-modal learning and not exploiting image-text pair
information to learn multi-modal feature. We only use category information in
contrast with most previous methods using image-text pair information for
multi-modal embedding. In this paper, we show that multi-modal feature can be
achieved without image-text pair information and our method makes more similar
distribution with image and text in multi-modal feature space than other
methods which use image-text pair information. And we show our multi-modal
feature has universal semantic information, even though it was trained for
category prediction. Our model is end-to-end backpropagation, intuitive and
easily extended to other multi-modal learning work.
Kamal Sarkar, Debanjan Das, Indra Banerjee, Mamta Kumari, Prasenjit Biswas
Comments: 8th meeting of Forum for Information Retrieval Evaluation 2016, 2016
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)
In this paper, we describe the methodology used and the results obtained by
us for completing the tasks given under the shared task on Consumer Health
Information Search (CHIS) collocated with the Forum for Information Retrieval
Evaluation (FIRE) 2016, ISI Kolkata. The shared task consists of two sub-tasks
– (1) task1: given a query and a document/set of documents associated with that
query, the task is to classify the sentences in the document as relevant to the
query or not and (2) task 2: the relevant sentences need to be further
classified as supporting the claim made in the query, or opposing the claim
made in the query. We have participated in both the sub-tasks. The percentage
accuracy obtained by our developed system for task1 was 73.39 which is third
highest among the 9 teams participated in the shared task.
Huamin Li, Yuval Kluger, Mark Tygert
Comments: 17 pages, 21 tables, 6 algorithms in pseudocode
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Numerical Analysis (cs.NA); Numerical Analysis (math.NA); Computation (stat.CO)
As illustrated via numerical experiments with an implementation in Spark (the
popular platform for distributed computation), randomized algorithms solve two
ubiquitous problems: (1) calculating a full principal component analysis or
singular value decomposition of a highly rectangular matrix, and (2)
calculating a low-rank approximation in the form of a singular value
decomposition to an arbitrary matrix. Several optimizations to recently
introduced methods yield results that are uniformly superior to those of the
stock implementations.
Ahmed Mateen, Lareab Chaudhary
Comments: 5 pages
Journal-ref: International Journal of Computer Applications Foundation of
Computer Science (FCS), NY, USA Volume 152 – Number 8 Year of Publication:
2016
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
In this research paper so as to handle Data in warehousing as well as reduce
the wastage of data and provide a better results which takes more and more turn
into a focal point of the data source business. Data warehousing and on-line
analytical processing (OLAP) are vital fundamentals of resolution hold, which
has more and more become a focal point of the database manufacturing. Lots of
marketable yield and services be at the present accessible, and the entire
primary database management organization vendor nowadays have contributions in
the area assessment hold up spaces some quite dissimilar necessities on record
technology compare to conventional on-line transaction giving out application.
This article gives a general idea of data warehousing and OLAP technologies,
with the highlighting on top of their latest necessities. So tools which is
used for extract, clean-up and load information into back end of a information
warehouse; multidimensional data model usual of OLAP; front end client tools
for querying and data analysis; server extension for proficient query
processing; and tools for data managing and for administration the warehouse.
In adding to survey the circumstances of the art, this article also identify a
number of capable research issue, a few which are interrelated to data wastage
troubles. In this paper use some new techniques to reduce the wastage of data,
provide better results. In this paper take some values, put in anova table and
give results through graphs which shows performance.
Ahmed Mateen, Lareab Chaudhary
Comments: 5 pages
Journal-ref: International Journal of Computer Applications Foundation of
Computer Science (FCS), NY, USA Volume 151 – Number 7 Year of Publication:
2016
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
In this research paper so as to handle Information warehousing as well as
online synthetic dispensation OLAP are necessary aspects of conclusion support
which takes more and more turn into a focal point of the data source
business.This paper offers an outline of information warehousing also OLAP
systems with a highlighting on their latest necessities.All of us explain
backside end tackle for extract clean-up and load information into an Data
warehouse multi dimensional data model usual of OLAP frontend user tools for
query and facts evaluation server extension for useful query dispensation and
apparatus for metadata managing and for supervision the stockroom. Insights
centered on complete data on customer actions manufactured goods act and souk
performance are powerful advance and opposition in the internet gap .In this
research conclude the company inspiration and the program and efficiency of
servers working in a data warehouse through use of some new techniques and get
better and efficient results. Data in petabyte scale. This test shows the data
dropping rate in data warehouse. The locomotive is in creation at Yahoo! since
2007 and presently manages more than half a dozen peta bytes of data.
Asim Kadav, Erik Kruus
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Learning (cs.LG)
Emerging workloads, such as graph processing and machine learning are
approximate because of the scale of data involved and the stochastic nature of
the underlying algorithms. These algorithms are often distributed over multiple
machines using bulk-synchronous processing (BSP) or other synchronous
processing paradigms such as map-reduce. However, data parallel processing
primitives such as repeated barrier and reduce operations introduce high
synchronization overheads. Hence, many existing data-processing platforms use
asynchrony and staleness to improve data-parallel job performance. Often, these
systems simply change the synchronous communication to asynchronous between the
worker nodes in the cluster. This improves the throughput of data processing
but results in poor accuracy of the final output since different workers may
progress at different speeds and process inconsistent intermediate outputs.
In this paper, we present ASAP, a model that provides asynchronous and
approximate processing semantics for data-parallel computation. ASAP provides
fine-grained worker synchronization using NOTIFY-ACK semantics that allows
independent workers to run asynchronously. ASAP also provides stochastic reduce
that provides approximate but guaranteed convergence to the same result as an
aggregated all-reduce. In our results, we show that ASAP can reduce
synchronization costs and provides 2-10X speedups in convergence and up to 10X
savings in network costs for distributed machine learning applications and
provides strong convergence guarantees.
Amir Hossein Akhavan Rahnama
Journal-ref: IEEE 2014 International Conference on Control, Decision and
Information Technologies (CoDIT)
Subjects: Machine Learning (stat.ML); Computation and Language (cs.CL); Databases (cs.DB); Distributed, Parallel, and Cluster Computing (cs.DC); Information Retrieval (cs.IR)
Big data trend has enforced the data-centric systems to have continuous fast
data streams. In recent years, real-time analytics on stream data has formed
into a new research field, which aims to answer queries about
what-is-happening-now with a negligible delay. The real challenge with
real-time stream data processing is that it is impossible to store instances of
data, and therefore online analytical algorithms are utilized. To perform
real-time analytics, pre-processing of data should be performed in a way that
only a short summary of stream is stored in main memory. In addition, due to
high speed of arrival, average processing time for each instance of data should
be in such a way that incoming instances are not lost without being captured.
Lastly, the learner needs to provide high analytical accuracy measures.
Sentinel is a distributed system written in Java that aims to solve this
challenge by enforcing both the processing and learning process to be done in
distributed form. Sentinel is built on top of Apache Storm, a distributed
computing platform. Sentinels learner, Vertical Hoeffding Tree, is a parallel
decision tree-learning algorithm based on the VFDT, with ability of enabling
parallel classification in distributed environments. Sentinel also uses
SpaceSaving to keep a summary of the data stream and stores its summary in a
synopsis data structure. Application of Sentinel on Twitter Public Stream API
is shown and the results are discussed.
Ji Liu, Shaoshuai Mou, A. Stephen Morse, Brian D. O. Anderson, Changbin Yu
Subjects: Optimization and Control (math.OC); Distributed, Parallel, and Cluster Computing (cs.DC)
By the distributed averaging problem is meant the problem of computing the
average value of a set of numbers possessed by the agents in a distributed
network using only communication between neighboring agents. Gossiping is a
well-known approach to the problem which seeks to iteratively arrive at a
solution by allowing each agent to interchange information with at most one
neighbor at each iterative step. Crafting a gossiping protocol which
accomplishes this is challenging because gossiping is an inherently
collaborative process which can lead to deadlocks unless careful precautions
are taken to ensure that it does not. Many gossiping protocols are
request-based which means simply that a gossip between two agents will occur
whenever one of the two agents accepts a request to gossip placed by the other.
In this paper, we present three deterministic request-based protocols. We show
by example that the first can deadlock. The second is guaranteed to avoid
deadlocks and requires fewer transmissions per iteration than standard
broadcast-based distributed averaging protocols by exploiting the idea of local
ordering together with the notion of an agent’s neighbor queue; the protocol
requires the simplest queue updates, which provides an in-depth understanding
of how local ordering and queue updates avoid deadlocks. It is shown that a
third protocol which uses a slightly more complicated queue update rule can
lead to significantly faster convergence; a worst case bound on convergence
rate is provided.
Ana Lava, Mahdi Jelodari Mamaghani, Siamak Mohammadi, Steve Furber
Comments: 7 pages, 6 figures, submitted to IEEE Design and Test Journal – special issue on Accelerators in October 2016
Subjects: Hardware Architecture (cs.AR); Distributed, Parallel, and Cluster Computing (cs.DC); Performance (cs.PF)
Flexibility at hardware level is the main driving force behind adaptive
systems whose aim is to realise microarhitecture deconfiguration ‘online’. This
feature allows the software/hardware stack to tolerate drastic changes of the
workload in data centres. With emerge of FPGA reconfigurablity this technology
is becoming a mainstream computing paradigm. Adaptivity is usually accompanied
by the high-level tools to facilitate multi-dimensional space exploration. An
essential aspect in this space is memory orchestration where on-chip and
off-chip memory distribution significantly influences the architecture in
coping with the critical spatial and timing constraints, e.g. Place and Route.
This paper proposes a memory smart technique for a particular class of adaptive
systems: Elastic Circuits which enjoy slack elasticity at fine level of
granularity. We explore retiming of a set of popular benchmarks via
investigating the memory distribution within and among accelerators. The area,
performance and power patterns are adopted by our high-level synthesis
framework, with respect to the behaviour of the input descriptions, to improve
the quality of the synthesised elastic circuits.
Aaditya Prakash, Nick Moran, Solomon Garber, Antonella DiLillo, James Storer
Comments: Accepted to Data Compression Conference, 11 pages, 5 figures
Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
It has long been considered a significant problem to improve the visual
quality of lossy image and video compression. Recent advances in computing
power together with the availability of large training data sets has increased
interest in the application of deep learning cnns to address image recognition
and image processing tasks. Here, we present a powerful cnn tailored to the
specific task of semantic image understanding to achieve higher visual quality
in lossy compression. A modest increase in complexity is incorporated to the
encoder which allows a standard, off-the-shelf jpeg decoder to be used. While
jpeg encoding may be optimized for generic images, the process is ultimately
unaware of the specific content of the image to be compressed. Our technique
makes jpeg content-aware by designing and training a model to identify multiple
semantic regions in a given image. Unlike object detection techniques, our
model does not require labeling of object positions and is able to identify
objects in a single pass. We present a new cnn architecture directed
specifically to image compression, which generates a map that highlights
semantically-salient regions so that they can be encoded at higher quality as
compared to background regions. By adding a complete set of features for every
class, and then taking a threshold over the sum of all feature activations, we
generate a map that highlights semantically-salient regions so that they can be
encoded at a better quality compared to background regions. Experiments are
presented on the Kodak PhotoCD dataset and the MIT Saliency Benchmark dataset,
in which our algorithm achieves higher visual quality for the same compressed
size.
Li-Yeh Chuang, Chao-Hsuan Ke, Cheng-Hong Yang
Comments: 5 pages, 2 figures, 4tables
Journal-ref: IMECS2008_pp146-150
Subjects: Learning (cs.LG)
Gene expression data is widely used in disease analysis and cancer diagnosis.
However, since gene expression data could contain thousands of genes
simultaneously, successful microarray classification is rather difficult.
Feature selection is an important pre-treatment for any classification process.
Selecting a useful gene subset as a classifier not only decreases the
computational time and cost, but also increases classification accuracy. In
this study, we applied the information gain method as a filter approach, and an
improved binary particle swarm optimization as a wrapper approach to implement
feature selection; selected gene subsets were used to evaluate the performance
of classification. Experimental results show that by employing the proposed
method fewer gene subsets needed to be selected and better classification
accuracy could be obtained.
Anuj Karpatne, Gowtham Atluri, James Faghmous, Michael Steinbach, Arindam Banerjee, Auroop Ganguly, Shashi Shekhar, Nagiza Samatova, Vipin Kumar
Comments: submitted to IEEE TKDE (in review)
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Data science models, although successful in a number of commercial domains,
have had limited applicability in scientific problems involving complex
physical phenomena. Theory-guided data science (TGDS) is an emerging paradigm
that aims to leverage the wealth of scientific knowledge for improving the
effectiveness of data science models in enabling scientific discovery. The
overarching vision of TGDS is to introduce scientific consistency as an
essential component for learning generalizable models. Further, by producing
scientifically interpretable models, TGDS aims to advance our scientific
understanding by discovering novel domain insights. Indeed, the paradigm of
TGDS has started to gain prominence in a number of scientific disciplines such
as turbulence modeling, material discovery, quantum chemistry, bio-medical
science, bio-marker discovery, climate science, and hydrology. In this paper,
we formally conceptualize the paradigm of TGDS and present a taxonomy of
research themes in TGDS. We describe several approaches for integrating domain
knowledge in different research themes using illustrative examples from
different disciplines. We also highlight some of the promising avenues of novel
research for realizing the full potential of theory-guided data science.
Taco S. Cohen, Max Welling
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
It has long been recognized that the invariance and equivariance properties
of a representation are critically important for success in many vision tasks.
In this paper we present Steerable Convolutional Neural Networks, an efficient
and flexible class of equivariant convolutional networks. We show that
steerable CNNs achieve state of the art results on the CIFAR image
classification benchmark. The mathematical theory of steerable representations
reveals a type system in which any steerable representation is a composition of
elementary feature types, each one associated with a particular kind of
symmetry. We show how the parameter cost of a steerable filter bank depends on
the types of the input and output features, and show how to use this knowledge
to construct CNNs that utilize parameters effectively.
Mayra Z. Rodriguez, Cesar H. Comin, Dalcimar Casanova, Odemir M. Bruno, Diego R. Amancio, Francisco A. Rodrigues, Luciano da F. Costa
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
Many real-world systems can be studied in terms of pattern recognition tasks,
so that proper use (and understanding) of machine learning methods in practical
applications becomes essential. While a myriad of classification methods have
been proposed, there is no consensus on which methods are more suitable for a
given dataset. As a consequence, it is important to comprehensively compare
methods in many possible scenarios. In this context, we performed a systematic
comparison of 7 well-known clustering methods available in the R language. In
order to account for the many possible variations of data, we considered
artificial datasets with several tunable properties (number of classes,
separation between classes, etc). In addition, we also evaluated the
sensitivity of the clustering methods with regard to their parameters
configuration. The results revealed that, when considering the default
configurations of the adopted methods, the spectral approach usually
outperformed the other clustering algorithms. We also found that the default
configuration of the adopted implementations was not accurate. In these cases,
a simple approach based on random selection of parameters values proved to be a
good alternative to improve the performance. All in all, the reported approach
provides subsidies guiding the choice of clustering algorithms.
Andreas Henelius, Kai Puolamäki, Henrik Boström, Panagiotis Papapetrou
Comments: 29 pages, 5 figures, 5 tables
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
Clustering is a widely used unsupervised learning method for finding
structure in the data. However, the resulting clusters are typically presented
without any guarantees on their robustness; slightly changing the used data
sample or re-running a clustering algorithm involving some stochastic component
may lead to completely different clusters. There is, hence, a need for
techniques that can quantify the instability of the generated clusters. In this
study, we propose a technique for quantifying the instability of a clustering
solution and for finding robust clusters, termed core clusters, which
correspond to clusters where the co-occurrence probability of each data item
within a cluster is at least (1 – alpha). We demonstrate how solving the core
clustering problem is linked to finding the largest maximal cliques in a graph.
We show that the method can be used with both clustering and classification
algorithms. The proposed method is tested on both simulated and real datasets.
The results show that the obtained clusters indeed meet the guarantees on
robustness.
Jesse H. Krijthe, Marco Loog
Comments: Presented at RRPR 2016: 1st Workshop on Reproducible Research in Pattern Recognition
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
In this paper, we discuss the approaches we took and trade-offs involved in
making a paper on a conceptual topic in pattern recognition research fully
reproducible. We discuss our definition of reproducibility, the tools used, how
the analysis was set up, show some examples of alternative analyses the code
enables and discuss our views on reproducibility.
Vishal Kakkar, Shirish K. Shevade, S Sundararajan, Dinesh Garg
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)
AUC (Area under the ROC curve) is an important performance measure for
applications where the data is highly imbalanced. Learning to maximize AUC
performance is thus an important research problem. Using a max-margin based
surrogate loss function, AUC optimization problem can be approximated as a
pairwise rankSVM learning problem. Batch learning methods for solving the
kernelized version of this problem suffer from scalability and may not result
in sparse classifiers. Recent years have witnessed an increased interest in the
development of online or single-pass online learning algorithms that design a
classifier by maximizing the AUC performance. The AUC performance of nonlinear
classifiers, designed using online methods, is not comparable with that of
nonlinear classifiers designed using batch learning algorithms on many
real-world datasets. Motivated by these observations, we design a scalable
algorithm for maximizing AUC performance by greedily adding the required number
of basis functions into the classifier model. The resulting sparse classifiers
perform faster inference. Our experimental results show that the level of
sparsity achievable can be order of magnitude smaller than the Kernel RankSVM
model without affecting the AUC performance much.
Asim Kadav, Erik Kruus
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Learning (cs.LG)
Emerging workloads, such as graph processing and machine learning are
approximate because of the scale of data involved and the stochastic nature of
the underlying algorithms. These algorithms are often distributed over multiple
machines using bulk-synchronous processing (BSP) or other synchronous
processing paradigms such as map-reduce. However, data parallel processing
primitives such as repeated barrier and reduce operations introduce high
synchronization overheads. Hence, many existing data-processing platforms use
asynchrony and staleness to improve data-parallel job performance. Often, these
systems simply change the synchronous communication to asynchronous between the
worker nodes in the cluster. This improves the throughput of data processing
but results in poor accuracy of the final output since different workers may
progress at different speeds and process inconsistent intermediate outputs.
In this paper, we present ASAP, a model that provides asynchronous and
approximate processing semantics for data-parallel computation. ASAP provides
fine-grained worker synchronization using NOTIFY-ACK semantics that allows
independent workers to run asynchronously. ASAP also provides stochastic reduce
that provides approximate but guaranteed convergence to the same result as an
aggregated all-reduce. In our results, we show that ASAP can reduce
synchronization costs and provides 2-10X speedups in convergence and up to 10X
savings in network costs for distributed machine learning applications and
provides strong convergence guarantees.
Liang Zhang, Gang Wang, Daniel Romero, Georgios B. Giannakis
Subjects: Optimization and Control (math.OC); Learning (cs.LG); Numerical Analysis (cs.NA)
Owing to their low-complexity iterations, Frank-Wolfe (FW) solvers are well
suited for various large-scale learning tasks. When block-separable constraints
are also present, randomized FW has been shown to further reduce complexity by
updating only a fraction of coordinate blocks per iteration. In this context,
the present work develops feasibility-ensuring step sizes, and provably
convergent randomized block Frank-Wolfe (RB-FW) solvers that are flexible in
selecting the number of blocks to update per iteration. Convergence rates of
RB-FW are established through computational bounds on a primal sub-optimality
measure, and on the duality gap. Different from existing convergence analysis,
which only applies to a step-size sequence that does not generally lead to
feasible iterates, the analysis here includes two classes of step-size
sequences that not only guarantee feasibility of the iterates, but also enhance
flexibility in choosing decay rates. The novel convergence results are markedly
broadened to encompass also nonconvex objectives, and further assert that RB-FW
with exact line-search reaches a stationary point at rate
(mathcal{O}(1/sqrt{t})). Performance of RB-FW with different step sizes and
number of blocks is demonstrated in two applications, namely charging of
electrical vehicles and structural support vector machines. Simulated tests
demonstrate the impressive performance improvement of RB-FW relative to
existing randomized single-block FW methods.
Chris Hodapp
Comments: 5 pages, 5 figures
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
With large volumes of health care data comes the research area of
computational phenotyping, making use of techniques such as machine learning to
describe illnesses and other clinical concepts from the data itself. The
“traditional” approach of using supervised learning relies on a domain expert,
and has two main limitations: requiring skilled humans to supply correct labels
limits its scalability and accuracy, and relying on existing clinical
descriptions limits the sorts of patterns that can be found. For instance, it
may fail to acknowledge that a disease treated as a single condition may really
have several subtypes with different phenotypes, as seems to be the case with
asthma and heart disease. Some recent papers cite successes instead using
unsupervised learning. This shows great potential for finding patterns in
Electronic Health Records that would otherwise be hidden and that can lead to
greater understanding of conditions and treatments. This work implements a
method derived strongly from Lasko et al., but implements it in Apache Spark
and Python and generalizes it to laboratory time-series data in MIMIC-III. It
is released as an open-source tool for exploration, analysis, and
visualization, available at this https URL
Torsten A. Enßlin, Jakob Knollmüller
Comments: 19 pages, 5 figures, submitted
Subjects: Machine Learning (stat.ML); Instrumentation and Methods for Astrophysics (astro-ph.IM); Information Theory (cs.IT); Learning (cs.LG)
The inference of correlated signal fields with unknown correlation structures
is of high scientific and technological relevance, but poses significant
conceptual and numerical challenges. To address these, we develop the
correlated signal inference (CSI) algorithm within information field theory
(IFT) and discuss its numerical implementation. To this end, we introduce the
free energy exploration (FrEE) strategy for numerical information field theory
(NIFTy) applications. The FrEE strategy is to let the mathematical structure of
the inference problem determine the dynamics of the numerical solver. FrEE uses
the Gibbs free energy formalism for all involved unknown fields and correlation
structures without marginalization of nuisance quantities. It thereby avoids
the complexity marginalization often impose to IFT equations. FrEE
simultaneously solves for the mean and the uncertainties of signal, nuisance,
and auxiliary fields, while exploiting any analytically calculable quantity.
Finally, FrEE uses a problem specific and self-tuning exploration strategy to
swiftly identify the optimal field estimates as well as their uncertainty maps.
For all estimated fields, properly weighted posterior samples drawn from their
exact, fully non-Gaussian distributions can be generated. Here, we develop the
FrEE strategies for the CSI of a normal, a log-normal, and a Poisson log-normal
IFT signal inference problem and demonstrate their performances via their NIFTy
implementations.
Muhammad Yousefnezhad, Daoqiang Zhang
Comments: Accepted in SIAM International Conference on Data Mininig (SDM), Houston, Texas, USA, April/27-29, 2017
Subjects: Machine Learning (stat.ML); Learning (cs.LG); Neurons and Cognition (q-bio.NC)
Multivariate Pattern (MVP) classification holds enormous potential for
decoding visual stimuli in the human brain by employing task-based fMRI data
sets. There is a wide range of challenges in the MVP techniques, i.e.
decreasing noise and sparsity, defining effective regions of interest (ROIs),
visualizing results, and the cost of brain studies. In overcoming these
challenges, this paper proposes a novel model of neural representation, which
can automatically detect the active regions for each visual stimulus and then
utilize these anatomical regions for visualizing and analyzing the functional
activities. Therefore, this model provides an opportunity for neuroscientists
to ask this question: what is the effect of a stimulus on each of the detected
regions instead of just study the fluctuation of voxels in the manually
selected ROIs. Moreover, our method introduces analyzing snapshots of brain
image for decreasing sparsity rather than using the whole of fMRI time series.
Further, a new Gaussian smoothing method is proposed for removing noise of
voxels in the level of ROIs. The proposed method enables us to combine
different fMRI data sets for reducing the cost of brain studies. Experimental
studies on 4 visual categories (words, consonants, objects and nonsense photos)
confirm that the proposed method achieves superior performance to
state-of-the-art methods.
Gwangbeen Park, Woobin Im
Comments: 8 pages, 5 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Learning (cs.LG)
We present novel method for image-text multi-modal representation learning.
In our knowledge, this work is the first approach of applying adversarial
learning concept to multi-modal learning and not exploiting image-text pair
information to learn multi-modal feature. We only use category information in
contrast with most previous methods using image-text pair information for
multi-modal embedding. In this paper, we show that multi-modal feature can be
achieved without image-text pair information and our method makes more similar
distribution with image and text in multi-modal feature space than other
methods which use image-text pair information. And we show our multi-modal
feature has universal semantic information, even though it was trained for
category prediction. Our model is end-to-end backpropagation, intuitive and
easily extended to other multi-modal learning work.
Yuemeng Li, Xintao Wu, Aidong Lu
Comments: 10 pages
Subjects: Social and Information Networks (cs.SI); Learning (cs.LG); Physics and Society (physics.soc-ph)
It has been shown that the adjacency eigenspace of a network contains key
information of its underlying structure. However, there has been no study on
spectral analysis of the adjacency matrices of directed signed graphs. In this
paper, we derive theoretical approximations of spectral projections from such
directed signed networks using matrix perturbation theory. We use the derived
theoretical results to study the influences of negative intra cluster and inter
cluster directed edges on node spectral projections. We then develop a spectral
clustering based graph partition algorithm, SC-DSG, and conduct evaluations on
both synthetic and real datasets. Both theoretical analysis and empirical
evaluation demonstrate the effectiveness of the proposed algorithm.
Martin Genzel, Gitta Kutyniok
Subjects: Machine Learning (stat.ML); Learning (cs.LG); Statistics Theory (math.ST)
In this paper, we study the challenge of feature selection based on a
relatively small collection of sample pairs ({(x_i, y_i)}_{1 leq i leq m}).
The observations (y_i in mathbb{R}) are thereby supposed to follow a noisy
single-index model, depending on a certain set of signal variables. A major
difficulty is that these variables usually cannot be observed directly, but
rather arise as hidden factors in the actual data vectors (x_i in
mathbb{R}^d) (feature variables). We will prove that a successful variable
selection is still possible in this setup, even when the applied estimator does
not have any knowledge of the underlying model parameters and only takes the
‘raw’ samples ({(x_i, y_i)}_{1 leq i leq m}) as input. The model
assumptions of our results will be fairly general, allowing for non-linear
observations, arbitrary convex signal structures as well as strictly convex
loss functions. This is particularly appealing for practical purposes, since in
many applications, already standard methods, e.g., the Lasso or logistic
regression, yield surprisingly good outcomes. Apart from a general discussion
of the practical scope of our theoretical findings, we will also derive a
rigorous guarantee for a specific real-world problem, namely sparse feature
extraction from (proteomics-based) mass spectrometry data.
Longzhuang He, Jintao Wang, Jian Song, Lajos Hanzo
Journal-ref: IEEE Transactions on Wireless Communications, 2016
Subjects: Information Theory (cs.IT)
Massive spatial modulation aided multiple-input multiple-output (SM-MIMO)
systems have recently been proposed as a novel combination of spatial
modulation (SM) and of conventional massive MIMO, where the base station (BS)
is equipped with a large number of antennas and simultaneously serves multiple
user equipment (UE) that employ SM for their uplink transmission. Since the
massive SM-MIMO concept combines the benefits of both the SM and massive MIMO
techniques, it has recently attracted substantial research interest. In this
paper, we study the achievable uplink spectral efficiency (SE) of a multi-cell
massive SM-MIMO system, and derive closed-form expressions to asymptotically
lower-bound the SE yielded by two linear BS combining schemes, including
maximum ratio (MR) combining and zero forcing (ZF) combining, when a
sufficiently large number of BS antennas are equipped. The derivation takes
into account the impact of transmitter spatial correlations, of imperfect
channel estimations, of user-specific power controls and of different pilot
reuse factors. The proposed asymptotic bounds are shown to be tight, even when
the scale of BS antennas is limited. The new SE results facilitate a
system-level investigation of the optimal number of uplink transmit antennas
(TAs) (N) with respect to SE maximization. Explicitly, we provide theoretical
insights on the SE of massive SM-MIMO systems. Furthermore, we demonstrate that
massive SM-MIMO systems are capable of outperforming the SE of conventional
massive MIMOs relying on single-TA UEs.
Lei Chen, Ahmed G. Helmy, Guangrong Yue, Shaoqian Li, Naofal Al-Dhahir
Comments: 7 pages, 2 figures, IEEE GLOBECOM 2016
Subjects: Information Theory (cs.IT)
Differential space time block coding (STBC) achieves full spatial diversity
and avoids channel estimation overhead. Over highly frequency-selective
channels, STBC is integrated with orthogonal frequency division multiplexing
(OFDM) to achieve high performance. However, low-cost implementation of
differential STBC-OFDM using direct-conversion transceivers is sensitive to
In-phase/Quadrature-phase imbalance (IQI). In this paper, we quantify the
performance impact of IQI at the receiver front-end on differential STBC-OFDM
systems and propose a compensation algorithm to mitigate its effect. The
proposed receiver IQI compensation works in an adaptive decision-directed
manner without using known pilots or training sequences, which reduces the rate
loss due to training overhead. Our numerical results show that our proposed
compensation algorithm can effectively mitigate receive IQI in differential
STBC-OFDM.
Lei Chen, Ahmed G. Helmy, Guangrong Yue, Shaoqian Li, Naofal Al-Dhahir
Comments: IEEE Transactions on Vehicular Technology, 15 pages, 11 figures
Subjects: Information Theory (cs.IT)
Differential space time block coding (STBC) achieves full spatial diversity
and avoids channel estimation overhead. Over highly frequency-selective
channels, STBC is integrated with orthogonal frequency division multiplexing
(OFDM) to efficiently mitigate intersymbol interference effects. However,
low-cost implementation of STBC-OFDM with direct-conversion transceivers is
sensitive to In-phase/Quadrature-phase imbalance (IQI). In this paper, we
quantify the performance impact of IQI at both the transmitter and receiver
radio frequency front-ends on differential STBC-OFDM systems which has not been
investigated before in the literature. In addition, we propose a widely-linear
compensation algorithm at the receiver to mitigate the performance degradation
caused by the IQI at the transmitter and receiver ends. Moreover, a
parameter-based generalized algorithm is proposed to extract the IQI parameters
and improve the performance under high-mobility. The adaptive compensation
algorithms are blind and work in a decision-directed manner without using known
pilots or training sequences. Numerical results show that our proposed
compensation algorithms can effectively mitigate IQI in differential STBC-OFDM.
Peng Deng, Mohsen Kavehrad
Comments: 11 pages, 11 figures
Subjects: Information Theory (cs.IT); Optics (physics.optics)
The light emitting diode (LED) nonlinearities distortion induced degradation
in the performance of visible light communication (VLC) systems can be
controlled by optimizing the DC bias point of the LED. In this paper, we
theoretically analyze and experimentally demonstrate the effect of white LED DC
bias on nonlinear modulation bandwidth and dynamic range of the VLC system. The
linear dynamic range is enhanced by using series-connected LED chips, and the
modulation bandwidth is extended to 40 MHz by post-equalization without using a
blue filter. The experimental results well match the theoretical model of LED
nonlinear modulation characteristics. The results show that the modulation
bandwidth increases and saturates with an increase in LED DC bias current due
to nonlinear effect of carrier lifetime and junction capacitance. The optimized
DC-bias current that corresponds to the minimum BER increases with the increase
of data rate. A 60-Mbps NRZ transmission can be achieved with BER threshold of
10-3 by properly adjusting LED DC bias point.
Muhammad R. A. Khandaker, Christos Masouros, Kai-Kit Wong
Comments: To be submitted to the IEEE Transactions on Signal Processing
Subjects: Information Theory (cs.IT)
Conventionally, interference and noise are treated as catastrophic elements
in wireless communications. However, it has been shown recently that exploiting
known interference constructively can even contribute to signal detection
ability at the receiving end. This paper exploits this concept to design
artificial noise (AN) beamformers constructive to the intended receiver (IR)
yet keeping AN disruptive to possible eavesdroppers (Eves). The scenario
considered here is a multiple-input single-output (MISO) wiretap channel with
multiple eavesdroppers. Both perfect and imperfect channel information have
been considered. The main objective is to improve the receive
signal-to-interference and noise ratio (SINR) at IR through exploitation of AN
power in an attempt to minimize the total transmit power, while confusing the
Eves. Numerical simulations demonstrate that the proposed constructive AN
precoding approach yields superior performance over conventional AN schemes in
terms of transmit power as well as symbol error rate (SER).
A Pranav
Comments: 6 pages, Submitted to ICIIS 2016
Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)
In this paper, we have studied the bit partitioning schemes for the multicell
multiple-input and single-output (MISO) infrastructure. Zero forcing
beamforming is used to null out the interference signals and the random vector
quantization, quantizes the channel vectors. For minimal feedback period (MFP),
the upper bound of rate loss is calculated and optimal bit partitioning among
the channels is shown. For adaptive feedback period scheme (AFP), joint
optimization schemes of feedback period and bit partitioning are proposed.
Finally, we compare the sum rate efficiency of each scheme and conclude that
minimal feedback period outperforms other schemes.
Yongpeng Wu, Jun-Bo Wang, Jue Wang, Robert Schober, Chengshan Xiao
Subjects: Information Theory (cs.IT)
In this paper, we investigate secure transmission over the large-scale
multiple-antenna wiretap channel with finite alphabet inputs. First, we show
analytically that a generalized singular value decomposition (GSVD) based
design, which is optimal for Gaussian inputs, may exhibit a severe performance
loss for finite alphabet inputs in the high signal-to-noise ratio (SNR) regime.
In light of this, we propose a novel Per-Group-GSVD (PG-GSVD) design which can
effectively compensate the performance loss caused by the GSVD design. More
importantly, the computational complexity of the PG-GSVD design is by orders of
magnitude lower than that of the existing design for finite alphabet inputs in
[1] while the resulting performance loss is minimal. Numerical results indicate
that the proposed PG-GSVD design can be efficiently implemented in large-scale
multiple-antenna systems and achieves significant performance gains compared to
the GSVD design.
Ahmed Ewaisha, Cihan Tepedelenlioglu
Comments: WCNC 2017
Journal-ref: Wireless Communications and Networking Conference 2017
Subjects: Information Theory (cs.IT)
We consider the joint scheduling-and-power-allocation problem of a downlink
cellular system. The system consists of two groups of users: real-time (RT) and
non-real-time (NRT) users. Given some average power constraint on the base
station, the problem is to find an algorithm that satisfies the RT and NRT
quality-of-service (QoS) constraints. The RT QoS constraints guarantee the
portion of RT packets that miss their deadline are no more than a pre-specified
threshold. On the other hand, the NRT QoS is only to guarantee the stability of
their queues. We propose a sum-rate-maximizing algorithm that satisfy all QoS
and average power constraints. The proposed power allocation policy has a
closed-form expression for the two groups of users. However, the power policy
of the RT users differ in structure from the NRT users. The proposed algorithm
is optimal for the on-off channel model with a polynomial-time scheduling
complexity. Using extensive simulations, the throughput of the proposed
algorithm is shown exceed existing approaches.
Ahmed Ewaisha, Cihan Tepedelenlioglu
Comments: Asilomar 2013. arXiv admin note: text overlap with arXiv:1410.7460, arXiv:1601.00608
Journal-ref: 2013 Asilomar Conference on Signals, Systems and Computers
Subjects: Information Theory (cs.IT)
We study the throughput-vs-delay trade-off in an overlay multi-channel
single-secondary-user cognitive radio system. Due to the limited sensing
capabilities of the cognitive radio user, channels are sensed sequentially.
Maximizing the throughput in such a problem is well-studied in the literature.
Yet, in real-time applications, hard delay constraints need to be considered
besides throughput. In this paper, optimal stopping rule and optimal power
allocation are discussed to maximize the secondary user’s throughput, subject
to an average delay constraint. We provide a low complexity approach to the
optimal solution of this problem. Simulation results show that this solution
allows the secondary user to meet the delay constraint without sacrificing
throughput significantly. It also shows the benefits of the optimal power
allocation strategy over the constant power allocation strategy.
Takuya Ohto, Koji Yamamoto, Seong-Lyun Kim, Takayuki Nishio, Masahiro Morikura
Comments: This work has been submitted to the IEEE for possible publication
Subjects: Information Theory (cs.IT)
The coverage probability and average ergodic rate of normalized SNR-based
scheduling in a downlink cellular network are derived by modeling the locations
of the base stations and users as two independent Poison point processes. The
scheduler selects the user with the largest instantaneous SNR normalized by the
average SNR. Our results confirm that normalized SNR scheduling increases the
coverage probability due to the multi-user diversity gain.
Hirofumi Tsuda, Ken Umeno
Comments: 9 pages, 1 figure. arXiv admin note: text overlap with arXiv:1610.04340
Subjects: Information Theory (cs.IT)
Signal to Noise Ratio (SNR) is an important index for wireless
communications. In CDMA systems, spreading sequences are utilized. This series
of papers show the method to derive spreading sequences as the solutions of the
non-linear programming: maximize SNR. In this paper, we consider a
frequency-selective wide-sense-stationary uncorrelated-scattering (WSSUS)
channel and evaluate the worst case of SNR. Then, we derive the new expression
of SNR whose main term consists of the periodic correlation terms and the
aperiodic correlation terms. In general, there is a relation between SNR and
mean-square correlations, which are indices for performance of spreading
sequences. Then, we show the relation between our expression and them. With
this expression, we can maximize SNR with the Lagrange multiplier method. In
Part II, with this expression, we construct two types optimization problems and
evaluate them.
Sandra Bender, Meik Dörpinghaus, Gerhard Fettweis
Subjects: Information Theory (cs.IT)
We consider a continuous-time bandlimited additive white Gaussian noise
channel with 1-bit output quantization. On such a channel the information is
carried by the temporal distances of the zero-crossings of the transmit signal.
The set of input signals is constrained by the bandwidth of the channel and an
average power constraint. We derive a lower bound on the capacity by
lower-bounding the achievable rate for a given set of waveforms with
exponentially distributed zero-crossing distances. We focus on the behaviour in
the high signal-to-noise ratio regime and characterize the achievable rate over
the available bandwidth and the signal-to-noise ratio.
Ebrahim Bedeer, Mohamed Hossam Ahmed, Halim Yanikomeroglu
Subjects: Information Theory (cs.IT)
In this paper, we investigate the sequence estimation problem of binary and
quadrature phase shift keying faster-than-Nyquist (FTN) signaling and propose
two novel low-complexity sequence estimation techniques based on concepts of
successive interference cancellation. To the best of our knowledge, this is the
first approach in the literature to detect FTN signaling on a symbol-by-symbol
basis. In particular, based on the structure of the self-interference inherited
in FTN signaling, we first find the operating region boundary—defined by the
root-raised cosine (rRC) pulse shape, its roll-off factor, and the time
acceleration parameter of the FTN signaling—where perfect estimation of the
transmit data symbols on a symbol-by-symbol basis is guaranteed, assuming
noise-free transmission. For noisy transmission, we then propose a novel
low-complexity technique that works within the operating region and is capable
of estimating the transmit data symbols on a symbol-by-symbol basis. To reduce
the error propagation of the proposed successive symbol-by-symbol sequence
estimator (SSSSE), we propose a successive symbol-by-symbol with go-back-(K)
sequence estimator (SSSgb(K)SE) that goes back to re-estimate up to (K)
symbols, and subsequently improves the estimation accuracy of the current data
symbol. Simulation results show that the proposed sequence estimation
techniques perform well for low intersymbol interference (ISI) scenarios and
can significantly increase the data rate and spectral efficiency. Additionally,
results reveal that choosing the value of (K) as low as (2) or (3) data symbols
is sufficient to significantly improve the bit-error-rate performance. Results
also show that the performance of the proposed SSSgb(K)SE, with (K = 1) or (2),
surpasses the performance of the lowest complexity equalizers reported in the
literature, with reduced computational complexity.
Jie Xu, Lingjie Duan, Rui Zhang
Comments: To appear in IEEE Wireless Communications Letters
Subjects: Information Theory (cs.IT)
This letter studies an emerging wireless communication intervention problem
at the physical layer, where a legitimate spoofer aims to spoof a malicious
link from Alice to Bob, by replacing Alice’s transmitted source message with
its target message at Bob side. From an information-theoretic perspective, we
are interested in characterizing the maximum achievable spoofing rate of this
new spoofing channel, which is equivalent to the maximum achievable rate of the
target message at Bob, under the condition that Bob cannot decode the source
message from Alice. We propose a novel combined spoofing approach, where the
spoofer sends its own target message, combined with a processed version of the
source message to cancel the source message at Bob. For both cases when Bob
treats interference as noise (TIN) or applies successive interference
cancelation (SIC), we obtain the maximum achievable spoofing rates by
optimizing the power allocation between the target and source messages at the
spoofer.
Alexander B. Boyd, Dibyendu Mandal, Paul M. Riechers, James P. Crutchfield
Comments: 8 pages, 2 figures, supplementary material; this http URL
Subjects: Statistical Mechanics (cond-mat.stat-mech); Information Theory (cs.IT); Adaptation and Self-Organizing Systems (nlin.AO)
A central result that arose in applying information theory to the stochastic
thermodynamics of nonlinear dynamical systems is the Information-Processing
Second Law (IPSL): the physical entropy of the universe can decrease if
compensated by the Shannon-Kolmogorov-Sinai entropy change of appropriate
information-carrying degrees of freedom. In particular, the asymptotic-rate
IPSL precisely delineates the thermodynamic functioning of autonomous
Maxwellian demons and information engines. How do these systems begin to
function as engines, Landauer erasers, and error correctors? Here, we identify
a minimal, inescapable transient dissipation engendered by physical information
processing not captured by asymptotic rates, but critical to adaptive
thermodynamic processes such as found in biological systems. A component of
transient dissipation, we also identify an implementation-dependent cost that
varies from one physical substrate to another for the same information
processing task. Applying these results to producing structured patterns from a
structureless information reservoir, we show that “retrodictive” generators
achieve the minimal costs. The results establish the thermodynamic toll imposed
by a physical system’s structure as it comes to optimally transduce
information.
Mihailo Stojnic
Subjects: Probability (math.PR); Information Theory (cs.IT); Statistics Theory (math.ST)
In our companion paper cite{Stojnicgscomp16} we introduce a collection of
fairly powerful statistical comparison results. They relate to a general
comparison concept and its an upgrade that we call lifting procedure. Here we
provide a different generic principle (which we call fully bilinear) that in
certain cases turns out to be stronger than the corresponding one from
cite{Stojnicgscomp16}. Moreover, we also show how the principle that we
introduce here can also be pushed through the lifting machinery of
cite{Stojnicgscomp16}. Finally, as was the case in cite{Stojnicgscomp16},
here we also show how the well known Slepian’s max and Gordon’s minmax
comparison principles can be obtained as special cases of the mechanisms that
we present here. We also create their lifted upgrades which happen to be
stronger than the corresponding ones in cite{Stojnicgscomp16}. A fairly large
collection of results obtained through numerical experiments is also provided.
It is observed that these results are in an excellent agreement with what the
theory predicts.
Mihailo Stojnic
Subjects: Probability (math.PR); Information Theory (cs.IT); Statistics Theory (math.ST)
In this paper we introduce a collection of powerful statistical comparison
results. We first present the results that we obtained while developing a
general comparison concept. After that we introduce a separate lifting
procedure that is a comparison concept on its own. We then show how in certain
scenarios the lifting procedure basically represents a substantial upgrade over
the general strategy. We complement the introduced results with a fairly large
collection of numerical experiments that are in an overwhelming agreement with
what the theory predicts. We also show how many well known comparison results
(e.g. Slepian’s max and Gordon’s minmax principle) can be obtained as special
cases. Moreover, it turns out that the minmax principle can be viewed as a
single max principle as well. The range of applications is enormous. It starts
with revisiting many of the results we created in recent years in various
mathematical fields and recognizing that they are fully self-contained as their
starting blocks are specialized variants of the concepts introduced here.
Further upgrades relate to core comparison extensions on the one side and more
practically oriented modifications on the other. Those that we deem the most
important we discuss in several separate companion papers to ensure preserving
the introductory elegance and simplicity of what is presented here.
C. Aghamohammadi, J. P. Crutchfield
Comments: 13 pages, 4 figures; this http URL
Subjects: Statistical Mechanics (cond-mat.stat-mech); Information Theory (cs.IT); Chaotic Dynamics (nlin.CD)
We analyze the thermodynamic costs of the three main approaches to generating
random numbers via the recently introduced Information Processing Second Law.
Given access to a specified source of randomness, a random number generator
(RNG) produces samples from a desired target probability distribution. This
differs from pseudorandom number generators (PRNG) that use wholly
deterministic algorithms and from true random number generators (TRNG) in which
the randomness source is a physical system. For each class, we analyze the
thermodynamics of generators based on algorithms implemented as finite-state
machines, as these allow for direct bounds on the required physical resources.
This establishes bounds on heat dissipation and work consumption during the
operation of three main classes of RNG algorithms—including those of von
Neumann, Knuth and Yao, and Roche and Hoshi—and for PRNG methods. We
introduce a general TRNG and determine its thermodynamic costs exactly for
arbitrary target distributions. The results highlight the significant
differences between the three main approaches to random number generation: One
is work producing, one is work consuming, and the other is potentially
dissipation neutral. Notably, TRNGs can both generate random numbers and
convert thermal energy to stored work. These thermodynamic costs on information
creation complement Landauer’s limit on the irreducible costs of information
destruction.
Torsten A. Enßlin, Jakob Knollmüller
Comments: 19 pages, 5 figures, submitted
Subjects: Machine Learning (stat.ML); Instrumentation and Methods for Astrophysics (astro-ph.IM); Information Theory (cs.IT); Learning (cs.LG)
The inference of correlated signal fields with unknown correlation structures
is of high scientific and technological relevance, but poses significant
conceptual and numerical challenges. To address these, we develop the
correlated signal inference (CSI) algorithm within information field theory
(IFT) and discuss its numerical implementation. To this end, we introduce the
free energy exploration (FrEE) strategy for numerical information field theory
(NIFTy) applications. The FrEE strategy is to let the mathematical structure of
the inference problem determine the dynamics of the numerical solver. FrEE uses
the Gibbs free energy formalism for all involved unknown fields and correlation
structures without marginalization of nuisance quantities. It thereby avoids
the complexity marginalization often impose to IFT equations. FrEE
simultaneously solves for the mean and the uncertainties of signal, nuisance,
and auxiliary fields, while exploiting any analytically calculable quantity.
Finally, FrEE uses a problem specific and self-tuning exploration strategy to
swiftly identify the optimal field estimates as well as their uncertainty maps.
For all estimated fields, properly weighted posterior samples drawn from their
exact, fully non-Gaussian distributions can be generated. Here, we develop the
FrEE strategies for the CSI of a normal, a log-normal, and a Poisson log-normal
IFT signal inference problem and demonstrate their performances via their NIFTy
implementations.
Eliel Hojman, Thomas Chaigne, Oren Solomon, Sylvain Gigan, Emmanuel Bossy, Yonina C. Eldar, Ori Katz
Subjects: Optics (physics.optics); Information Theory (cs.IT)
In deep tissue photoacoustic imaging the spatial resolution is inherently
limited by the acoustic wavelength. Recently, it was demonstrated that it is
possible to surpass the acoustic diffraction limit by analyzing fluctuations in
a set of photoacoustic images obtained under unknown speckle illumination
patterns. Here, we purpose an approach to boost reconstruction fidelity and
resolution, while reducing the number of acquired images by utilizing a
compressed sensing computational reconstruction framework. The approach takes
into account prior knowledge of the system response and sparsity of the target
structure. We provide proof of principle experiments of the approach and
demonstrate that improved performance is obtained when both speckle
fluctuations and object priors are used. We numerically study the expected
performance as a function of the measurements signal to noise ratio and sample
spatial-sparsity. The presented reconstruction framework can be applied to
analyze existing photoacoustic experimental datasets containing dynamic
fluctuations.
Martin Genzel
Subjects: Statistics Theory (math.ST); Information Theory (cs.IT)
In this paper, we study the issue of estimating a structured signal (x_0 in
mathbb{R}^n) from non-linear and noisy Gaussian observations. Supposing that
(x_0) is contained in a certain convex subset (K subset mathbb{R}^n), we
prove that accurate recovery is already feasible if the number of observations
exceeds the effective dimension of (K), which is a common measure for the
complexity of signal classes. It will turn out that the possibly unknown
non-linearity of our model affects the error rate only by a multiplicative
constant. This achievement is based on recent works by Plan and Vershynin, who
have suggested to treat the non-linearity rather as noise which perturbs a
linear measurement process. Using the concept of restricted strong convexity,
we show that their results for the generalized Lasso can be extended to a
fairly large class of convex loss functions. Moreover, we shall allow for the
presence of adversarial noise so that even deterministic model inaccuracies can
be coped with. These generalizations particularly give further evidence of why
many standard estimators perform surprisingly well in practice, although they
do not rely on any knowledge of the underlying output rule. To this end, our
results provide a unified and general framework for signal reconstruction in
high dimensions, covering various challenges from the fields of compressed
sensing, signal processing, and statistical learning.