Mohammad Javad Shafiee, Audrey G. Chung, Farzad Khalvati, Masoom A. Haider, Alexander Wong
Comments: 8 pages
Subjects: Neural and Evolutionary Computing (cs.NE); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
While lung cancer is the second most diagnosed form of cancer in men and
women, a sufficiently early diagnosis can be pivotal in patient survival rates.
Imaging-based, or radiomics-driven, detection methods have been developed to
aid diagnosticians, but largely rely on hand-crafted features which may not
fully encapsulate the differences between cancerous and healthy tissue.
Recently, the concept of discovery radiomics was introduced, where custom
abstract features are discovered from readily available imaging data. We
propose a novel evolutionary deep radiomic sequencer discovery approach based
on evolutionary deep intelligence. Motivated by patient privacy concerns and
the idea of operational artificial intelligence, the evolutionary deep radiomic
sequencer discovery approach organically evolves increasingly more efficient
deep radiomic sequencers that produce significantly more compact yet similarly
descriptive radiomic sequences over multiple generations. As a result, this
framework improves operational efficiency and enables diagnosis to be run
locally at the radiologist’s computer while maintaining detection accuracy. We
evaluated the evolved deep radiomic sequencer (EDRS) discovered via the
proposed evolutionary deep radiomic sequencer discovery framework against
state-of-the-art radiomics-driven and discovery radiomics methods using
clinical lung CT data with pathologically-proven diagnostic data. The evolved
deep radiomic sequencer shows state-of-the-art diagnostic accuracy (88.78%)
relative to previous radiomics approaches.
Hamed Zamani, W. Bruce Croft
Comments: to appear in the proceedings of The 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’17)
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Learning a high-dimensional dense representation for vocabulary terms, also
known as a word embedding, has recently attracted much attention in natural
language processing and information retrieval tasks. The embedding vectors are
typically learned based on term proximity in a large corpus. This means that
the objective in well-known word embedding algorithms, e.g., word2vec, is to
accurately predict adjacent word(s) for a given word or context. However, this
objective is not necessarily equivalent to the goal of many information
retrieval (IR) tasks. The primary objective in various IR tasks is to capture
relevance instead of term proximity, syntactic, or even semantic similarity.
This is the motivation for developing unsupervised relevance-based word
embedding models that learn word representations based on query-document
relevance information. In this paper, we propose two learning models with
different objective functions; one learns a relevance distribution over the
vocabulary set for each query, and the other classifies each term as belonging
to the relevant or non-relevant class for each query. To train our models, we
used over six million unique queries and the top ranked documents retrieved in
response to each query, which are assumed to be relevant to the query. We
extrinsically evaluate our learned word representation models using two IR
tasks: query expansion and query classification. Both query expansion
experiments on four TREC collections and query classification experiments on
the KDD Cup 2005 dataset suggest that the relevance-based word embedding models
significantly outperform state-of-the-art proximity-based embedding models,
such as word2vec and GloVe.
Zhen Li, Sheng Wang, Yizhou Yu, Jinbo Xu
Subjects: Biomolecules (q-bio.BM); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Quantitative Methods (q-bio.QM)
Computational prediction of membrane protein (MP) structures is very
challenging partially due to lack of sufficient solved structures for homology
modeling. Recently direct evolutionary coupling analysis (DCA) sheds some light
on protein contact prediction and accordingly, contact-assisted folding, but
DCA is effective only on some very large-sized families since it uses
information only in a single protein family. This paper presents a deep
transfer learning method that can significantly improve MP contact prediction
by learning contact patterns and complex sequence-contact relationship from
thousands of non-membrane proteins (non-MPs). Tested on 510 non-redundant MPs,
our deep model (learned from only non-MPs) has top L/10 long-range contact
prediction accuracy 0.69, better than our deep model trained by only MPs (0.63)
and much better than a representative DCA method CCMpred (0.47) and the CASP11
winner MetaPSICOV (0.55). The accuracy of our deep model can be further
improved to 0.72 when trained by a mix of non-MPs and MPs. When only contacts
in transmembrane regions are evaluated, our method has top L/10 long-range
accuracy 0.62, 0.57, and 0.53 when trained by a mix of non-MPs and MPs, by
non-MPs only, and by MPs only, respectively, still much better than MetaPSICOV
(0.45) and CCMpred (0.40). All these results suggest that sequence-structure
relationship learned by our deep model from non-MPs generalizes well to MP
contact prediction. Improved contact prediction also leads to better
contact-assisted folding. Using only top predicted contacts as restraints, our
deep learning method can fold 160 and 200 of 510 MPs with TMscore>0.6 when
trained by non-MPs only and by a mix of non-MPs and MPs, respectively, while
CCMpred and MetaPSICOV can do so for only 56 and 77 MPs, respectively. Our
contact-assisted folding also greatly outperforms homology modeling.
Andrea Palazzi, Davide Abati, Simone Calderara, Francesco Solera, Rita Cucchiara
Comments: Submitted to IEEE Transactions on PAMI
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In this work we aim to predict the driver’s focus of attention. The goal is
to estimate what a person would pay attention to while driving, and which part
of the scene around the vehicle is more critical for the task. To this end we
propose a new computer vision model based on a multi-path deep architecture
that integrates three sources of information: visual cues, motion and scene
semantics. We also introduce DR(eye)VE, the largest dataset of driving scenes,
enriched with eye-tracking annotations and other sensors’ measurements. This
dataset features more than 500.000 registered frames, matching ego-centric
views (from glasses worn by drivers) and car-centric views (from roof-mounted
camera). Results highlight that several attentional patterns are shared across
drivers and can be reproduced to some extent. This may benefit an ADAS system
by providing indication of which elements in the scene are likely to capture
the driver’s attention.
Hao Dong, Guang Yang, Fangde Liu, Yuanhan Mo, Yike Guo
Journal-ref: Medical Image Understanding and Analysis (MIUA) 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
The system fives the error of “Bad character(s) in field Abstract” for no
reason. Please refer to manuscript for the full abstract.
Sung-Ho Bae, Mohamed Elgharib, Mohamed Hefeeda, Wojciech Matusik
Comments: 8 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Single-image-based view generation (SIVG) is important for producing 3D
stereoscopic content. Here, handling different spatial resolutions as input and
optimizing both reconstruction accuracy and processing speed is desirable.
Latest approaches are based on convolutional neural network (CNN), and they
generate promising results. However, their use of fully connected layers as
well as pre-trained VGG forces a compromise between reconstruction accuracy and
processing speed. In addition, this approach is limited to the use of a
specific spatial resolution. To remedy these problems, we propose exploiting
fully convolutional networks (FCN) for SIVG. We present two FCN architectures
for SIVG. The first one is based on combination of an FCN and a view-rendering
network called DeepView(_{ren}). The second one consists of decoupled networks
for luminance and chrominance signals, denoted by DeepView(_{dec}). To train
our solutions we present a large dataset of 2M stereoscopic images. Results
show that both of our architectures improve accuracy and speed over the state
of the art. DeepView(_{ren}) generates competitive accuracy to the state of the
art, however, with the fastest processing speed of all. That is x5 times faster
speed and x24 times lower memory consumption compared to the state of the art.
DeepView(_{dec}) has much higher accuracy, but with x2.5 times faster speed and
x12 times lower memory consumption. We evaluated our approach with both
objective and subjective studies.
Babak Ehteshami Bejnordi, Guido Zuidhof, Maschenka Balkenhol, Meyke Hermsen, Peter Bult, Bram van Ginneken, Nico Karssemeijer, Geert Litjens, Jeroen van der Laak
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Automated classification of histopathological whole-slide images (WSI) of
breast tissue requires analysis at very high resolutions with a large
contextual area. In this paper, we present context-aware stacked convolutional
neural networks (CNN) for classification of breast WSIs into normal/benign,
ductal carcinoma in situ (DCIS), and invasive ductal carcinoma (IDC). We first
train a CNN using high pixel resolution patches to capture cellular level
information. The feature responses generated by this model are then fed as
input to a second CNN, stacked on top of the first. Training of this stacked
architecture with large input patches enables learning of fine-grained
(cellular) details and global interdependence of tissue structures. Our system
is trained and evaluated on a dataset containing 221 WSIs of H&E stained breast
tissue specimens. The system achieves an AUC of 0.962 for the binary
classification of non-malignant and malignant slides and obtains a three class
accuracy of 81.3% for classification of WSIs into normal/benign, DCIS, and IDC,
demonstrating its potentials for routine diagnostics.
Shirui Li, Alper Yilmaz, Changlin Xiao, Hua Li
Comments: It has 6 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In this paper, we propose a new method to detect 4D spatiotemporal interest
points though an implicit surface, we refer to as the 4D-ISIP. We use a 3D
volume which has a truncated signed distance function(TSDF) for every voxel to
represent our 3D object model. The TSDF represents the distance between the
spatial points and object surface points which is an implicit surface
representation. Our novelty is to detect the points where the local
neighborhood has significant variations along both spatial and temporal
directions. We established a system to acquire 3D human motion dataset using
only one Kinect. Experimental results show that our method can detect 4D-ISIP
for different human actions.
Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Judy Hoffman, Li Fei-Fei, C. Lawrence Zitnick, Ross Girshick
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Learning (cs.LG)
Existing methods for visual reasoning attempt to directly map inputs to
outputs using black-box architectures without explicitly modeling the
underlying reasoning processes. As a result, these black-box models often learn
to exploit biases in the data rather than learning to perform visual reasoning.
Inspired by module networks, this paper proposes a model for visual reasoning
that consists of a program generator that constructs an explicit representation
of the reasoning process to be performed, and an execution engine that executes
the resulting program to produce an answer. Both the program generator and the
execution engine are implemented by neural networks, and are trained using a
combination of backpropagation and REINFORCE. Using the CLEVR benchmark for
visual reasoning, we show that our model significantly outperforms strong
baselines and generalizes better in a variety of settings.
Riku Shigematsu, David Feng, Shaodi You, Nick Barnes
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Recently, deep Convolutional Neural Networks (CNN) have demonstrated strong
performance on RGB salient object detection. Although, depth information can
help improve detection results, the exploration of CNNs for RGB-D salient
object detection remains limited. Here we propose a novel deep CNN architecture
for RGB-D salient object detection that exploits high-level, mid-level, and low
level features. Further, we present novel depth features that capture the ideas
of background enclosure and depth contrast that are suitable for a learned
approach. We show improved results compared to state-of-the-art RGB-D salient
object detection methods. We also show that the low-level and mid-level depth
features both contribute to improvements in the results. Especially, F-Score of
our method is 0.848 on RGBD1000 dataset, which is 10.7% better than the second
place.
Hirokatsu Kataoka, Kaori Abe, Akio Nakamura, Yutaka Satoh
Comments: CVPR 2017 Workshop Submission
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
The paper presents a novel concept for collaborative descriptors between
deeply learned and hand-crafted features. To achieve this concept, we apply
convolutional maps for pre-processing, namely the convovlutional maps are used
as input of hand-crafted features. We recorded an increase in the performance
rate of +17.06 % (multi-class object recognition) and +24.71 % (car detection)
from grayscale input to convolutional maps. Although the framework is
straight-forward, the concept should be inherited for an improved
representation.
Vincenzo Lomonaco, Davide Maltoni
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Robotics (cs.RO)
Continuous/Lifelong learning of high-dimensional data streams is a
challenging research problem. In fact, fully retraining models each time new
data become available is infeasible, due to computational and storage issues,
while na”ive incremental strategies have been shown to suffer from
catastrophic forgetting. In the context of real-world object recognition
applications (e.g., robotic vision), where continuous learning is crucial, very
few datasets and benchmarks are available to evaluate and compare emerging
techniques. In this work we propose a new dataset and benchmark CORe50,
specifically designed for continuous object recognition, and introduce baseline
approaches for different continuous learning scenarios.
Mahdieh Poostchi, Ali Shafiekhani, Kannappan Palaniappan, Guna Seetharaman
Comments: 5 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Weighting pixel contribution considering its location is a key feature in
many fundamental image processing tasks including filtering, object modeling
and distance matching. Several techniques have been proposed that incorporate
Spatial information to increase the accuracy and boost the performance of
detection, tracking and recognition systems at the cost of speed. But, it is
still not clear how to efficiently ex- tract weighted local histograms in
constant time using integral histogram. This paper presents a novel algorithm
to compute accurately multi-scale Spatially weighted local histograms in
constant time using Weighted Integral Histogram (SWIH) for fast search. We
applied our spatially weighted integral histogram approach for fast tracking
and obtained more accurate and robust target localization result in comparison
with using plain histogram.
Andrew Knyazev, Alexander Malyshev
Comments: 5 pages, 8 figures. To appear in Proceedings of SampTA 2017: Sampling Theory and Applications, 12th International Conference, July 3-7, 2017, Tallinn, Estonia
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Information Theory (cs.IT); Numerical Analysis (cs.NA); Numerical Analysis (math.NA)
Signal reconstruction from a sample using an orthogonal projector onto a
guiding subspace is theoretically well justified, but may be difficult to
practically implement. We propose more general guiding operators, which
increase signal components in the guiding subspace relative to those in a
complementary subspace, e.g., iterative low-pass edge-preserving filters for
super-resolution of images. Two examples of super-resolution illustrate our
technology: a no-flash RGB photo guided using a high resolution flash RGB
photo, and a depth image guided using a high resolution RGB photo.
Akshay Kumar Gupta
Comments: 9 pages, 3 figures, 3 tables
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
Visual question answering (or VQA) is a new and exciting problem that
combines natural language processing and computer vision techniques. We present
a survey of the various datasets and models that have been used to tackle this
task. The first part of the survey details the various datasets for VQA and
compares them along some common factors. The second part of this survey details
the different approaches for VQA, classified into four types: non-deep learning
models, deep learning models without attention, deep learning models with
attention, and other models which do not fit into the first three. Finally, we
compare the performances of these approaches and provide some directions for
future work.
Mohammad Javad Shafiee, Audrey G. Chung, Farzad Khalvati, Masoom A. Haider, Alexander Wong
Comments: 8 pages
Subjects: Neural and Evolutionary Computing (cs.NE); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
While lung cancer is the second most diagnosed form of cancer in men and
women, a sufficiently early diagnosis can be pivotal in patient survival rates.
Imaging-based, or radiomics-driven, detection methods have been developed to
aid diagnosticians, but largely rely on hand-crafted features which may not
fully encapsulate the differences between cancerous and healthy tissue.
Recently, the concept of discovery radiomics was introduced, where custom
abstract features are discovered from readily available imaging data. We
propose a novel evolutionary deep radiomic sequencer discovery approach based
on evolutionary deep intelligence. Motivated by patient privacy concerns and
the idea of operational artificial intelligence, the evolutionary deep radiomic
sequencer discovery approach organically evolves increasingly more efficient
deep radiomic sequencers that produce significantly more compact yet similarly
descriptive radiomic sequences over multiple generations. As a result, this
framework improves operational efficiency and enables diagnosis to be run
locally at the radiologist’s computer while maintaining detection accuracy. We
evaluated the evolved deep radiomic sequencer (EDRS) discovered via the
proposed evolutionary deep radiomic sequencer discovery framework against
state-of-the-art radiomics-driven and discovery radiomics methods using
clinical lung CT data with pathologically-proven diagnostic data. The evolved
deep radiomic sequencer shows state-of-the-art diagnostic accuracy (88.78%)
relative to previous radiomics approaches.
Djallel Bouneffouf, Irina Rish, Guillermo A. Cecchi, Raphael Feraud
Comments: IJCAI 2017
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)
We consider a novel formulation of the multi-armed bandit model, which we
call the contextual bandit with restricted context, where only a limited number
of features can be accessed by the learner at every iteration. This novel
formulation is motivated by different online problems arising in clinical
trials, recommender systems and attention modeling. Herein, we adapt the
standard multi-armed bandit algorithm known as Thompson Sampling to take
advantage of our restricted context setting, and propose two novel algorithms,
called the Thompson Sampling with Restricted Context(TSRC) and the Windows
Thompson Sampling with Restricted Context(WTSRC), for handling stationary and
nonstationary environments, respectively. Our empirical results demonstrate
advantages of the proposed approaches on several real-life datasets
Jiyuan Zhang, Yang Feng, Dong Wang, Yang Wang, Andrew Abel, Shiyue Zhang, Andi Zhang
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
It has been shown that Chinese poems can be successfully generated by
sequence-to-sequence neural models, particularly with the attention mechanism.
A potential problem of this approach, however, is that neural models can only
learn abstract rules, while poem generation is a highly creative process that
involves not only rules but also innovations for which pure statistical models
are not appropriate in principle. This work proposes a memory-augmented neural
model for Chinese poem generation, where the neural model and the augmented
memory work together to balance the requirements of linguistic accordance and
aesthetic innovation, leading to innovative generations that are still
rule-compliant. In addition, it is found that the memory mechanism provides
interesting flexibility that can be used to generate poems with different
styles.
Gagan Madan
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Relation Extraction refers to the task of populating a database with tuples
of the form (r(e_1, e_2)), where (r) is a relation and (e_1), (e_2) are
entities. Distant supervision is one such technique which tries to
automatically generate training examples based on an existing KB such as
Freebase. This paper is a survey of some of the techniques in distant
supervision which primarily rely on Probabilistic Graphical Models (PGMs).
Rui L. Lopes, Alípio Jorge
Comments: Part of DM4OG 2017 proceedings (arXiv:1705.03451)
Subjects: Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
The main task in oil and gas exploration is to gain an understanding of the
distribution and nature of rocks and fluids in the subsurface. Well logs are
records of petro-physical data acquired along a borehole, providing direct
information about what is in the subsurface. The data collected by logging
wells can have significant economic consequences, due to the costs inherent to
drilling wells, and the potential return of oil deposits. In this paper, we
describe preliminary work aimed at building a general framework for well log
prediction.
First, we perform a descriptive and exploratory analysis of the gaps in the
neutron porosity logs of more than a thousand wells in the North Sea. Then, we
generate artificial gaps in the neutron logs that reflect the statistics
collected before. Finally, we compare Artificial Neural Networks, Random
Forests, and three algorithms of Linear Regression in the prediction of missing
gaps on a well-by-well basis.
Yan Li, Zhaohan Sun
Subjects: Artificial Intelligence (cs.AI)
In most common settings of Markov Decision Process (MDP), an agent evaluate a
policy based on expectation of (discounted) sum of rewards. However in many
applications this criterion might not be suitable from two perspective: first,
in risk aversion situation expectation of accumulated rewards is not robust
enough, this is the case when distribution of accumulated reward is heavily
skewed; another issue is that many applications naturally take several
objective into consideration when evaluating a policy, for instance in
autonomous driving an agent needs to balance speed and safety when choosing
appropriate decision. In this paper, we consider evaluating a policy based on a
sequence of quantiles it induces on a set of target states, our idea is to
reformulate the original problem into a multi-objective MDP problem with
lexicographic preference naturally defined. For computation of finding an
optimal policy, we proposed an algorithm extbf{FLMDP} that could solve
general multi-objective MDP with lexicographic reward preference.
Jae Young Lee, Richard S. Sutton
Comments: 20 pages, 2 figures (i.e., 6 sub-figures), 2 tables, 5 main ideal algorithms, and 1 algorithm for implementation. For a summary, a short simplified RLDM-conf. version is available at <this http URL>
Subjects: Artificial Intelligence (cs.AI); Systems and Control (cs.SY)
Policy iteration (PI) is a recursive process of policy evaluation and
improvement to solve an optimal decision-making, e.g., reinforcement learning
(RL) or optimal control problem and has served as the fundamental to develop RL
methods. Motivated by integral PI (IPI) schemes in optimal control and RL
methods in continuous time and space (CTS), this paper proposes on-policy IPI
to solve the general RL problem in CTS, with its environment modeled by an
ordinary differential equation (ODE). In such continuous domain, we also
propose four off-policy IPI methods—two are the ideal PI forms that use
advantage and Q-functions, respectively, and the other two are natural
extensions of the existing off-policy IPI schemes to our general RL framework.
Compared to the IPI methods in optimal control, the proposed IPI schemes can be
applied to more general situations and do not require an initial stabilizing
policy to run; they are also strongly relevant to the RL algorithms in CTS such
as advantage updating, Q-learning, and value-gradient based (VGB) greedy policy
improvement. Our on-policy IPI is basically model-based but can be made
partially model-free; each off-policy method is also either partially or
completely model-free. The mathematical properties of the IPI
methods—admissibility, monotone improvement, and convergence towards the
optimal solution—are all rigorously proven, together with the equivalence of
on- and off-policy IPI. Finally, the IPI methods are simulated with an
inverted-pendulum model to support the theory and verify the performance.
Akshay Kumar Gupta
Comments: 9 pages, 3 figures, 3 tables
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
Visual question answering (or VQA) is a new and exciting problem that
combines natural language processing and computer vision techniques. We present
a survey of the various datasets and models that have been used to tackle this
task. The first part of the survey details the various datasets for VQA and
compares them along some common factors. The second part of this survey details
the different approaches for VQA, classified into four types: non-deep learning
models, deep learning models without attention, deep learning models with
attention, and other models which do not fit into the first three. Finally, we
compare the performances of these approaches and provide some directions for
future work.
Steven Stenberg Hansen
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG)
We present a new deep meta reinforcement learner, which we call Deep Episodic
Value Iteration (DEVI). DEVI uses a deep neural network to learn a similarity
metric for a non-parametric model-based reinforcement learning algorithm. Our
model is trained end-to-end via back-propagation. Despite being trained using
the model-free Q-learning objective, we show that DEVI’s model-based internal
structure provides `one-shot’ transfer to changes in reward and transition
structure, even for tasks with very high-dimensional state spaces.
Vincenzo Lomonaco, Davide Maltoni
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Robotics (cs.RO)
Continuous/Lifelong learning of high-dimensional data streams is a
challenging research problem. In fact, fully retraining models each time new
data become available is infeasible, due to computational and storage issues,
while na”ive incremental strategies have been shown to suffer from
catastrophic forgetting. In the context of real-world object recognition
applications (e.g., robotic vision), where continuous learning is crucial, very
few datasets and benchmarks are available to evaluate and compare emerging
techniques. In this work we propose a new dataset and benchmark CORe50,
specifically designed for continuous object recognition, and introduce baseline
approaches for different continuous learning scenarios.
Ankur Bapna, Gokhan Tur, Dilek Hakkani-Tur, Larry Heck
Comments: 8 + 2 pages
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG)
Conversational Language Understanding (CLU) is a key component of goal
oriented dialogue systems that would parse user ut- terances into semantic
frame representa- tions. Traditionally CLU does not utilize the dialogue
history beyond the previous system turn and contextual ambiguities are resolved
by the downstream components. In this paper, we explore novel approaches for
modeling dialogue context in a re- current neural network (RNN) based lan-
guage understanding system. We propose the Hierarchical Dialogue Encoder Net-
work, that allows encoding context from the dialogue history in chronological
or- der. We compare the performance of our proposed architecture with two
context models, one that uses just the previous turn context and another that
encodes dialogue context in a memory network, but loses the order of utterances
in the dialogue his- tory. Experiments with a multi-domain di- alogue dataset
demonstrate that the pro- posed architecture results in reduced se- mantic
frame error rates.
Matthew Lamm, Mihail Eric
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Today’s artificial assistants are typically prompted to perform tasks through
direct, imperative commands such as emph{Set a timer} or emph{Pick up the
box}. However, to progress toward more natural exchanges between humans and
these assistants, it is important to understand the way non-imperative
utterances can indirectly elicit action of an addressee. In this paper, we
investigate command types in the setting of a grounded, collaborative game. We
focus on a less understood family of utterances for eliciting agent action,
locatives like emph{The chair is in the other room}, and demonstrate how these
utterances indirectly command in specific game state contexts. Our work shows
that models with domain-specific grounding can effectively realize the
pragmatic reasoning that is necessary for more robust natural language
interaction with robots.
Hamed Zamani, W. Bruce Croft
Comments: to appear in the proceedings of The 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’17)
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Learning a high-dimensional dense representation for vocabulary terms, also
known as a word embedding, has recently attracted much attention in natural
language processing and information retrieval tasks. The embedding vectors are
typically learned based on term proximity in a large corpus. This means that
the objective in well-known word embedding algorithms, e.g., word2vec, is to
accurately predict adjacent word(s) for a given word or context. However, this
objective is not necessarily equivalent to the goal of many information
retrieval (IR) tasks. The primary objective in various IR tasks is to capture
relevance instead of term proximity, syntactic, or even semantic similarity.
This is the motivation for developing unsupervised relevance-based word
embedding models that learn word representations based on query-document
relevance information. In this paper, we propose two learning models with
different objective functions; one learns a relevance distribution over the
vocabulary set for each query, and the other classifies each term as belonging
to the relevant or non-relevant class for each query. To train our models, we
used over six million unique queries and the top ranked documents retrieved in
response to each query, which are assumed to be relevant to the query. We
extrinsically evaluate our learned word representation models using two IR
tasks: query expansion and query classification. Both query expansion
experiments on four TREC collections and query classification experiments on
the KDD Cup 2005 dataset suggest that the relevance-based word embedding models
significantly outperform state-of-the-art proximity-based embedding models,
such as word2vec and GloVe.
Akshay Kumar Gupta
Comments: 9 pages, 3 figures, 3 tables
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
Visual question answering (or VQA) is a new and exciting problem that
combines natural language processing and computer vision techniques. We present
a survey of the various datasets and models that have been used to tackle this
task. The first part of the survey details the various datasets for VQA and
compares them along some common factors. The second part of this survey details
the different approaches for VQA, classified into four types: non-deep learning
models, deep learning models without attention, deep learning models with
attention, and other models which do not fit into the first three. Finally, we
compare the performances of these approaches and provide some directions for
future work.
Laura Perez-Beltrachini, Claire Gardent
Subjects: Computation and Language (cs.CL)
Recently, several data-sets associating data to text have been created to
train data-to-text surface realisers. It is unclear however to what extent the
surface realisation task exercised by these data-sets is linguistically
challenging. Do these data-sets provide enough variety to encourage the
development of generic, high-quality data-to-text surface realisers ? In this
paper, we argue that these data-sets have important drawbacks. We back up our
claim using statistics, metrics and manual evaluation. We conclude by eliciting
a set of criteria for the creation of a data-to-text benchmark which could help
better support the development, evaluation and comparison of linguistically
sophisticated data-to-text surface realisers.
Shantanu Kumar
Subjects: Computation and Language (cs.CL)
Relation Extraction is an important sub-task of Information Extraction which
has the potential of employing deep learning (DL) models with the creation of
large datasets using distant supervision. In this review, we compare the
contributions and pitfalls of the various DL models that have been used for the
task, to help guide the path ahead.
Ahmed Khalifa, Gabriella A. B. Barros, Julian Togelius
Subjects: Computation and Language (cs.CL); Learning (cs.LG)
DeepTingle is a text prediction and classification system trained on the
collected works of the renowned fantastic gay erotica author Chuck Tingle.
Whereas the writing assistance tools you use everyday (in the form of
predictive text, translation, grammar checking and so on) are trained on
generic, purportedly “neutral” datasets, DeepTingle is trained on a very
specific, internally consistent but externally arguably eccentric dataset. This
allows us to foreground and confront the norms embedded in data-driven
creativity and productivity assistance tools. As such tools effectively
function as extensions of our cognition into technology, it is important to
identify the norms they embed within themselves and, by extension, us.
DeepTingle is realized as a web application based on LSTM networks and the
GloVe word embedding, implemented in JavaScript with Keras-JS.
Mandar Joshi, Eunsol Choi, Daniel S. Weld, Luke Zettlemoyer
Comments: Accepted at ACL 2017
Subjects: Computation and Language (cs.CL)
We present TriviaQA, a challenging reading comprehension dataset containing
over 650K question-answer-evidence triples. TriviaQA includes 95K
question-answer pairs authored by trivia enthusiasts and independently gathered
evidence documents, six per question on average, that provide high quality
distant supervision for answering the questions. We show that, in comparison to
other recently introduced large-scale datasets, TriviaQA (1) has relatively
complex, compositional questions, (2) has considerable syntactic and lexical
variability between questions and corresponding answer-evidence sentences, and
(3) requires more cross sentence reasoning to find answers. We also present two
baseline algorithms: a feature-based classifier and a state-of-the-art neural
network, that performs well on SQuAD reading comprehension. Neither approach
comes close to human performance (23% and 40% vs. 80%), suggesting that
TriviaQA is a challenging testbed that is worth significant future
study.footnote{Data and code available at —
this http URL
Hamid Reza Hassanzadeh, Ying Sha, May D. Wang
Subjects: Computation and Language (cs.CL); Learning (cs.LG); Machine Learning (stat.ML)
Multiple cause-of-death data provides a valuable source of information that
can be used to enhance health standards by predicting health related
trajectories in societies with large populations. These data are often
available in large quantities across U.S. states and require Big Data
techniques to uncover complex hidden patterns. We design two different classes
of models suitable for large-scale analysis of mortality data, a Hadoop-based
ensemble of random forests trained over N-grams, and the DeepDeath, a deep
classifier based on the recurrent neural network (RNN). We apply both classes
to the mortality data provided by the National Center for Health Statistics and
show that while both perform significantly better than the random classifier,
the deep model that utilizes long short-term memory networks (LSTMs), surpasses
the N-gram based models and is capable of learning the temporal aspect of the
data without a need for building ad-hoc, expert-driven features.
Ankur Bapna, Gokhan Tur, Dilek Hakkani-Tur, Larry Heck
Comments: 8 + 2 pages
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG)
Conversational Language Understanding (CLU) is a key component of goal
oriented dialogue systems that would parse user ut- terances into semantic
frame representa- tions. Traditionally CLU does not utilize the dialogue
history beyond the previous system turn and contextual ambiguities are resolved
by the downstream components. In this paper, we explore novel approaches for
modeling dialogue context in a re- current neural network (RNN) based lan-
guage understanding system. We propose the Hierarchical Dialogue Encoder Net-
work, that allows encoding context from the dialogue history in chronological
or- der. We compare the performance of our proposed architecture with two
context models, one that uses just the previous turn context and another that
encodes dialogue context in a memory network, but loses the order of utterances
in the dialogue his- tory. Experiments with a multi-domain di- alogue dataset
demonstrate that the pro- posed architecture results in reduced se- mantic
frame error rates.
Matthew Lamm, Mihail Eric
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Today’s artificial assistants are typically prompted to perform tasks through
direct, imperative commands such as emph{Set a timer} or emph{Pick up the
box}. However, to progress toward more natural exchanges between humans and
these assistants, it is important to understand the way non-imperative
utterances can indirectly elicit action of an addressee. In this paper, we
investigate command types in the setting of a grounded, collaborative game. We
focus on a less understood family of utterances for eliciting agent action,
locatives like emph{The chair is in the other room}, and demonstrate how these
utterances indirectly command in specific game state contexts. Our work shows
that models with domain-specific grounding can effectively realize the
pragmatic reasoning that is necessary for more robust natural language
interaction with robots.
Jiyuan Zhang, Yang Feng, Dong Wang, Yang Wang, Andrew Abel, Shiyue Zhang, Andi Zhang
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
It has been shown that Chinese poems can be successfully generated by
sequence-to-sequence neural models, particularly with the attention mechanism.
A potential problem of this approach, however, is that neural models can only
learn abstract rules, while poem generation is a highly creative process that
involves not only rules but also innovations for which pure statistical models
are not appropriate in principle. This work proposes a memory-augmented neural
model for Chinese poem generation, where the neural model and the augmented
memory work together to balance the requirements of linguistic accordance and
aesthetic innovation, leading to innovative generations that are still
rule-compliant. In addition, it is found that the memory mechanism provides
interesting flexibility that can be used to generate poems with different
styles.
Gagan Madan
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Relation Extraction refers to the task of populating a database with tuples
of the form (r(e_1, e_2)), where (r) is a relation and (e_1), (e_2) are
entities. Distant supervision is one such technique which tries to
automatically generate training examples based on an existing KB such as
Freebase. This paper is a survey of some of the techniques in distant
supervision which primarily rely on Probabilistic Graphical Models (PGMs).
Lantian Li, Yixiang Chen, Ying Shi, Zhiyuan Tang, Dong Wang
Comments: deep neural networks, speaker verification, speaker feature
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Learning (cs.LG)
Recently deep neural networks (DNNs) have been used to learn speaker
features. However, the quality of the learned features is not sufficiently
good, so a complex back-end model, either neural or probabilistic, has to be
used to address the residual uncertainty when applied to speaker verification,
just as with raw features. This paper presents a convolutional time-delay deep
neural network structure (CT-DNN) for speaker feature learning. Our
experimental results on the Fisher database demonstrated that this CT-DNN can
produce high-quality speaker features: even with a single feature (0.3 seconds
including the context), the EER can be as low as 7.68%. This effectively
confirmed that the speaker trait is largely a deterministic short-time property
rather than a long-time distributional pattern, and therefore can be extracted
from just dozens of frames.
Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Judy Hoffman, Li Fei-Fei, C. Lawrence Zitnick, Ross Girshick
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Learning (cs.LG)
Existing methods for visual reasoning attempt to directly map inputs to
outputs using black-box architectures without explicitly modeling the
underlying reasoning processes. As a result, these black-box models often learn
to exploit biases in the data rather than learning to perform visual reasoning.
Inspired by module networks, this paper proposes a model for visual reasoning
that consists of a program generator that constructs an explicit representation
of the reasoning process to be performed, and an execution engine that executes
the resulting program to produce an answer. Both the program generator and the
execution engine are implemented by neural networks, and are trained using a
combination of backpropagation and REINFORCE. Using the CLEVR benchmark for
visual reasoning, we show that our model significantly outperforms strong
baselines and generalizes better in a variety of settings.
Hamed Zamani, W. Bruce Croft
Comments: to appear in the proceedings of The 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’17)
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Learning a high-dimensional dense representation for vocabulary terms, also
known as a word embedding, has recently attracted much attention in natural
language processing and information retrieval tasks. The embedding vectors are
typically learned based on term proximity in a large corpus. This means that
the objective in well-known word embedding algorithms, e.g., word2vec, is to
accurately predict adjacent word(s) for a given word or context. However, this
objective is not necessarily equivalent to the goal of many information
retrieval (IR) tasks. The primary objective in various IR tasks is to capture
relevance instead of term proximity, syntactic, or even semantic similarity.
This is the motivation for developing unsupervised relevance-based word
embedding models that learn word representations based on query-document
relevance information. In this paper, we propose two learning models with
different objective functions; one learns a relevance distribution over the
vocabulary set for each query, and the other classifies each term as belonging
to the relevant or non-relevant class for each query. To train our models, we
used over six million unique queries and the top ranked documents retrieved in
response to each query, which are assumed to be relevant to the query. We
extrinsically evaluate our learned word representation models using two IR
tasks: query expansion and query classification. Both query expansion
experiments on four TREC collections and query classification experiments on
the KDD Cup 2005 dataset suggest that the relevance-based word embedding models
significantly outperform state-of-the-art proximity-based embedding models,
such as word2vec and GloVe.
Masahiro Kazama, Minami Sugimoto, Chizuru Hosokawa, Keisuke Matsushima, Lav R. Varshney, Yoshiki Ishikawa
Subjects: Computers and Society (cs.CY); Computation and Language (cs.CL)
We propose a novel system which can transform a recipe into any selected
regional style (e.g., Japanese, Mediterranean, or Italian). This system has
three characteristics. First the system can identify the degree of dietary
style mixture of any selected recipe. Second, the system can visualize such
dietary style mixtures using barycentric Newton diagrams. Third, the system can
suggest ingredient substitutions through an extended word2vec model, such that
a recipe becomes more authentic for any selected dietary style. Drawing on a
large number of recipes from Yummly, an example shows how the proposed system
can transform a traditional Japanese recipe, Sukiyaki, into French style.
Tuomo Lempiäinen, Jukka Suomela
Comments: 13 pages, 1 figure
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Computational Complexity (cs.CC)
While the relationship of time and space is an established topic in
traditional centralised complexity theory, this is not the case in distributed
computing. We aim to remedy this by studying the time and space complexity of
algorithms in a weak message-passing model of distributed computing. While a
constant number of communication rounds implies a constant number of states
visited during the execution, the other direction is not clear at all. We
consider several graph families and show that indeed, there exist non-trivial
graph problems that are solvable by constant-space algorithms but that require
a non-constant running time. This provides us with a new complexity class for
distributed computing and raises interesting questions about the existence of
further combinations of time and space complexity.
Sebastian Brandt, Jara Uitto, Roger Wattenhofer
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Consider a small group of mobile agents whose goal is to locate a certain
cell in a two-dimensional infinite grid. The agents operate in an asynchronous
environment, where in each discrete time step, an arbitrary subset of the
agents execute one atomic look-compute-move cycle. The protocol controlling
each agent is determined by a (possibly distinct) finite automaton. The only
means of communication is to sense the states of the agents sharing the same
grid cell. Whenever an agent moves, the destination cell of the movement is
chosen by the agent’s automaton from the set of neighboring grid cells. We
study the minimum number of agents required to locate the target cell within
finite time and our main result states a tight lower bound for agents endowed
with a global compass. Furthermore, we show that the lack of such a compass
makes the problem strictly more difficult and present tight upper and lower
bounds for this case.
Deepthi Devaki Akkoorath, José Brandão, Annette Bieniusa, Carlos Baquero
Comments: 16 pages
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Concurrent linearizable access to shared objects can be prohibitively
expensive in a high contention workload. Many applications apply ad-hoc
techniques to eliminate the need of synchronous atomic updates, which may
result in non-linearizable implementations. We propose a new programming model
which leverages such patterns for concurrent access to objects in a shared
memory system. In this model, each thread maintains different views on the
shared object – a thread-local view and a global view. As the thread-local view
is not shared, it can be updated without incurring synchronization costs. These
local updates become visible to other threads only after the thread-local view
is merged with the global view. This enables better performance at the expense
of linearizability. We show that it is possible to maintain thread-local views
and to perform merge efficiently for several data types and evaluate their
performance and scalability compared to linearizable implementations. Further,
we discuss the consistency semantics of the data types and the associated
programming model.
Wei Liu, Kai Wu, Jialin Liu, Feng Chen, Dong Li
Comments: 10 pages
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
HPC applications pose high demands on I/O performance and storage capability.
The emerging non-volatile memory (NVM) techniques offer low-latency, high
bandwidth, and persistence for HPC applications. However, the existing I/O
stack are designed and optimized based on an assumption of disk-based storage.
To effectively use NVM, we must re-examine the existing high performance
computing (HPC) I/O sub-system to properly integrate NVM into it. Using NVM as
a fast storage, the previous assumption on the inferior performance of storage
(e.g., hard drive) is not valid any more. The performance problem caused by
slow storage may be mitigated; the existing mechanisms to narrow the
performance gap between storage and CPU may be unnecessary and result in large
overhead. Thus fully understanding the impact of introducing NVM into the HPC
software stack demands a thorough performance study.
In this paper, we analyze and model the performance of I/O intensive HPC
applications with NVM as a block device. We study the performance from three
perspectives: (1) the impact of NVM on the performance of traditional page
cache; (2) a performance comparison between MPI individual I/O and POSIX I/O;
and (3) the impact of NVM on the performance of collective I/O. We reveal the
diminishing effects of page cache, minor performance difference between MPI
individual I/O and POSIX I/O, and performance disadvantage of collective I/O on
NVM due to unnecessary data shuffling. We also model the performance of MPI
collective I/O and study the complex interaction between data shuffling,
storage performance, and I/O access patterns.
Giuseppe A. Di Luna, Paola Flocchini, Nicola Santoro, Giovanni Viglietta, Yukiko Yamauchi
Comments: 68 pages, 9 figures
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Robotics (cs.RO)
Shape formation is a basic distributed problem for systems of computational
mobile entities. Intensively studied for systems of autonomous mobile robots,
it has recently been investigated in the realm of programmable matter. Namely,
it has been studied in the geometric Amoebot model, where the anonymous
entities, called particles, operate on a hexagonal tessellation of the plane
and have limited computational power (they have constant memory), strictly
local interaction and communication capabilities (only with particles in
neighboring nodes of the grid), and limited motorial capabilities (from a grid
node to an empty neighboring node); their activation is controlled by an
adversarial scheduler. Recent investigations have shown how, starting from a
well-structured configuration in which the particles form a (not necessarily
complete) triangle, the particles can form a large class of shapes. This result
has been established under several assumptions: agreement on the clockwise
direction (i.e., chirality), a sequential activation schedule, and
randomization (i.e., particles can flip coins).
In this paper we provide a characterization of which shapes can be formed
deterministically starting from any simply connected initial configuration of
(n) particles. As a byproduct, if randomization is allowed, then any input
shape can be formed from any initial (simply connected) shape by our
deterministic algorithm, provided that (n) is large enough.
Our algorithm works without chirality, proving that chirality is
computationally irrelevant for shape formation. Furthermore, it works under a
strong adversarial scheduler, not necessarily sequential. We also consider the
complexity of shape formation in terms of the total number of moves performed
by the particles executing a universal shape formation algorithm, and we prove
that our solution is asymptotically optimal, with (O(n^2)) moves.
Lixing Chen, Jie Xu
Comments: Submitted to ACM/IEEE Symposium on Edge Computing (SEC 2017)
Subjects: Computer Science and Game Theory (cs.GT); Distributed, Parallel, and Cluster Computing (cs.DC)
Small cell base stations (SBSs) endowed with cloud-like computing
capabilities are considered as a key enabler of edge computing (EC), which
provides ultra-low latency and location-awareness for a variety of emerging
mobile applications and the Internet of Things. However, due to the limited
computation resources of an individual SBS, providing computation services of
high quality to its users faces significant challenges when it is overloaded
with an excessive amount of computation workload. In this paper, we propose
collaborative edge computing among SBSs by forming SBS coalitions to share
computation resources with each other, thereby accommodating more computation
workload in the edge system and reducing reliance on the remote cloud. A novel
SBS coalition formation algorithm is developed based on the coalitional game
theory to cope with various new challenges in small-cell-based edge systems,
including the co-provisioning of radio access and computing services,
cooperation incentives, and potential security risks. To address these
challenges, the proposed method (1) allows collaboration at both the user-SBS
association stage and the SBS peer offloading stage by exploiting the ultra
dense deployment of SBSs, (2) develops a payment-based incentive mechanism that
implements proportionally fair utility division to form stable SBS coalitions,
and (3) builds a social trust network for managing security risks among SBSs
due to collaboration. Systematic simulations in practical scenarios are carried
out to evaluate the efficacy and performance of the proposed method, which
shows that tremendous edge computing performance improvement can be achieved.
Sabrina Müller, Cem Tekin, Mihaela van der Schaar, Anja Klein
Comments: 18 pages, 6 figures
Subjects: Learning (cs.LG); Human-Computer Interaction (cs.HC); Social and Information Networks (cs.SI)
In mobile crowdsourcing, mobile users accomplish outsourced human
intelligence tasks. Mobile crowdsourcing requires an appropriate task
assignment strategy, since different workers may have different performance in
terms of acceptance rate and quality. Task assignment is challenging, since a
worker’s performance (i) may fluctuate, depending on both the worker’s current
context and the task context, (ii) is not known a priori, but has to be learned
over time. However, learning context-specific worker performance requires
access to context information, which workers may not grant to a central entity.
Moreover, evaluating worker performance might require costly quality
assessments. In this paper, we propose a context-aware hierarchical online
learning algorithm addressing the problem of performance maximization in mobile
crowdsourcing. In our algorithm, a local controller (LC) in the mobile device
of a worker regularly observes its worker’s context, his decisions to accept or
decline tasks and the quality in completing tasks. Based on these observations,
the LC regularly estimates its worker’s context-specific performance. The
mobile crowdsourcing platform (MCSP) then selects workers based on performance
estimates received from the LCs. This hierarchical approach enables the LCs to
learn context-specific worker performance and it enables the MCSP to select
suitable workers. In addition, our algorithm preserves worker context locally,
and it keeps the number of required quality assessments low. We prove that our
algorithm converges to the optimal task assignment strategy. Moreover, the
algorithm outperforms simpler task assignment strategies in experiments based
on synthetic and real data.
Pierre-François Marteau, Saeid Soheily-Khah, Nicolas Béchet
Comments: 24 pages, working paper
Subjects: Learning (cs.LG)
From the identification of a drawback in the Isolation Forest (IF) algorithm
that limits its use in the scope of anomaly detection, we propose two
extensions that allow to firstly overcome the previously mention limitation and
secondly to provide it with some supervised learning capability. The resulting
Hybrid Isolation Forest (HIF) that we propose is first evaluated on a synthetic
dataset to analyze the effect of the new meta-parameters that are introduced
and verify that the addressed limitation of the IF algorithm is effectively
overcame. We hen compare the two algorithms on the ISCX benchmark dataset, in
the context of a network intrusion detection application. Our experiments show
that HIF outperforms IF, but also challenges the 1-class and 2-classes SVM
baselines with computational efficiency.
Hassan Ismkhan
Subjects: Learning (cs.LG)
The aim of the k-means is to minimize squared sum of Euclidean distance from
the mean (SSEDM) of each cluster. The k-means can effectively optimize this
function, but it is too sensitive for initial centers (seeds). This paper
proposed a method for initialization of the k-means using the concept of useful
nearest center for each data point.
Mostafa Rahmani, George Atia
Subjects: Learning (cs.LG); Methodology (stat.ME); Machine Learning (stat.ML)
Random column sampling is not guaranteed to yield data sketches that preserve
the underlying structures of the data and may not sample sufficiently from
less-populated data clusters. Also, adaptive sampling can often provide
accurate low rank approximations, yet may fall short of producing descriptive
data sketches, especially when the cluster centers are linearly dependent.
Motivated by that, this paper introduces a novel randomized column sampling
tool dubbed Spatial Random Sampling (SRS), in which data points are sampled
based on their proximity to randomly sampled points on the unit sphere. The
most compelling feature of SRS is that the corresponding probability of
sampling from a given data cluster is proportional to the surface area the
cluster occupies on the unit sphere, independently from the size of the cluster
population. Although it is fully randomized, SRS is shown to provide
descriptive and balanced data representations. The proposed idea addresses a
pressing need in data science and holds potential to inspire many novel
approaches for analysis of big data.
Djallel Bouneffouf, Irina Rish, Guillermo A. Cecchi, Raphael Feraud
Comments: IJCAI 2017
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)
We consider a novel formulation of the multi-armed bandit model, which we
call the contextual bandit with restricted context, where only a limited number
of features can be accessed by the learner at every iteration. This novel
formulation is motivated by different online problems arising in clinical
trials, recommender systems and attention modeling. Herein, we adapt the
standard multi-armed bandit algorithm known as Thompson Sampling to take
advantage of our restricted context setting, and propose two novel algorithms,
called the Thompson Sampling with Restricted Context(TSRC) and the Windows
Thompson Sampling with Restricted Context(WTSRC), for handling stationary and
nonstationary environments, respectively. Our empirical results demonstrate
advantages of the proposed approaches on several real-life datasets
Feng Nan, Venkatesh Saligrama
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
We point out an issue with Theorem 5 appearing in “Group-based active query
selection for rapid diagnosis in time-critical situations”. Theorem 5 bounds
the expected number of queries for a greedy algorithm to identify the class of
an item within a constant factor of optimal. The Theorem is based on
correctness of a result on minimization of adaptive submodular functions. We
present an example that shows that a critical step in Theorem A.11 of “Adaptive
Submodularity: Theory and Applications in Active Learning and Stochastic
Optimization” is incorrect.
Lantian Li, Yixiang Chen, Ying Shi, Zhiyuan Tang, Dong Wang
Comments: deep neural networks, speaker verification, speaker feature
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Learning (cs.LG)
Recently deep neural networks (DNNs) have been used to learn speaker
features. However, the quality of the learned features is not sufficiently
good, so a complex back-end model, either neural or probabilistic, has to be
used to address the residual uncertainty when applied to speaker verification,
just as with raw features. This paper presents a convolutional time-delay deep
neural network structure (CT-DNN) for speaker feature learning. Our
experimental results on the Fisher database demonstrated that this CT-DNN can
produce high-quality speaker features: even with a single feature (0.3 seconds
including the context), the EER can be as low as 7.68%. This effectively
confirmed that the speaker trait is largely a deterministic short-time property
rather than a long-time distributional pattern, and therefore can be extracted
from just dozens of frames.
Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Judy Hoffman, Li Fei-Fei, C. Lawrence Zitnick, Ross Girshick
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Learning (cs.LG)
Existing methods for visual reasoning attempt to directly map inputs to
outputs using black-box architectures without explicitly modeling the
underlying reasoning processes. As a result, these black-box models often learn
to exploit biases in the data rather than learning to perform visual reasoning.
Inspired by module networks, this paper proposes a model for visual reasoning
that consists of a program generator that constructs an explicit representation
of the reasoning process to be performed, and an execution engine that executes
the resulting program to produce an answer. Both the program generator and the
execution engine are implemented by neural networks, and are trained using a
combination of backpropagation and REINFORCE. Using the CLEVR benchmark for
visual reasoning, we show that our model significantly outperforms strong
baselines and generalizes better in a variety of settings.
Hirokatsu Kataoka, Kaori Abe, Akio Nakamura, Yutaka Satoh
Comments: CVPR 2017 Workshop Submission
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
The paper presents a novel concept for collaborative descriptors between
deeply learned and hand-crafted features. To achieve this concept, we apply
convolutional maps for pre-processing, namely the convovlutional maps are used
as input of hand-crafted features. We recorded an increase in the performance
rate of +17.06 % (multi-class object recognition) and +24.71 % (car detection)
from grayscale input to convolutional maps. Although the framework is
straight-forward, the concept should be inherited for an improved
representation.
Mohammad Javad Shafiee, Audrey G. Chung, Farzad Khalvati, Masoom A. Haider, Alexander Wong
Comments: 8 pages
Subjects: Neural and Evolutionary Computing (cs.NE); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
While lung cancer is the second most diagnosed form of cancer in men and
women, a sufficiently early diagnosis can be pivotal in patient survival rates.
Imaging-based, or radiomics-driven, detection methods have been developed to
aid diagnosticians, but largely rely on hand-crafted features which may not
fully encapsulate the differences between cancerous and healthy tissue.
Recently, the concept of discovery radiomics was introduced, where custom
abstract features are discovered from readily available imaging data. We
propose a novel evolutionary deep radiomic sequencer discovery approach based
on evolutionary deep intelligence. Motivated by patient privacy concerns and
the idea of operational artificial intelligence, the evolutionary deep radiomic
sequencer discovery approach organically evolves increasingly more efficient
deep radiomic sequencers that produce significantly more compact yet similarly
descriptive radiomic sequences over multiple generations. As a result, this
framework improves operational efficiency and enables diagnosis to be run
locally at the radiologist’s computer while maintaining detection accuracy. We
evaluated the evolved deep radiomic sequencer (EDRS) discovered via the
proposed evolutionary deep radiomic sequencer discovery framework against
state-of-the-art radiomics-driven and discovery radiomics methods using
clinical lung CT data with pathologically-proven diagnostic data. The evolved
deep radiomic sequencer shows state-of-the-art diagnostic accuracy (88.78%)
relative to previous radiomics approaches.
Steven Stenberg Hansen
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG)
We present a new deep meta reinforcement learner, which we call Deep Episodic
Value Iteration (DEVI). DEVI uses a deep neural network to learn a similarity
metric for a non-parametric model-based reinforcement learning algorithm. Our
model is trained end-to-end via back-propagation. Despite being trained using
the model-free Q-learning objective, we show that DEVI’s model-based internal
structure provides `one-shot’ transfer to changes in reward and transition
structure, even for tasks with very high-dimensional state spaces.
Ahmed Khalifa, Gabriella A. B. Barros, Julian Togelius
Subjects: Computation and Language (cs.CL); Learning (cs.LG)
DeepTingle is a text prediction and classification system trained on the
collected works of the renowned fantastic gay erotica author Chuck Tingle.
Whereas the writing assistance tools you use everyday (in the form of
predictive text, translation, grammar checking and so on) are trained on
generic, purportedly “neutral” datasets, DeepTingle is trained on a very
specific, internally consistent but externally arguably eccentric dataset. This
allows us to foreground and confront the norms embedded in data-driven
creativity and productivity assistance tools. As such tools effectively
function as extensions of our cognition into technology, it is important to
identify the norms they embed within themselves and, by extension, us.
DeepTingle is realized as a web application based on LSTM networks and the
GloVe word embedding, implemented in JavaScript with Keras-JS.
Hamed Zamani, W. Bruce Croft
Comments: to appear in the proceedings of The 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’17)
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Learning a high-dimensional dense representation for vocabulary terms, also
known as a word embedding, has recently attracted much attention in natural
language processing and information retrieval tasks. The embedding vectors are
typically learned based on term proximity in a large corpus. This means that
the objective in well-known word embedding algorithms, e.g., word2vec, is to
accurately predict adjacent word(s) for a given word or context. However, this
objective is not necessarily equivalent to the goal of many information
retrieval (IR) tasks. The primary objective in various IR tasks is to capture
relevance instead of term proximity, syntactic, or even semantic similarity.
This is the motivation for developing unsupervised relevance-based word
embedding models that learn word representations based on query-document
relevance information. In this paper, we propose two learning models with
different objective functions; one learns a relevance distribution over the
vocabulary set for each query, and the other classifies each term as belonging
to the relevant or non-relevant class for each query. To train our models, we
used over six million unique queries and the top ranked documents retrieved in
response to each query, which are assumed to be relevant to the query. We
extrinsically evaluate our learned word representation models using two IR
tasks: query expansion and query classification. Both query expansion
experiments on four TREC collections and query classification experiments on
the KDD Cup 2005 dataset suggest that the relevance-based word embedding models
significantly outperform state-of-the-art proximity-based embedding models,
such as word2vec and GloVe.
Vincenzo Lomonaco, Davide Maltoni
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Robotics (cs.RO)
Continuous/Lifelong learning of high-dimensional data streams is a
challenging research problem. In fact, fully retraining models each time new
data become available is infeasible, due to computational and storage issues,
while na”ive incremental strategies have been shown to suffer from
catastrophic forgetting. In the context of real-world object recognition
applications (e.g., robotic vision), where continuous learning is crucial, very
few datasets and benchmarks are available to evaluate and compare emerging
techniques. In this work we propose a new dataset and benchmark CORe50,
specifically designed for continuous object recognition, and introduce baseline
approaches for different continuous learning scenarios.
Michael T. Lash, Jason Slater, Philip M. Polgreen, Alberto M. Segre
Comments: 8 pages
Subjects: Computers and Society (cs.CY); Learning (cs.LG); Applications (stat.AP)
This large-scale study, consisting of 24.5 million hand hygiene opportunities
spanning 19 distinct facilities in 10 different states, uses linear predictive
models to expose factors that may affect hand hygiene compliance. We examine
the use of features such as temperature, relative humidity, influenza
prevalence and severity, day/night shift, federal holidays and the presence of
new residents in predicting daily hand hygiene compliance. The results suggest
that colder temperatures and federal holidays have an adverse effect on hand
hygiene compliance rates, and that individual cultures and attitudes regarding
hand hygiene seem to exist among facilities.
Hamid Reza Hassanzadeh, Ying Sha, May D. Wang
Subjects: Computation and Language (cs.CL); Learning (cs.LG); Machine Learning (stat.ML)
Multiple cause-of-death data provides a valuable source of information that
can be used to enhance health standards by predicting health related
trajectories in societies with large populations. These data are often
available in large quantities across U.S. states and require Big Data
techniques to uncover complex hidden patterns. We design two different classes
of models suitable for large-scale analysis of mortality data, a Hadoop-based
ensemble of random forests trained over N-grams, and the DeepDeath, a deep
classifier based on the recurrent neural network (RNN). We apply both classes
to the mortality data provided by the National Center for Health Statistics and
show that while both perform significantly better than the random classifier,
the deep model that utilizes long short-term memory networks (LSTMs), surpasses
the N-gram based models and is capable of learning the temporal aspect of the
data without a need for building ad-hoc, expert-driven features.
Honglun Zhang, Haiyang Wang, Xiaming Chen, Yongkun Wang, Yaohui Jin
Comments: 9 pages, in Chinese, 7 figures, CCFBD2016
Subjects: Computers and Society (cs.CY); Learning (cs.LG)
P2P lending presents as an innovative and flexible alternative for
conventional lending institutions like banks, where lenders and borrowers
directly make transactions and benefit each other without complicated
verifications. However, due to lack of specialized laws, delegated monitoring
and effective managements, P2P platforms may spawn potential risks, such as
withdraw failures, investigation involvements and even runaway bosses, which
cause great losses to lenders and are especially serious and notorious in
China. Although there are abundant public information and data available on the
Internet related to P2P platforms, challenges of multi-sourcing and
heterogeneity matter. In this paper, we promote a novel deep learning model,
OMNIRank, which comprehends multi-dimensional features of P2P platforms for
risk quantification and produces scores for ranking. We first construct a
large-scale flexible crawling framework and obtain great amounts of
multi-source heterogeneous data of domestic P2P platforms since 2007 from the
Internet. Purifications like duplication and noise removal, null handing,
format unification and fusion are applied to improve data qualities. Then we
extract deep features of P2P platforms via text comprehension, topic modeling,
knowledge graph and sentiment analysis, which are delivered as inputs to
OMNIRank, a deep learning model for risk quantification of P2P platforms.
Finally, according to rankings generated by OMNIRank, we conduct flourish data
visualizations and interactions, providing lenders with comprehensive
information supports, decision suggestions and safety guarantees.
Ankur Bapna, Gokhan Tur, Dilek Hakkani-Tur, Larry Heck
Comments: 8 + 2 pages
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG)
Conversational Language Understanding (CLU) is a key component of goal
oriented dialogue systems that would parse user ut- terances into semantic
frame representa- tions. Traditionally CLU does not utilize the dialogue
history beyond the previous system turn and contextual ambiguities are resolved
by the downstream components. In this paper, we explore novel approaches for
modeling dialogue context in a re- current neural network (RNN) based lan-
guage understanding system. We propose the Hierarchical Dialogue Encoder Net-
work, that allows encoding context from the dialogue history in chronological
or- der. We compare the performance of our proposed architecture with two
context models, one that uses just the previous turn context and another that
encodes dialogue context in a memory network, but loses the order of utterances
in the dialogue his- tory. Experiments with a multi-domain di- alogue dataset
demonstrate that the pro- posed architecture results in reduced se- mantic
frame error rates.
Zhen Li, Sheng Wang, Yizhou Yu, Jinbo Xu
Subjects: Biomolecules (q-bio.BM); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Quantitative Methods (q-bio.QM)
Computational prediction of membrane protein (MP) structures is very
challenging partially due to lack of sufficient solved structures for homology
modeling. Recently direct evolutionary coupling analysis (DCA) sheds some light
on protein contact prediction and accordingly, contact-assisted folding, but
DCA is effective only on some very large-sized families since it uses
information only in a single protein family. This paper presents a deep
transfer learning method that can significantly improve MP contact prediction
by learning contact patterns and complex sequence-contact relationship from
thousands of non-membrane proteins (non-MPs). Tested on 510 non-redundant MPs,
our deep model (learned from only non-MPs) has top L/10 long-range contact
prediction accuracy 0.69, better than our deep model trained by only MPs (0.63)
and much better than a representative DCA method CCMpred (0.47) and the CASP11
winner MetaPSICOV (0.55). The accuracy of our deep model can be further
improved to 0.72 when trained by a mix of non-MPs and MPs. When only contacts
in transmembrane regions are evaluated, our method has top L/10 long-range
accuracy 0.62, 0.57, and 0.53 when trained by a mix of non-MPs and MPs, by
non-MPs only, and by MPs only, respectively, still much better than MetaPSICOV
(0.45) and CCMpred (0.40). All these results suggest that sequence-structure
relationship learned by our deep model from non-MPs generalizes well to MP
contact prediction. Improved contact prediction also leads to better
contact-assisted folding. Using only top predicted contacts as restraints, our
deep learning method can fold 160 and 200 of 510 MPs with TMscore>0.6 when
trained by non-MPs only and by a mix of non-MPs and MPs, respectively, while
CCMpred and MetaPSICOV can do so for only 56 and 77 MPs, respectively. Our
contact-assisted folding also greatly outperforms homology modeling.
Sanghamitra Dutta, Viveck Cadambe, Pulkit Grover
Comments: To appear in ISIT 2017
Subjects: Information Theory (cs.IT)
We consider the problem of computing the convolution of two long vectors
using parallel processing units in the presence of “stragglers”. Stragglers
refer to the small fraction of faulty or slow processors that delays the entire
computation in time-critical distributed systems. We first show that splitting
the vectors into smaller pieces and using a linear code to encode these pieces
provides better resilience against stragglers than replication-based schemes
under a simple, worst-case straggler analysis. We then demonstrate that under
commonly used models of computation time, coding can dramatically improve the
probability of finishing the computation within a target “deadline” time. As
opposed to the more commonly used technique of expected computation time
analysis, we quantify the exponents of the probability of failure in the limit
of large deadlines. Our exponent metric captures the probability of failing to
finish before a specified deadline time, i.e. , the behavior of the “tail”.
Moreover, our technique also allows for simple closed form expressions for more
general models of computation time, e.g. shifted Weibull models instead of only
shifted exponentials. Thus, through this problem of coded convolution, we
establish the utility of a novel asymptotic failure exponent analysis for
distributed systems.
Jad Hachem, Nikhil Karamchandani, Sharayu Moharir, Suhas Diggavi
Comments: 7 pages, 2 figures. A shorter version of this paper is to appear in IEEE ISIT 2017
Subjects: Information Theory (cs.IT)
We study the coded caching problem when we are allowed to match users to
caches based on their requested files. We focus on the case where caches are
divided into clusters and each user can be assigned to a unique cache from a
specific cluster. We show that neither the coded delivery strategy
(approximately optimal when the user-cache assignment is pre-fixed) nor the
uncoded replication strategy (approximately optimal when all caches belong to a
single cluster) is sufficient for all memory regimes. We propose a hybrid
solution that combines ideas from both schemes and that performs at least as
well as either strategy in most memory regimes. Finally, we show that this
hybrid strategy is approximately optimal in most memory regimes.
Zhiyong Zhou, Jun Yu
Subjects: Information Theory (cs.IT)
We study the recovery results of (ell_p)-constrained compressive sensing
(CS) with (pgeq 1) via robust width property and determine conditions on the
number of measurements for Gaussian matrices under which the property holds
with high probability. Our paper extends the existing results in Cahill and
Mixon (2014) from (ell_2)-constrained CS to (ell_p)-constrained case with
(pgeq 1) and complements the recovery analysis for robust CS with (ell_p)
loss function.
Harsh Tataria, Peter J. Smith, Michail Matthaiou, Pawel A. Dmochowski
Comments: 7 pages, 3 figures, accepted for publication in the proceedings of IEEE ICC, to be held in Paris, France, May 2017
Subjects: Information Theory (cs.IT)
Closed-form approximations to the expected per-terminal
signal-to-interference-plus-noise-ratio (SINR) and ergodic sum spectral
efficiency of a large multiuser multiple-input multiple-output system are
presented. Our analysis assumes correlated Ricean fading with maximum ratio
combining on the uplink, where the base station (BS) is equipped with a uniform
linear array (ULA) with physical size restrictions. Unlike previous studies,
our model caters for the presence of unequal correlation matrices and unequal
Rice factors for each terminal. As the number of BS antennas grows without
bound, with a finite number of terminals, we derive the limiting expected
per-terminal SINR and ergodic sum spectral efficiency of the system. Our
findings suggest that with restrictions on the size of the ULA, the expected
SINR saturates with increasing operating signal-to-noise-ratio (SNR) and BS
antennas. Whilst unequal correlation matrices result in higher performance, the
presence of strong line-of-sight (LoS) has an opposite effect. Our analysis
accommodates changes in system dimensions, SNR, LoS levels, spatial correlation
levels and variations in fixed physical spacings of the BS array.
Tsuyoshi Yoshida, Magnus Karlsson, Erik Agrell
Comments: 4 pages, 3 figures
Subjects: Information Theory (cs.IT); Optics (physics.optics)
High-throughput optical communication systems utilize binary soft-decision
forward error correction (SD-FEC) with bit interleaving over the bit channels.
The generalized mutual information (GMI) is an achievable rate in such systems
and is known to be a good predictor of the bit error rate after SD-FEC decoding
(post-FEC BER) for uniform signaling. However, for probabilistically shaped
(nonuniform) signaling, we find that the GMI, even if normalized with the
signal entropy, is less correlated with the post-FEC BER. We show that the
mutual information based on the distribution of the single bit signal, and its
asymmetric log-likelihood ratio, are better predictors of the post-FEC BER. In
simulations over the Gaussian channel, we find that the prediction accuracy,
quantified as the peak-to-peak deviation of the post-FEC BER within a set of
different modulation formats and distributions, can be improved more than 10
times compared with the normalized GMI.
Oskari Tervo, Le-Nam Tran, Symeon Chatzinotas, Markku Juntti, Björn Ottersten
Comments: 5 pages, 4 figures, accepted for IEEE International Workshop on Signal Processing Advances in Wireless Communications (SPAWC 2017)
Subjects: Information Theory (cs.IT)
This paper studies energy-efficient joint transmit and receive beamforming in
multi-cell multi-user multiple-input multiple-output systems. We consider
conventional network energy efficiency metric where the users can receive
unicasting streams in addition to the group-specific common multicasting
streams which have certain rate constraints. The goal is to use the
transmission resources more efficiently to improve the energy efficiency, when
the users are equipped with multiple antennas. Numerical results show the
achieved energy efficiency gains by using the additional degrees of freedom of
the multicasting transmission to private message unicasting.
Fengyou Sun, Yuming Jiang
Subjects: Information Theory (cs.IT)
This paper presents a set of new results directly exploring the special
characteristics of the wireless channel capacity process. An appealing finding
is that, for typical fading channels, their instantaneous capacity and
cumulative capacity are both light-tailed. A direct implication of this finding
is that the cumulative capacity and subsequently the delay and backlog
performance can be upper-bounded by some exponential distributions, which is
often assumed but not justified in the wireless network performance analysis
literature. In addition, various bounds are derived for distributions of the
cumulative capacity and the delay-constrained capacity, considering three
representative dependence structures in the capacity process, namely
comonotonicity, independence, and Markovian. To help gain insights in the
performance of a wireless channel whose capacity process may be too complex or
detailed information is lacking, stochastic orders are introduced to the
capacity process, based on which, results to compare the delay and
delay-constrained capacity performance are obtained. Moreover, the impact of
self-interference in communication, which is an open problem in stochastic
network calculus (SNC), is investigated and original results are derived. The
obtained results in this paper complement the SNC literature, easing its
application to wireless networks and its extension towards a calculus for
wireless networks.
Giuseppe Caire
Subjects: Information Theory (cs.IT)
A well-known lower bound widely used in the massive MIMO literature hinges on
the fact that, thanks to the large number of antennas, the “useful signal
coefficient” behaves almost deterministically, i.e., it tends to be close to
its strictly positive mean value because of the massive MIMO coherent
combining. If this “channel hardening effect” does not hold, the bound degrades
significantly, to the point of becoming completely useless when the useful
signal coefficient has zero mean. Erroneously, this poor behavior of the bound
let people to believe that “knowing the effective channel coefficient” is
critical to obtain good performance. Also, this let to the conclusion that
beamformed DL pilots are needed to learn the useful signal coefficient or, more
recently, to the proposal/rediscovery of “blind” techniques to estimate such
useful signal coefficient explicitly. As a matter of fact, an alternative
bounding technique, also very well known in the information theoretic
literature, but perhaps not so well-known in the massive MIMO literature, shows
that the unknown statistical fluctuations of the useful signal coefficient have
very small impact on the system performance, provided that the useful signal
coefficient varies slowly over time and frequency, i.e., that the coherence
block length is reasonably large. In this note, we derive and compare the two
bounds, and discuss their relative merit and applicability.
Pavel Loskot
Comments: no figures, no math
Subjects: Information Theory (cs.IT)
This comment recalls a previously proposed encoding scheme involving two
synchronized random number generators (RNGs) to compress the transmission
message. It is also claimed that the recently proposed random number modulation
(RNM) scheme suffers considerably from the severe error propagation, and that,
in general, the overall energy consumption is minimized when all information
bits are transmitted as fast as possible with the minimum latency.
Daniel Heinlein, Sascha Kurz
Comments: 30 pages, 3 tables
Subjects: Combinatorics (math.CO); Information Theory (cs.IT)
We study asymptotic lower and upper bounds for the sizes of constant
dimension codes with respect to the subspace or injection distance, which is
used in random linear network coding. In this context we review known upper
bounds and show relations between them. A slightly improved version of the
so-called linkage construction is presented which is e.g. used to construct
constant dimension codes with subspace distance (d=4), dimension (k=3) of the
codewords for all field sizes (q), and sufficiently large dimensions (v) of the
ambient space, that exceed the MRD bound, for codes containing a lifted MRD
code, by Etzion and Silberstein.
Haolei Weng, Arian Maleki
Subjects: Statistics Theory (math.ST); Information Theory (cs.IT)
The class of Lq-regularized least squares (LQLS) are considered for
estimating a p-dimensional vector {eta} from its n noisy linear observations
y = X{eta}+w. The performance of these schemes are studied under the
high-dimensional asymptotic setting in which p grows linearly with n. In this
asymptotic setting, phase transition diagrams (PT) are often used for comparing
the performance of different estimators. Although phase transition analysis is
shown to provide useful information for compressed sensing, the fact that it
ignores the measurement noise not only limits its applicability in many
application areas, but also may lead to misunderstandings. For instance,
consider a linear regression problem in which n > p and the signal is not
exactly sparse. If the measurement noise is ignored in such systems,
regularization techniques, such as LQLS, seem to be irrelevant since even the
ordinary least squares (OLS) returns the exact solution. However, it is
well-known that if n is not much larger than p then the regularization
techniques improve the performance of OLS. In response to this limitation of PT
analysis, we consider the low-noise sensitivity analysis. We show that this
analysis framework (i) reveals the advantage of LQLS over OLS, (ii) captures
the difference between different LQLS estimators even when n > p, and (iii)
provides a fair comparison among different estimators in high signal-to-noise
ratios. As an application of this framework, we will show that under mild
conditions LASSO outperforms other LQLS even when the signal is dense. Finally,
by a simple transformation we connect our low-noise sensitivity framework to
the classical asymptotic regime in which n/p goes to infinity and characterize
how and when regularization techniques offer improvements over ordinary least
squares, and which regularizer gives the most improvement when the sample size
is large.
Andrew Knyazev, Alexander Malyshev
Comments: 5 pages, 8 figures. To appear in Proceedings of SampTA 2017: Sampling Theory and Applications, 12th International Conference, July 3-7, 2017, Tallinn, Estonia
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Information Theory (cs.IT); Numerical Analysis (cs.NA); Numerical Analysis (math.NA)
Signal reconstruction from a sample using an orthogonal projector onto a
guiding subspace is theoretically well justified, but may be difficult to
practically implement. We propose more general guiding operators, which
increase signal components in the guiding subspace relative to those in a
complementary subspace, e.g., iterative low-pass edge-preserving filters for
super-resolution of images. Two examples of super-resolution illustrate our
technology: a no-flash RGB photo guided using a high resolution flash RGB
photo, and a depth image guided using a high resolution RGB photo.
Stuart M. Marshall, Alastair R. G. Murray, Leroy Cronin
Comments: 21 pages, 7 figures
Subjects: Other Quantitative Biology (q-bio.OT); Information Theory (cs.IT)
One thing that discriminates living things from inanimate matter is their
ability to generate similarly complex or non-random architectures in a large
abundance. From DNA sequences to folded protein structures, living cells,
microbial communities and multicellular structures, the material configurations
in biology can easily be distinguished from non-living material assemblies.
This is also true of the products of complex organisms that can themselves
construct complex tools, machines, and artefacts. Whilst these objects are not
living, they cannot randomly form, as they are the product of a biological
organism and hence are either technological or cultural biosignatures. The
problem is that it is not obvious how it might be possible to generalise an
approach that aims to evaluate complex objects as possible biosignatures.
However, if it was possible such a self-contained approach could be useful to
explore the cosmos for new life forms. This would require us to prove
rigorously that a given artefact is too complex to have formed by chance. In
this paper, we present a new type of complexity measure, Pathway Complexity,
that allows us to not only threshold the abiotic-biotic divide, but to
demonstrate a probabilistic approach based upon object abundance and complexity
which can be used to unambiguously assign complex objects as biosignatures. We
hope that this approach not only opens up the search for biosignatures beyond
earth, but allow us to explore earth for new types of biology, as well as
observing when a complex chemical system discovered in the laboratory could be
considered alive.