Raluca Necula (1), Mihaela Breaban (1), Madalina Raschip (1) ((1) Faculty of Computer Science, Alexandru Ioan Cuza University, Iasi, Romania)
Comments: 10 pages, 2 figures
Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI)
The Dynamic Vehicle Routing Problem with Time Windows (DVRPTW) is an
extension of the well-known Vehicle Routing Problem (VRP), which takes into
account the dynamic nature of the problem. This aspect requires the vehicle
routes to be updated in an ongoing manner as new customer requests arrive in
the system and must be incorporated into an evolving schedule during the
working day. Besides the vehicle capacity constraint involved in the classical
VRP, DVRPTW considers in addition time windows, which are able to better
capture real-world situations. Despite this, so far, few studies have focused
on tackling this problem of greater practical importance. To this end, this
study devises for the resolution of DVRPTW, an ant colony optimization based
algorithm, which resorts to a joint solution construction mechanism, able to
construct in parallel the vehicle routes. This method is coupled with a local
search procedure, aimed to further improve the solutions built by ants, and
with an insertion heuristics, which tries to reduce the number of vehicles used
to service the available customers. The experiments indicate that the proposed
algorithm is competitive and effective, and on DVRPTW instances with a higher
dynamicity level, it is able to yield better results compared to existing
ant-based approaches.
Sergi Caelles, Yuhua Chen, Jordi Pont-Tuset, Luc Van Gool
Subjects: Computer Vision and Pattern Recognition (cs.CV)
This paper tackles the problem of semi-supervised video object segmentation,
that is, segmenting an object in a sequence given its mask in the first frame.
One of the main challenges in this scenario is the change of appearance of the
objects of interest. Their semantics, on the other hand, do not vary. This
paper investigates how to take advantage of such invariance via the
introduction of a semantic prior that guides the appearance model.
Specifically, given the segmentation mask of the first frame of a sequence, we
estimate the semantics of the object of interest, and propagate that knowledge
throughout the sequence to improve the results based on an appearance model. We
present Semantically-Guided Video Object Segmentation (SGV), which improves
results over previous state of the art on two different datasets using a
variety of evaluation metrics, while running in half a second per frame.
Kai Cao, Anil K. Jain
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Latent fingerprints are one of the most important and widely used evidence in
law enforcement and forensic agencies worldwide. Yet, NIST evaluations show
that the performance of state-of-the-art latent recognition systems is far from
satisfactory. An automated latent fingerprint recognition system with high
accuracy is essential to compare latents found at crime scenes to a large
collection of reference prints to generate a candidate list of possible mates.
In this paper, we propose an automated latent fingerprint recognition algorithm
that utilizes Convolutional Neural Networks (ConvNets) for ridge flow
estimation and minutiae descriptor extraction, and extract complementary
templates (two minutiae templates and one texture template) to represent the
latent. The comparison scores between the latent and a reference print based on
the three templates are fused to retrieve a short candidate list from the
reference database. Experimental results show that the rank-1 identification
accuracies (query latent is matched with its true mate in the reference
database) are 64.7% for the NIST SD27 and 75.3% for the WVU latent databases,
against a reference database of 100K rolled prints. These results are the best
among published papers on latent recognition and competitive with the
performance (66.7% and 70.8% rank-1 accuracies on NIST SD27 and WVU DB,
respectively) of a leading COTS latent Automated Fingerprint Identification
System (AFIS). By score-level (rank-level) fusion of our system with the
commercial off-the-shelf (COTS) latent AFIS, the overall rank-1 identification
performance can be improved from 64.7% and 75.3% to 73.3% (74.4%) and 76.6%
(78.4%) on NIST SD27 and WVU latent databases, respectively.
Amal Rannen Triki, Rahaf Aljundi, Mathew B. Blaschko, Tinne Tuytelaars
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
This paper introduces a new lifelong learning solution where a single model
is trained for a sequence of tasks. The main challenge that vision systems face
in this context is catastrophic forgetting: as they tend to adapt to the most
recently seen task, they lose performance on the tasks that were learned
previously. Our method aims at preserving the knowledge of the previous tasks
while learning a new one by using autoencoders. For each task, an
under-complete autoencoder is learned, capturing the features that are crucial
for its achievement. When a new task is presented to the system, we prevent the
reconstructions of the features with these autoencoders from changing, which
has the effect of preserving the information on which the previous tasks are
mainly relying. At the same time, the features are given space to adjust to the
most recent environment as only their projection into a low dimension
submanifold is controlled. The proposed system is evaluated on image
classification tasks and shows a reduction of forgetting over the
state-of-the-art
Long-Kai Huang, Qiang Yang, Wei-Shi Zheng
Comments: To appear in IEEE Transactions on Neural Networks and Learning Systems (DOI: 10.1109/TNNLS.2017.2689242)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Although hash function learning algorithms have achieved great success in
recent years, most existing hash models are off-line, which are not suitable
for processing sequential or online data. To address this problem, this work
proposes an online hash model to accommodate data coming in stream for online
learning. Specifically, a new loss function is proposed to measure the
similarity loss between a pair of data samples in hamming space. Then, a
structured hash model is derived and optimized in a passive-aggressive way.
Theoretical analysis on the upper bound of the cumulative loss for the proposed
online hash model is provided. Furthermore, we extend our online hashing from a
single-model to a multi-model online hashing that trains multiple models so as
to retain diverse online hashing models in order to avoid biased update. The
competitive efficiency and effectiveness of the proposed online hash models are
verified through extensive experiments on several large-scale datasets as
compared to related hashing methods.
Amit Kumar, Rama Chellappa
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Recently, Deep Convolution Networks (DCNNs) have been applied to the task of
face alignment and have shown potential for learning improved feature
representations. Although deeper layers can capture abstract concepts like
pose, it is difficult to capture the geometric relationships among the
keypoints in DCNNs. In this paper, we propose a novel convolution-deconvolution
network for facial keypoint detection. Our model predicts the 2D locations of
the keypoints and their individual visibility along with 3D head pose, while
exploiting the spatial relationships among different keypoints. Different from
existing approaches of modeling these relationships, we propose learnable
transform functions which captures the relationships between keypoints at
feature level. However, due to extensive variations in pose, not all of these
relationships act at once, and hence we propose, a pose-based routing function
which implicitly models the active relationships. Both transform functions and
the routing function are implemented through convolutions in a multi-task
framework. Our approach presents a single-shot keypoint detection method,
making it different from many existing cascade regression-based methods. We
also show that learning these relationships significantly improve the accuracy
of keypoint detections for in-the-wild face images from challenging datasets
such as AFW and AFLW.
Margret Keuper
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Most state-of-the-art motion segmentation algorithms draw their potential
from modeling motion differences of local entities such as point trajectories
in terms of pairwise potentials in graphical models. Inference in instances of
minimum cost multicut problems defined on such graphs al- lows to optimize the
number of the resulting segments along with the segment assignment. However,
pairwise potentials limit the discriminative power of the employed motion
models to translational differences. More complex models such as Euclidean or
affine transformations call for higher-order potentials and a tractable
inference in the resulting higher-order graphical models. In this paper, we (1)
introduce a generalization of the minimum cost lifted multicut problem to
hypergraphs, and (2) propose a simple primal feasible heuristic that allows for
a reasonably efficient inference in instances of higher-order lifted multicut
problem instances defined on point trajectory hypergraphs for motion
segmentation. The resulting motion segmentations improve over the
state-of-the-art on the FBMS-59 dataset.
Tuan Hoang, Thanh-Toan Do, Dang-Khoa Le Tan, Ngai-Man Cheung
Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR)
We propose a novel approach to improve unsupervised hashing. Specifically, we
propose an embedding method to enhance the discriminative property of features
before passing them into hashing. We propose a very efficient embedding method:
Gaussian Mixture Model embedding (Gemb). Gemb embeds feature vector into a
low-dimensional vector using Gaussian Mixture Model. Our experiment shows that
the proposed method boosts the hashing performance of many state-of-the-art,
e.g. Binary Autoencoder (BA), Iterative Quantization (ITQ), in standard
evaluation metrics for the three main benchmark datasets.
Aliaksandr Siarohin, Gloria Zen, Cveta Majtanovic, Xavier Alameda-Pineda, Elisa Ricci, Nicu Sebe
Comments: Accepted at ACM ICMR 2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Recent works have shown that it is possible to automatically predict
intrinsic image properties like memorability. In this paper, we take a step
forward addressing the question: “Can we make an image more memorable?”.
Methods for automatically increasing image memorability would have an impact in
many application fields like education, gaming or advertising. Our work is
inspired by the popular editing-by-applying-filters paradigm adopted in photo
editing applications, like Instagram and Prisma. In this context, the problem
of increasing image memorability maps to that of retrieving “memorabilizing”
filters or style “seeds”. Still, users generally have to go through most of the
available filters before finding the desired solution, thus turning the editing
process into a resource and time consuming task. In this work, we show that it
is possible to automatically retrieve the best style seeds for a given image,
thus remarkably reducing the number of human attempts needed to find a good
match. Our approach leverages from recent advances in the field of image
synthesis and adopts a deep architecture for generating a memorable picture
from a given input image and a style seed. Importantly, to automatically select
the best style a novel learning-based solution, also relying on deep models, is
proposed. Our experimental evaluation, conducted on publicly available
benchmarks, demonstrates the effectiveness of the proposed approach for
generating memorable images through automatic style seed selection
Yuxin Peng, Xiangteng He, Junjie Zhao
Comments: 13 pages, submitted to IEEE Transactions on Image Processing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Fine-grained image classification is to recognize hundreds of subcategories
belonging to the same basic-level category, such as 200 subcategories belonging
to bird, and highly challenging due to large variance in same subcategory and
small variance among different subcategories. Existing methods generally find
where the object or its parts are and then discriminate which subcategory the
image belongs to. However, they mainly have two limitations: (1) Relying on
object or parts annotations which are heavily labor consuming. (2) Ignoring the
spatial relationship between the object and its parts as well as among these
parts, both of which are significantly helpful for finding discriminative
parts. Therefore, this paper proposes the object-part attention driven
discriminative localization (OPADDL) approach for weakly supervised
fine-grained image classification, and the main novelties are: (1) Object-part
attention model integrates two level attentions: object-level attention
localizes objects of images, and part-level attention selects discriminative
parts of object. Both are jointly employed to learn multi-view and multi-scale
features to enhance their mutual promotion. (2) Object-part spatial model
combines two spatial constraints: object spatial constraint ensures selected
parts highly representative, and part spatial constraint eliminates redundancy
and enhances discrimination of selected parts. Both are jointly employed to
exploit the subtle and local differences for distinguishing the subcategories.
Importantly, neither objects nor parts annotations are used, which avoids the
heavy labor consuming of labeling. Comparing with more than 10 state-of-the-art
methods on 3 widely used datasets, our OPADDL approach achieves the best
performance.
Weihua Chen, Xiaotang Chen, Jianguo Zhang, Kaiqi Huang
Comments: accepted to CVPR2017
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Person re-identification (ReID) is an important task in wide area video
surveillance which focuses on identifying people across different cameras.
Recently, deep learning networks with a triplet loss become a common framework
for person ReID. However, the triplet loss pays main attentions on obtaining
correct orders on the training set. It still suffers from a weaker
generalization capability from the training set to the testing set, thus
resulting in inferior performance. In this paper, we design a quadruplet loss,
which can lead to the model output with a larger inter-class variation and a
smaller intra-class variation compared to the triplet loss. As a result, our
model has a better generalization ability and can achieve a higher performance
on the testing set. In particular, a quadruplet deep network using a
margin-based online hard negative mining is proposed based on the quadruplet
loss for the person ReID. In extensive experiments, the proposed network
outperforms most of the state-of-the-art algorithms on representative datasets
which clearly demonstrates the effectiveness of our proposed method.
Jue Wang, Anoop Cherian, Fatih Porikli, Stephen Gould
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Most popular deep learning based models for action recognition are designed
to generate separate predictions within their short temporal windows, which are
often aggregated by heuristic means to assign an action label to the full video
segment. Given that not all frames from a video characterize the underlying
action, pooling schemes that impose equal importance to all frames might be
unfavorable. In an attempt towards tackling this challenge, we propose a novel
pooling scheme, dubbed SVM pooling, based on the notion that among the bag of
features generated by a CNN on all temporal windows, there is at least one
feature that characterizes the action. To this end, we learn a decision
hyperplane that separates this unknown yet useful feature from the rest.
Applying multiple instance learning in an SVM setup, we use the parameters of
this separating hyperplane as a descriptor for the video. Since these
parameters are directly related to the support vectors in a max-margin
framework, they serve as robust representations for pooling of the CNN
features. We devise a joint optimization objective and an efficient solver that
learns these hyperplanes per video and the corresponding action classifiers
over the hyperplanes. Showcased experiments on the standard HMDB and UCF101
datasets demonstrate state-of-the-art performance.
Swami Sankaranarayanan, Yogesh Balaji, Carlos D. Castillo, Rama Chellappa
Comments: Code will be released shortly
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Visual Domain adaptation is an actively researched problem in Computer
Vision. In this work, we propose an approach that leverages unsupervised data
to bring the source and target distributions closer in a learned joint feature
space. We accomplish this by inducing a symbiotic relationship between the
learned embedding and a generative adversarial framework. This is in contrast
to methods which use an adversarial framework for realistic data generation and
retraining deep models with such data. We show the strength and generality of
our method by performing experiments on three different tasks: (1) Digit
classification (MNIST, SVHN and USPS datasets) (2) Object recognition using
OFFICE dataset and (3) Face recognition using the Celebrity Frontal Profile
(CFP) dataset.
Cheng Ju, Aurélien Bibaut, Mark J. van der Laan
Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Methodology (stat.ME)
Artificial neural networks have been successfully applied to a variety of
machine learning tasks, including image recognition, semantic segmentation, and
machine translation. However, few studies fully investigated ensembles of
artificial neural networks. In this work, we investigated multiple widely used
ensemble methods, including unweighted averaging, majority voting, the Bayes
Optimal Classifier, and the (discrete) Super Learner, for image recognition
tasks, with deep neural networks as candidate algorithms. We designed several
experiments, with the candidate algorithms being the same network structure
with different model checkpoints within a single training process, networks
with same structure but trained multiple times stochastically, and networks
with different structure. In addition, we further studied the over-confidence
phenomenon of the neural networks, as well as its impact on the ensemble
methods. Across all of our experiments, the Super Learner achieved best
performance among all the ensemble methods in this study.
Henrique Santos, Victor Dantas, Vasco Furtado, Paulo Pinheiro, Deborah L. McGuinness
Comments: Accepted at the 14th Extended Semantic Web Conference (ESWC 2017)
Subjects: Artificial Intelligence (cs.AI); Computers and Society (cs.CY)
In the context of Smart Cities, indicator definitions have been used to
calculate values that enable the comparison among different cities. The
calculation of an indicator values has challenges as the calculation may need
to combine some aspects of quality while addressing different levels of
abstraction. Knowledge graphs (KGs) have been used successfully to support
flexible representation, which can support improved understanding and data
analysis in similar settings. This paper presents an operational description
for a city KG, an indicator ontology that support indicator discovery and data
visualization and an application capable of performing metadata analysis to
automatically build and display dashboards according to discovered indicators.
We describe our implementation in an urban mobility setting.
Paul Thaddeus Kazibudzki
Comments: 30 pages, 11 tables, 3 figures
Subjects: Artificial Intelligence (cs.AI)
An overview of current debates and contemporary research devoted to the
modeling of decision making processes and their facilitation directs attention
to the Analytic Hierarchy Process (AHP). At the core of the AHP are various
prioritization procedures (PPs) and consistency measures (CMs) for a Pairwise
Comparison Matrix (PCM) which, in a sense, reflects preferences of decision
makers. Certainly, when judgments about these preferences are perfectly
consistent (cardinally transitive), all PPs coincide and the quality of the
priority ratios (PRs) estimation is exemplary. However, human judgments are
very rarely consistent, thus the quality of PRs estimation may significantly
vary. The scale of these variations depends on the applied PP and utilized CM
for a PCM. This is why it is important to find out which PPs and which CMs for
a PCM lead directly to an improvement of the PRs estimation accuracy. The main
goal of this research is realized through the properly designed, coded and
executed seminal and sophisticated simulation algorithms in Wolfram Mathematica
8.0. These research results convince that the embedded in the AHP and commonly
applied, both genuine PP and CM for PCM may significantly deteriorate the
quality of PRs estimation; however, solutions proposed in this paper can
significantly improve the methodology.
Henrique Santos, Vasco Furtado
Comments: In Advances in Artificial Intelligence – SBIA 2012
Subjects: Artificial Intelligence (cs.AI); Computers and Society (cs.CY)
Although there are increasingly more initiatives for the generation of
semantic knowledge based on user participation, there is still a shortage of
platforms for regular users to create applications on which semantic data can
be exploited and generated automatically. We propose an architecture, called
Semantic Maps (SeMaps), for assisting the authoring and hosting of applications
in which the maps combine the aggregation of a Geographic Information System
and crowd-generated content (called here crowd maps). In these systems, the
digital map works as a blackboard for accommodating stories told by people
about events they want to share with others typically participating in their
social networks. SeMaps offers an environment for the creation and maintenance
of sites based on crowd maps with the possibility for the user to characterize
semantically that which s/he intends to mark on the map. The designer of a
crowd map, by informing a linguistic expression that designates what has to be
marked on the maps, is guided in a process that aims to associate a concept
from a common-sense base to this linguistic expression. Thus, the crowd maps
start to have dominion over common-sense inferential relations that define the
meaning of the marker, and are able to make inferences about the network of
linked data. This makes it possible to generate maps that have the power to
perform inferences and access external sources (such as DBpedia) that
constitute information that is useful and appropriate to the context of the
map. In this paper we describe the architecture of SeMaps and how it was
applied in a crowd map authoring tool.
Terence Fusco, Yaxin Bi, Haiying Wang, Fiona Browne
Comments: 8 pages, 5 figures, Dragon 4 Symposium
Subjects: Artificial Intelligence (cs.AI)
The key issues pertaining to collection of epidemic disease data for our
analysis purposes are that it is a labour intensive, time consuming and
expensive process resulting in availability of sparse sample data which we use
to develop prediction models. To address this sparse data issue, we present
novel Incremental Transductive methods to circumvent the data collection
process by applying previously acquired data to provide consistent,
confidence-based labelling alternatives to field survey research. We
investigated various reasoning approaches for semisupervised machine learning
including Bayesian models for labelling data. The results show that using the
proposed methods, we can label instances of data with a class of vector density
at a high level of confidence. By applying the Liberal and Strict Training
Approaches, we provide a labelling and classification alternative to standalone
algorithms. The methods in this paper are components in the process of reducing
the proliferation of the Schistosomiasis disease and its effects.
Paulo Pinheiro, Deborah L. McGuinness, Henrique Santos
Comments: In Proceedings of the 5th Workshop on Linked Science 2015 – Best Practices and the Road Ahead (LISC 2015)
Subjects: Artificial Intelligence (cs.AI); Computers and Society (cs.CY)
Significant efforts have been made to understand and document knowledge
related to scientific measurements. Many of those efforts resulted in one or
more high-quality ontologies that describe some aspects of scientific
measurements, but not in a comprehensive and coherently integrated manner. For
instance, we note that many of these high-quality ontologies are not properly
aligned, and more challenging, that they have different and often conflicting
concepts and approaches for encoding knowledge about empirical measurements. As
a result of this lack of an integrated view, it is often challenging for
scientists to determine whether any two scientific measurements were taken in
semantically compatible manners, thus making it difficult to decide whether
measurements should be analyzed in combination or not. In this paper, we
present the Human-Aware Sensor Network Ontology that is a comprehensive
alignment and integration of a sensing infrastructure ontology and a provenance
ontology. HASNetO has been under development for more than one year, and has
been reviewed, shared and used by multiple scientific communities. The ontology
has been in use to support the data management of a number of large-scale
ecological monitoring activities (observations) and empirical experiments.
Henrique Santos, Vasco Furtado, Paulo Pinheiro, Deborah L. McGuinness
Comments: In Proceedings of the 6th Workshop on Semantics for Smarter Cities (S4SC 2015), Bethlehem, PA, USA, October 11-12, 2015
Subjects: Artificial Intelligence (cs.AI); Computers and Society (cs.CY)
As part of Smart Cities initiatives, national, regional and local governments
all over the globe are under the mandate of being more open regarding how they
share their data. Under this mandate, many of these governments are publishing
data under the umbrella of open government data, which includes measurement
data from city-wide sensor networks. Furthermore, many of these data are
published in so-called data portals as documents that may be spreadsheets,
comma-separated value (CSV) data files, or plain documents in PDF or Word
documents. The sharing of these documents may be a convenient way for the data
provider to convey and publish data but it is not the ideal way for data
consumers to reuse the data. For example, the problems of reusing the data may
range from difficulty opening a document that is provided in any format that is
not plain text, to the actual problem of understanding the meaning of each
piece of knowledge inside of the document. Our proposal tackles those
challenges by identifying metadata that has been regarded to be relevant for
measurement data and providing a schema for this metadata. We further leverage
the Human-Aware Sensor Network Ontology (HASNetO) to build an architecture for
data collected in urban environments. We discuss the use of HASNetO and the
supporting infrastructure to manage both data and metadata in support of the
City of Fortaleza, a large metropolitan area in Brazil.
Guido Montufar, Johannes Rauh
Comments: 8 pages
Subjects: Artificial Intelligence (cs.AI); Optimization and Control (math.OC)
We investigate the geometry of optimal memoryless time independent decision
making in relation to the amount of information that the acting agent has about
the state of the system. We show that the expected long term reward, discounted
or per time step, is maximized by policies that randomize among at most (k)
actions whenever at most (k) world states are consistent with the agent’s
observation. Moreover, we show that the expected reward per time step can be
studied in terms of the expected discounted reward. Our main tool is a
geometric version of the policy improvement lemma, which identifies a
polyhedral cone of policy changes in which the state value function increases
for all states.
Mieczysław Kłopotek
Comments: Pre-publication version of: M.A. K{l}opotek: Transferable Plausibility Model – A Probabilistic Interpretation of Mathematical Theory of Evidence O.Hryniewicz, J. Kacprzyk, J.Koronacki, S.Wierzcho'{n}: Issues in Intelligent Systems Paradigms Akademicka Oficyna Wydawnicza EXIT, Warszawa 2005 ISBN 83-87674-90-7, pp.107–118
Subjects: Artificial Intelligence (cs.AI)
This paper suggests a new interpretation of the Dempster-Shafer theory in
terms of probabilistic interpretation of plausibility. A new rule of
combination of independent evidence is shown and its preservation of
interpretation is demonstrated.
Amal Rannen Triki, Rahaf Aljundi, Mathew B. Blaschko, Tinne Tuytelaars
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
This paper introduces a new lifelong learning solution where a single model
is trained for a sequence of tasks. The main challenge that vision systems face
in this context is catastrophic forgetting: as they tend to adapt to the most
recently seen task, they lose performance on the tasks that were learned
previously. Our method aims at preserving the knowledge of the previous tasks
while learning a new one by using autoencoders. For each task, an
under-complete autoencoder is learned, capturing the features that are crucial
for its achievement. When a new task is presented to the system, we prevent the
reconstructions of the features with these autoencoders from changing, which
has the effect of preserving the information on which the previous tasks are
mainly relying. At the same time, the features are given space to adjust to the
most recent environment as only their projection into a low dimension
submanifold is controlled. The proposed system is evaluated on image
classification tasks and shows a reduction of forgetting over the
state-of-the-art
Long-Kai Huang, Qiang Yang, Wei-Shi Zheng
Comments: To appear in IEEE Transactions on Neural Networks and Learning Systems (DOI: 10.1109/TNNLS.2017.2689242)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Although hash function learning algorithms have achieved great success in
recent years, most existing hash models are off-line, which are not suitable
for processing sequential or online data. To address this problem, this work
proposes an online hash model to accommodate data coming in stream for online
learning. Specifically, a new loss function is proposed to measure the
similarity loss between a pair of data samples in hamming space. Then, a
structured hash model is derived and optimized in a passive-aggressive way.
Theoretical analysis on the upper bound of the cumulative loss for the proposed
online hash model is provided. Furthermore, we extend our online hashing from a
single-model to a multi-model online hashing that trains multiple models so as
to retain diverse online hashing models in order to avoid biased update. The
competitive efficiency and effectiveness of the proposed online hash models are
verified through extensive experiments on several large-scale datasets as
compared to related hashing methods.
Farhan Khawar, Nevin L. Zhang, Jinxing Yu
Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI)
Implicit feedback is the simplest form of user feedback that can be used for
item recommendation. It is easy to collect and domain independent. However,
there is a lack of negative examples. Existing works circumvent this problem by
making various assumptions regarding the unconsumed items, which fail to hold
when the user did not consume an item because she was unaware of it. In this
paper we propose Conformative Filtering (CoF) as a novel method for addressing
the lack of negative examples in implicit feedback. The motivation is that if
there is a large group of users who share the same taste and none of them
consumed an item, then it is highly likely that the item is irrelevant to this
taste. We use Hierarchical Latent Tree Analysis (HLTA) to identify taste-based
user groups, and make recommendations for a user based on her memberships in
the groups. Experiments on real-world datasets from different domains show that
CoF has superior performance compared to other baselines and more than 10%
improvement in Recall@5 and Recall@10 is observed.
Brian Paden, Yannik Nager, Emilio Frazzoli
Comments: 7 Pages
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI)
A landmark based heuristic is investigated for reducing query phase run-time
of the probabilistic roadmap (PRM) motion planning method. The heuristic is
generated by storing minimum spanning trees from a small number of vertices
within the PRM graph and using these trees to approximate the cost of a
shortest path between any two vertices of the graph. The intermediate step of
preprocessing the graph increases the time and memory requirements of the
classical motion planning technique in exchange for speeding up individual
queries making the method advantageous in multi-query applications. This paper
investigates these trade-offs on PRM graphs constructed in randomized
environments as well as a practical manipulator simulation.We conclude that the
method is preferable to Dijkstra’s algorithm or the ({
m A}^*) algorithm with
conventional heuristics in multi-query applications.
Ioan Gabriel Bucur, Tom Claassen, Tom Heskes
Comments: 10 pages, 12 figures, Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS) 2017
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Methodology (stat.ME)
Causal effect estimation from observational data is an important and much
studied research topic. The instrumental variable (IV) and local causal
discovery (LCD) patterns are canonical examples of settings where a closed-form
expression exists for the causal effect of one variable on another, given the
presence of a third variable. Both rely on faithfulness to infer that the
latter only influences the target effect via the cause variable. In reality, it
is likely that this assumption only holds approximately and that there will be
at least some form of weak interaction. This brings about the paradoxical
situation that, in the large-sample limit, no predictions are made, as
detecting the weak edge invalidates the setting. We introduce an alternative
approach by replacing strict faithfulness with a prior that reflects the
existence of many ‘weak’ (irrelevant) and ‘strong’ interactions. We obtain a
posterior distribution over the target causal effect estimator which shows
that, in many cases, we can still make good estimates. We demonstrate the
approach in an application on a simple linear-Gaussian setting, using the
MultiNest sampling algorithm, and compare it with established techniques to
show our method is robust even when strict faithfulness is violated.
Raluca Necula (1), Mihaela Breaban (1), Madalina Raschip (1) ((1) Faculty of Computer Science, Alexandru Ioan Cuza University, Iasi, Romania)
Comments: 10 pages, 2 figures
Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI)
The Dynamic Vehicle Routing Problem with Time Windows (DVRPTW) is an
extension of the well-known Vehicle Routing Problem (VRP), which takes into
account the dynamic nature of the problem. This aspect requires the vehicle
routes to be updated in an ongoing manner as new customer requests arrive in
the system and must be incorporated into an evolving schedule during the
working day. Besides the vehicle capacity constraint involved in the classical
VRP, DVRPTW considers in addition time windows, which are able to better
capture real-world situations. Despite this, so far, few studies have focused
on tackling this problem of greater practical importance. To this end, this
study devises for the resolution of DVRPTW, an ant colony optimization based
algorithm, which resorts to a joint solution construction mechanism, able to
construct in parallel the vehicle routes. This method is coupled with a local
search procedure, aimed to further improve the solutions built by ants, and
with an insertion heuristics, which tries to reduce the number of vehicles used
to service the available customers. The experiments indicate that the proposed
algorithm is competitive and effective, and on DVRPTW instances with a higher
dynamicity level, it is able to yield better results compared to existing
ant-based approaches.
Annamalai Narayanan, Mahinthan Chandramohan, Lihui Chen, Yang Liu
Subjects: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Software Engineering (cs.SE)
Existing Android malware detection approaches use a variety of features such
as security sensitive APIs, system calls, control-flow structures and
information flows in conjunction with Machine Learning classifiers to achieve
accurate detection. Each of these feature sets provides a unique semantic
perspective (or view) of apps’ behaviours with inherent strengths and
limitations. Meaning, some views are more amenable to detect certain attacks
but may not be suitable to characterise several other attacks. Most of the
existing malware detection approaches use only one (or a selected few) of the
aforementioned feature sets which prevent them from detecting a vast majority
of attacks. Addressing this limitation, we propose MKLDroid, a unified
framework that systematically integrates multiple views of apps for performing
comprehensive malware detection and malicious code localisation. The rationale
is that, while a malware app can disguise itself in some views, disguising in
every view while maintaining malicious intent will be much harder.
MKLDroid uses a graph kernel to capture structural and contextual information
from apps’ dependency graphs and identify malice code patterns in each view.
Subsequently, it employs Multiple Kernel Learning (MKL) to find a weighted
combination of the views which yields the best detection accuracy. Besides
multi-view learning, MKLDroid’s unique and salient trait is its ability to
locate fine-grained malice code portions in dependency graphs (e.g.,
methods/classes). Through our large-scale experiments on several datasets
(incl. wild apps), we demonstrate that MKLDroid outperforms three
state-of-the-art techniques consistently, in terms of accuracy while
maintaining comparable efficiency. In our malicious code localisation
experiments on a dataset of repackaged malware, MKLDroid was able to identify
all the malice classes with 94% average recall.
Shubham Toshniwal, Hao Tang, Liang Lu, Karen Livescu
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
End-to-end training of deep learning-based models allows for implicit
learning of intermediate representations based on the final task loss. However,
the end-to-end approach ignores the useful domain knowledge encoded in explicit
intermediate-level supervision. We hypothesize that using intermediate
representations as auxiliary supervision at lower levels of deep networks may
be a good way of combining the advantages of end-to-end training and more
traditional pipeline approaches. We present experiments on conversational
speech recognition where we use lower-level tasks, such as phoneme recognition,
in a multitask training approach with an encoder-decoder model for direct
character transcription. We compare multiple types of lower-level tasks and
analyze the effects of the auxiliary tasks. Our results on the Switchboard
corpus show that this approach improves recognition accuracy over a standard
encoder-decoder model on the Eval2000 test set.
Aman Chadha, Sushmit Mallik, Ravdeep Johar
Journal-ref: International Journal of Computer Applications 52(20):35-42, 2012
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Learning (cs.LG); Multimedia (cs.MM)
The aim of a Content-Based Image Retrieval (CBIR) system, also known as Query
by Image Content (QBIC), is to help users to retrieve relevant images based on
their contents. CBIR technologies provide a method to find images in large
databases by using unique descriptors from a trained image. The image
descriptors include texture, color, intensity and shape of the object inside an
image. Several feature-extraction techniques viz., Average RGB, Color Moments,
Co-occurrence, Local Color Histogram, Global Color Histogram and Geometric
Moment have been critically compared in this paper. However, individually these
techniques result in poor performance. So, combinations of these techniques
have also been evaluated and results for the most efficient combination of
techniques have been presented and optimized for each class of image query. We
also propose an improvement in image retrieval performance by introducing the
idea of Query modification through image cropping. It enables the user to
identify a region of interest and modify the initial query to refine and
personalize the image retrieval results.
Farhan Khawar, Nevin L. Zhang, Jinxing Yu
Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI)
Implicit feedback is the simplest form of user feedback that can be used for
item recommendation. It is easy to collect and domain independent. However,
there is a lack of negative examples. Existing works circumvent this problem by
making various assumptions regarding the unconsumed items, which fail to hold
when the user did not consume an item because she was unaware of it. In this
paper we propose Conformative Filtering (CoF) as a novel method for addressing
the lack of negative examples in implicit feedback. The motivation is that if
there is a large group of users who share the same taste and none of them
consumed an item, then it is highly likely that the item is irrelevant to this
taste. We use Hierarchical Latent Tree Analysis (HLTA) to identify taste-based
user groups, and make recommendations for a user based on her memberships in
the groups. Experiments on real-world datasets from different domains show that
CoF has superior performance compared to other baselines and more than 10%
improvement in Recall@5 and Recall@10 is observed.
Wei Lu, Qikai Cheng, Christina Lioma
Subjects: Information Retrieval (cs.IR)
TextRank is a variant of PageRank typically used in graphs that represent
documents, and where vertices denote terms and edges denote relations between
terms. Quite often the relation between terms is simple term co-occurrence
within a fixed window of k terms. The output of TextRank when applied
iteratively is a score for each vertex, i.e. a term weight, that can be used
for information retrieval (IR) just like conventional term frequency based term
weights. So far, when computing TextRank term weights over co- occurrence
graphs, the window of term co-occurrence is al- ways ?xed. This work departs
from this, and considers dy- namically adjusted windows of term co-occurrence
that fol- low the document structure on a sentence- and paragraph- level. The
resulting TextRank term weights are used in a ranking function that re-ranks
1000 initially returned search results in order to improve the precision of the
ranking. Ex- periments with two IR collections show that adjusting the vicinity
of term co-occurrence when computing TextRank term weights can lead to gains in
early precision.
Birger Larsen, Christina Lioma, Arjen de Vries
Subjects: Information Retrieval (cs.IR)
The ECIR half-day workshop on Task-Based and Aggregated Search (TBAS) was
held in Barcelona, Spain on 1 April 2012. The program included a keynote talk
by Professor Jarvelin, six full paper presentations, two poster presentations,
and an interactive discussion among the approximately 25 participants. This
report overviews the aims and contents of the workshop and outlines the major
outcomes.
Christina Lioma, Roi Blanco
Subjects: Information Retrieval (cs.IR)
Automatic language processing tools typically assign to terms so-called
weights corresponding to the contribution of terms to information content.
Traditionally, term weights are computed from lexical statistics, e.g., term
frequencies. We propose a new type of term weight that is computed from part of
speech (POS) n-gram statistics. The proposed POS-based term weight represents
how informative a term is in general, based on the POS contexts in which it
generally occurs in language. We suggest five different computations of
POS-based term weights by extending existing statistical approximations of term
information measures. We apply these POS-based term weights to information
retrieval, by integrating them into the model that matches documents to
queries. Experiments with two TREC collections and 300 queries, using TF-IDF &
BM25 as baselines, show that integrating our POS-based term weights to
retrieval always leads to gains (up to +33.7% from the baseline). Additional
experiments with a different retrieval model as baseline (Language Model with
Dirichlet priors smoothing) and our best performing POS-based term weight, show
retrieval gains always and consistently across the whole smoothing range of the
baseline.
Christina Lioma, Birger Larsen, Hinrich Schütze, Peter Ingwersen
Subjects: Information Retrieval (cs.IR)
Interactive Information Retrieval refers to the branch of Information
Retrieval that considers the retrieval process with respect to a wide range of
contexts, which may affect the user’s information seeking experience. The
identification and representation of such contexts has been the object of the
principle of Polyrepresentation, a theoretical framework for reasoning about
different representations arising from interactive information retrieval in a
given context. Although the principle of Polyrepresentation has received
attention from many researchers, not much empirical work has been done based on
it. One reason may be that it has not yet been formalised mathematically. In
this paper we propose an up-to-date and exible mathematical formalisation of
the principle of Polyrepresentation for information needs. Specifically, we
apply Subjective Logic to model different representations of information needs
as beliefs marked by degrees of uncertainty. We combine such beliefs using
different logical operators, and we discuss these combinations with respect to
different retrieval scenarios and situations. A formal model is introduced and
discussed, with illustrative applications to the modelling of information
needs.
Christina Lioma, Birger Larsen, Peter Ingwersen
Subjects: Information Retrieval (cs.IR)
According to the principle of polyrepresentation, retrieval accuracy may
improve through the combination of multiple and diverse information object
representations about e.g. the context of the user, the information sought, or
the retrieval system. Recently, the principle of polyrepresentation was
mathematically expressed using subjective logic, where the potential
suitability of each representation for improving retrieval performance was
formalised through degrees of belief and uncertainty. No experimental evidence
or practical application has so far validated this model. We extend the work of
Lioma et al. (2010), by providing a practical application and analysis of the
model. We show how to map the abstract notions of belief and uncertainty to
real-life evidence drawn from a retrieval dataset. We also show how to estimate
two different types of polyrepresentation assuming either (a) independence or
(b) dependence between the information objects that are combined. We focus on
the polyrepresentation of different types of context relating to user
information needs (i.e. work task, user background knowledge, ideal answer) and
show that the subjective logic model can predict their optimal combination
prior and independently to the retrieval process.
Christina Lioma, Birger Larsen, Wei Lu
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)
Typically, every part in most coherent text has some plausible reason for its
presence, some function that it performs to the overall semantics of the text.
Rhetorical relations, e.g. contrast, cause, explanation, describe how the parts
of a text are linked to each other. Knowledge about this socalled discourse
structure has been applied successfully to several natural language processing
tasks. This work studies the use of rhetorical relations for Information
Retrieval (IR): Is there a correlation between certain rhetorical relations and
retrieval performance? Can knowledge about a document’s rhetorical relations be
useful to IR? We present a language model modification that considers
rhetorical relations when estimating the relevance of a document to a query.
Empirical evaluation of different versions of our model on TREC settings shows
that certain rhetorical relations can benefit retrieval effectiveness notably
(> 10% in mean average precision over a state-of-the-art baseline).
Tuan Hoang, Thanh-Toan Do, Dang-Khoa Le Tan, Ngai-Man Cheung
Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR)
We propose a novel approach to improve unsupervised hashing. Specifically, we
propose an embedding method to enhance the discriminative property of features
before passing them into hashing. We propose a very efficient embedding method:
Gaussian Mixture Model embedding (Gemb). Gemb embeds feature vector into a
low-dimensional vector using Gaussian Mixture Model. Our experiment shows that
the proposed method boosts the hashing performance of many state-of-the-art,
e.g. Binary Autoencoder (BA), Iterative Quantization (ITQ), in standard
evaluation metrics for the three main benchmark datasets.
Aman Chadha, Sushmit Mallik, Ravdeep Johar
Journal-ref: International Journal of Computer Applications 52(20):35-42, 2012
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Learning (cs.LG); Multimedia (cs.MM)
The aim of a Content-Based Image Retrieval (CBIR) system, also known as Query
by Image Content (QBIC), is to help users to retrieve relevant images based on
their contents. CBIR technologies provide a method to find images in large
databases by using unique descriptors from a trained image. The image
descriptors include texture, color, intensity and shape of the object inside an
image. Several feature-extraction techniques viz., Average RGB, Color Moments,
Co-occurrence, Local Color Histogram, Global Color Histogram and Geometric
Moment have been critically compared in this paper. However, individually these
techniques result in poor performance. So, combinations of these techniques
have also been evaluated and results for the most efficient combination of
techniques have been presented and optimized for each class of image query. We
also propose an improvement in image retrieval performance by introducing the
idea of Query modification through image cropping. It enables the user to
identify a region of interest and modify the initial query to refine and
personalize the image retrieval results.
Oded Avraham, Yoav Goldberg
Subjects: Computation and Language (cs.CL)
We explore the ability of word embeddings to capture both semantic and
morphological similarity, as affected by the different types of linguistic
properties (surface form, lemma, morphological tag) used to compose the
representation of each word. We train several models, where each uses a
different subset of these properties to compose its representations. By
evaluating the models on semantic and morphological measures, we reveal some
useful insights on the relationship between semantics and morphology.
Qingyu Zhou, Nan Yang, Furu Wei, Chuanqi Tan, Hangbo Bao, Ming Zhou
Comments: Submitted to EMNLP 2017
Subjects: Computation and Language (cs.CL)
Automatic question generation aims to generate questions from a text passage
where the generated questions can be answered by certain sub-spans of the given
passage. Traditional methods mainly use rigid heuristic rules to transform a
sentence into related questions. In this work, we propose to apply the neural
encoder-decoder model to generate meaningful and diverse questions from natural
language sentences. The encoder reads the input text and the answer position,
to produce an answer-aware input representation, which is fed to the decoder to
generate an answer focused question. We conduct a preliminary study on neural
question generation from text with the SQuAD dataset, and the experiment
results show that our method can produce fluent and diverse questions.
Luís Campos, Francisco Couto
Comments: 5 pages, 2 figures
Subjects: Computation and Language (cs.CL)
MRA (Multilingual Report Annotator) is a web application that translates
Radiology text and annotates it with RadLex terms. Its goal is to explore the
solution of translating non-English Radiology reports as a way to solve the
problem of most of the Text Mining tools being developed for English. In this
brief paper we explain the language barrier problem and shortly describe the
application. MRA can be found at this https URL
Pengcheng Yin, Graham Neubig
Comments: To appear in ACL 2017
Subjects: Computation and Language (cs.CL); Programming Languages (cs.PL); Software Engineering (cs.SE)
We consider the problem of parsing natural language descriptions into source
code written in a general-purpose programming language like Python. Existing
data-driven methods treat this problem as a language generation task without
considering the underlying syntax of the target programming language. Informed
by previous work in semantic parsing, in this paper we propose a novel neural
architecture powered by a grammar model to explicitly capture the target syntax
as prior knowledge. Experiments find this an effective way to scale up to
generation of complex programs from natural language descriptions, achieving
state-of-the-art results that well outperform previous code generation and
semantic parsing approaches.
Chunting Zhou, Graham Neubig
Comments: Accepted by ACL 2017
Subjects: Computation and Language (cs.CL)
Labeled sequence transduction is a task of transforming one sequence into
another sequence that satisfies desiderata specified by a set of labels. In
this paper we propose multi-space variational encoder-decoders, a new model for
labeled sequence transduction with semi-supervised learning. The generative
model can use neural networks to handle both discrete and continuous latent
variables to exploit various features of data. Experiments show that our model
provides not only a powerful supervised framework but also can effectively take
advantage of the unlabeled data. On the SIGMORPHON morphological inflection
benchmark, our model outperforms single-model state-of-art results by a large
margin for the majority of languages.
Yaniv Sheena, Míša Hejná, Yossi Adi, Joseph Keshet
Subjects: Computation and Language (cs.CL)
Pre-aspiration is defined as the period of glottal friction occurring in
sequences of vocalic/consonantal sonorants and phonetically voiceless
obstruents. In this paper, we propose two machine learning methods for
automatic measurement of pre-aspiration duration: feedforward neural network,
which works at the frame level; and structured prediction model, which relies
on manually designed feature functions, and works at the segment level. The
input for both algorithms is a speech signal of an arbitrary length containing
a single obstruent, and the output is a pair of times which constitutes the
pre-aspiration boundaries. We train both models on a set of manually annotated
examples. Results suggest that the structured model is superior to the
frame-based model as it yields higher accuracy in predicting the boundaries and
generalizes to new speakers and new languages. Finally, we demonstrate the
applicability of our structured prediction algorithm by replicating linguistic
analysis of pre-aspiration in Aberystwyth English with high correlation.
Shubham Toshniwal, Hao Tang, Liang Lu, Karen Livescu
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
End-to-end training of deep learning-based models allows for implicit
learning of intermediate representations based on the final task loss. However,
the end-to-end approach ignores the useful domain knowledge encoded in explicit
intermediate-level supervision. We hypothesize that using intermediate
representations as auxiliary supervision at lower levels of deep networks may
be a good way of combining the advantages of end-to-end training and more
traditional pipeline approaches. We present experiments on conversational
speech recognition where we use lower-level tasks, such as phoneme recognition,
in a multitask training approach with an encoder-decoder model for direct
character transcription. We compare multiple types of lower-level tasks and
analyze the effects of the auxiliary tasks. Our results on the Switchboard
corpus show that this approach improves recognition accuracy over a standard
encoder-decoder model on the Eval2000 test set.
Christina Lioma, Birger Larsen, Wei Lu
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)
Typically, every part in most coherent text has some plausible reason for its
presence, some function that it performs to the overall semantics of the text.
Rhetorical relations, e.g. contrast, cause, explanation, describe how the parts
of a text are linked to each other. Knowledge about this socalled discourse
structure has been applied successfully to several natural language processing
tasks. This work studies the use of rhetorical relations for Information
Retrieval (IR): Is there a correlation between certain rhetorical relations and
retrieval performance? Can knowledge about a document’s rhetorical relations be
useful to IR? We present a language model modification that considers
rhetorical relations when estimating the relevance of a document to a query.
Empirical evaluation of different versions of our model on TREC settings shows
that certain rhetorical relations can benefit retrieval effectiveness notably
(> 10% in mean average precision over a state-of-the-art baseline).
Barun Gorain, Andrzej Pelc
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
We consider the problem of topology recognition in wireless (radio) networks
modeled as undirected graphs. Topology recognition is a fundamental task in
which every node of the network has to output a map of the underlying graph
i.e., an isomorphic copy of it, and situate itself in this map. In wireless
networks, nodes communicate in synchronous rounds. In each round a node can
either transmit a message to all its neighbors, or stay silent and listen. At
the receiving end, a node (v) hears a message from a neighbor (w) in a given
round, if (v) listens in this round, and if (w) is its only neighbor that
transmits in this round. Nodes have labels which are (not necessarily
different) binary strings. The length of a labeling scheme is the largest
length of a label. We concentrate on wireless networks modeled by trees, and we
investigate two problems.
egin{itemize} item What is the shortest labeling scheme that permits
topology recognition in all wireless tree networks of diameter (D) and maximum
degree (Delta)?
item What is the fastest topology recognition algorithm working for all
wireless tree networks of diameter (D) and maximum degree (Delta), using such
a short labeling scheme? end{itemize}
We are interested in deterministic topology recognition algorithms. For the
first problem, we show that the minimum length of a labeling scheme allowing
topology recognition in all trees of maximum degree (Delta geq 3) is
(Theta(loglog Delta)). For such short schemes, used by an algorithm working
for the class of trees of diameter (Dgeq 4) and maximum degree (Delta geq
3), we show almost matching bounds on the time of topology recognition: an
upper bound (O(DDelta)), and a lower bound (Omega(DDelta^{epsilon})), for
any constant (epsilon<1).
Anthony Gregerson, Aman Chadha, Katherine Morrow
Comments: International Conference on Field-Programmable Technology (ICFPT), Kyoto Research Park, Japan, Dec. 9-11, 2013. hardware design; hardware architecture; cad; computer aided design; IC design; integrated circuit design; partitioning algorithms; field programmable gate arrays; benchmark testing; heuristic algorithms; resource management; dynamic scheduling; digital signal processing
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS)
Design flows use graph partitioning both as a precursor to place and route
for single devices, and to divide netlists or task graphs among multiple
devices. Partitioners have accommodated FPGA heterogeneity via multi-resource
constraints, but have not yet exploited the corresponding ability to implement
some computations in multiple ways (e.g., LUTs vs. DSP blocks), which could
enable a superior solution. This paper introduces multi-personality graph
partitioning, which incorporates aspects of resource mapping into partitioning.
We present a modified multi-level KLFM partitioning algorithm that also
performs heterogeneous resource mapping for nodes with multiple potential
implementations (multiple personalities). We evaluate several variants of our
multi-personality FPGA circuit partitioner using 21 circuits and benchmark
graphs, and show that dynamic resource mapping improves cut size on average by
27% over static mapping for these circuits. We further show that it improves
deviation from target resource utilizations by 50% over post-partitioning
resource mapping.
Ari Kobren, Nicholas Monath, Akshay Krishnamurthy, Andrew McCallum
Comments: 20 pages. Code available here: this https URL
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
Many modern clustering methods scale well to a large number of data items, N,
but not to a large number of clusters, K. This paper introduces PERCH, a new
non-greedy algorithm for online hierarchical clustering that scales to both
massive N and K–a problem setting we term extreme clustering. Our algorithm
efficiently routes new data points to the leaves of an incrementally-built
tree. Motivated by the desire for both accuracy and speed, our approach
performs tree rotations for the sake of enhancing subtree purity and
encouraging balancedness. We prove that, under a natural separability
assumption, our non-greedy algorithm will produce trees with perfect dendrogram
purity regardless of online data arrival order. Our experiments demonstrate
that PERCH constructs more accurate trees than other tree-building clustering
algorithms and scales well with both N and K, achieving a higher quality
clustering than the strongest flat clustering competitor in nearly half the
time.
Hanjun Dai, Elias B. Khalil, Yuyu Zhang, Bistra Dilkina, Le Song
Comments: 26 pages
Subjects: Learning (cs.LG)
Many combinatorial optimization problems over graphs are NP-hard, and require
significant specialized knowledge and trial-and-error to design good heuristics
or approximation algorithms. Can we automate this challenging and tedious
process, and learn the algorithms instead? In many real world applications, it
is typically the case that the same type of optimization problem is solved
again and again on a regular basis, maintaining the same problem structure but
differing in the data. This provides an opportunity for learning heuristic
algorithms which can exploit the structure of such recurring problems. In this
paper, we propose a unique combination of reinforcement learning and graph
embedding to address this challenge. The learned greedy policy behaves like a
meta-algorithm which incrementally constructs a solution, and the action is
determined by the output of a graph embedding network capturing the current
state of the solution. We show that our framework can be applied to a diverse
range of optimization problems over graphs, and provide evidence that our
learning approach can compete with or outperform specialized heuristics or
approximation algorithms for the Minimum Vertex Cover, Maximum Cut and
Traveling Salesman Problems.
Moran Feldman, Christopher Harshaw, Amin Karbasi
Subjects: Learning (cs.LG); Data Structures and Algorithms (cs.DS)
It is known that greedy methods perform well for maximizing monotone
submodular functions. At the same time, such methods perform poorly in the face
of non-monotonicity. In this paper, we show – arguably, surprisingly – that
invoking the classical greedy algorithm (O(sqrt{k}))-times leads to the
(currently) fastest deterministic algorithm, called Repeated Greedy, for
maximizing a general submodular function subject to (k)-independent system
constraints. Repeated Greedy achieves ((1 + O(1/sqrt{k}))k) approximation
using (O(nrsqrt{k})) function evaluations (here, (n) and (r) denote the size
of the ground set and the maximum size of a feasible solution, respectively).
We then show that by a careful sampling procedure, we can run the greedy
algorithm only once and obtain the (currently) fastest randomized algorithm,
called Sample Greedy, for maximizing a submodular function subject to
(k)-extendible system constraints (a subclass of (k)-independent system
constrains). Sample Greedy achieves ((k + 3))-approximation with only (O(nr/k))
function evaluations. Finally, we derive an almost matching lower bound, and
show that no polynomial time algorithm can have an approximation ratio smaller
than ( k + 1/2 – varepsilon). To further support our theoretical results, we
compare the performance of Repeated Greedy and Sample Greedy with prior art in
a concrete application (movie recommendation). We consistently observe that
while Sample Greedy achieves practically the same utility as the best baseline,
it performs at least two orders of magnitude faster.
Daniel O'Malley, Velimir V. Vesselinov, Boian S. Alexandrov, Ludmil B. Alexandrov
Subjects: Learning (cs.LG); Quantum Physics (quant-ph); Machine Learning (stat.ML)
D-Wave quantum annealers represent a novel computational architecture and
have attracted significant interest, but have been used for few real-world
computations. Machine learning has been identified as an area where quantum
annealing may be useful. Here, we show that the D-Wave 2X can be effectively
used as part of an unsupervised machine learning method. This method can be
used to analyze large datasets. The D-Wave only limits the number of features
that can be extracted from the dataset. We apply this method to learn the
features from a set of facial images.
Kevin M. Amaral, Ping Chen, Scott Crouter, Wei Ding
Comments: 10 pages, 6 tables, 2 figures
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
Accelerometer measurements are the prime type of sensor information most
think of when seeking to measure physical activity. On the market, there are
many fitness measuring devices which aim to track calories burned and steps
counted through the use of accelerometers. These measurements, though good
enough for the average consumer, are noisy and unreliable in terms of the
precision of measurement needed in a scientific setting. The contribution of
this paper is an innovative and highly accurate regression method which uses an
intermediary two-stage classification step to better direct the regression of
energy expenditure values from accelerometer counts.
We show that through an additional unsupervised layer of intermediate feature
construction, we can leverage latent patterns within accelerometer counts to
provide better grounds for activity classification than expert-constructed
timeseries features. For this, our approach utilizes a mathematical model
originating in natural language processing, the bag-of-words model, that has in
the past years been appearing in diverse disciplines outside of the natural
language processing field such as image processing. Further emphasizing the
natural language connection to stochastics, we use a gaussian mixture model to
learn the dictionary upon which the bag-of-words model is built. Moreover, we
show that with the addition of these features, we’re able to improve regression
root mean-squared error of energy expenditure by approximately 1.4 units over
existing state-of-the-art methods.
Yixi Xu, Jean Honorio, Xiao Wang
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
In this paper, we propose a compositional nonparametric method in which a
model is expressed as a labeled binary tree of (2k+1) nodes, where each node is
either a summation, a multiplication, or the application of one of the (q)
basis functions to one of the (p) covariates. We show that in order to recover
a labeled binary tree from a given dataset, the sufficient number of samples is
(O(klog(pq)+log(k!))), and the necessary number of samples is (Omega(klog
(pq)-log(k!))). We implement our method for regression as a greedy search
algorithm, and demonstrate its effectiveness with two synthetic data sets and
one real-world experiment.
Diego García-Gil, Julián Luengo, Salvador García, Francisco Herrera
Subjects: Databases (cs.DB); Learning (cs.LG)
In any knowledge discovery process the value of extracted knowledge is
directly related to the quality of the data used. Big Data problems, generated
by massive growth in the scale of data observed in recent years, also follow
the same dictate. A common problem affecting data quality is the presence of
noise, particularly in classification problems, where label noise refers to the
incorrect labeling of training instances, and is known to be a very disruptive
feature of data. However, in this Big Data era, the massive growth in the scale
of the data poses a challenge to traditional proposals created to tackle noise,
as they have difficulties coping with such a large amount of data. New
algorithms need to be proposed to treat the noise in Big Data problems,
providing high quality and clean data, also known as Smart Data. In this paper,
two Big Data preprocessing approaches to remove noisy examples are proposed: an
homogeneous ensemble and an heterogeneous ensemble filter, with special
emphasis in their scalability and performance traits. The obtained results show
that these proposals enable the practitioner to efficiently obtain a Smart
Dataset from any Big Data classification problem.
Yi Han, Benjamin I. P. Rubinstein
Comments: 7 pages, 5 figures, 4 tables
Subjects: Cryptography and Security (cs.CR); Learning (cs.LG); Machine Learning (stat.ML)
Despite the wide use of machine learning in adversarial settings including
computer security, recent studies have demonstrated vulnerabilities to evasion
attacks—carefully crafted adversarial samples that closely resemble
legitimate instances, but cause misclassification. In this paper, (1) we
analyse the effectiveness of the gradient-descent method—the leading approach
for generating adversarial samples—against non-linear support vector
machines, and conclude that carefully reduced kernel smoothness can
significantly increase robustness to the attack; (2) we propose a quantity
similar to margin that can efficiently predict potential susceptibility to
gradient-descent attack, before the attack is launched; and (3) we design a new
adversarial sample construction algorithm based on optimising the
multiplicative ratio of class decision functions. Our results demonstrate that
the new method not only increases the attack’s success rate, but also achieves
success with less perturbation.
Elaine Angelino, Nicholas Larus-Stone, Daniel Alabi, Margo Seltzer, Cynthia Rudin
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
We present the design and implementation of a custom discrete optimization
technique for building rule lists over a categorical feature space. Our
algorithm provides the optimal solution, with a certificate of optimality. By
leveraging algorithmic bounds, efficient data structures, and computational
reuse, we achieve several orders of magnitude speedup in time and a massive
reduction of memory consumption. We demonstrate that our approach produces
optimal rule lists on practical problems in seconds. This framework is a novel
alternative to CART and other decision tree methods.
Cheng Ju, Aurélien Bibaut, Mark J. van der Laan
Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Methodology (stat.ME)
Artificial neural networks have been successfully applied to a variety of
machine learning tasks, including image recognition, semantic segmentation, and
machine translation. However, few studies fully investigated ensembles of
artificial neural networks. In this work, we investigated multiple widely used
ensemble methods, including unweighted averaging, majority voting, the Bayes
Optimal Classifier, and the (discrete) Super Learner, for image recognition
tasks, with deep neural networks as candidate algorithms. We designed several
experiments, with the candidate algorithms being the same network structure
with different model checkpoints within a single training process, networks
with same structure but trained multiple times stochastically, and networks
with different structure. In addition, we further studied the over-confidence
phenomenon of the neural networks, as well as its impact on the ensemble
methods. Across all of our experiments, the Super Learner achieved best
performance among all the ensemble methods in this study.
Aman Chadha, Sushmit Mallik, Ravdeep Johar
Journal-ref: International Journal of Computer Applications 52(20):35-42, 2012
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Learning (cs.LG); Multimedia (cs.MM)
The aim of a Content-Based Image Retrieval (CBIR) system, also known as Query
by Image Content (QBIC), is to help users to retrieve relevant images based on
their contents. CBIR technologies provide a method to find images in large
databases by using unique descriptors from a trained image. The image
descriptors include texture, color, intensity and shape of the object inside an
image. Several feature-extraction techniques viz., Average RGB, Color Moments,
Co-occurrence, Local Color Histogram, Global Color Histogram and Geometric
Moment have been critically compared in this paper. However, individually these
techniques result in poor performance. So, combinations of these techniques
have also been evaluated and results for the most efficient combination of
techniques have been presented and optimized for each class of image query. We
also propose an improvement in image retrieval performance by introducing the
idea of Query modification through image cropping. It enables the user to
identify a region of interest and modify the initial query to refine and
personalize the image retrieval results.
Hong Ren, Nan Liu, Cunhua Pan, Xiaohu You, Lajos Hanzo
Comments: submitted to one journal for possible publication
Subjects: Information Theory (cs.IT)
We propose a quality-of-service (QoS) driven power-and rate-adaptation scheme
for wireless cloud radio access networks (C-RAN), where each radio remote head
(RRH) is connected to the baseband unit (BBU) pool through high-speed optical
links. The RRHs jointly support the users by efficiently exploiting the
enhanced spatial degrees of freedom attainted by the powerful cloud computing
facilitated by the BBU pool. Our proposed scheme aims for maximizing the
effective capacity (EC) of the user subject to both per-RRH average-and
peak-power constraints, where the EC is defined as the tele-traffic maximum
arrival rate that can be supported by the C-RAN under the statistical delay-QoS
requirement. We first transform the EC maximization problem into an equivalent
convex optimization problem. By using the Lagrange dual decomposition method
and satisfying the Karush-Kuhn-Tucker (KKT) conditions, the optimal
transmission power of each RRH can be obtained in closed form. Furthermore, an
online tracking method is provided for approximating the average power of each
RRH for the sake of updating the Lagrange dual variables. For the special case
of two RRHs, the expression of the average power to be assigned to each RRH can
be calculated in explicit form, which can be numerically evaluated. Hence, the
Lagrange dual variables can be computed in advance in this special case. Our
simulation results show that the proposed scheme converges rapidly for all the
scenarios considered and achieves 20\% higher EC than the optimization method,
where each RRH’s power is independently optimized.
Hassan Naseri, Visa Koivunen
Subjects: Information Theory (cs.IT); Machine Learning (stat.ML)
A reliable, accurate, and affordable positioning service is highly required
in wireless networks. In this paper, a novel distributed message passing
algorithm is proposed to solve the problem of cooperative network localization
using both distance and angle of arrival estimates. This hybrid approach
combines the distance and angle observations to reduce the uncertainty in
localizing the network nodes. A statistical problem formulations is employed
and approximate minimum mean square error (MMSE) estimates of the node
locations are found. Numerical results are presented to show the improvement in
localization performance compared to existing distance-only and angle-only
localization methods. Moreover, the proposed algorithm improves the
identifiability of the localization problem compared to range-only or
angle-only localization techniques. That is, it can solve the problem with
fewer anchor nodes and fewer connections in the network.
Ha-Vu Tran, Georges Kaddoum, Hung Tran, Een-Kee Hong
Comments: 15 pages, 8 figures
Subjects: Information Theory (cs.IT)
In this paper, we investigate an application of two different beamforming
techniques and propose a novel downlink power minimization scheme for a
two-tier heterogeneous network (HetNet) model. In this context, we employ time
reversal (TR) technique to a femtocell base station (FBS) whereas we assume
that a macrocell base station (MBS) uses a zero-forcing-based algorithm and the
communication channels are subject to frequency selective fading. Additionally,
HetNet’s backhaul connection is unable to support a sufficient throughput for
signaling information exchange between two tiers. Given the considered HetNet
model, a downlink power minimization scheme is proposed, and closed-form
expressions concerning the optimal solution are provided, taking this
constraint into account. Furthermore, considering imperfect channel estimation
at TR-employed femtocell, a worst-case robust power minimization problem is
formulated. By devising TR worst-case analysis, this robust problem is
transformed into an equivalent formulation that is tractable to solve. The
results presented in our paper show that the TR technique outperforms the
zero-forcing one in the perspective of beamforming methods for femtocell
working environments. Finally, we validate the proposed power loading strategy
for both cases of perfect and imperfect channel estimations.
Lukas Holzbaur, Hannes Bartz, Antonia Wachter-Zeh
Subjects: Information Theory (cs.IT)
A low-complexity method for resolving stall patterns when decoding staircase
codes is proposed. Stall patterns are the dominating contributor to the error
floor in the original decoding method. Our improvement is based on locating
stall patterns by intersecting non-zero syndromes and flipping the
corresponding bits. The approach effectively lowers the error floor of decoding
staircase codes. This allows for a new range of block sizes to be considered
for optical communication at a certain rate or, alternatively, a significantly
decreased error floor for the same block size. Further, an improved analysis of
the error floor behavior is introduced which provides a more accurate
estimation of the contributions to the error floor.
Congduan Li, Steven Weber, John MacLaren Walsh
Comments: 20 pages with double column, revision of previous submission arXiv:1507.05728
Subjects: Information Theory (cs.IT)
Recent algorithmic developments have enabled computers to automatically
determine and prove the capacity regions of small hypergraph networks under
network coding. A structural theory relating network coding problems of
different sizes is developed to make best use of this newfound computational
capability. A formal notion of network minimality is developed which removes
components of a network coding problem that are inessential to its core
complexity. Equivalence between different network coding problems under
relabeling is formalized via group actions, an algorithm which can directly
list single representatives from each equivalence class of minimal networks up
to a prescribed network size is presented. This algorithm, together with rate
region software, is leveraged to create a database containing the rate regions
for all minimal network coding problems with five or fewer sources and edges, a
collection of 744119 equivalence classes representing more than 9 million
networks. In order to best learn from this database, and to leverage it to
infer rate regions and their characteristics of networks at scale, a hierarchy
between different network coding problems is created with a new theory of
combinations and embedding operators.
Junghoon Kim, Bruno Clerckx, Paul D. Mitcheson
Comments: accepted for publication in IEEE WPTC 2017
Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)
A systematic design of adaptive waveform for Wireless Power Transfer (WPT)
has recently been proposed and shown through simulations to lead to significant
performance benefits compared to traditional non-adaptive and heuristic
waveforms. In this study, we design the first prototype of a closed-loop
wireless power transfer system with adaptive waveform optimization based on
Channel State Information acquisition. The prototype consists of three
important blocks, namely the channel estimator, the waveform optimizer, and the
energy harvester. Software Defined Radio (SDR) prototyping tools are used to
implement a wireless power transmitter and a channel estimator, and a voltage
doubler rectenna is designed to work as an energy harvester. A channel adaptive
waveform with 8 sinewaves is shown through experiments to improve the average
harvested DC power at the rectenna output by 9.8% to 36.8% over a non-adaptive
design with the same number of sinewaves.
Qingqing Wu, Yong Zeng, Rui Zhang
Comments: Submitted for possible publication
Subjects: Information Theory (cs.IT)
Unmanned aerial vehicles (UAVs) have attracted significant interest recently
in wireless communication due to their high maneuverability, flexible
deployment, and low cost. This paper studies a UAV-enabled wireless network
where the UAV is employed as an aerial mobile base station (BS) to serve a
group of users on the ground. To achieve fair performance among users, we
maximize the minimum throughput over all ground users by jointly optimizing the
multiuser communication scheduling and UAV trajectory over a finite horizon.
The formulated problem is shown to be a mixed integer non-convex optimization
problem that is difficult to solve in general. We thus propose an efficient
iterative algorithm by applying the block coordinate descent and successive
convex optimization techniques, which is guaranteed to converge to at least a
locally optimal solution. To achieve fast convergence and stable throughput, we
further propose a low-complexity initialization scheme for the UAV trajectory
design based on the simple circular trajectory. Extensive simulation results
are provided which show significant throughput gains of the proposed design as
compared to other benchmark schemes.
Yuhua Sun, Qiang Wang, Tongjiang Yan
Comments: 13 pages. arXiv admin note: text overlap with arXiv:1702.00822, arXiv:1701.03766
Subjects: Information Theory (cs.IT)
Let (p,q) be distinct primes satisfying (mathrm{gcd}(p-1,q-1)=d) and let
(D_i), (i=0,1,cdots,d-1), be Whiteman’s generalized cyclotomic classes with
(Z_{pq}^{ast}=cup_{i=0}^{d-1}D_i). In this paper, we give the values of Gauss
periods: (Omega_0=sum_{i=0}^{frac{d}{2}-1}D_{2i}) and
(Omega_1=sum_{i=0}^{frac{d}{2}-1}D_{2i+1}). As an application, we determine
a lower bound on the 2-adic complexity of modified Jacobi sequence. Our results
show that the 2-adic complexity of modified Jacobi sequence is at least
(pq-p-q-1) with period (N=pq). This indicates that the 2-adic complexity of
modified Jacobi sequence is large enough to resist the attack of the rational
approximation algorithm (RAA) for feedback with carry shift registers (FCSRs).
Pengda Huang
Subjects: Information Theory (cs.IT)
The industry of wearable remote health monitoring system keeps growing. In
the diagnosis of cardiovascular disease, Electrocardiography~(ECG) waveform is
one of the major tools which is thus widely taken as the monitoring objective.
For the purpose of reducing bit expenditure in the monitoring systems, we study
the compression of ECG signal and propose a new compressor in low complexity.
Different from the traditional ECG compressors, most of which are built on a
single sensor, our compression scheme is based on multiple ECG sensors. The
multi-sensor based compression scheme is able to provide more accurate sensing
results. Besides the investigation into the structure of the compressor, we
also jointly optimize the period and the bit number per sample in the
transmission of ECG signal. Experiments are performed on records in MIT-BIH
Arrhythmis database and European ST-T database. Experimental results show that
our method outperforms conventional ones with respect to ECG reconstruction
accuracy at the same bit rate consumption
Muppirala Viswa Virinchi, Abhishek Behera, Manoj Gopalkrishnan
Comments: 12 pages, 1 figure
Subjects: Molecular Networks (q-bio.MN); Statistical Mechanics (cond-mat.stat-mech); Information Theory (cs.IT); Adaptation and Self-Organizing Systems (nlin.AO); Biological Physics (physics.bio-ph)
The notion of entropy is shared between statistics and thermodynamics, and is
fundamental to both disciplines. This makes statistical problems particularly
suitable for reaction network implementations. In this paper we show how to
perform a statistical operation known as Information Projection or E projection
with stochastic mass-action kinetics. Our scheme encodes desired conditional
distributions as the equilibrium distributions of reaction systems. To our
knowledge this is a first scheme to exploit the inherent stochasticity of
reaction networks for information processing. We apply this to the problem of
an artificial cell trying to infer its environment from partial observations.
Mohammad Shahbazi
Subjects: Quantum Physics (quant-ph); Information Theory (cs.IT)
This paper is a comment on the paper “Quantum Mechanics and Algorithmic
Randomness” was written by Ulvi Yurtsever cite{Yurtsever} and the briefly
explanation of the algorithmic randomness of quantum measurements results.
There are differences between the computability of probability sources, (
which means there is an algorithm that can define the way that random process
or probability source generates the numbers ) and the algorithmic randomness of
the sequences or strings which are produced by a source. We may have the source
without a computable algorithm for that but it can produce compressible or
incompressible strings. For example, so far there is no computable algorithm
that can define the abstract meaning of randomness even the easiest one,
Bernoulli probability distribution. Historically and philosophically there many
scientist believe the existence of the algorithm for a random process is a
contradiction because in their opinion, in the definition of a random variable,
implicitly assumed that there is no reason for the happening of an event and we
just know the probabilities. There is however no need to enter into this matter
here. As in the paper mentioned, all the algorithms for simulating a random
process try to pass the statistical tests and be close to the abstract meaning
of it.
Qiuwei Li, Zhihui Zhu, Gongguo Tang
Subjects: Numerical Analysis (cs.NA); Information Theory (cs.IT); Learning (cs.LG); Optimization and Control (math.OC)
This work investigates the geometry of a nonconvex reformulation of
minimizing a general convex loss function (f(X)) regularized by the matrix
nuclear norm (|X|_*). Nuclear-norm regularized matrix inverse problems are at
the heart of many applications in machine learning, signal processing, and
control. The statistical performance of nuclear norm regularization has been
studied extensively in literature using convex analysis techniques. Despite its
optimal performance, the resulting optimization has high computational
complexity when solved using standard or even tailored fast convex solvers. To
develop faster and more scalable algorithms, we follow the proposal of
Burer-Monteiro to factor the matrix variable (X) into the product of two
smaller rectangular matrices (X=UV^T) and also replace the nuclear norm
(|X|_*) with ((|U|_F^2+|V|_F^2)/2). In spite of the nonconvexity of the
factored formulation, we prove that when the convex loss function (f(X)) is
((2r,4r))-restricted well-conditioned, each critical point of the factored
problem either corresponds to the optimal solution (X^star) of the original
convex optimization or is a strict saddle point where the Hessian matrix has a
strictly negative eigenvalue. Such a geometric structure of the factored
formulation allows many local search algorithms to converge to the global
optimum with random initializations.