Mundher Al-Shabi, Tee Connie, Andrew Beng Jin Teoh
Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
With rapid development of the Internet, the web contents become huge. Most of
the websites are publicly available and anyone can access the contents
everywhere such as workplace, home and even schools. Nev-ertheless, not all the
web contents are appropriate for all users, especially children. An example of
these contents is pornography images which should be restricted to certain age
group. Besides, these images are not safe for work (NSFW) in which employees
should not be seen accessing such contents. Recently, convolutional neural
networks have been successfully applied to many computer vision problems.
Inspired by these successes, we propose a mixture of convolutional neural
networks for adult content recognition. Unlike other works, our method is
formulated on a weighted sum of multiple deep neural network models. The
weights of each CNN models are expressed as a linear regression problem learnt
using Ordinary Least Squares (OLS). Experimental results demonstrate that the
proposed model outperforms both single CNN model and the average sum of CNN
models in adult content recognition.
Zhen-Hua Feng, Josef Kittler, William Christmas, Xiao-Jun Wu
Comments: 15 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Appearance variations result in many difficulties in face image analysis. To
deal with this challenge, we present a Unified Tensor-based Active Appearance
Model (UT-AAM) for jointly modelling the geometry and texture information of 2D
faces. In contrast with the classical Tensor-based AAM (T-AAM), the proposed
UT-AAM has four advantages: First, for each type of face information, namely
shape and texture, we construct a tensor model capturing all relevant
appearance variations. This unified tensor model contrasts with the
variation-specific models of T-AAM. Second, a strategy for dealing with
self-occluded faces is proposed to obtain consistent shape and texture
representations of faces across large pose variations. Third, our UT-AAM is
capable of constructing the model from an incomplete training dataset, using
tensor completion methods. Last, we use an effective cascaded-regression-based
method for UT-AAM fitting. With these improvements, the utility of UT-AAM in
practice is considerably enhanced in comparison with the classical T-AAM. As an
example, we demonstrate the improvements in training facial landmark detectors
through the use of UT-AAM to synthesise a large number of virtual samples.
Experimental results obtained using the Multi-PIE and 300-W face datasets
demonstrate the merits of the proposed approach.
Licheng Yu, Hao Tan, Mohit Bansal, Tamara L. Berg
Comments: 11 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Referring expressions are natural language constructions used to identify
particular objects within a scene. In this paper, we propose a unified
framework for the tasks of referring expression comprehension and generation.
Our model is composed of three modules: speaker, listener, and reinforcer. The
speaker generates referring expressions, the listener comprehends referring
expressions, and the reinforcer introduces a reward function to guide sampling
of more discriminative expressions. The listener-speaker modules are trained
jointly in an end-to-end learning framework, allowing the modules to be aware
of one another during learning while also benefiting from the discriminative
reinforcer’s feedback. We demonstrate that this unified framework and training
achieves state-of-the-art results for both comprehension and generation on
three referring expression datasets. Project and demo page:
this https URL
Hamza Bendaoudi, Farida Cheriet, J. M. Pierre Langlois
Comments: This paper was accepted and presented at Conference on Design and Architectures for Signal and Image Processing – DASIP 2016
Subjects: Computer Vision and Pattern Recognition (cs.CV); Hardware Architecture (cs.AR)
This paper presents a memory efficient architecture that implements the
Multi-Scale Line Detector (MSLD) algorithm for real-time retinal blood vessel
detection in fundus images on a Zynq FPGA. This implementation benefits from
the FPGA parallelism to drastically reduce the memory requirements of the MSLD
from two images to a few values. The architecture is optimized in terms of
resource utilization by reusing the computations and optimizing the bit-width.
The throughput is increased by designing fully pipelined functional units. The
architecture is capable of achieving a comparable accuracy to its software
implementation but 70x faster for low resolution images. For high resolution
images, it achieves an acceleration by a factor of 323x.
Amir R. Zamir, Te-Lin Wu, Lin Sun, William Shen, Jitendra Malik, Silvio Savarese
Comments: See a video describing the method at this https URL and the website this http URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Currently, the most successful learning models in computer vision are based
on learning successive representations followed by a decision layer. This is
usually actualized through feedforward multilayer neural networks, e.g.
ConvNets, where each layer forms one of such successive representations.
However, an alternative that can achieve the same goal is a feedback based
approach in which the representation is formed in an iterative manner based on
a feedback received from previous iteration’s output.
We establish that a feedback based approach has several fundamental
advantages over feedforward: it enables making early predictions at the query
time, its output naturally conforms to a hierarchical structure in the label
space (e.g. a taxonomy), and it provides a new basis for Curriculum Learning.
We observe that feedback networks develop a considerably different
representation compared to feedforward counterparts, in line with the
aforementioned advantages. We put forth a general feedback based learning
architecture with the endpoint results on par or better than existing
feedforward networks with the addition of the above advantages. We also
investigate several mechanisms in feedback architectures (e.g. skip connections
in time) and design choices (e.g. feedback length). We hope this study offers
new perspectives in quest for more natural and practical learning models.
Arnav Bhavsar
Subjects: Computer Vision and Pattern Recognition (cs.CV)
In recent years, the usefulness of 3D shape estimation is being realized in
microscopic or close-range imaging, as the 3D information can further be used
in various applications. Due to limited depth of field at such small distances,
the defocus blur induced in images can provide information about the 3D shape
of the object. The task of `shape from defocus’ (SFD), involves the problem of
estimating good quality 3D shape estimates from images with depth-dependent
defocus blur. While the research area of SFD is quite well-established, the
approaches have largely demonstrated results on objects with bulk/coarse shape
variation. However, in many cases, objects studied under microscopes often
involve fine/detailed structures, which have not been explicitly considered in
most methods. In addition, given that, in recent years, large data volumes are
typically associated with microscopy related applications, it is also important
for such SFD methods to be efficient. In this work, we provide an indication of
the usefulness of the Belief Propagation (BP) approach in addressing these
concerns for SFD. BP has been known to be an efficient combinatorial
optimization approach, and has been empirically demonstrated to yield good
quality solutions in low-level vision problems such as image restoration,
stereo disparity estimation etc. For exploiting the efficiency of BP in SFD, we
assume local space-invariance of the defocus blur, which enables the
application of BP in a straightforward manner. Even with such an assumption,
the ability of BP to provide good quality solutions while using non-convex
priors, reflects in yielding plausible shape estimates in presence of fine
structures on the objects under microscopy imaging.
Pichao Wang, Wanqing Li, Chuankun Li, Yonghong Hou
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Convolutional Neural Networks (ConvNets) have recently shown promising
performance in many computer vision tasks, especially image-based recognition.
How to effectively apply ConvNets to sequence-based data is still an open
problem. This paper proposes an effective yet simple method to represent
spatio-temporal information carried in (3D) skeleton sequences into three (2D)
images by encoding the joint trajectories and their dynamics into color
distribution in the images, referred to as Joint Trajectory Maps (JTM), and
adopts ConvNets to learn the discriminative features for human action
recognition. Such an image-based representation enables us to fine-tune
existing ConvNets models for the classification of skeleton sequences without
training the networks afresh. The three JTMs are generated in three orthogonal
planes and provide complimentary information to each other. The final
recognition is further improved through multiply score fusion of the three
JTMs. The proposed method was evaluated on four public benchmark datasets, the
large NTU RGB+D Dataset, MSRC-12 Kinect Gesture Dataset (MSRC-12), G3D Dataset
and UTD Multimodal Human Action Dataset (UTD-MHAD) and achieved the
state-of-the-art results.
Diego Marcos, Michele Volpi, Nikos Komodakis, Devis Tuia
Subjects: Computer Vision and Pattern Recognition (cs.CV)
We propose a method to encode rotation equivariance or invariance into
convolutional neural networks (CNNs). Each convolutional filter is applied with
several orientations and returns a vector field that represents the magnitude
and angle of the highest scoring rotation at the given spatial location. To
propagate information about the main orientation of the different features to
each layer in the network, we propose an enriched orientation pooling, i.e. max
and argmax operators over the orientation space, allowing to keep the
dimensionality of the feature maps low and to propagate only useful
information. We name this approach RotEqNet. We apply RotEqNet to three
datasets: first, a rotation invariant classification problem, the MNIST-rot
benchmark, in which we improve over the state-of-the-art results. Then, a
neuron membrane segmentation benchmark, where we show that RotEqNet can be
applied successfully to obtain equivariance to rotation with a simple fully
convolutional architecture. Finally, we improve significantly the
state-of-the-art on the problem of estimating cars’ absolute orientation in
aerial images, a problem where the output is required to be covariant with
respect to the object’s orientation.
Hang Su, Xiatian Zhu, Shaogang Gong
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Logo detection in unconstrained images is challenging, particularly when only
very sparse labelled training images are accessible due to high labelling
costs. In this work, we describe a model training image synthesising method
capable of improving significantly logo detection performance when only a
handful of (e.g., 10) labelled training images captured in realistic context
are available, avoiding extensive manual labelling costs. Specifically, we
design a novel algorithm for generating Synthetic Context Logo (SCL) training
images to increase model robustness against unknown background clutters,
resulting in superior logo detection performance. For benchmarking model
performance, we introduce a new logo detection dataset TopLogo-10 collected
from top 10 most popular clothing/wearable brandname logos captured in rich
visual context. Extensive comparisons show the advantages of our proposed SCL
model over the state-of-the-art alternatives for logo detection using two
real-world logo benchmark datasets: FlickrLogo-32 and our new TopLogo-10.
Mundher Al-Shabi, Tee Connie, Andrew Beng Jin Teoh
Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
With rapid development of the Internet, the web contents become huge. Most of
the websites are publicly available and anyone can access the contents
everywhere such as workplace, home and even schools. Nev-ertheless, not all the
web contents are appropriate for all users, especially children. An example of
these contents is pornography images which should be restricted to certain age
group. Besides, these images are not safe for work (NSFW) in which employees
should not be seen accessing such contents. Recently, convolutional neural
networks have been successfully applied to many computer vision problems.
Inspired by these successes, we propose a mixture of convolutional neural
networks for adult content recognition. Unlike other works, our method is
formulated on a weighted sum of multiple deep neural network models. The
weights of each CNN models are expressed as a linear regression problem learnt
using Ordinary Least Squares (OLS). Experimental results demonstrate that the
proposed model outperforms both single CNN model and the average sum of CNN
models in adult content recognition.
Fahime Sheikhzadeh, Martial Guillaud, Rabab K. Ward
Subjects: Tissues and Organs (q-bio.TO); Computer Vision and Pattern Recognition (cs.CV)
This paper addresses the problem of quantifying biomarkers in multi-stained
tissues, based on color and spatial information. A deep learning based method
that can automatically localize and quantify the cells expressing biomarker(s)
in a whole slide image is proposed. The deep learning network is a fully
convolutional network (FCN) whose input is the true RGB color image of a tissue
and output is a map of the different biomarkers. The FCN relies on a
convolutional neural network (CNN) that classifies each cell separately
according to the biomarker it expresses. In this study, images of
immunohistochemistry (IHC) stained slides were collected and used. More than
4,500 RGB images of cells were manually labeled based on the expressing
biomarkers. The labeled cell images were used to train the CNN (obtaining an
accuracy of 92% in a test set). The trained CNN is then extended to an FCN that
generates a map of all biomarkers in the whole slide image acquired by the
scanner (instead of classifying every cell image). To evaluate our method, we
manually labeled all nuclei expressing different biomarkers in two whole slide
images and used theses as the ground truth. Our proposed method for
immunohistochemical analysis compares well with the manual labeling by humans
(average F-score of 0.96).
Hamid Reza Hassanzadeh, Hadi Sadoghi Yazdi, Abedin Vahedian
Journal-ref: 3rd Iranian Joint Congress on Intelligent Systems and Fuzzy
Systems, 2009
Subjects: Artificial Intelligence (cs.AI)
In this paper we introduce a fuzzy constraint linear discriminant analysis
(FC-LDA). The FC-LDA tries to minimize misclassification error based on
modified perceptron criterion that benefits handling the uncertainty near the
decision boundary by means of a fuzzy linear programming approach with fuzzy
resources. The method proposed has low computational complexity because of its
linear characteristics and the ability to deal with noisy data with different
degrees of tolerance. Obtained results verify the success of the algorithm when
dealing with different problems. Comparing FC-LDA and LDA shows superiority in
classification task.
Matthias Nickles
Comments: Technical Report
Subjects: Artificial Intelligence (cs.AI)
This technical report describes the usage, syntax, semantics and core
algorithms of the probabilistic inductive logic programming framework PrASP.
PrASP is a research software which integrates non-monotonic reasoning based on
Answer Set Programming (ASP), probabilistic inference and parameter learning.
In contrast to traditional approaches to Probabilistic (Inductive) Logic
Programming, our framework imposes only little restrictions on probabilistic
logic programs. In particular, PrASP allows for ASP as well as First-Order
Logic syntax, and for the annotation of formulas with point probabilities as
well as interval probabilities. A range of widely configurable inference
algorithms can be combined in a pipeline-like fashion, in order to cover a
variety of use cases.
Cédric Buron (LIP6), Sylvain Ductor (LIP6), Zahia Guessoum (LIP6, CRESTIC)
Journal-ref: STAIRS 2016, Aug 2016, The Hague, Netherlands. IOS Press, 284, pp.
27 — 38, Frontiers in Artificial Intelligence and Applications
Subjects: Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)
Opponent modeling consists in modeling the strategy or preferences of an
agent thanks to the data it provides. In the context of automated negotiation
and with machine learning, it can result in an advantage so overwhelming that
it may restrain some casual agents to be part of the bargaining process. We
qualify as “curious” an agent driven by the desire of negotiating in order to
collect information and improve its opponent model. However, neither
curiosity-based rational-ity nor curiosity-robust protocol have been studied in
automatic negotiation. In this paper, we rely on mechanism design to propose
three extensions of the standard bargaining protocol that limit information
leak. Those extensions are supported by an enhanced rationality model, that
considers the exchanged information. Also, they are theoretically analyzed and
experimentally evaluated.
Erik P Hoel
Comments: 15 pages, 6 figures
Subjects: Information Theory (cs.IT); Artificial Intelligence (cs.AI)
The causal structure of any system can be analyzed at a multitude of spatial
and temporal scales. It has long been thought that while higher scale (macro)
descriptions of causal structure may be useful to observers, they are at best a
compressed description and at worse leave out critical information. However,
recent research applying information theory to causal analysis has shown that
the causal structure of some systems can actually come into focus (be more
informative) at a macroscale (Hoel et al. 2013). That is, a macro model of a
system (a map) can be more informative than a fully detailed model of the
system (the territory). This has been called causal emergence. While causal
emergence may at first glance seem counterintuitive, this paper grounds the
phenomenon in a classic concept from information theory: Shannon’s discovery of
the channel capacity. I argue that systems have a particular causal capacity,
and that different causal models of those systems take advantage of that
capacity to various degrees. For some systems, only macroscale causal models
use the full causal capacity. Such macroscale causal models can either be
coarse-grains, or may leave variables and states out of the model (exogenous)
in various ways, which can improve the model’s efficacy and its informativeness
via the same mathematical principles of how error-correcting codes take
advantage of an information channel’s capacity. As model choice increase, the
causal capacity of a system approaches the channel capacity. Ultimately, this
provides a general framework for understanding how the causal structure of some
systems cannot be fully captured by even the most detailed microscopic model.
Licheng Yu, Hao Tan, Mohit Bansal, Tamara L. Berg
Comments: 11 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Referring expressions are natural language constructions used to identify
particular objects within a scene. In this paper, we propose a unified
framework for the tasks of referring expression comprehension and generation.
Our model is composed of three modules: speaker, listener, and reinforcer. The
speaker generates referring expressions, the listener comprehends referring
expressions, and the reinforcer introduces a reward function to guide sampling
of more discriminative expressions. The listener-speaker modules are trained
jointly in an end-to-end learning framework, allowing the modules to be aware
of one another during learning while also benefiting from the discriminative
reinforcer’s feedback. We demonstrate that this unified framework and training
achieves state-of-the-art results for both comprehension and generation on
three referring expression datasets. Project and demo page:
this https URL
Timothy A. Mann, Hugo Penedones, Shie Mannor, Todd Hester
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Temporal Difference learning or TD((lambda)) is a fundamental algorithm in
the field of reinforcement learning. However, setting TD’s (lambda) parameter,
which controls the timescale of TD updates, is generally left up to the
practitioner. We formalize the (lambda) selection problem as a bias-variance
trade-off where the solution is the value of (lambda) that leads to the
smallest Mean Squared Value Error (MSVE). To solve this trade-off we suggest
applying Leave-One-Trajectory-Out Cross-Validation (LOTO-CV) to search the
space of (lambda) values. Unfortunately, this approach is too computationally
expensive for most practical applications. For Least Squares TD (LSTD) we show
that LOTO-CV can be implemented efficiently to automatically tune (lambda) and
apply function optimization methods to efficiently search the space of
(lambda) values. The resulting algorithm, ALLSTD, is parameter free and our
experiments demonstrate that ALLSTD is significantly computationally faster
than the na”{i}ve LOTO-CV implementation while achieving similar performance.
Ahlam Ansari, Moonish Maknojia, Altamash Shaikh
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Question Answering System (QAS) is used for information retrieval and natural
language processing (NLP) to reduce human effort. There are numerous QAS based
on the user documents present today, but they all are limited to providing
objective answers and process simple questions only. Complex questions cannot
be answered by the existing QAS, as they require interpretation of the current
and old data as well as the question asked by the user. The above limitations
can be overcome by using deep cases and neural network. Hence we propose a
modified QAS in which we create a deep artificial neural network with
associative memory from text documents. The modified QAS processes the contents
of the text document provided to it and find the answer to even complex
questions in the documents.
Massimiliano Dal Mas
Comments: 8 pages, 3 figures; 2 tables; for details see: this http URL
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Computers and Society (cs.CY); Social and Information Networks (cs.SI)
The Folksodriven framework makes it possible for data scientists to define an
ontology environment where searching for buried patterns that have some kind of
predictive power to build predictive models more effectively. It accomplishes
this through an abstractions that isolate parameters of the predictive modeling
process searching for patterns and designing the feature set, too. To reflect
the evolving knowledge, this paper considers ontologies based on folksonomies
according to a new concept structure called “Folksodriven” to represent
folksonomies. So, the studies on the transformational regulation of the
Folksodriven tags are regarded to be important for adaptive folksonomies
classifications in an evolving environment used by Intelligent Systems to
represent the knowledge sharing. Folksodriven tags are used to categorize
salient data points so they can be fed to a machine-learning system and
“featurizing” the data.
Conceição Rocha, Alípio Jorge, Roberta Sionara, Paula Brito, Carlos Pimenta, Solange Rezende
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)
This paper deals with the entity extraction task (named entity recognition)
of a text mining process that aims at unveiling non-trivial semantic
structures, such as relationships and interaction between entities or
communities. In this paper we present a simple and efficient named entity
extraction algorithm. The method, named PAMPO (PAttern Matching and POs tagging
based algorithm for NER), relies on flexible pattern matching, part-of-speech
tagging and lexical-based rules. It was developed to process texts written in
Portuguese, however it is potentially applicable to other languages as well.
We compare our approach with current alternatives that support Named Entity
Recognition (NER) for content written in Portuguese. These are Alchemy, Zemanta
and Rembrandt. Evaluation of the efficacy of the entity extraction method on
several texts written in Portuguese indicates a considerable improvement on
(recall) and (F_1) measures.
Massimiliano Dal Mas
Comments: 9 pages, 3 figures; for details see: this http URL
Subjects: Computers and Society (cs.CY); Information Retrieval (cs.IR); Software Engineering (cs.SE); Social and Information Networks (cs.SI)
In this paper we present the FolksoDriven Cloud (FDC) built on Cloud and on
Semantic technologies. Cloud computing has emerged in these recent years as the
new paradigm for the provision of on-demand distributed computing resources.
Semantic Web can be used for relationship between different data and
descriptions of services to annotate provenance of repositories on ontologies.
The FDC service is composed of a back-end which submits and monitors the
documents, and a user front-end which allows users to schedule on-demand
operations and to watch the progress of running processes. The impact of the
proposed method is illustrated on a user since its inception.
Ahlam Ansari, Moonish Maknojia, Altamash Shaikh
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Question Answering System (QAS) is used for information retrieval and natural
language processing (NLP) to reduce human effort. There are numerous QAS based
on the user documents present today, but they all are limited to providing
objective answers and process simple questions only. Complex questions cannot
be answered by the existing QAS, as they require interpretation of the current
and old data as well as the question asked by the user. The above limitations
can be overcome by using deep cases and neural network. Hence we propose a
modified QAS in which we create a deep artificial neural network with
associative memory from text documents. The modified QAS processes the contents
of the text document provided to it and find the answer to even complex
questions in the documents.
Massimiliano Dal Mas
Comments: 8 pages, 3 figures; 2 tables; for details see: this http URL
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Computers and Society (cs.CY); Social and Information Networks (cs.SI)
The Folksodriven framework makes it possible for data scientists to define an
ontology environment where searching for buried patterns that have some kind of
predictive power to build predictive models more effectively. It accomplishes
this through an abstractions that isolate parameters of the predictive modeling
process searching for patterns and designing the feature set, too. To reflect
the evolving knowledge, this paper considers ontologies based on folksonomies
according to a new concept structure called “Folksodriven” to represent
folksonomies. So, the studies on the transformational regulation of the
Folksodriven tags are regarded to be important for adaptive folksonomies
classifications in an evolving environment used by Intelligent Systems to
represent the knowledge sharing. Folksodriven tags are used to categorize
salient data points so they can be fed to a machine-learning system and
“featurizing” the data.
Licheng Yu, Hao Tan, Mohit Bansal, Tamara L. Berg
Comments: 11 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Referring expressions are natural language constructions used to identify
particular objects within a scene. In this paper, we propose a unified
framework for the tasks of referring expression comprehension and generation.
Our model is composed of three modules: speaker, listener, and reinforcer. The
speaker generates referring expressions, the listener comprehends referring
expressions, and the reinforcer introduces a reward function to guide sampling
of more discriminative expressions. The listener-speaker modules are trained
jointly in an end-to-end learning framework, allowing the modules to be aware
of one another during learning while also benefiting from the discriminative
reinforcer’s feedback. We demonstrate that this unified framework and training
achieves state-of-the-art results for both comprehension and generation on
three referring expression datasets. Project and demo page:
this https URL
Conceição Rocha, Alípio Jorge, Roberta Sionara, Paula Brito, Carlos Pimenta, Solange Rezende
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)
This paper deals with the entity extraction task (named entity recognition)
of a text mining process that aims at unveiling non-trivial semantic
structures, such as relationships and interaction between entities or
communities. In this paper we present a simple and efficient named entity
extraction algorithm. The method, named PAMPO (PAttern Matching and POs tagging
based algorithm for NER), relies on flexible pattern matching, part-of-speech
tagging and lexical-based rules. It was developed to process texts written in
Portuguese, however it is potentially applicable to other languages as well.
We compare our approach with current alternatives that support Named Entity
Recognition (NER) for content written in Portuguese. These are Alchemy, Zemanta
and Rembrandt. Evaluation of the efficacy of the entity extraction method on
several texts written in Portuguese indicates a considerable improvement on
(recall) and (F_1) measures.
Milind Chabbi, Abdelhalim Amer, Shasha Wen, Xu Liu
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
This manuscript serves as a correctness proof of the Hierarchical MCS locks
with Timeout (HMCS-T) described in our paper titled “An Efficient
Abortable-locking Protocol for Multi-level NUMA Systems” appearing in the
proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of
Parallel Programming.
HMCS-T is a very involved protocol. The system is stateful; the values of
prior acquisition efforts affect the subsequent acquisition efforts. Also, the
status of successors, predecessors, ancestors, and descendants affect steps
followed by the protocol. The ability to make the protocol fully non-blocking
leads to modifications to the exttt{next} field, which causes deviation from
the original MCS lock protocol both in acquisition and release. At several
places, unconditional field updates are replaced with SWAP or CAS operations.
We follow a multi-step approach to prove the correctness of HMCS-T. To
demonstrate the correctness of the HMCS-T lock, we use the Spin model checking.
Model checking causes a combinatorial explosion even to simulate a handful of
threads. First, we understand the minimal, sufficient configurations necessary
to prove safety properties of a single level of lock in the tree. We construct
HMCS-T locks that represent these configurations. We model check these
configurations, which proves the correctness of components of an HMCS-T lock.
Finally, building upon these facts, we argue logically for the correctness of
HMCS-T<n>.
Christopher Natoli, Vincent Gramoli
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Cryptography and Security (cs.CR)
In this paper, we identify a new form of attack, called the Balance attack,
against proof-of-work blockchain systems. The novelty of this attack consists
of delaying network communications between multiple subgroups of nodes with
balanced mining power. Our theoretical analysis captures the precise tradeoff
between the network delay and the mining power of the attacker needed to double
spend in Ethereum with high probability.
We quantify our probabilistic analysis with statistics taken from the R3
consortium, and show that a single machine needs 20 minutes to attack the
consortium. Finally, we run an Ethereum private chain in a distributed system
with similar settings as R3 to demonstrate the feasibility of the approach, and
discuss the application of the Balance attack to Bitcoin. Our results clearly
confirm that main proof-of-work blockchain protocols can be badly suited for
consortium blockchains.
Juno Nam, Jurae Kim
Comments: 19 pages, 5 figures
Subjects: Learning (cs.LG)
Finding the main product of a chemical reaction is one of the important
problems of organic chemistry. This paper describes a method of applying a
neural machine translation model to the prediction of organic chemical
reactions. In order to translate ‘reactants and reagents’ to ‘products’, a
gated recurrent unit based sequence-to-sequence model and a parser to generate
input tokens for model from reaction SMILES strings were built. Training sets
are composed of reactions from the patent databases, and reactions manually
generated applying the elementary reactions in an organic chemistry textbook of
Wade. The trained models were tested by examples and problems in the textbook.
The prediction process does not need manual encoding of rules (e.g., SMARTS
transformations) to predict products, hence it only needs sufficient training
reaction sets to learn new types of reactions.
Timothy A. Mann, Hugo Penedones, Shie Mannor, Todd Hester
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Temporal Difference learning or TD((lambda)) is a fundamental algorithm in
the field of reinforcement learning. However, setting TD’s (lambda) parameter,
which controls the timescale of TD updates, is generally left up to the
practitioner. We formalize the (lambda) selection problem as a bias-variance
trade-off where the solution is the value of (lambda) that leads to the
smallest Mean Squared Value Error (MSVE). To solve this trade-off we suggest
applying Leave-One-Trajectory-Out Cross-Validation (LOTO-CV) to search the
space of (lambda) values. Unfortunately, this approach is too computationally
expensive for most practical applications. For Least Squares TD (LSTD) we show
that LOTO-CV can be implemented efficiently to automatically tune (lambda) and
apply function optimization methods to efficiently search the space of
(lambda) values. The resulting algorithm, ALLSTD, is parameter free and our
experiments demonstrate that ALLSTD is significantly computationally faster
than the na”{i}ve LOTO-CV implementation while achieving similar performance.
Shuai Li, Kui Jia, Xiaogang Wang
Subjects: Learning (cs.LG)
The recent successful deep neural networks are largely trained in a
supervised manner. It {it associates} complex patterns of input samples with
neurons in the last layer, which form representations of {it concepts}. In
spite of their successes, the properties of complex patterns associated a
learned concept remain elusive. In this work, by analyzing how neurons are
associated with concepts in supervised networks, we hypothesize that with
proper priors to regulate learning, neural networks can automatically associate
neurons in the intermediate layers with concepts that are aligned with real
world concepts, when trained only with labels that associate concepts with top
level neurons, which is a plausible way for unsupervised learning. We develop a
prior to verify the hypothesis and experimentally find the proposed priors help
neural networks automatically learn both basic physical concepts at the lower
layers, e.g., rotation of filters, and highly semantic concepts at the higher
layers, e.g., fine-grained categories of an entry-level category.
Hongyuan Mei, Jason Eisner
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
Many events occur in the world. Some event types are stochastically excited
or inhibited—in the sense of having their probabilities elevated or
decreased—by patterns in the sequence of previous events. Discovering such
patterns can help us predict which type of event will happen next and when.
Learning such structure should benefit various applications, including medical
prognosis, consumer behavior, and social media activity prediction. We propose
to model streams of discrete events in continuous time, by constructing a
neurally self-modulating multivariate point process. This generative model
allows past events to influence the future in complex ways, by conditioning
future event intensities on the hidden state of a recurrent neural network that
has consumed the stream of past events. We evaluate our model on multiple
datasets and show that it significantly outperforms other strong baselines.
Jason Hartford, Greg Lewis, Kevin Leyton-Brown, Matt Taddy
Subjects: Applications (stat.AP); Learning (cs.LG); Machine Learning (stat.ML)
We are in the middle of a remarkable rise in the use and capability of
artificial intelligence. Much of this growth has been fueled by the success of
deep learning architectures: models that map from observables to outputs via
multiple layers of latent representations. These deep learning algorithms are
effective tools for unstructured prediction, and they can be combined in AI
systems to solve complex automated reasoning problems. This paper provides a
recipe for combining ML algorithms to solve for causal effects in the presence
of instrumental variables — sources of treatment randomization that are
conditionally independent from the response. We show that a flexible IV
specification resolves into two prediction tasks that can be solved with deep
neural nets: a first-stage network for treatment prediction and a second-stage
network whose loss function involves integration over the conditional treatment
distribution. This Deep IV framework imposes some specific structure on the
stochastic gradient descent routine used for training, but it is general enough
that we can take advantage of off-the-shelf ML capabilities and avoid extensive
algorithm customization. We outline how to obtain out-of-sample causal
validation in order to avoid over-fit. We also introduce schemes for both
Bayesian and frequentist inference: the former via a novel adaptation of
dropout training, and the latter via a data splitting routine.
Frédéric Chazal (DATASHAPE), Ilaria Giulini (DATASHAPE), Bertrand Michel (LSTA)
Journal-ref: 30th Conference on Neural Information Processing Systems (NIPS
2016), Dec 2016, Barcelona, Spain. 30th Conference on Neural Information
Processing Systems (NIPS 2016), Barcelona, Spain., 2016
Subjects: Computational Geometry (cs.CG); Learning (cs.LG); Statistics Theory (math.ST)
Approximations of Laplace-Beltrami operators on manifolds through graph
Lapla-cians have become popular tools in data analysis and machine learning.
These discretized operators usually depend on bandwidth parameters whose tuning
remains a theoretical and practical problem. In this paper, we address this
problem for the unnormalized graph Laplacian by establishing an oracle
inequality that opens the door to a well-founded data-driven procedure for the
bandwidth selection. Our approach relies on recent results by Lacour and
Massart [LM15] on the so-called Lepski’s method.
Erik P Hoel
Comments: 15 pages, 6 figures
Subjects: Information Theory (cs.IT); Artificial Intelligence (cs.AI)
The causal structure of any system can be analyzed at a multitude of spatial
and temporal scales. It has long been thought that while higher scale (macro)
descriptions of causal structure may be useful to observers, they are at best a
compressed description and at worse leave out critical information. However,
recent research applying information theory to causal analysis has shown that
the causal structure of some systems can actually come into focus (be more
informative) at a macroscale (Hoel et al. 2013). That is, a macro model of a
system (a map) can be more informative than a fully detailed model of the
system (the territory). This has been called causal emergence. While causal
emergence may at first glance seem counterintuitive, this paper grounds the
phenomenon in a classic concept from information theory: Shannon’s discovery of
the channel capacity. I argue that systems have a particular causal capacity,
and that different causal models of those systems take advantage of that
capacity to various degrees. For some systems, only macroscale causal models
use the full causal capacity. Such macroscale causal models can either be
coarse-grains, or may leave variables and states out of the model (exogenous)
in various ways, which can improve the model’s efficacy and its informativeness
via the same mathematical principles of how error-correcting codes take
advantage of an information channel’s capacity. As model choice increase, the
causal capacity of a system approaches the channel capacity. Ultimately, this
provides a general framework for understanding how the causal structure of some
systems cannot be fully captured by even the most detailed microscopic model.
Kiryung Lee, Yanjun Li, Kyong Hwan Jin, Jong Chul Ye
Comments: 41 pages, 3 figures
Subjects: Information Theory (cs.IT)
Compressed sensing provided a new sampling paradigm for sparse signals.
Remarkably, it has been shown that practical algorithms provide robust recovery
from noisy linear measurements acquired at a near optimal sample rate. In
real-world applications, a signal of interest is typically sparse not in the
canonical basis but in a certain transform domain, such as the wavelet or the
finite difference domain. The theory of compressed sensing was extended to this
analysis sparsity model. However, existing theoretic results for the analysis
sparsity model have limitations in the following senses: the derivation relies
on particular choices of sensing matrix and sparsifying transform; sample
complexity is significantly suboptimal compared to the analogous result in
compressed sensing with the canonical sparsity model. In this paper, we propose
a novel unified theory for robust recovery of sparse signals in a general
transform domain using convex relaxation methods that overcomes the
aforementioned limitations. Our results do not rely on any specific choice of
sensing matrix and sparsifying transform and provide near optimal sample
complexity when the transforms satisfy certain conditions. In particular, our
theory extends existing near optimal sample complexity results to a wider class
of sensing matrix and sparsifying transform. We also provide extensions of our
results to the scenarios where the atoms in the transform has varying
incoherence parameters and the unknown signal exhibits a structured sparsity
pattern. Numerical results show that the proposed variable density sampling
provides superior recovery performance over previous works, which is aligned
with the presented theory.
Xiliang Luo, Penghao Cai, Xiaoyu Zhang, Die Hu, Cong Shen
Comments: 13 pages, 8 figures
Subjects: Information Theory (cs.IT)
Unlike the time-division duplexing (TDD) systems, the downlink (DL) and
uplink (UL) channels are not reciprocal anymore in the case of
frequency-division duplexing (FDD). However, some long-term parameters, e.g.
the time delays and angles of arrival (AoAs) of the channel paths, still enjoy
reciprocity. In this paper, by efficiently exploiting the aforementioned
limited reciprocity, we address the DL channel state information (CSI) feedback
in a practical wideband massive multiple-input multiple-output (MIMO) system
operating in the FDD mode. With orthogonal frequency-division multiplexing
(OFDM) waveform and assuming frequency-selective fading channels, we propose a
scalable framework for the DL pilots design, DL CSI acquisition, and the
corresponding CSI feedback in the UL. In particular, the base station (BS) can
transmit the FFT-based pilots with the carefully-selected phase shifts. Then
the user can rely on the so-called time-domain aggregate channel (TAC) to
derive the feedback of reduced imensionality according to either its own
knowledge about the statistics of the DL channels or the instruction from the
serving BS. We demonstrate that each user can just feed back one scalar number
per DL channel path for the BS to recover the DL CSIs. Comprehensive numerical
results further corroborate our designs.
Milad Rezaee, Mahtab Mirmohseni, Vaneet Aggarwal (Senior Member, IEEE), Mohammad Reza Aref
Comments: paper has been submitted to IEEE Transactions on Wireless Communications
Subjects: Information Theory (cs.IT)
In this paper, we consider a multi-hop energy harvesting (EH) communication
system in a full-duplex mode, where arrival data and harvested energy curves in
the source and the relays are modeled as general functions. This model includes
the EH system with discrete arrival processes as a special case. We investigate
the throughput maximization problem considering minimum utilized energy in the
source and relays and find the optimal offline algorithm. We show that the
optimal solution of the two-hop transmission problem have three main steps: (i)
Solving a point-to-point throughput maximization problem at the source; (ii)
Solving a point-to-point throughput maximization problem at the relay (after
applying the solution of first step as the input of this second problem); (iii)
Minimizing utilized energy in the source. In addition, we show that how the
optimal algorithm for the completion time minimization problem can be derived
from the proposed algorithm for throughput maximization problem. Also, for the
throughput maximization problem, we propose an online algorithm and show that
it is more efficient than the benchmark one (which is a direct application of
an existing point-to-point online algorithm to the multi-hop system).
Jiejing Wen, Minghui Yang, Keqin Feng
Comments: 6 pages
Subjects: Information Theory (cs.IT)
The notion of strong external difference family (SEDF) in a finite abelian
group ((G,+)) is raised by M. B. Paterson and D. R. Stinson [5] in 2016 and
motivated by its application in communication theory to construct (R)-optimal
regular algebraic manipulation detection code. A series of
((n,m,k,lambda))-SEDF’s have been constructed in [5, 4, 2, 1] with (m=2). In
this note we present an example of (243, 11, 22, 20)-SEDF in finite field
(mathbb{F}_q) ((q=3^5=243).) This is an answer for the following problem
raised in [5] and continuously asked in [4, 2, 1]: if there exists an
((n,m,k,lambda))-SEDF for (mgeq 5).
Gaopeng Jian
Subjects: Information Theory (cs.IT)
The generalized Hamming weights of a linear code have been extensively
studied since Wei first use them to characterize the cryptography performance
of a linear code over the wire-tap channel of type II. In this paper, we
investigate the generalized Hamming weights of three classes of linear codes
constructed through defining sets and determine them partly for some cases.
Particularly, in the semiprimitive case we solve an problem left in Yang et al.
(IEEE Trans. Inf. Theory 61(9): 4905–4913, 2015).
Lele Wang, Ofer Shayevitz
Subjects: Information Theory (cs.IT); Discrete Mathematics (cs.DM); Combinatorics (math.CO)
We introduce the notion of information ratio ( ext{Ir}(H/G)) between two
(simple, undirected) graphs (G) and (H), defined as the supremum of ratios
(k/n) such that there exists a mapping between the strong products (G^k) to
(H^n) that preserves non-adjacency. Operationally speaking, the information
ratio is the maximal number of source symbols per channel use that can be
reliably sent over a channel with a confusion graph (H), where reliability is
measured w.r.t. a source confusion graph (G). Various results are provided,
including in particular lower and upper bounds on ( ext{Ir}(H/G)) in terms of
different graph properties, inequalities and identities for behavior under
strong product and disjoint union, relations to graph cores, and notions of
graph criticality. Informally speaking, ( ext{Ir}(H/G)) can be interpreted as
a measure of similarity between (G) and (H). We make this notion precise by
introducing the concept of information equivalence between graphs, a more
quantitative version of homomorphic equivalence. We then describe a natural
partial ordering over the space of information equivalence classes, and endow
it with a suitable metric structure that is contractive under the strong
product. Various examples and open problems are discussed.
Ricky X. F. Chen
Comments: A brief introduction to Shannon’s information theory in a combinatorial flavor, completed for quite a while. Comments are welcome
Subjects: Information Theory (cs.IT); Combinatorics (math.CO)
This is an introduction to Shannon’s Information Theory. It is more like a
long note so that it is by no means a complete survey or completely
mathematically rigorous. It covers two main topics: entropy and channel
capacity. All related concepts will be developed in a totally combinatorial
flavor. Some issues usually not addressed in the literature will be discussed
here as well. Hopefully, it will be interesting to those interested in
Information Theory.
Yu Liu, Ammar Ghazal, Cheng-Xiang Wang, Xiaohu Ge, Yang Yang, Yapei Zhang
Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT)
The rapid developments of high-speed trains (HSTs) introduce new challenges
to HST wireless communication systems. Realistic HST channel models play a
critical role in designing and evaluating HST communication systems. Due to the
length limitation, bounding of tunnel itself, and waveguide effect, channel
characteristics in tunnel scenarios are very different from those in other HST
scenarios. Therefore, accurate tunnel channel models considering both
large-scale and small-scale fading characteristics are essential for HST
communication systems. Moreover, certain characteristics of tunnel channels
have not been investigated sufficiently. This article provides a comprehensive
review of the measurement campaigns in tunnels and presents some tunnel channel
models using various modeling methods. Finally, future directions in HST tunnel
channel measurements and modeling are discussed.