IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Mon, 2 Jan 2017

    我爱机器学习(52ml.net)发表于 2017-01-02 00:00:00
    love 0

    Neural and Evolutionary Computing

    Adult Content Recognition from Images Using a Mixture of Convolutional Neural Networks

    Mundher Al-Shabi, Tee Connie, Andrew Beng Jin Teoh
    Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    With rapid development of the Internet, the web contents become huge. Most of
    the websites are publicly available and anyone can access the contents
    everywhere such as workplace, home and even schools. Nev-ertheless, not all the
    web contents are appropriate for all users, especially children. An example of
    these contents is pornography images which should be restricted to certain age
    group. Besides, these images are not safe for work (NSFW) in which employees
    should not be seen accessing such contents. Recently, convolutional neural
    networks have been successfully applied to many computer vision problems.
    Inspired by these successes, we propose a mixture of convolutional neural
    networks for adult content recognition. Unlike other works, our method is
    formulated on a weighted sum of multiple deep neural network models. The
    weights of each CNN models are expressed as a linear regression problem learnt
    using Ordinary Least Squares (OLS). Experimental results demonstrate that the
    proposed model outperforms both single CNN model and the average sum of CNN
    models in adult content recognition.


    Computer Vision and Pattern Recognition

    A Unified Tensor-based Active Appearance Face Model

    Zhen-Hua Feng, Josef Kittler, William Christmas, Xiao-Jun Wu
    Comments: 15 pages, 7 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Appearance variations result in many difficulties in face image analysis. To
    deal with this challenge, we present a Unified Tensor-based Active Appearance
    Model (UT-AAM) for jointly modelling the geometry and texture information of 2D
    faces. In contrast with the classical Tensor-based AAM (T-AAM), the proposed
    UT-AAM has four advantages: First, for each type of face information, namely
    shape and texture, we construct a tensor model capturing all relevant
    appearance variations. This unified tensor model contrasts with the
    variation-specific models of T-AAM. Second, a strategy for dealing with
    self-occluded faces is proposed to obtain consistent shape and texture
    representations of faces across large pose variations. Third, our UT-AAM is
    capable of constructing the model from an incomplete training dataset, using
    tensor completion methods. Last, we use an effective cascaded-regression-based
    method for UT-AAM fitting. With these improvements, the utility of UT-AAM in
    practice is considerably enhanced in comparison with the classical T-AAM. As an
    example, we demonstrate the improvements in training facial landmark detectors
    through the use of UT-AAM to synthesise a large number of virtual samples.
    Experimental results obtained using the Multi-PIE and 300-W face datasets
    demonstrate the merits of the proposed approach.

    A Joint Speaker-Listener-Reinforcer Model for Referring Expressions

    Licheng Yu, Hao Tan, Mohit Bansal, Tamara L. Berg
    Comments: 11 pages, 6 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

    Referring expressions are natural language constructions used to identify
    particular objects within a scene. In this paper, we propose a unified
    framework for the tasks of referring expression comprehension and generation.
    Our model is composed of three modules: speaker, listener, and reinforcer. The
    speaker generates referring expressions, the listener comprehends referring
    expressions, and the reinforcer introduces a reward function to guide sampling
    of more discriminative expressions. The listener-speaker modules are trained
    jointly in an end-to-end learning framework, allowing the modules to be aware
    of one another during learning while also benefiting from the discriminative
    reinforcer’s feedback. We demonstrate that this unified framework and training
    achieves state-of-the-art results for both comprehension and generation on
    three referring expression datasets. Project and demo page:
    this https URL

    Memory Efficient Multi-Scale Line Detector Architecture for Retinal Blood Vessel Segmentation

    Hamza Bendaoudi, Farida Cheriet, J. M. Pierre Langlois
    Comments: This paper was accepted and presented at Conference on Design and Architectures for Signal and Image Processing – DASIP 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Hardware Architecture (cs.AR)

    This paper presents a memory efficient architecture that implements the
    Multi-Scale Line Detector (MSLD) algorithm for real-time retinal blood vessel
    detection in fundus images on a Zynq FPGA. This implementation benefits from
    the FPGA parallelism to drastically reduce the memory requirements of the MSLD
    from two images to a few values. The architecture is optimized in terms of
    resource utilization by reusing the computations and optimizing the bit-width.
    The throughput is increased by designing fully pipelined functional units. The
    architecture is capable of achieving a comparable accuracy to its software
    implementation but 70x faster for low resolution images. For high resolution
    images, it achieves an acceleration by a factor of 323x.

    Feedback Networks

    Amir R. Zamir, Te-Lin Wu, Lin Sun, William Shen, Jitendra Malik, Silvio Savarese
    Comments: See a video describing the method at this https URL and the website this http URL
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Currently, the most successful learning models in computer vision are based
    on learning successive representations followed by a decision layer. This is
    usually actualized through feedforward multilayer neural networks, e.g.
    ConvNets, where each layer forms one of such successive representations.
    However, an alternative that can achieve the same goal is a feedback based
    approach in which the representation is formed in an iterative manner based on
    a feedback received from previous iteration’s output.

    We establish that a feedback based approach has several fundamental
    advantages over feedforward: it enables making early predictions at the query
    time, its output naturally conforms to a hierarchical structure in the label
    space (e.g. a taxonomy), and it provides a new basis for Curriculum Learning.
    We observe that feedback networks develop a considerably different
    representation compared to feedforward counterparts, in line with the
    aforementioned advantages. We put forth a general feedback based learning
    architecture with the endpoint results on par or better than existing
    feedforward networks with the addition of the above advantages. We also
    investigate several mechanisms in feedback architectures (e.g. skip connections
    in time) and design choices (e.g. feedback length). We hope this study offers
    new perspectives in quest for more natural and practical learning models.

    Shape Estimation from Defocus Cue for Microscopy Images via Belief Propagation

    Arnav Bhavsar
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In recent years, the usefulness of 3D shape estimation is being realized in
    microscopic or close-range imaging, as the 3D information can further be used
    in various applications. Due to limited depth of field at such small distances,
    the defocus blur induced in images can provide information about the 3D shape
    of the object. The task of `shape from defocus’ (SFD), involves the problem of
    estimating good quality 3D shape estimates from images with depth-dependent
    defocus blur. While the research area of SFD is quite well-established, the
    approaches have largely demonstrated results on objects with bulk/coarse shape
    variation. However, in many cases, objects studied under microscopes often
    involve fine/detailed structures, which have not been explicitly considered in
    most methods. In addition, given that, in recent years, large data volumes are
    typically associated with microscopy related applications, it is also important
    for such SFD methods to be efficient. In this work, we provide an indication of
    the usefulness of the Belief Propagation (BP) approach in addressing these
    concerns for SFD. BP has been known to be an efficient combinatorial
    optimization approach, and has been empirically demonstrated to yield good
    quality solutions in low-level vision problems such as image restoration,
    stereo disparity estimation etc. For exploiting the efficiency of BP in SFD, we
    assume local space-invariance of the defocus blur, which enables the
    application of BP in a straightforward manner. Even with such an assumption,
    the ability of BP to provide good quality solutions while using non-convex
    priors, reflects in yielding plausible shape estimates in presence of fine
    structures on the objects under microscopy imaging.

    Action Recognition Based on Joint Trajectory Maps with Convolutional Neural Networks

    Pichao Wang, Wanqing Li, Chuankun Li, Yonghong Hou
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Convolutional Neural Networks (ConvNets) have recently shown promising
    performance in many computer vision tasks, especially image-based recognition.
    How to effectively apply ConvNets to sequence-based data is still an open
    problem. This paper proposes an effective yet simple method to represent
    spatio-temporal information carried in (3D) skeleton sequences into three (2D)
    images by encoding the joint trajectories and their dynamics into color
    distribution in the images, referred to as Joint Trajectory Maps (JTM), and
    adopts ConvNets to learn the discriminative features for human action
    recognition. Such an image-based representation enables us to fine-tune
    existing ConvNets models for the classification of skeleton sequences without
    training the networks afresh. The three JTMs are generated in three orthogonal
    planes and provide complimentary information to each other. The final
    recognition is further improved through multiply score fusion of the three
    JTMs. The proposed method was evaluated on four public benchmark datasets, the
    large NTU RGB+D Dataset, MSRC-12 Kinect Gesture Dataset (MSRC-12), G3D Dataset
    and UTD Multimodal Human Action Dataset (UTD-MHAD) and achieved the
    state-of-the-art results.

    Rotation equivariant vector field networks

    Diego Marcos, Michele Volpi, Nikos Komodakis, Devis Tuia
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We propose a method to encode rotation equivariance or invariance into
    convolutional neural networks (CNNs). Each convolutional filter is applied with
    several orientations and returns a vector field that represents the magnitude
    and angle of the highest scoring rotation at the given spatial location. To
    propagate information about the main orientation of the different features to
    each layer in the network, we propose an enriched orientation pooling, i.e. max
    and argmax operators over the orientation space, allowing to keep the
    dimensionality of the feature maps low and to propagate only useful
    information. We name this approach RotEqNet. We apply RotEqNet to three
    datasets: first, a rotation invariant classification problem, the MNIST-rot
    benchmark, in which we improve over the state-of-the-art results. Then, a
    neuron membrane segmentation benchmark, where we show that RotEqNet can be
    applied successfully to obtain equivariance to rotation with a simple fully
    convolutional architecture. Finally, we improve significantly the
    state-of-the-art on the problem of estimating cars’ absolute orientation in
    aerial images, a problem where the output is required to be covariant with
    respect to the object’s orientation.

    Deep Learning Logo Detection with Data Expansion by Synthesising Context

    Hang Su, Xiatian Zhu, Shaogang Gong
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Logo detection in unconstrained images is challenging, particularly when only
    very sparse labelled training images are accessible due to high labelling
    costs. In this work, we describe a model training image synthesising method
    capable of improving significantly logo detection performance when only a
    handful of (e.g., 10) labelled training images captured in realistic context
    are available, avoiding extensive manual labelling costs. Specifically, we
    design a novel algorithm for generating Synthetic Context Logo (SCL) training
    images to increase model robustness against unknown background clutters,
    resulting in superior logo detection performance. For benchmarking model
    performance, we introduce a new logo detection dataset TopLogo-10 collected
    from top 10 most popular clothing/wearable brandname logos captured in rich
    visual context. Extensive comparisons show the advantages of our proposed SCL
    model over the state-of-the-art alternatives for logo detection using two
    real-world logo benchmark datasets: FlickrLogo-32 and our new TopLogo-10.

    Adult Content Recognition from Images Using a Mixture of Convolutional Neural Networks

    Mundher Al-Shabi, Tee Connie, Andrew Beng Jin Teoh
    Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    With rapid development of the Internet, the web contents become huge. Most of
    the websites are publicly available and anyone can access the contents
    everywhere such as workplace, home and even schools. Nev-ertheless, not all the
    web contents are appropriate for all users, especially children. An example of
    these contents is pornography images which should be restricted to certain age
    group. Besides, these images are not safe for work (NSFW) in which employees
    should not be seen accessing such contents. Recently, convolutional neural
    networks have been successfully applied to many computer vision problems.
    Inspired by these successes, we propose a mixture of convolutional neural
    networks for adult content recognition. Unlike other works, our method is
    formulated on a weighted sum of multiple deep neural network models. The
    weights of each CNN models are expressed as a linear regression problem learnt
    using Ordinary Least Squares (OLS). Experimental results demonstrate that the
    proposed model outperforms both single CNN model and the average sum of CNN
    models in adult content recognition.

    Automatic labeling of molecular biomarkers of whole slide immunohistochemistry images using fully convolutional networks

    Fahime Sheikhzadeh, Martial Guillaud, Rabab K. Ward
    Subjects: Tissues and Organs (q-bio.TO); Computer Vision and Pattern Recognition (cs.CV)

    This paper addresses the problem of quantifying biomarkers in multi-stained
    tissues, based on color and spatial information. A deep learning based method
    that can automatically localize and quantify the cells expressing biomarker(s)
    in a whole slide image is proposed. The deep learning network is a fully
    convolutional network (FCN) whose input is the true RGB color image of a tissue
    and output is a map of the different biomarkers. The FCN relies on a
    convolutional neural network (CNN) that classifies each cell separately
    according to the biomarker it expresses. In this study, images of
    immunohistochemistry (IHC) stained slides were collected and used. More than
    4,500 RGB images of cells were manually labeled based on the expressing
    biomarkers. The labeled cell images were used to train the CNN (obtaining an
    accuracy of 92% in a test set). The trained CNN is then extended to an FCN that
    generates a map of all biomarkers in the whole slide image acquired by the
    scanner (instead of classifying every cell image). To evaluate our method, we
    manually labeled all nuclei expressing different biomarkers in two whole slide
    images and used theses as the ground truth. Our proposed method for
    immunohistochemical analysis compares well with the manual labeling by humans
    (average F-score of 0.96).


    Artificial Intelligence

    Fuzzy Constraints Linear Discriminant Analysis

    Hamid Reza Hassanzadeh, Hadi Sadoghi Yazdi, Abedin Vahedian
    Journal-ref: 3rd Iranian Joint Congress on Intelligent Systems and Fuzzy
    Systems, 2009
    Subjects: Artificial Intelligence (cs.AI)

    In this paper we introduce a fuzzy constraint linear discriminant analysis
    (FC-LDA). The FC-LDA tries to minimize misclassification error based on
    modified perceptron criterion that benefits handling the uncertainty near the
    decision boundary by means of a fuzzy linear programming approach with fuzzy
    resources. The method proposed has low computational complexity because of its
    linear characteristics and the ability to deal with noisy data with different
    degrees of tolerance. Obtained results verify the success of the algorithm when
    dealing with different problems. Comparing FC-LDA and LDA shows superiority in
    classification task.

    PrASP Report

    Matthias Nickles
    Comments: Technical Report
    Subjects: Artificial Intelligence (cs.AI)

    This technical report describes the usage, syntax, semantics and core
    algorithms of the probabilistic inductive logic programming framework PrASP.
    PrASP is a research software which integrates non-monotonic reasoning based on
    Answer Set Programming (ASP), probabilistic inference and parameter learning.
    In contrast to traditional approaches to Probabilistic (Inductive) Logic
    Programming, our framework imposes only little restrictions on probabilistic
    logic programs. In particular, PrASP allows for ASP as well as First-Order
    Logic syntax, and for the annotation of formulas with point probabilities as
    well as interval probabilities. A range of widely configurable inference
    algorithms can be combined in a pipeline-like fashion, in order to cover a
    variety of use cases.

    Curiosity-Aware Bargaining

    Cédric Buron (LIP6), Sylvain Ductor (LIP6), Zahia Guessoum (LIP6, CRESTIC)
    Journal-ref: STAIRS 2016, Aug 2016, The Hague, Netherlands. IOS Press, 284, pp.
    27 — 38, Frontiers in Artificial Intelligence and Applications
    Subjects: Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)

    Opponent modeling consists in modeling the strategy or preferences of an
    agent thanks to the data it provides. In the context of automated negotiation
    and with machine learning, it can result in an advantage so overwhelming that
    it may restrain some casual agents to be part of the bargaining process. We
    qualify as “curious” an agent driven by the desire of negotiating in order to
    collect information and improve its opponent model. However, neither
    curiosity-based rational-ity nor curiosity-robust protocol have been studied in
    automatic negotiation. In this paper, we rely on mechanism design to propose
    three extensions of the standard bargaining protocol that limit information
    leak. Those extensions are supported by an enhanced rationality model, that
    considers the exchanged information. Also, they are theoretically analyzed and
    experimentally evaluated.

    When the map is better than the territory

    Erik P Hoel
    Comments: 15 pages, 6 figures
    Subjects: Information Theory (cs.IT); Artificial Intelligence (cs.AI)

    The causal structure of any system can be analyzed at a multitude of spatial
    and temporal scales. It has long been thought that while higher scale (macro)
    descriptions of causal structure may be useful to observers, they are at best a
    compressed description and at worse leave out critical information. However,
    recent research applying information theory to causal analysis has shown that
    the causal structure of some systems can actually come into focus (be more
    informative) at a macroscale (Hoel et al. 2013). That is, a macro model of a
    system (a map) can be more informative than a fully detailed model of the
    system (the territory). This has been called causal emergence. While causal
    emergence may at first glance seem counterintuitive, this paper grounds the
    phenomenon in a classic concept from information theory: Shannon’s discovery of
    the channel capacity. I argue that systems have a particular causal capacity,
    and that different causal models of those systems take advantage of that
    capacity to various degrees. For some systems, only macroscale causal models
    use the full causal capacity. Such macroscale causal models can either be
    coarse-grains, or may leave variables and states out of the model (exogenous)
    in various ways, which can improve the model’s efficacy and its informativeness
    via the same mathematical principles of how error-correcting codes take
    advantage of an information channel’s capacity. As model choice increase, the
    causal capacity of a system approaches the channel capacity. Ultimately, this
    provides a general framework for understanding how the causal structure of some
    systems cannot be fully captured by even the most detailed microscopic model.

    A Joint Speaker-Listener-Reinforcer Model for Referring Expressions

    Licheng Yu, Hao Tan, Mohit Bansal, Tamara L. Berg
    Comments: 11 pages, 6 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

    Referring expressions are natural language constructions used to identify
    particular objects within a scene. In this paper, we propose a unified
    framework for the tasks of referring expression comprehension and generation.
    Our model is composed of three modules: speaker, listener, and reinforcer. The
    speaker generates referring expressions, the listener comprehends referring
    expressions, and the reinforcer introduces a reward function to guide sampling
    of more discriminative expressions. The listener-speaker modules are trained
    jointly in an end-to-end learning framework, allowing the modules to be aware
    of one another during learning while also benefiting from the discriminative
    reinforcer’s feedback. We demonstrate that this unified framework and training
    achieves state-of-the-art results for both comprehension and generation on
    three referring expression datasets. Project and demo page:
    this https URL

    Adaptive Lambda Least-Squares Temporal Difference Learning

    Timothy A. Mann, Hugo Penedones, Shie Mannor, Todd Hester
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    Temporal Difference learning or TD((lambda)) is a fundamental algorithm in
    the field of reinforcement learning. However, setting TD’s (lambda) parameter,
    which controls the timescale of TD updates, is generally left up to the
    practitioner. We formalize the (lambda) selection problem as a bias-variance
    trade-off where the solution is the value of (lambda) that leads to the
    smallest Mean Squared Value Error (MSVE). To solve this trade-off we suggest
    applying Leave-One-Trajectory-Out Cross-Validation (LOTO-CV) to search the
    space of (lambda) values. Unfortunately, this approach is too computationally
    expensive for most practical applications. For Least Squares TD (LSTD) we show
    that LOTO-CV can be implemented efficiently to automatically tune (lambda) and
    apply function optimization methods to efficiently search the space of
    (lambda) values. The resulting algorithm, ALLSTD, is parameter free and our
    experiments demonstrate that ALLSTD is significantly computationally faster
    than the na”{i}ve LOTO-CV implementation while achieving similar performance.

    Intelligent information extraction based on artificial neural network

    Ahlam Ansari, Moonish Maknojia, Altamash Shaikh
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

    Question Answering System (QAS) is used for information retrieval and natural
    language processing (NLP) to reduce human effort. There are numerous QAS based
    on the user documents present today, but they all are limited to providing
    objective answers and process simple questions only. Complex questions cannot
    be answered by the existing QAS, as they require interpretation of the current
    and old data as well as the question asked by the user. The above limitations
    can be overcome by using deep cases and neural network. Hence we propose a
    modified QAS in which we create a deep artificial neural network with
    associative memory from text documents. The modified QAS processes the contents
    of the text document provided to it and find the answer to even complex
    questions in the documents.


    Information Retrieval

    Automatic Data Deformation Analysis on Evolving Folksonomy Driven Environment

    Massimiliano Dal Mas
    Comments: 8 pages, 3 figures; 2 tables; for details see: this http URL
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Computers and Society (cs.CY); Social and Information Networks (cs.SI)

    The Folksodriven framework makes it possible for data scientists to define an
    ontology environment where searching for buried patterns that have some kind of
    predictive power to build predictive models more effectively. It accomplishes
    this through an abstractions that isolate parameters of the predictive modeling
    process searching for patterns and designing the feature set, too. To reflect
    the evolving knowledge, this paper considers ontologies based on folksonomies
    according to a new concept structure called “Folksodriven” to represent
    folksonomies. So, the studies on the transformational regulation of the
    Folksodriven tags are regarded to be important for adaptive folksonomies
    classifications in an evolving environment used by Intelligent Systems to
    represent the knowledge sharing. Folksodriven tags are used to categorize
    salient data points so they can be fed to a machine-learning system and
    “featurizing” the data.

    PAMPO: using pattern matching and pos-tagging for effective Named Entities recognition in Portuguese

    Conceição Rocha, Alípio Jorge, Roberta Sionara, Paula Brito, Carlos Pimenta, Solange Rezende
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    This paper deals with the entity extraction task (named entity recognition)
    of a text mining process that aims at unveiling non-trivial semantic
    structures, such as relationships and interaction between entities or
    communities. In this paper we present a simple and efficient named entity
    extraction algorithm. The method, named PAMPO (PAttern Matching and POs tagging
    based algorithm for NER), relies on flexible pattern matching, part-of-speech
    tagging and lexical-based rules. It was developed to process texts written in
    Portuguese, however it is potentially applicable to other languages as well.

    We compare our approach with current alternatives that support Named Entity
    Recognition (NER) for content written in Portuguese. These are Alchemy, Zemanta
    and Rembrandt. Evaluation of the efficacy of the entity extraction method on
    several texts written in Portuguese indicates a considerable improvement on
    (recall) and (F_1) measures.

    FolksoDrivenCloud: an annotation and process application for social collaborative networking

    Massimiliano Dal Mas
    Comments: 9 pages, 3 figures; for details see: this http URL
    Subjects: Computers and Society (cs.CY); Information Retrieval (cs.IR); Software Engineering (cs.SE); Social and Information Networks (cs.SI)

    In this paper we present the FolksoDriven Cloud (FDC) built on Cloud and on
    Semantic technologies. Cloud computing has emerged in these recent years as the
    new paradigm for the provision of on-demand distributed computing resources.
    Semantic Web can be used for relationship between different data and
    descriptions of services to annotate provenance of repositories on ontologies.
    The FDC service is composed of a back-end which submits and monitors the
    documents, and a user front-end which allows users to schedule on-demand
    operations and to watch the progress of running processes. The impact of the
    proposed method is illustrated on a user since its inception.


    Computation and Language

    Intelligent information extraction based on artificial neural network

    Ahlam Ansari, Moonish Maknojia, Altamash Shaikh
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

    Question Answering System (QAS) is used for information retrieval and natural
    language processing (NLP) to reduce human effort. There are numerous QAS based
    on the user documents present today, but they all are limited to providing
    objective answers and process simple questions only. Complex questions cannot
    be answered by the existing QAS, as they require interpretation of the current
    and old data as well as the question asked by the user. The above limitations
    can be overcome by using deep cases and neural network. Hence we propose a
    modified QAS in which we create a deep artificial neural network with
    associative memory from text documents. The modified QAS processes the contents
    of the text document provided to it and find the answer to even complex
    questions in the documents.

    Automatic Data Deformation Analysis on Evolving Folksonomy Driven Environment

    Massimiliano Dal Mas
    Comments: 8 pages, 3 figures; 2 tables; for details see: this http URL
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Computers and Society (cs.CY); Social and Information Networks (cs.SI)

    The Folksodriven framework makes it possible for data scientists to define an
    ontology environment where searching for buried patterns that have some kind of
    predictive power to build predictive models more effectively. It accomplishes
    this through an abstractions that isolate parameters of the predictive modeling
    process searching for patterns and designing the feature set, too. To reflect
    the evolving knowledge, this paper considers ontologies based on folksonomies
    according to a new concept structure called “Folksodriven” to represent
    folksonomies. So, the studies on the transformational regulation of the
    Folksodriven tags are regarded to be important for adaptive folksonomies
    classifications in an evolving environment used by Intelligent Systems to
    represent the knowledge sharing. Folksodriven tags are used to categorize
    salient data points so they can be fed to a machine-learning system and
    “featurizing” the data.

    A Joint Speaker-Listener-Reinforcer Model for Referring Expressions

    Licheng Yu, Hao Tan, Mohit Bansal, Tamara L. Berg
    Comments: 11 pages, 6 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

    Referring expressions are natural language constructions used to identify
    particular objects within a scene. In this paper, we propose a unified
    framework for the tasks of referring expression comprehension and generation.
    Our model is composed of three modules: speaker, listener, and reinforcer. The
    speaker generates referring expressions, the listener comprehends referring
    expressions, and the reinforcer introduces a reward function to guide sampling
    of more discriminative expressions. The listener-speaker modules are trained
    jointly in an end-to-end learning framework, allowing the modules to be aware
    of one another during learning while also benefiting from the discriminative
    reinforcer’s feedback. We demonstrate that this unified framework and training
    achieves state-of-the-art results for both comprehension and generation on
    three referring expression datasets. Project and demo page:
    this https URL

    PAMPO: using pattern matching and pos-tagging for effective Named Entities recognition in Portuguese

    Conceição Rocha, Alípio Jorge, Roberta Sionara, Paula Brito, Carlos Pimenta, Solange Rezende
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    This paper deals with the entity extraction task (named entity recognition)
    of a text mining process that aims at unveiling non-trivial semantic
    structures, such as relationships and interaction between entities or
    communities. In this paper we present a simple and efficient named entity
    extraction algorithm. The method, named PAMPO (PAttern Matching and POs tagging
    based algorithm for NER), relies on flexible pattern matching, part-of-speech
    tagging and lexical-based rules. It was developed to process texts written in
    Portuguese, however it is potentially applicable to other languages as well.

    We compare our approach with current alternatives that support Named Entity
    Recognition (NER) for content written in Portuguese. These are Alchemy, Zemanta
    and Rembrandt. Evaluation of the efficacy of the entity extraction method on
    several texts written in Portuguese indicates a considerable improvement on
    (recall) and (F_1) measures.


    Distributed, Parallel, and Cluster Computing

    Correctness of Hierarchical MCS Locks with Timeout

    Milind Chabbi, Abdelhalim Amer, Shasha Wen, Xu Liu
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    This manuscript serves as a correctness proof of the Hierarchical MCS locks
    with Timeout (HMCS-T) described in our paper titled “An Efficient
    Abortable-locking Protocol for Multi-level NUMA Systems” appearing in the
    proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of
    Parallel Programming.

    HMCS-T is a very involved protocol. The system is stateful; the values of
    prior acquisition efforts affect the subsequent acquisition efforts. Also, the
    status of successors, predecessors, ancestors, and descendants affect steps
    followed by the protocol. The ability to make the protocol fully non-blocking
    leads to modifications to the exttt{next} field, which causes deviation from
    the original MCS lock protocol both in acquisition and release. At several
    places, unconditional field updates are replaced with SWAP or CAS operations.

    We follow a multi-step approach to prove the correctness of HMCS-T. To
    demonstrate the correctness of the HMCS-T lock, we use the Spin model checking.
    Model checking causes a combinatorial explosion even to simulate a handful of
    threads. First, we understand the minimal, sufficient configurations necessary
    to prove safety properties of a single level of lock in the tree. We construct
    HMCS-T locks that represent these configurations. We model check these
    configurations, which proves the correctness of components of an HMCS-T lock.
    Finally, building upon these facts, we argue logically for the correctness of
    HMCS-T<n>.

    The Balance Attack Against Proof-Of-Work Blockchains: The R3 Testbed as an Example

    Christopher Natoli, Vincent Gramoli
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Cryptography and Security (cs.CR)

    In this paper, we identify a new form of attack, called the Balance attack,
    against proof-of-work blockchain systems. The novelty of this attack consists
    of delaying network communications between multiple subgroups of nodes with
    balanced mining power. Our theoretical analysis captures the precise tradeoff
    between the network delay and the mining power of the attacker needed to double
    spend in Ethereum with high probability.

    We quantify our probabilistic analysis with statistics taken from the R3
    consortium, and show that a single machine needs 20 minutes to attack the
    consortium. Finally, we run an Ethereum private chain in a distributed system
    with similar settings as R3 to demonstrate the feasibility of the approach, and
    discuss the application of the Balance attack to Bitcoin. Our results clearly
    confirm that main proof-of-work blockchain protocols can be badly suited for
    consortium blockchains.


    Learning

    Linking the Neural Machine Translation and the Prediction of Organic Chemistry Reactions

    Juno Nam, Jurae Kim
    Comments: 19 pages, 5 figures
    Subjects: Learning (cs.LG)

    Finding the main product of a chemical reaction is one of the important
    problems of organic chemistry. This paper describes a method of applying a
    neural machine translation model to the prediction of organic chemical
    reactions. In order to translate ‘reactants and reagents’ to ‘products’, a
    gated recurrent unit based sequence-to-sequence model and a parser to generate
    input tokens for model from reaction SMILES strings were built. Training sets
    are composed of reactions from the patent databases, and reactions manually
    generated applying the elementary reactions in an organic chemistry textbook of
    Wade. The trained models were tested by examples and problems in the textbook.
    The prediction process does not need manual encoding of rules (e.g., SMARTS
    transformations) to predict products, hence it only needs sufficient training
    reaction sets to learn new types of reactions.

    Adaptive Lambda Least-Squares Temporal Difference Learning

    Timothy A. Mann, Hugo Penedones, Shie Mannor, Todd Hester
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    Temporal Difference learning or TD((lambda)) is a fundamental algorithm in
    the field of reinforcement learning. However, setting TD’s (lambda) parameter,
    which controls the timescale of TD updates, is generally left up to the
    practitioner. We formalize the (lambda) selection problem as a bias-variance
    trade-off where the solution is the value of (lambda) that leads to the
    smallest Mean Squared Value Error (MSVE). To solve this trade-off we suggest
    applying Leave-One-Trajectory-Out Cross-Validation (LOTO-CV) to search the
    space of (lambda) values. Unfortunately, this approach is too computationally
    expensive for most practical applications. For Least Squares TD (LSTD) we show
    that LOTO-CV can be implemented efficiently to automatically tune (lambda) and
    apply function optimization methods to efficiently search the space of
    (lambda) values. The resulting algorithm, ALLSTD, is parameter free and our
    experiments demonstrate that ALLSTD is significantly computationally faster
    than the na”{i}ve LOTO-CV implementation while achieving similar performance.

    Automatic Discoveries of Physical and Semantic Concepts via Association Priors of Neuron Groups

    Shuai Li, Kui Jia, Xiaogang Wang
    Subjects: Learning (cs.LG)

    The recent successful deep neural networks are largely trained in a
    supervised manner. It {it associates} complex patterns of input samples with
    neurons in the last layer, which form representations of {it concepts}. In
    spite of their successes, the properties of complex patterns associated a
    learned concept remain elusive. In this work, by analyzing how neurons are
    associated with concepts in supervised networks, we hypothesize that with
    proper priors to regulate learning, neural networks can automatically associate
    neurons in the intermediate layers with concepts that are aligned with real
    world concepts, when trained only with labels that associate concepts with top
    level neurons, which is a plausible way for unsupervised learning. We develop a
    prior to verify the hypothesis and experimentally find the proposed priors help
    neural networks automatically learn both basic physical concepts at the lower
    layers, e.g., rotation of filters, and highly semantic concepts at the higher
    layers, e.g., fine-grained categories of an entry-level category.

    The Neural Hawkes Process: A Neurally Self-Modulating Multivariate Point Process

    Hongyuan Mei, Jason Eisner
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    Many events occur in the world. Some event types are stochastically excited
    or inhibited—in the sense of having their probabilities elevated or
    decreased—by patterns in the sequence of previous events. Discovering such
    patterns can help us predict which type of event will happen next and when.
    Learning such structure should benefit various applications, including medical
    prognosis, consumer behavior, and social media activity prediction. We propose
    to model streams of discrete events in continuous time, by constructing a
    neurally self-modulating multivariate point process. This generative model
    allows past events to influence the future in complex ways, by conditioning
    future event intensities on the hidden state of a recurrent neural network that
    has consumed the stream of past events. We evaluate our model on multiple
    datasets and show that it significantly outperforms other strong baselines.

    Counterfactual Prediction with Deep Instrumental Variables Networks

    Jason Hartford, Greg Lewis, Kevin Leyton-Brown, Matt Taddy
    Subjects: Applications (stat.AP); Learning (cs.LG); Machine Learning (stat.ML)

    We are in the middle of a remarkable rise in the use and capability of
    artificial intelligence. Much of this growth has been fueled by the success of
    deep learning architectures: models that map from observables to outputs via
    multiple layers of latent representations. These deep learning algorithms are
    effective tools for unstructured prediction, and they can be combined in AI
    systems to solve complex automated reasoning problems. This paper provides a
    recipe for combining ML algorithms to solve for causal effects in the presence
    of instrumental variables — sources of treatment randomization that are
    conditionally independent from the response. We show that a flexible IV
    specification resolves into two prediction tasks that can be solved with deep
    neural nets: a first-stage network for treatment prediction and a second-stage
    network whose loss function involves integration over the conditional treatment
    distribution. This Deep IV framework imposes some specific structure on the
    stochastic gradient descent routine used for training, but it is general enough
    that we can take advantage of off-the-shelf ML capabilities and avoid extensive
    algorithm customization. We outline how to obtain out-of-sample causal
    validation in order to avoid over-fit. We also introduce schemes for both
    Bayesian and frequentist inference: the former via a novel adaptation of
    dropout training, and the latter via a data splitting routine.

    Data driven estimation of Laplace-Beltrami operator

    Frédéric Chazal (DATASHAPE), Ilaria Giulini (DATASHAPE), Bertrand Michel (LSTA)
    Journal-ref: 30th Conference on Neural Information Processing Systems (NIPS
    2016), Dec 2016, Barcelona, Spain. 30th Conference on Neural Information
    Processing Systems (NIPS 2016), Barcelona, Spain., 2016
    Subjects: Computational Geometry (cs.CG); Learning (cs.LG); Statistics Theory (math.ST)

    Approximations of Laplace-Beltrami operators on manifolds through graph
    Lapla-cians have become popular tools in data analysis and machine learning.
    These discretized operators usually depend on bandwidth parameters whose tuning
    remains a theoretical and practical problem. In this paper, we address this
    problem for the unnormalized graph Laplacian by establishing an oracle
    inequality that opens the door to a well-founded data-driven procedure for the
    bandwidth selection. Our approach relies on recent results by Lacour and
    Massart [LM15] on the so-called Lepski’s method.


    Information Theory

    When the map is better than the territory

    Erik P Hoel
    Comments: 15 pages, 6 figures
    Subjects: Information Theory (cs.IT); Artificial Intelligence (cs.AI)

    The causal structure of any system can be analyzed at a multitude of spatial
    and temporal scales. It has long been thought that while higher scale (macro)
    descriptions of causal structure may be useful to observers, they are at best a
    compressed description and at worse leave out critical information. However,
    recent research applying information theory to causal analysis has shown that
    the causal structure of some systems can actually come into focus (be more
    informative) at a macroscale (Hoel et al. 2013). That is, a macro model of a
    system (a map) can be more informative than a fully detailed model of the
    system (the territory). This has been called causal emergence. While causal
    emergence may at first glance seem counterintuitive, this paper grounds the
    phenomenon in a classic concept from information theory: Shannon’s discovery of
    the channel capacity. I argue that systems have a particular causal capacity,
    and that different causal models of those systems take advantage of that
    capacity to various degrees. For some systems, only macroscale causal models
    use the full causal capacity. Such macroscale causal models can either be
    coarse-grains, or may leave variables and states out of the model (exogenous)
    in various ways, which can improve the model’s efficacy and its informativeness
    via the same mathematical principles of how error-correcting codes take
    advantage of an information channel’s capacity. As model choice increase, the
    causal capacity of a system approaches the channel capacity. Ultimately, this
    provides a general framework for understanding how the causal structure of some
    systems cannot be fully captured by even the most detailed microscopic model.

    Unified Theory for Recovery of Sparse Signals in a General Transform Domain

    Kiryung Lee, Yanjun Li, Kyong Hwan Jin, Jong Chul Ye
    Comments: 41 pages, 3 figures
    Subjects: Information Theory (cs.IT)

    Compressed sensing provided a new sampling paradigm for sparse signals.
    Remarkably, it has been shown that practical algorithms provide robust recovery
    from noisy linear measurements acquired at a near optimal sample rate. In
    real-world applications, a signal of interest is typically sparse not in the
    canonical basis but in a certain transform domain, such as the wavelet or the
    finite difference domain. The theory of compressed sensing was extended to this
    analysis sparsity model. However, existing theoretic results for the analysis
    sparsity model have limitations in the following senses: the derivation relies
    on particular choices of sensing matrix and sparsifying transform; sample
    complexity is significantly suboptimal compared to the analogous result in
    compressed sensing with the canonical sparsity model. In this paper, we propose
    a novel unified theory for robust recovery of sparse signals in a general
    transform domain using convex relaxation methods that overcomes the
    aforementioned limitations. Our results do not rely on any specific choice of
    sensing matrix and sparsifying transform and provide near optimal sample
    complexity when the transforms satisfy certain conditions. In particular, our
    theory extends existing near optimal sample complexity results to a wider class
    of sensing matrix and sparsifying transform. We also provide extensions of our
    results to the scenarios where the atoms in the transform has varying
    incoherence parameters and the unknown signal exhibits a structured sparsity
    pattern. Numerical results show that the proposed variable density sampling
    provides superior recovery performance over previous works, which is aligned
    with the presented theory.

    A Scalable Framework for CSI Feedback in FDD Massive MIMO via DL Path Aligning

    Xiliang Luo, Penghao Cai, Xiaoyu Zhang, Die Hu, Cong Shen
    Comments: 13 pages, 8 figures
    Subjects: Information Theory (cs.IT)

    Unlike the time-division duplexing (TDD) systems, the downlink (DL) and
    uplink (UL) channels are not reciprocal anymore in the case of
    frequency-division duplexing (FDD). However, some long-term parameters, e.g.
    the time delays and angles of arrival (AoAs) of the channel paths, still enjoy
    reciprocity. In this paper, by efficiently exploiting the aforementioned
    limited reciprocity, we address the DL channel state information (CSI) feedback
    in a practical wideband massive multiple-input multiple-output (MIMO) system
    operating in the FDD mode. With orthogonal frequency-division multiplexing
    (OFDM) waveform and assuming frequency-selective fading channels, we propose a
    scalable framework for the DL pilots design, DL CSI acquisition, and the
    corresponding CSI feedback in the UL. In particular, the base station (BS) can
    transmit the FFT-based pilots with the carefully-selected phase shifts. Then
    the user can rely on the so-called time-domain aggregate channel (TAC) to
    derive the feedback of reduced imensionality according to either its own
    knowledge about the statistics of the DL channels or the instruction from the
    serving BS. We demonstrate that each user can just feed back one scalar number
    per DL channel path for the BS to recover the DL CSIs. Comprehensive numerical
    results further corroborate our designs.

    Optimal Transmission Policies for Multi-hop Energy Harvesting Systems

    Milad Rezaee, Mahtab Mirmohseni, Vaneet Aggarwal (Senior Member, IEEE), Mohammad Reza Aref
    Comments: paper has been submitted to IEEE Transactions on Wireless Communications
    Subjects: Information Theory (cs.IT)

    In this paper, we consider a multi-hop energy harvesting (EH) communication
    system in a full-duplex mode, where arrival data and harvested energy curves in
    the source and the relays are modeled as general functions. This model includes
    the EH system with discrete arrival processes as a special case. We investigate
    the throughput maximization problem considering minimum utilized energy in the
    source and relays and find the optimal offline algorithm. We show that the
    optimal solution of the two-hop transmission problem have three main steps: (i)
    Solving a point-to-point throughput maximization problem at the source; (ii)
    Solving a point-to-point throughput maximization problem at the relay (after
    applying the solution of first step as the input of this second problem); (iii)
    Minimizing utilized energy in the source. In addition, we show that how the
    optimal algorithm for the completion time minimization problem can be derived
    from the proposed algorithm for throughput maximization problem. Also, for the
    throughput maximization problem, we propose an online algorithm and show that
    it is more efficient than the benchmark one (which is a direct application of
    an existing point-to-point online algorithm to the multi-hop system).

    The ((n,m,k,λ))-Strong External Difference Family with (m geq 5) Exists

    Jiejing Wen, Minghui Yang, Keqin Feng
    Comments: 6 pages
    Subjects: Information Theory (cs.IT)

    The notion of strong external difference family (SEDF) in a finite abelian
    group ((G,+)) is raised by M. B. Paterson and D. R. Stinson [5] in 2016 and
    motivated by its application in communication theory to construct (R)-optimal
    regular algebraic manipulation detection code. A series of
    ((n,m,k,lambda))-SEDF’s have been constructed in [5, 4, 2, 1] with (m=2). In
    this note we present an example of (243, 11, 22, 20)-SEDF in finite field
    (mathbb{F}_q) ((q=3^5=243).) This is an answer for the following problem
    raised in [5] and continuously asked in [4, 2, 1]: if there exists an
    ((n,m,k,lambda))-SEDF for (mgeq 5).

    Generalized Hamming weights of three classes of linear codes

    Gaopeng Jian
    Subjects: Information Theory (cs.IT)

    The generalized Hamming weights of a linear code have been extensively
    studied since Wei first use them to characterize the cryptography performance
    of a linear code over the wire-tap channel of type II. In this paper, we
    investigate the generalized Hamming weights of three classes of linear codes
    constructed through defining sets and determine them partly for some cases.
    Particularly, in the semiprimitive case we solve an problem left in Yang et al.
    (IEEE Trans. Inf. Theory 61(9): 4905–4913, 2015).

    Graph Information Ratio

    Lele Wang, Ofer Shayevitz
    Subjects: Information Theory (cs.IT); Discrete Mathematics (cs.DM); Combinatorics (math.CO)

    We introduce the notion of information ratio ( ext{Ir}(H/G)) between two
    (simple, undirected) graphs (G) and (H), defined as the supremum of ratios
    (k/n) such that there exists a mapping between the strong products (G^k) to
    (H^n) that preserves non-adjacency. Operationally speaking, the information
    ratio is the maximal number of source symbols per channel use that can be
    reliably sent over a channel with a confusion graph (H), where reliability is
    measured w.r.t. a source confusion graph (G). Various results are provided,
    including in particular lower and upper bounds on ( ext{Ir}(H/G)) in terms of
    different graph properties, inequalities and identities for behavior under
    strong product and disjoint union, relations to graph cores, and notions of
    graph criticality. Informally speaking, ( ext{Ir}(H/G)) can be interpreted as
    a measure of similarity between (G) and (H). We make this notion precise by
    introducing the concept of information equivalence between graphs, a more
    quantitative version of homomorphic equivalence. We then describe a natural
    partial ordering over the space of information equivalence classes, and endow
    it with a suitable metric structure that is contractive under the strong
    product. Various examples and open problems are discussed.

    A Brief Introduction on Shannon's Information Theory

    Ricky X. F. Chen
    Comments: A brief introduction to Shannon’s information theory in a combinatorial flavor, completed for quite a while. Comments are welcome
    Subjects: Information Theory (cs.IT); Combinatorics (math.CO)

    This is an introduction to Shannon’s Information Theory. It is more like a
    long note so that it is by no means a complete survey or completely
    mathematically rigorous. It covers two main topics: entropy and channel
    capacity. All related concepts will be developed in a totally combinatorial
    flavor. Some issues usually not addressed in the literature will be discussed
    here as well. Hopefully, it will be interesting to those interested in
    Information Theory.

    Channel Measurements and Models for High-Speed Train Wireless Communication Systems in Tunnel Scenarios: A Survey

    Yu Liu, Ammar Ghazal, Cheng-Xiang Wang, Xiaohu Ge, Yang Yang, Yapei Zhang
    Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT)

    The rapid developments of high-speed trains (HSTs) introduce new challenges
    to HST wireless communication systems. Realistic HST channel models play a
    critical role in designing and evaluating HST communication systems. Due to the
    length limitation, bounding of tunnel itself, and waveguide effect, channel
    characteristics in tunnel scenarios are very different from those in other HST
    scenarios. Therefore, accurate tunnel channel models considering both
    large-scale and small-scale fading characteristics are essential for HST
    communication systems. Moreover, certain characteristics of tunnel channels
    have not been investigated sufficiently. This article provides a comprehensive
    review of the measurement campaigns in tunnels and presents some tunnel channel
    models using various modeling methods. Finally, future directions in HST tunnel
    channel measurements and modeling are discussed.




沪ICP备19023445号-2号
友情链接