IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Fri, 21 Apr 2017

    我爱机器学习(52ml.net)发表于 2017-04-21 00:00:00
    love 0

    Neural and Evolutionary Computing

    Genetic Algorithm Based Floor Planning System

    Hamide Ozlem Dalgic, Erkan Bostanci, Mehmet Serdar Guzel
    Subjects: Neural and Evolutionary Computing (cs.NE)

    Genetic Algorithms are widely used in many different optimization problems
    including layout design. The layout of the shelves play an important role in
    the total sales metrics for superstores since this affects the customers’
    shopping behaviour. This paper employed a genetic algorithm based approach to
    design shelf layout of superstores. The layout design problem was tackled by
    using a novel chromosome representation which takes many different parameters
    to prevent dead-ends and improve shelf visibility into consideration. Results
    show that the approach can produce reasonably good layout designs in very short
    amounts of time.

    Solving the Uncapacitated Single Allocation p-Hub Median Problem on GPU

    Abdelhamid Benaini, Achraf Berrajaa, Jaouad Boukachour, Mustapha Oudani
    Subjects: Discrete Mathematics (cs.DM); Neural and Evolutionary Computing (cs.NE)

    A parallel genetic algorithm (GA) implemented on GPU clusters is proposed to
    solve the Uncapacitated Single Allocation p-Hub Median problem. The GA uses
    binary and integer encoding and genetic operators adapted to this problem. Our
    GA is improved by generated initial solution with hubs located at middle nodes.
    The obtained experimental results are compared with the best known solutions on
    all benchmarks on instances up to 1000 nodes. Furthermore, we solve our own
    randomly generated instances up to 6000 nodes. Our approach outperforms most
    well-known heuristics in terms of solution quality and time execution and it
    allows hitherto unsolved problems to be solved.

    Improved Neural Relation Detection for Knowledge Base Question Answering

    Mo Yu, Wenpeng Yin, Kazi Saidul Hasan, Cicero dos Santos, Bing Xiang, Bowen Zhou
    Comments: Accepted by ACL 2017
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    Relation detection is a core component for many NLP applications including
    Knowledge Base Question Answering (KBQA). In this paper, we propose a
    hierarchical recurrent neural network enhanced by residual learning that
    detects KB relations given an input question. Our method uses deep residual
    bidirectional LSTMs to compare questions and relation names via different
    hierarchies of abstraction. Additionally, we propose a simple KBQA system that
    integrates entity linking and our proposed relation detector to enable one
    enhance another. Experimental results evidence that our approach achieves not
    only outstanding relation detection performance, but more importantly, it helps
    our KBQA system to achieve state-of-the-art accuracy for both single-relation
    (SimpleQuestions) and multi-relation (WebQSP) QA benchmarks.

    Softmax GAN

    Min Lin
    Comments: NIPS 2017 submission
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Softmax GAN is a novel variant of Generative Adversarial Network (GAN). The
    key idea of Softmax GAN is to replace the classification loss in the original
    GAN with a softmax cross-entropy loss in the sample space of one single batch.
    In the adversarial learning of (N) real training samples and (M) generated
    samples, the target of discriminator training is to distribute all the
    probability mass to the real samples, each with probability (frac{1}{M}), and
    distribute zero probability to generated data. In the generator training phase,
    the target is to assign equal probability to all data points in the batch, each
    with probability (frac{1}{M+N}). While the original GAN is closely related to
    Noise Contrastive Estimation (NCE), we show that Softmax GAN is the Importance
    Sampling version of GAN. We futher demonstrate with experiments that this
    simple change stabilizes GAN training.

    End-to-End Multi-View Networks for Text Classification

    Hongyu Guo, Colin Cherry, Jiang Su
    Comments: 6 pages
    Subjects: Computation and Language (cs.CL); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    We propose a multi-view network for text classification. Our method
    automatically creates various views of its input text, each taking the form of
    soft attention weights that distribute the classifier’s focus among a set of
    base features. For a bag-of-words representation, each view focuses on a
    different subset of the text’s words. Aggregating many such views results in a
    more discriminative and robust representation. Through a novel architecture
    that both stacks and concatenates views, we produce a network that emphasizes
    both depth and width, allowing training to converge quickly. Using our
    multi-view architecture, we establish new state-of-the-art accuracies on two
    benchmark tasks.


    Computer Vision and Pattern Recognition

    Multi-view Supervision for Single-view Reconstruction via Differentiable Ray Consistency

    Shubham Tulsiani, Tinghui Zhou, Alexei A. Efros, Jitendra Malik
    Comments: To appear at CVPR 2017. Project webpage : this https URL
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We study the notion of consistency between a 3D shape and a 2D observation
    and propose a differentiable formulation which allows computing gradients of
    the 3D shape given an observation from an arbitrary view. We do so by
    reformulating view consistency using a differentiable ray consistency (DRC)
    term. We show that this formulation can be incorporated in a learning framework
    to leverage different types of multi-view observations e.g. foreground masks,
    depth, color images, semantics etc. as supervision for learning single-view 3D
    prediction. We present empirical analysis of our technique in a controlled
    setting. We also show that this approach allows us to improve over existing
    techniques for single-view reconstruction of objects from the PASCAL VOC
    dataset.

    Towards Large-Pose Face Frontalization in the Wild

    Xi Yin, Xiang Yu, Kihyuk Sohn, Xiaoming Liu, Manmohan Chandraker
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Despite recent advances in face recognition using deep learning, severe
    accuracy drops are observed for large pose variations in unconstrained
    environments. Learning pose-invariant features is one solution, but needs
    expensively labeled large scale data and carefully designed feature learning
    algorithms. In this work, we focus on frontalizing faces in the wild under
    various head poses, including extreme profile views. We propose a novel deep 3D
    Morphable Model (3DMM) conditioned Face Frontalization Generative Adversarial
    Network (GAN), termed as FF-GAN, to generate neutral head pose face images. Our
    framework differs from both traditional GANs and 3DMM based modeling.
    Incorporating 3DMM into the GAN structure provides shape and appearance priors
    for fast convergence with less training data, while also supporting end-to-end
    training. The 3DMM conditioned GAN employs not only the discriminator and
    generator loss but also a new masked symmetry loss to retain visual quality
    under occlusions, besides an identity loss to recover high frequency
    information. Experiments on face recognition, landmark localization and 3D
    reconstruction consistently show the advantage of our frontalization method on
    faces in the wild datasets.Detailed results can be refered to:
    this http URL

    Temporal Action Detection with Structured Segment Networks

    Yue Zhao, Yuanjun Xiong, Limin Wang, Zhirong Wu, Dahua Lin, Xiaoou Tang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Detecting activities in untrimmed videos is an important yet challenging
    task. In this paper, we tackle the difficulties of effectively locating the
    start and the end of a long complex action, which are often met by existing
    methods. Our key contribution is the structured segment network, a novel
    framework for temporal action detection, which models the temporal structure of
    each activity instance via a structured temporal pyramid. On top of the
    pyramid, we further introduce a decomposed discriminative model, which
    comprises two classifiers, respectively for classifying activities and
    determining completeness. This allows the framework to effectively distinguish
    positive proposals from background or incomplete ones, thus leading to both
    accurate recognition and localization. These components are integrated into a
    unified network that can be efficiently trained in an end-to-end fashion. We
    also propose a simple yet effective temporal action proposal scheme that can
    generate proposals of considerably higher qualities. On two challenging
    benchmarks, THUMOS14 and ActivityNet, our method remarkably outperforms
    existing state-of-the-art methods by over ( 10\% ) absolute average mAP,
    demonstrating superior accuracy and strong adaptivity in handling activities
    with various temporal structures.

    A Nuclear-norm Model for Multi-Frame Super-Resolution Reconstruction from Video Clips

    Rui Zhao, Raymond H. Chan
    Comments: 12 pages, 7 numberical examples, 12 figure groups, 2 tables
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We propose a variational approach to obtain super-resolution images from
    multiple low-resolution frames extracted from video clips. First the
    displacement between the low-resolution frames and the reference frame are
    computed by an optical flow algorithm. Then a low-rank model is used to
    construct the reference frame in high-resolution by incorporating the
    information of the low-resolution frames. The model has two terms: a 2-norm
    data fidelity term and a nuclear-norm regularization term. Alternating
    direction method of multipliers is used to solve the model. Comparison of our
    methods with other models on synthetic and real video clips show that our
    resulting images are more accurate with less artifacts. It also provides much
    finer and discernable details.

    Training object class detectors with click supervision

    Dim P. Papadopoulos, Jasper R. R. Uijlings, Frank Keller, Vittorio Ferrari
    Comments: CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Training object class detectors typically requires a large set of images with
    objects annotated by bounding boxes. However, manually drawing bounding boxes
    is very time consuming. In this paper we greatly reduce annotation time by
    proposing center-click annotations: we ask annotators to click on the center of
    an imaginary bounding box which tightly encloses the object instance. We then
    incorporate these clicks into existing Multiple Instance Learning techniques
    for weakly supervised object localization, to jointly localize object bounding
    boxes over all training images. Extensive experiments on PASCAL VOC 2007 and MS
    COCO show that: (1) our scheme delivers high-quality detectors, performing
    substantially better than those produced by weakly supervised techniques, with
    a modest extra annotation effort; (2) these detectors in fact perform in a
    range close to those trained from manually drawn bounding boxes; (3) as the
    center-click task is very fast, our scheme reduces total annotation time by 9x
    to 18x.

    Exploring epoch-dependent stochastic residual networks

    Fabio Carrara, Andrea Esuli, Fabrizio Falchi, Alejandro Moreo Fernández
    Comments: Preliminary report
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    The recently proposed stochastic residual networks selectively activate or
    bypass the layers during training, based on independent stochastic choices,
    each of which following a probability distribution that is fixed in advance. In
    this paper we present a first exploration on the use of an epoch-dependent
    distribution, starting with a higher probability of bypassing deeper layers and
    then activating them more frequently as training progresses. Preliminary
    results are mixed, yet they show some potential of adding an epoch-dependent
    management of distributions, worth of further investigation.

    Segmentation of the Proximal Femur from MR Images using Deep Convolutional Neural Networks

    Cem M. Deniz, Spencer Hallyburton, Arakua Welbeck, Stephen Honig, Kyunghyun Cho, Gregory Chang
    Comments: 7 pages, 5 figures, and one table
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)

    Osteoporosis is a public health problem characterized by increased fracture
    risk secondary to low bone mass and microarchitectural deterioration of bone
    tissue. Almost all fractures of the hip require hospitalization and major
    surgery. Early diagnosis of osteoporosis plays an important role in preventing
    osteoporotic fracture. Magnetic resonance imaging (MRI) has been successfully
    performed to image trabecular bone architecture in vivo proving itself as the
    practical imaging modality for bone quality assessment. However, segmentation
    of the whole proximal femur is required to measure bone quality and assess
    fracture risk precisely. Manual segmentation of the proximal femur is
    time-intensive, limiting the use of MRI measurements in the clinical practice.
    To overcome this bottleneck, robust automatic proximal femur segmentation
    method is required. In this paper, we propose to use deep convolutional neural
    networks (CNNs) for an automatic proximal femur segmentation using structural
    MR images. We constructed a dataset with 62 volumetric MR scans that are
    manually-segmented for proximal femur. We performed experiments using two
    different CNN architectures and achieved a high dice similarity score of 0.95.

    End-to-End Unsupervised Deformable Image Registration with a Convolutional Neural Network

    Bob D. de Vos, Floris F. Berendsen, Max A. Viergever, Marius Staring, Ivana Išgum
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this work we propose a deep learning network for deformable image
    registration (DIRNet). The DIRNet consists of a convolutional neural network
    (ConvNet) regressor, a spatial transformer, and a resampler. The ConvNet
    analyzes a pair of fixed and moving images and outputs parameters for the
    spatial transformer, which generates the displacement vector field that enables
    the resampler to warp the moving image to the fixed image. The DIRNet is
    trained end-to-end by unsupervised optimization of a similarity metric between
    input image pairs. A trained DIRNet can be applied to perform registration on
    unseen image pairs in one pass, thus non-iteratively. Evaluation was performed
    with registration of images of handwritten digits (MNIST) and cardiac cine MR
    scans (Sunnybrook Cardiac Data). The results demonstrate that registration with
    DIRNet is as accurate as a conventional deformable image registration method
    with substantially shorter execution times.

    Understanding the Mechanisms of Deep Transfer Learning for Medical Images

    Hariharan Ravishankar, Prasad Sudhakar, Rahul Venkataramani, Sheshadri Thiruvenkadam, Pavan Annangi, Narayanan Babu, Vivek Vaidya
    Comments: Published in MICCAI Workshop on Deep Learning in Medical Image Analysis, 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    The ability to automatically learn task specific feature representations has
    led to a huge success of deep learning methods. When large training data is
    scarce, such as in medical imaging problems, transfer learning has been very
    effective. In this paper, we systematically investigate the process of
    transferring a Convolutional Neural Network, trained on ImageNet images to
    perform image classification, to kidney detection problem in ultrasound images.
    We study how the detection performance depends on the extent of transfer. We
    show that a transferred and tuned CNN can outperform a state-of-the-art feature
    engineered pipeline and a hybridization of these two techniques achieves 20\%
    higher performance. We also investigate how the evolution of intermediate
    response images from our network. Finally, we compare these responses to
    state-of-the-art image processing filters in order to gain greater insight into
    how transfer learning is able to effectively manage widely varying imaging
    regimes.

    End-to-end representation learning for Correlation Filter based tracking

    Jack Valmadre, Luca Bertinetto, João F. Henriques, Andrea Vedaldi, Philip H. S. Torr
    Comments: To appear at CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    The Correlation Filter is an algorithm that trains a linear template to
    discriminate between images and their translations. It is well suited to object
    tracking because its formulation in the Fourier domain provides a fast
    solution, enabling the detector to be re-trained once per frame. Previous works
    that use the Correlation Filter, however, have adopted features that were
    either manually designed or trained for a different task. This work is the
    first to overcome this limitation by interpreting the Correlation Filter
    learner, which has a closed-form solution, as a differentiable layer in a deep
    neural network. This enables learning deep features that are tightly coupled to
    the Correlation Filter. Experiments illustrate that our method has the
    important practical benefit of allowing lightweight architectures to achieve
    state-of-the-art performance at high framerates.

    Predicting Cognitive Decline with Deep Learning of Brain Metabolism and Amyloid Imaging

    Hongyoon Choi, Kyong Hwan Jin
    Comments: 24 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    For effective treatment of Alzheimer disease (AD), it is important to
    identify subjects who are most likely to exhibit rapid cognitive decline.
    Herein, we developed a novel framework based on a deep convolutional neural
    network which can predict future cognitive decline in mild cognitive impairment
    (MCI) patients using flurodeoxyglucose and florbetapir positron emission
    tomography (PET). The architecture of the network only relies on baseline PET
    studies of AD and normal subjects as the training dataset. Feature extraction
    and complicated image preprocessing including nonlinear warping are unnecessary
    for our approach. Accuracy of prediction (84.2%) for conversion to AD in MCI
    patients outperformed conventional feature-based quantification approaches. ROC
    analyses revealed that performance of CNN-based approach was significantly
    higher than that of the conventional quantification methods (p < 0.05). Output
    scores of the network were strongly correlated with the longitudinal change in
    cognitive measurements. These results show the feasibility of deep learning as
    a tool for predicting disease outcome using brain images.

    Enhancing Person Re-identification in a Self-trained Subspace

    Xun Yang, Meng Wang, Richang Hong, Qi Tian, Yong Rui
    Comments: Accepted by ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM)
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Despite the promising progress made in recent years, person re-identification
    (re-ID) remains a challenging task due to the complex variations in human
    appearances from different camera views. For this challenging problem, a large
    variety of algorithms have been developed in the fully-supervised setting,
    requiring access to a large amount of labeled training data. However, the main
    bottleneck for fully-supervised re-ID is the limited availability of labeled
    training samples. To address this problem, in this paper, we propose a
    self-trained subspace learning paradigm for person re-ID which effectively
    utilizes both labeled and unlabeled data to learn a discriminative subspace
    where person images across disjoint camera views can be easily matched. The
    proposed approach first constructs pseudo pairwise relationships among
    unlabeled persons using the k-nearest neighbors algorithm. Then, with the
    pseudo pairwise relationships, the unlabeled samples can be easily combined
    with the labeled samples to learn a discriminative projection by solving an
    eigenvalue problem. In addition, we refine the pseudo pairwise relationships
    iteratively, which further improves the learning performance. A multi-kernel
    embedding strategy is also incorporated into the proposed approach to cope with
    the non-linearity in person’s appearance and explore the complementation of
    multiple kernels. In this way, the performance of person re-ID can be greatly
    enhanced when training data are insufficient. Experimental results on six
    widely-used datasets demonstrate the effectiveness of our approach and its
    performance can be comparable to the reported results of most state-of-the-art
    fully-supervised methods while using much fewer labeled data.

    A Fuzzy Brute Force Matching Method for Binary Image Features

    Erkan Bostanci, Nadia Kanwal, Betul Bostanci, Mehmet Serdar Guzel
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Matching of binary image features is an important step in many different
    computer vision applications. Conventionally, an arbitrary threshold is used to
    identify a correct match from incorrect matches using Hamming distance which
    may improve or degrade the matching results for different input images. This is
    mainly due to the image content which is affected by the scene, lighting and
    imaging conditions. This paper presents a fuzzy logic based approach for brute
    force matching of image features to overcome this situation. The method was
    tested using a well-known image database with known ground truth. The approach
    is shown to produce a higher number of correct matches when compared against
    constant distance thresholds. The nature of fuzzy logic which allows the
    vagueness of information and tolerance to errors has been successfully
    exploited in an image processing context. The uncertainty arising from the
    imaging conditions has been overcome with the use of compact fuzzy matching
    membership functions.

    BranchConnect: Large-Scale Visual Recognition with Learned Branch Connections

    Karim Ahmed, Lorenzo Torresani
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We introduce an architecture for large-scale image categorization that
    enables the end-to-end learning of separate visual features for the different
    classes to distinguish. The proposed model consists of a deep CNN shaped like a
    tree. The stem of the tree includes a sequence of convolutional layers common
    to all classes. The stem then splits into multiple branches implementing
    parallel feature extractors, which are ultimately connected to the final
    classification layer via learned gated connections. These learned gates
    determine for each individual class the subset of features to use. Such a
    scheme naturally encourages the learning of a heterogeneous set of specialized
    features through the separate branches and it allows each class to use the
    subset of features that are optimal for its recognition. We show the generality
    of our proposed method by reshaping several popular CNNs from the literature
    into our proposed architecture. Our experiments on the CIFAR100, CIFAR10, and
    Synth datasets show that in each case our resulting model yields a substantial
    improvement in accuracy over the original CNN. Our empirical analysis also
    suggests that our scheme acts as a form of beneficial regularization improving
    generalization performance.

    SLAM with Objects using a Nonparametric Pose Graph

    Beipeng Mu, Shih-Yuan Liu, Liam Paull, John Leonard, Jonathan How
    Comments: published at IROS 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)

    Mapping and self-localization in unknown environments are fundamental
    capabilities in many robotic applications. These tasks typically involve the
    identification of objects as unique features or landmarks, which requires the
    objects both to be detected and then assigned a unique identifier that can be
    maintained when viewed from different perspectives and in different images. The
    extit{data association} and extit{simultaneous localization and mapping}
    (SLAM) problems are, individually, well-studied in the literature. But these
    two problems are inherently tightly coupled, and that has not been
    well-addressed. Without accurate SLAM, possible data associations are
    combinatorial and become intractable easily. Without accurate data association,
    the error of SLAM algorithms diverge easily. This paper proposes a novel
    nonparametric pose graph that models data association and SLAM in a single
    framework. An algorithm is further introduced to alternate between inferring
    data association and performing SLAM. Experimental results show that our
    approach has the new capability of associating object detections and localizing
    objects at the same time, leading to significantly better performance on both
    the data association and SLAM problems than achieved by considering only one
    and ignoring imperfections in the other.

    Unassisted Quantitative Evaluation Of Despeckling Filters

    Luis Gomez, Raydonal Ospina, Alejandro C. Frery
    Comments: Accepted for publication in Remote Sensing – Open Access Journal
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    SAR (Synthetic Aperture Radar) imaging plays a central role in Remote Sensing
    due to, among other important features, its ability to provide high-resolution,
    day-and-night and almost weather-independent images. SAR images are affected
    from a granular contamination, speckle, that can be described by a
    multiplicative model. Many despeckling techniques have been proposed in the
    literature, as well as measures of the quality of the results they provide.
    Assuming the multiplicative model, the observed image (Z) is the product of two
    independent fields: the backscatter (X) and the speckle (Y). The result of any
    speckle filter is (widehat X), an estimator of the backscatter (X), based
    solely on the observed data (Z). An ideal estimator would be the one for which
    the ratio of the observed image to the filtered one (I=Z/widehat X) is only
    speckle: a collection of independent identically distributed samples from Gamma
    variates. We, then, assess the quality of a filter by the closeness of (I) to
    the hypothesis that it is adherent to the statistical properties of pure
    speckle. We analyze filters through the ratio image they produce with regards
    to first- and second-order statistics: the former check marginal properties,
    while the latter verifies lack of structure. A new quantitative image-quality
    index is then defined, and applied to state-of-the-art despeckling filters.
    This new measure provides consistent results with commonly used quality
    measures (equivalent number of looks, PSNR, MSSIM, (eta) edge correlation,
    and preservation of the mean), and ranks the filters results also in agreement
    with their visual analysis. We conclude our study showing that the proposed
    measure can be successfully used to optimize the (often many) parameters that
    define a speckle filter.

    HPatches: A benchmark and evaluation of handcrafted and learned local descriptors

    Vassileios Balntas, Karel Lenc, Andrea Vedaldi, Krystian Mikolajczyk
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we propose a novel benchmark for evaluating local image
    descriptors. We demonstrate that the existing datasets and evaluation protocols
    do not specify unambiguously all aspects of evaluation, leading to ambiguities
    and inconsistencies in results reported in the literature. Furthermore, these
    datasets are nearly saturated due to the recent improvements in local
    descriptors obtained by learning them from large annotated datasets. Therefore,
    we introduce a new large dataset suitable for training and testing modern
    descriptors, together with strictly defined evaluation protocols in several
    tasks such as matching, retrieval and classification. This allows for more
    realistic, and thus more reliable comparisons in different application
    scenarios. We evaluate the performance of several state-of-the-art descriptors
    and analyse their properties. We show that a simple normalisation of
    traditional hand-crafted descriptors can boost their performance to the level
    of deep learning based descriptors within a realistic benchmarks evaluation.

    Using Mise-En-Scène Visual Features based on MPEG-7 and Deep Learning for Movie Recommendation

    Yashar Deldjoo, Massimo Quadrana, Mehdi Elahi, Paolo Cremonesi
    Comments: 8 pages, 3 figures
    Subjects: Information Retrieval (cs.IR); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)

    Item features play an important role in movie recommender systems, where
    recommendations can be generated by using explicit or implicit preferences of
    users on traditional features (attributes) such as tag, genre, and cast.
    Typically, movie features are human-generated, either editorially (e.g., genre
    and cast) or by leveraging the wisdom of the crowd (e.g., tag), and as such,
    they are prone to noise and are expensive to collect. Moreover, these features
    are often rare or absent for new items, making it difficult or even impossible
    to provide good quality recommendations.

    In this paper, we show that user’s preferences on movies can be better
    described in terms of the mise-en-sc`ene features, i.e., the visual aspects of
    a movie that characterize design, aesthetics and style (e.g., colors,
    textures). We use both MPEG-7 visual descriptors and Deep Learning hidden
    layers as example of mise-en-sc`ene features that can visually describe
    movies. Interestingly, mise-en-sc`ene features can be computed automatically
    from video files or even from trailers, offering more flexibility in handling
    new items, avoiding the need for costly and error-prone human-based tagging,
    and providing good scalability.

    We have conducted a set of experiments on a large catalogue of 4K movies.
    Results show that recommendations based on mise-en-sc`ene features
    consistently provide the best performance with respect to richer sets of more
    traditional features, such as genre and tag.

    Fast Generation for Convolutional Autoregressive Models

    Prajit Ramachandran, Tom Le Paine, Pooya Khorrami, Mohammad Babaeizadeh, Shiyu Chang, Yang Zhang, Mark A. Hasegawa-Johnson, Roy H. Campbell, Thomas S. Huang
    Comments: Accepted at ICLR 2017 Workshop
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

    Convolutional autoregressive models have recently demonstrated
    state-of-the-art performance on a number of generation tasks. While fast,
    parallel training methods have been crucial for their success, generation is
    typically implemented in a na”{i}ve fashion where redundant computations are
    unnecessarily repeated. This results in slow generation, making such models
    infeasible for production environments. In this work, we describe a method to
    speed up generation in convolutional autoregressive models. The key idea is to
    cache hidden states to avoid redundant computation. We apply our fast
    generation method to the Wavenet and PixelCNN++ models and achieve up to
    (21 imes) and (183 imes) speedups respectively.


    Artificial Intelligence

    Learning to Acquire Information

    Yewen Pu, Leslie P Kaelbling, Armando Solar-Lezama
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    We consider the problem of diagnosis where a set of simple observations are
    used to infer a potentially complex hidden hypothesis. Finding the optimal
    subset of observations is intractable in general, thus we focus on the problem
    of active diagnosis, where the agent selects the next most-informative
    observation based on the results of previous observations. We show that under
    the assumption of uniform observation entropy, one can build an implication
    model which directly predicts the outcome of the potential next observation
    conditioned on the results of past observations, and selects the observation
    with the maximum entropy. This approach enjoys reduced computation complexity
    by bypassing the complicated hypothesis space, and can be trained on
    observation data alone, learning how to query without knowledge of the hidden
    hypothesis.

    The Dependent Doors Problem: An Investigation into Sequential Decisions without Feedback

    Amos Korman (GANG, IRIF), Yoav Rodeh
    Subjects: Artificial Intelligence (cs.AI)

    We introduce the dependent doors problem as an abstraction for situations in
    which one must perform a sequence of possibly dependent decisions, without
    receiving feedback information on the effectiveness of previously made actions.
    Informally, the problem considers a set of (d) doors that are initially closed,
    and the aim is to open all of them as fast as possible. To open a door, the
    algorithm knocks on it and it might open or not according to some probability
    distribution. This distribution may depend on which other doors are currently
    open, as well as on which other doors were open during each of the previous
    knocks on that door. The algorithm aims to minimize the expected time until all
    doors open. Crucially, it must act at any time without knowing whether or which
    other doors have already opened. In this work, we focus on scenarios where
    dependencies between doors are both positively correlated and acyclic.The
    fundamental distribution of a door describes the probability it opens in the
    best of conditions (with respect to other doors being open or closed). We show
    that if in two configurations of (d) doors corresponding doors share the same
    fundamental distribution, then these configurations have the same optimal
    running time up to a universal constant, no matter what are the dependencies
    between doors and what are the distributions. We also identify algorithms that
    are optimal up to a universal constant factor. For the case in which all doors
    share the same fundamental distribution we additionally provide a simpler
    algorithm, and a formula to calculate its running time. We furthermore analyse
    the price of lacking feedback for several configurations governed by standard
    fundamental distributions. In particular, we show that the price is logarithmic
    in (d) for memoryless doors, but can potentially grow to be linear in (d) for
    other distributions.We then turn our attention to investigate precise bounds.
    Even for the case of two doors, identifying the optimal sequence is an
    intriguing combinatorial question. Here, we study the case of two cascading
    memoryless doors. That is, the first door opens on each knock independently
    with probability (p\_1). The second door can only open if the first door is
    open, in which case it will open on each knock independently with probability
    (p\_2). We solve this problem almost completely by identifying algorithms that
    are optimal up to an additive term of 1.

    Knowledge Fusion via Embeddings from Text, Knowledge Graphs, and Images

    Steffen Thoma, Achim Rettinger, Fabian Both
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    We present a baseline approach for cross-modal knowledge fusion. Different
    basic fusion methods are evaluated on existing embedding approaches to show the
    potential of joining knowledge about certain concepts across modalities in a
    fused concept representation.

    Realizing an optimization approach inspired from Piagets theory on cognitive development

    Utku Kose, Ahmet Arslan
    Comments: 8 pages, 1 figure, 2 tables
    Journal-ref: Broad Research in Artificial Intelligence and Neuroscience,
    6(1-4), 2015, 15-22
    Subjects: Artificial Intelligence (cs.AI); Optimization and Control (math.OC)

    The objective of this paper is to introduce an artificial intelligence based
    optimization approach, which is inspired from Piagets theory on cognitive
    development. The approach has been designed according to essential processes
    that an individual may experience while learning something new or improving his
    / her knowledge. These processes are associated with the Piagets ideas on an
    individuals cognitive development. The approach expressed in this paper is a
    simple algorithm employing swarm intelligence oriented tasks in order to
    overcome single-objective optimization problems. For evaluating effectiveness
    of this early version of the algorithm, test operations have been done via some
    benchmark functions. The obtained results show that the approach / algorithm
    can be an alternative to the literature in terms of single-objective
    optimization. The authors have suggested the name: Cognitive Development
    Optimization Algorithm (CoDOA) for the related intelligent optimization
    approach.

    On Singleton Arc Consistency for Natural CSPs Defined by Forbidden Patterns

    Clement Carbonnel, David A. Cohen, Martin C. Cooper, Stanislav Zivny
    Subjects: Computational Complexity (cs.CC); Artificial Intelligence (cs.AI); Discrete Mathematics (cs.DM)

    Singleton arc consistency is an important type of local consistency which has
    been recently shown to solve all constraint satisfaction problems (CSPs) over
    constraint languages of bounded width. We aim to characterise all classes of
    CSPs defined by a forbidden pattern that are solved by singleton arc
    consistency and closed under removing constraints. We identify five new
    patterns whose absence ensures solvability by singleton arc consistency, four
    of which are provably maximal and three of which generalise 2-SAT. Combined
    with simple counter-examples for other patterns, we make significant progress
    towards a complete classification.

    Improved Neural Relation Detection for Knowledge Base Question Answering

    Mo Yu, Wenpeng Yin, Kazi Saidul Hasan, Cicero dos Santos, Bing Xiang, Bowen Zhou
    Comments: Accepted by ACL 2017
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    Relation detection is a core component for many NLP applications including
    Knowledge Base Question Answering (KBQA). In this paper, we propose a
    hierarchical recurrent neural network enhanced by residual learning that
    detects KB relations given an input question. Our method uses deep residual
    bidirectional LSTMs to compare questions and relation names via different
    hierarchies of abstraction. Additionally, we propose a simple KBQA system that
    integrates entity linking and our proposed relation detector to enable one
    enhance another. Experimental results evidence that our approach achieves not
    only outstanding relation detection performance, but more importantly, it helps
    our KBQA system to achieve state-of-the-art accuracy for both single-relation
    (SimpleQuestions) and multi-relation (WebQSP) QA benchmarks.

    Intrusion Prevention and Detection in Grid Computing – The ALICE Case

    Andres Gomez, Camilo Lara, Udo Kebschull (for the ALICE Collaboration)
    Comments: Journal of Physics: Conference Series, Volume 664
    Journal-ref: J. Phys.: Conf. Ser. 664 062017 (2015)
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); High Energy Physics – Experiment (hep-ex)

    Grids allow users flexible on-demand usage of computing resources through
    remote communication networks. A remarkable example of a Grid in High Energy
    Physics (HEP) research is used in the ALICE experiment at European Organization
    for Nuclear Research CERN. Physicists can submit jobs used to process the huge
    amount of particle collision data produced by the Large Hadron Collider (LHC).
    Grids face complex security challenges. They are interesting targets for
    attackers seeking for huge computational resources. Since users can execute
    arbitrary code in the worker nodes on the Grid sites, special care should be
    put in this environment. Automatic tools to harden and monitor this scenario
    are required. Currently, there is no integrated solution for such requirement.
    This paper describes a new security framework to allow execution of job
    payloads in a sandboxed context. It also allows process behavior monitoring to
    detect intrusions, even when new attack methods or zero day vulnerabilities are
    exploited, by a Machine Learning approach. We plan to implement the proposed
    framework as a software prototype that will be tested as a component of the
    ALICE Grid middleware.

    Predicting Cognitive Decline with Deep Learning of Brain Metabolism and Amyloid Imaging

    Hongyoon Choi, Kyong Hwan Jin
    Comments: 24 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    For effective treatment of Alzheimer disease (AD), it is important to
    identify subjects who are most likely to exhibit rapid cognitive decline.
    Herein, we developed a novel framework based on a deep convolutional neural
    network which can predict future cognitive decline in mild cognitive impairment
    (MCI) patients using flurodeoxyglucose and florbetapir positron emission
    tomography (PET). The architecture of the network only relies on baseline PET
    studies of AD and normal subjects as the training dataset. Feature extraction
    and complicated image preprocessing including nonlinear warping are unnecessary
    for our approach. Accuracy of prediction (84.2%) for conversion to AD in MCI
    patients outperformed conventional feature-based quantification approaches. ROC
    analyses revealed that performance of CNN-based approach was significantly
    higher than that of the conventional quantification methods (p < 0.05). Output
    scores of the network were strongly correlated with the longitudinal change in
    cognitive measurements. These results show the feasibility of deep learning as
    a tool for predicting disease outcome using brain images.

    SemEval-2017 Task 8: RumourEval: Determining rumour veracity and support for rumours

    Leon Derczynski, Kalina Bontcheva, Maria Liakata, Rob Procter, Geraldine Wong Sak Hoi, Arkaitz Zubiaga
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

    Media is full of false claims. Even Oxford Dictionaries named “post-truth” as
    the word of 2016. This makes it more important than ever to build systems that
    can identify the veracity of a story, and the kind of discourse there is around
    it. RumourEval is a SemEval shared task that aims to identify and handle
    rumours and reactions to them, in text. We present an annotation scheme, a
    large dataset covering multiple topics – each having their own families of
    claims and replies – and use these to pose two concrete challenges as well as
    the results achieved by participants on these challenges.

    Monte Carlo Tree Search with Sampled Information Relaxation Dual Bounds

    Daniel R. Jiang, Lina Al-Kanj, Warren B. Powell
    Comments: 33 pages, 6 figures
    Subjects: Optimization and Control (math.OC); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Monte Carlo Tree Search (MCTS), most famously used in game-play artificial
    intelligence (e.g., the game of Go), is a well-known strategy for constructing
    approximate solutions to sequential decision problems. Its primary innovation
    is the use of a heuristic, known as a default policy, to obtain Monte Carlo
    estimates of downstream values for states in a decision tree. This information
    is used to iteratively expand the tree towards regions of states and actions
    that an optimal policy might visit. However, to guarantee convergence to the
    optimal action, MCTS requires the entire tree to be expanded asymptotically. In
    this paper, we propose a new technique called Primal-Dual MCTS that utilizes
    sampled information relaxation upper bounds on potential actions, creating the
    possibility of “ignoring” parts of the tree that stem from highly suboptimal
    choices. This allows us to prove that despite converging to a partial decision
    tree in the limit, the recommended action from Primal-Dual MCTS is optimal. The
    new approach shows significant promise when used to optimize the behavior of a
    single driver navigating a graph while operating on a ride-sharing platform.
    Numerical experiments on a real dataset of 7,000 trips in New Jersey suggest
    that Primal-Dual MCTS improves upon standard MCTS by producing deeper decision
    trees and exhibits a reduced sensitivity to the size of the action space.

    User-driven Intelligent Interface on the Basis of Multimodal Augmented Reality and Brain-Computer Interaction for People with Functional Disabilities

    S. Stirenko, Yu. Gordienko, T. Shemsedinov, O. Alienin, Yu. Kochura, N. Gordienko, A. Rojbi, J.R. López Benito, E. Artetxe González
    Comments: 6 pages, 3 figures, 1 table, IEEE First Ukraine Conference on Electrical and Computer Engineering (UKRCON-2017)
    Subjects: Human-Computer Interaction (cs.HC); Artificial Intelligence (cs.AI); Distributed, Parallel, and Cluster Computing (cs.DC)

    The analysis of the current integration attempts of some modes and use cases
    of user-machine interaction is presented. The new concept of the user-driven
    intelligent interface is proposed on the basis of multimodal augmented reality
    and brain-computer interaction for various applications: in disabilities
    studies, education, home care, health care, etc. The several use cases of
    multimodal augmentation are presented. The perspectives of the better human
    comprehension by the immediate feedback through neurophysical channels by means
    of brain-computer interaction are outlined. It is shown that brain-computer
    interface (BCI) technology provides new strategies to overcome limits of the
    currently available user interfaces, especially for people with functional
    disabilities. The results of the previous studies of the low end consumer and
    open-source BCI-devices allow us to conclude that combination of machine
    learning (ML), multimodal interactions (visual, sound, tactile) with BCI will
    profit from the immediate feedback from the actual neurophysical reactions
    classified by ML methods. In general, BCI in combination with other modes of AR
    interaction can deliver much more information than these types of interaction
    themselves. Even in the current state the combined AR-BCI interfaces could
    provide the highly adaptable and personal services, especially for people with
    functional disabilities.

    An Interpretable Knowledge Transfer Model for Knowledge Base Completion

    Qizhe Xie, Xuezhe Ma, Zihang Dai, Eduard Hovy
    Comments: ACL 2017
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Knowledge bases are important resources for a variety of natural language
    processing tasks but suffer from incompleteness. We propose a novel embedding
    model, emph{ITransF}, to perform knowledge base completion. Equipped with a
    sparse attention mechanism, ITransF discovers hidden concepts of relations and
    transfer statistical strength through the sharing of concepts. Moreover, the
    learned associations between relations and concepts, which are represented by
    sparse attention vectors, can be interpreted easily. We evaluate ITransF on two
    benchmark datasets—WN18 and FB15k for knowledge base completion and obtains
    improvements on both the mean rank and Hits@10 metrics, over all baselines that
    do not use additional information.


    Information Retrieval

    Using Mise-En-Scène Visual Features based on MPEG-7 and Deep Learning for Movie Recommendation

    Yashar Deldjoo, Massimo Quadrana, Mehdi Elahi, Paolo Cremonesi
    Comments: 8 pages, 3 figures
    Subjects: Information Retrieval (cs.IR); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)

    Item features play an important role in movie recommender systems, where
    recommendations can be generated by using explicit or implicit preferences of
    users on traditional features (attributes) such as tag, genre, and cast.
    Typically, movie features are human-generated, either editorially (e.g., genre
    and cast) or by leveraging the wisdom of the crowd (e.g., tag), and as such,
    they are prone to noise and are expensive to collect. Moreover, these features
    are often rare or absent for new items, making it difficult or even impossible
    to provide good quality recommendations.

    In this paper, we show that user’s preferences on movies can be better
    described in terms of the mise-en-sc`ene features, i.e., the visual aspects of
    a movie that characterize design, aesthetics and style (e.g., colors,
    textures). We use both MPEG-7 visual descriptors and Deep Learning hidden
    layers as example of mise-en-sc`ene features that can visually describe
    movies. Interestingly, mise-en-sc`ene features can be computed automatically
    from video files or even from trailers, offering more flexibility in handling
    new items, avoiding the need for costly and error-prone human-based tagging,
    and providing good scalability.

    We have conducted a set of experiments on a large catalogue of 4K movies.
    Results show that recommendations based on mise-en-sc`ene features
    consistently provide the best performance with respect to richer sets of more
    traditional features, such as genre and tag.

    The Magic Barrier Revisited: Accessing Natural Limitations of Recommender Assessment

    Kevin Jasberg, Sergej Sizov
    Subjects: Human-Computer Interaction (cs.HC); Information Retrieval (cs.IR)

    Recommender systems nowadays have many applications and are of great economic
    benefit. Hence, it is imperative for success-oriented companies to compare
    different of such systems and select the better one for their purposes. To this
    end, various metrics of predictive accuracy are commonly used, such as the Root
    Mean Square Error (RMSE), or precision and recall. All these metrics more or
    less measure how well a recommender system can predict human behaviour.

    Unfortunately, human behaviour is always associated with some degree of
    uncertainty, making the evaluation difficult, since it is not clear whether a
    deviation is system-induced or just originates from the natural variability of
    human decision making. At this point, some authors speculated that we may be
    reaching some Magic Barrier where this variability prevents us from getting
    much more accurate.

    In this article, we will extend the existing theory of the Magic Barrier into
    a new probabilistic but a yet pragmatic model. In particular, we will use
    methods from metrology and physics to develop easy-to-handle quantities for
    computation to describe the Magic Barrier for different accuracy metrics and
    provide suggestions for common application. This discussion is substantiated by
    comprehensive experiments with real users and large-scale simulations on a
    high-performance cluster.


    Computation and Language

    Reinforcement Learning with External Knowledge and Two-Stage Q-functions for Predicting Popular Reddit Threads

    Ji He, Mari Ostendorf, Xiaodong He
    Subjects: Computation and Language (cs.CL)

    This paper addresses the problem of predicting popularity of comments in an
    online discussion forum using reinforcement learning, particularly addressing
    two challenges that arise from having natural language state and action spaces.
    First, the state representation, which characterizes the history of comments
    tracked in a discussion at a particular point, is augmented to incorporate the
    global context represented by discussions on world events available in an
    external knowledge source. Second, a two-stage Q-learning framework is
    introduced, making it feasible to search the combinatorial action space while
    also accounting for redundancy among sub-actions. We experiment with five
    Reddit communities, showing that the two methods improve over previous reported
    results on this task.

    Improved Neural Relation Detection for Knowledge Base Question Answering

    Mo Yu, Wenpeng Yin, Kazi Saidul Hasan, Cicero dos Santos, Bing Xiang, Bowen Zhou
    Comments: Accepted by ACL 2017
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    Relation detection is a core component for many NLP applications including
    Knowledge Base Question Answering (KBQA). In this paper, we propose a
    hierarchical recurrent neural network enhanced by residual learning that
    detects KB relations given an input question. Our method uses deep residual
    bidirectional LSTMs to compare questions and relation names via different
    hierarchies of abstraction. Additionally, we propose a simple KBQA system that
    integrates entity linking and our proposed relation detector to enable one
    enhance another. Experimental results evidence that our approach achieves not
    only outstanding relation detection performance, but more importantly, it helps
    our KBQA system to achieve state-of-the-art accuracy for both single-relation
    (SimpleQuestions) and multi-relation (WebQSP) QA benchmarks.

    BB_twtr at SemEval-2017 Task 4: Twitter Sentiment Analysis with CNNs and LSTMs

    Mathieu Cliche
    Comments: Published in Proceedings of SemEval-2017, 8 pages
    Subjects: Computation and Language (cs.CL); Machine Learning (stat.ML)

    In this paper we describe our attempt at producing a state-of-the-art Twitter
    sentiment classifier using Convolutional Neural Networks (CNNs) and Long Short
    Term Memory (LSTMs) networks. Our system leverages a large amount of unlabeled
    data to pre-train word embeddings. We then use a subset of the unlabeled data
    to fine tune the embeddings using distant supervision. The final CNNs and LSTMs
    are trained on the SemEval-2017 Twitter dataset where the embeddings are fined
    tuned again. To boost performances we ensemble several CNNs and LSTMs together.
    Our approach achieved first rank on all of the five English subtasks amongst 40
    teams.

    Neural End-to-End Learning for Computational Argumentation Mining

    Steffen Eger, Johannes Daxenberger, Iryna Gurevych
    Comments: To be published at ACL 2017
    Subjects: Computation and Language (cs.CL)

    We investigate neural techniques for end-to-end computational argumentation
    mining. We frame the problem as a token-based dependency parsing as well as a
    token-based sequence tagging model, including a multi-task learning setup.
    Contrary to models that operate on the argument component level, we find that
    framing the problem as dependency parsing leads to subpar performance results.
    In contrast, less complex (local) tagging models based on BiLSTMs perform
    robustly across classification scenarios, being able to catch long-range
    dependencies inherent to the argumentation mining problem. Moreover, we find
    that jointly learning ‘natural’ subtasks, in a multi-task learning setup,
    improves performance.

    Cross-domain Semantic Parsing via Paraphrasing

    Yu Su, Xifeng Yan
    Subjects: Computation and Language (cs.CL)

    Existing studies on semantic parsing mainly focus on the in-domain setting.
    We formulate cross-domain semantic parsing as a domain adaptation problem:
    train a semantic parser on some source domains and then adapt it to the target
    domain. Due to the diversity of logical forms in different domains, this
    problem presents unique and intriguing challenges. By converting logical forms
    into canonical utterances in natural language, we reduce semantic parsing to
    paraphrasing, and develop an attentive sequence-to-sequence paraphrase model
    that is general and flexible to adapt to different domains. We discover two
    problems, small micro variance and large macro variance, of pre-trained word
    embeddings that hurdle their direct use in neural networks, and propose
    standardization techniques as a remedy. On the popular Overnight dataset, which
    contains eight domains, we show that both cross-domain training and
    standardized pre-trained word embeddings can bring significant improvement.

    Call Attention to Rumors: Deep Attention Based Recurrent Neural Networks for Early Rumor Detection

    Tong Chen, Lin Wu, Xue Li, Jun Zhang, Hongzhi Yin, Yang Wang
    Comments: 9 pages
    Subjects: Computation and Language (cs.CL); Social and Information Networks (cs.SI)

    The proliferation of social media in communication and information
    dissemination has made it an ideal platform for spreading rumors. Automatically
    debunking rumors at their stage of diffusion is known as extit{early rumor
    detection}, which refers to dealing with sequential posts regarding disputed
    factual claims with certain variations and highly textual duplication over
    time. Thus, identifying trending rumors demands an efficient yet flexible model
    that is able to capture long-range dependencies among postings and produce
    distinct representations for the accurate early detection. However, it is a
    challenging task to apply conventional classification algorithms to rumor
    detection in earliness since they rely on hand-crafted features which require
    intensive manual efforts in the case of large amount of posts. This paper
    presents a deep attention model on the basis of recurrent neural networks (RNN)
    to learn extit{selectively} temporal hidden representations of sequential
    posts for identifying rumors. The proposed model delves soft-attention into the
    recurrence to simultaneously pool out distinct features with particular focus
    and produce hidden representations that capture contextual variations of
    relevant posts over time. Extensive experiments on real datasets collected from
    social media websites demonstrate that (1) the deep attention based RNN model
    outperforms state-of-the-arts that rely on hand-crafted features; (2) the
    introduction of soft attention mechanism can effectively distill relevant parts
    to rumors from original posts in advance; (3) the proposed method detects
    rumors more quickly and accurately than competitors.

    SemEval-2017 Task 8: RumourEval: Determining rumour veracity and support for rumours

    Leon Derczynski, Kalina Bontcheva, Maria Liakata, Rob Procter, Geraldine Wong Sak Hoi, Arkaitz Zubiaga
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

    Media is full of false claims. Even Oxford Dictionaries named “post-truth” as
    the word of 2016. This makes it more important than ever to build systems that
    can identify the veracity of a story, and the kind of discourse there is around
    it. RumourEval is a SemEval shared task that aims to identify and handle
    rumours and reactions to them, in text. We present an annotation scheme, a
    large dataset covering multiple topics – each having their own families of
    claims and replies – and use these to pose two concrete challenges as well as
    the results achieved by participants on these challenges.

    Global Relation Embedding for Relation Extraction

    Yu Su, Honglei Liu, Semih Yavuz, Izzeddin Gur, Huan Sun, Xifeng Yan
    Subjects: Computation and Language (cs.CL)

    Recent studies have shown that embedding textual relations using deep neural
    networks greatly helps relation extraction. However, many existing studies rely
    on supervised learning; their performance is dramatically limited by the
    availability of training data. In this work, we generalize textual relation
    embedding to the distant supervision setting, where much larger-scale but noisy
    training data is available. We propose leveraging global statistics of
    relations, i.e., the co-occurrence statistics of textual and knowledge base
    relations collected from the entire corpus, to embed textual relations. This
    approach turns out to be more robust to the training noise introduced by
    distant supervision. On a popular relation extraction dataset, we show that the
    learned textual relation embeddings can be used to augment existing relation
    extraction models and significantly improve their performance. Most remarkably,
    for the top 1,000 relational facts discovered by the best existing model, the
    precision can be improved from 83.9% to 89.3%.

    An Interpretable Knowledge Transfer Model for Knowledge Base Completion

    Qizhe Xie, Xuezhe Ma, Zihang Dai, Eduard Hovy
    Comments: ACL 2017
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Knowledge bases are important resources for a variety of natural language
    processing tasks but suffer from incompleteness. We propose a novel embedding
    model, emph{ITransF}, to perform knowledge base completion. Equipped with a
    sparse attention mechanism, ITransF discovers hidden concepts of relations and
    transfer statistical strength through the sharing of concepts. Moreover, the
    learned associations between relations and concepts, which are represented by
    sparse attention vectors, can be interpreted easily. We evaluate ITransF on two
    benchmark datasets—WN18 and FB15k for knowledge base completion and obtains
    improvements on both the mean rank and Hits@10 metrics, over all baselines that
    do not use additional information.

    End-to-End Multi-View Networks for Text Classification

    Hongyu Guo, Colin Cherry, Jiang Su
    Comments: 6 pages
    Subjects: Computation and Language (cs.CL); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    We propose a multi-view network for text classification. Our method
    automatically creates various views of its input text, each taking the form of
    soft attention weights that distribute the classifier’s focus among a set of
    base features. For a bag-of-words representation, each view focuses on a
    different subset of the text’s words. Aggregating many such views results in a
    more discriminative and robust representation. Through a novel architecture
    that both stacks and concatenates views, we produce a network that emphasizes
    both depth and width, allowing training to converge quickly. Using our
    multi-view architecture, we establish new state-of-the-art accuracies on two
    benchmark tasks.


    Distributed, Parallel, and Cluster Computing

    Intrusion Prevention and Detection in Grid Computing – The ALICE Case

    Andres Gomez, Camilo Lara, Udo Kebschull (for the ALICE Collaboration)
    Comments: Journal of Physics: Conference Series, Volume 664
    Journal-ref: J. Phys.: Conf. Ser. 664 062017 (2015)
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); High Energy Physics – Experiment (hep-ex)

    Grids allow users flexible on-demand usage of computing resources through
    remote communication networks. A remarkable example of a Grid in High Energy
    Physics (HEP) research is used in the ALICE experiment at European Organization
    for Nuclear Research CERN. Physicists can submit jobs used to process the huge
    amount of particle collision data produced by the Large Hadron Collider (LHC).
    Grids face complex security challenges. They are interesting targets for
    attackers seeking for huge computational resources. Since users can execute
    arbitrary code in the worker nodes on the Grid sites, special care should be
    put in this environment. Automatic tools to harden and monitor this scenario
    are required. Currently, there is no integrated solution for such requirement.
    This paper describes a new security framework to allow execution of job
    payloads in a sandboxed context. It also allows process behavior monitoring to
    detect intrusions, even when new attack methods or zero day vulnerabilities are
    exploited, by a Machine Learning approach. We plan to implement the proposed
    framework as a software prototype that will be tested as a component of the
    ALICE Grid middleware.

    How Bandwidth Affects the (CONGEST) Model

    Dennis Olivetti
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    The (CONGEST) model for distributed network computing is well suited for
    analyzing the impact of limiting the throughput of a network on its capacity to
    solve tasks efficiently. For many “global” problems there exists a lower bound
    of (Omega(D + sqrt{n/B})), where (B) is the amount of bits that can be
    exchanged between two nodes in one round of communication, (n) is the number of
    nodes and (D) is the diameter of the graph. Typically, upper bounds are given
    only for the case (B=O(log n)), or for the case (B = +infty). For (B=O(log
    n)), the Minimum Spanning Tree (MST) construction problem can be solved in (O(D
    + sqrt{n}log^* n)) rounds, and the Single Source Shortest Path (SSSP) problem
    can be ((1+epsilon))-approximated in (widetilde{O}(epsilon^{-O(1)}
    (D+sqrt{n}) )) rounds. We extend these results by providing algorithms with a
    complexity parametric on (B). We show that, for any (B=Omega(log n)), there
    exists an algorithm that constructs a MST in (widetilde{O}(D + sqrt{n/B}))
    rounds, and an algorithm that ((1+epsilon))-approximate the SSSP problem in
    (widetilde{O}(epsilon^{-O(1)} (D+sqrt{n/B}) )) rounds. We also show that
    there exist problems that are bandwidth insensitive.

    Name Independent Fault Tolerant Routing Scheme

    Alkida Balliu, Dennis Olivetti
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    We consider the problem of routing in presence of faults in undirected
    weighted graphs. More specifically, we focus on the design of compact
    name-independent fault-tolerant routing schemes, where the designer of the
    scheme is not allowed to assign names to nodes, i.e., the name of a node is
    just its identifier. Given a set (F) of faulty (or forbidden) edges, the goal
    is to route from a source node (s) to a target (t) avoiding the forbidden edges
    in (F).

    Given any name-dependent fault-tolerant routing scheme and any
    name-independent routing scheme, we show how to use them as a black box to
    construct a name-independent fault-tolerant routing scheme. In particular, we
    present a name-independent routing scheme able to handle any set (F) of
    forbidden edges. This has stretch (O(k^2,|F|^3(|F|+log^2 n)log D)), where
    (D) is the diameter of the graph. It uses tables of size ( widetilde{O}(k,
    n^{1/k}(k + deg(v)))) bits at every node (v), where (deg(v)) is the degree of
    node (v). In the context of networks that suffer only from occasional failures,
    we present a name-independent routing scheme that handles only (1) fault at a
    time, and another routing scheme that handles at most (2) faults at a time. The
    former uses (widetilde{O}(k^2, n^{1/k} + k,deg(v))) bits of memory per node,
    with stretch (O(k^3log D)). The latter consumes in average ( widetilde{O}(k^2
    ,n^{1/k} + deg(v))) bits of memory per node, with stretch (O(k^2log D)).

    Certification of Compact Low-Stretch Routing Schemes

    Alkida Balliu, Pierre Fraigniaud
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    On the one hand, the correctness of routing protocols in networks is an issue
    of utmost importance for guaranteeing the delivery of messages from any source
    to any target. On the other hand, a large collection of routing schemes have
    been proposed during the last two decades, with the objective of transmitting
    messages along short routes, while keeping the routing tables small.
    Regrettably, all these schemes share the property that an adversary may modify
    the content of the routing tables with the objective of, e.g., blocking the
    delivery of messages between some pairs of nodes, without being detected by any
    node.

    In this paper, we present a simple certification mechanism which enables the
    nodes to locally detect any alteration of their routing tables. In particular,
    we show how to locally verify the stretch-3 routing scheme by Thorup and Zwick
    [SPAA 2001] by adding certificates of (widetilde{O}(sqrt{n})) bits at each
    node in (n)-node networks, that is, by keeping the memory size of the same
    order of magnitude as the original routing tables. We also propose a new
    name-independent routing scheme using routing tables of size
    (widetilde{O}(sqrt{n})) bits. This new routing scheme can be locally verified
    using certificates on (widetilde{O}(sqrt{n})) bits. Its stretch is3 if using
    handshaking, and 5 otherwise.

    The Design, Implementation, and Deployment of a System to Transparently Compress Hundreds of Petabytes of Image Files for a File-Storage Service

    Daniel Reiter Horn, Ken Elkabany, Chris Lesniewski-Laas, Keith Winstein
    Comments: 12 pages
    Journal-ref: Proc. NSDI 2017, Boston. p1-15
    Subjects: Multimedia (cs.MM); Distributed, Parallel, and Cluster Computing (cs.DC); Graphics (cs.GR); Networking and Internet Architecture (cs.NI)

    We report the design, implementation, and deployment of Lepton, a
    fault-tolerant system that losslessly compresses JPEG images to 77% of their
    original size on average. Lepton replaces the lowest layer of baseline JPEG
    compression-a Huffman code-with a parallelized arithmetic code, so that the
    exact bytes of the original JPEG file can be recovered quickly. Lepton matches
    the compression efficiency of the best prior work, while decoding more than
    nine times faster and in a streaming manner. Lepton has been released as
    open-source software and has been deployed for a year on the Dropbox
    file-storage backend. As of February 2017, it had compressed more than 203 PiB
    of user JPEG files, saving more than 46 PiB.

    User-driven Intelligent Interface on the Basis of Multimodal Augmented Reality and Brain-Computer Interaction for People with Functional Disabilities

    S. Stirenko, Yu. Gordienko, T. Shemsedinov, O. Alienin, Yu. Kochura, N. Gordienko, A. Rojbi, J.R. López Benito, E. Artetxe González
    Comments: 6 pages, 3 figures, 1 table, IEEE First Ukraine Conference on Electrical and Computer Engineering (UKRCON-2017)
    Subjects: Human-Computer Interaction (cs.HC); Artificial Intelligence (cs.AI); Distributed, Parallel, and Cluster Computing (cs.DC)

    The analysis of the current integration attempts of some modes and use cases
    of user-machine interaction is presented. The new concept of the user-driven
    intelligent interface is proposed on the basis of multimodal augmented reality
    and brain-computer interaction for various applications: in disabilities
    studies, education, home care, health care, etc. The several use cases of
    multimodal augmentation are presented. The perspectives of the better human
    comprehension by the immediate feedback through neurophysical channels by means
    of brain-computer interaction are outlined. It is shown that brain-computer
    interface (BCI) technology provides new strategies to overcome limits of the
    currently available user interfaces, especially for people with functional
    disabilities. The results of the previous studies of the low end consumer and
    open-source BCI-devices allow us to conclude that combination of machine
    learning (ML), multimodal interactions (visual, sound, tactile) with BCI will
    profit from the immediate feedback from the actual neurophysical reactions
    classified by ML methods. In general, BCI in combination with other modes of AR
    interaction can deliver much more information than these types of interaction
    themselves. Even in the current state the combined AR-BCI interfaces could
    provide the highly adaptable and personal services, especially for people with
    functional disabilities.


    Learning

    Dynamic Graph Convolutional Networks

    Franco Manessi, Alessandro Rozza, Mario Manzo
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    Many different classification tasks need to manage structured data, which are
    usually modeled as graphs. Moreover, these graphs can be dynamic, meaning that
    the vertices/edges of each graph may change during time. Our goal is to jointly
    exploit structured data and temporal information through the use of a neural
    network model. To the best of our knowledge, this task has not been addressed
    using these kind of architectures. For this reason, we propose two novel
    approaches, which combine Long Short-Term Memory networks and Graph
    Convolutional Networks to learn long short-term dependencies together with
    graph structure. The quality of our methods is confirmed by the promising
    results achieved.

    Softmax GAN

    Min Lin
    Comments: NIPS 2017 submission
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Softmax GAN is a novel variant of Generative Adversarial Network (GAN). The
    key idea of Softmax GAN is to replace the classification loss in the original
    GAN with a softmax cross-entropy loss in the sample space of one single batch.
    In the adversarial learning of (N) real training samples and (M) generated
    samples, the target of discriminator training is to distribute all the
    probability mass to the real samples, each with probability (frac{1}{M}), and
    distribute zero probability to generated data. In the generator training phase,
    the target is to assign equal probability to all data points in the batch, each
    with probability (frac{1}{M+N}). While the original GAN is closely related to
    Noise Contrastive Estimation (NCE), we show that Softmax GAN is the Importance
    Sampling version of GAN. We futher demonstrate with experiments that this
    simple change stabilizes GAN training.

    Every Untrue Label is Untrue in its Own Way: Controlling Error Type with the Log Bilinear Loss

    Yehezkel S. Resheff, Amit Mandelbaum, Daphna Weinshall
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    Deep learning has become the method of choice in many application domains of
    machine learning in recent years, especially for multi-class classification
    tasks. The most common loss function used in this context is the cross-entropy
    loss, which reduces to the log loss in the typical case when there is a single
    correct response label. While this loss is insensitive to the identity of the
    assigned class in the case of misclassification, in practice it is often the
    case that some errors may be more detrimental than others. Here we present the
    bilinear-loss (and related log-bilinear-loss) which differentially penalizes
    the different wrong assignments of the model. We thoroughly test this method
    using standard models and benchmark image datasets. As one application, we show
    the ability of this method to better contain error within the correct
    super-class, in the hierarchically labeled CIFAR100 dataset, without affecting
    the overall performance of the classifier.

    Multi-view Probability Linear Discrimination Analysis for Multi-view Vector Based Text Dependent Speaker Verification

    Ziqiang Shi, Rujie Liu
    Subjects: Learning (cs.LG)

    In this paper, we explicitly extract and model jointly multi-view information
    from short utterances of the individuals, such as speaker identity and text
    contents. During the development stage, a deep neural network (DNN) that will
    be used to extract j-vector, is initialized and trained with the speech frames
    as input and the actual side information of the utterance as flat output
    block-wise one-hot labels. In the case of text dependent speaker verification,
    since there is no one-one mapping between input frames and text content labels,
    a syllable aware DNN is trained to provide compact lexical representation, the
    s-vector of the utterance. These two vectors (j-vector and s-vector) will be
    combined together to become a multi-view vector representation of the utterance
    during the enrollment and evaluation stages. In order to better describe such
    multi-view vectors, we propose a multi-view probability linear discriminant
    analysis (PLDA) model which incorporates both within-speaker/text and
    between-speaker/text variation. In verification we calculate the likelihood
    that the two multi-view vectors belonging to the same speaker and text or not,
    and the likelihood will be used in decision-making. Large scale experiments for
    the open-set condition showed that our approach leads to 0.26\% EER, 2.7\% EER,
    and 1.34\% EER for impost wrong, impostor correct, and target wrong
    respectively.

    Fast Generation for Convolutional Autoregressive Models

    Prajit Ramachandran, Tom Le Paine, Pooya Khorrami, Mohammad Babaeizadeh, Shiyu Chang, Yang Zhang, Mark A. Hasegawa-Johnson, Roy H. Campbell, Thomas S. Huang
    Comments: Accepted at ICLR 2017 Workshop
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

    Convolutional autoregressive models have recently demonstrated
    state-of-the-art performance on a number of generation tasks. While fast,
    parallel training methods have been crucial for their success, generation is
    typically implemented in a na”{i}ve fashion where redundant computations are
    unnecessarily repeated. This results in slow generation, making such models
    infeasible for production environments. In this work, we describe a method to
    speed up generation in convolutional autoregressive models. The key idea is to
    cache hidden states to avoid redundant computation. We apply our fast
    generation method to the Wavenet and PixelCNN++ models and achieve up to
    (21 imes) and (183 imes) speedups respectively.

    SAFS: A Deep Feature Selection Approach for Precision Medicine

    Milad Zafar Nezhad, Dongxiao Zhu, Xiangrui Li, Kai Yang, Phillip Levy
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    In this paper, we propose a new deep feature selection method based on deep
    architecture. Our method uses stacked auto-encoders for feature representation
    in higher-level abstraction. We developed and applied a novel feature learning
    approach to a specific precision medicine problem, which focuses on assessing
    and prioritizing risk factors for hypertension (HTN) in a vulnerable
    demographic subgroup (African-American). Our approach is to use deep learning
    to identify significant risk factors affecting left ventricular mass indexed to
    body surface area (LVMI) as an indicator of heart damage risk. The results show
    that our feature learning and representation approach leads to better results
    in comparison with others.

    Robust Wirtinger Flow for Phase Retrieval with Arbitrary Corruption

    Jinghui Chen, Lingxiao Wang, Xiao Zhang, Quanquan Gu
    Comments: 32 pages, 5 figures, 2 tables
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    We consider the phase retrieval problem of recovering the unknown signal from
    the magnitude-only measurements, where the measurements can be contaminated by
    both sparse arbitrary corruption and bounded random noise. We propose a new
    nonconvex algorithm for robust phase retrieval, namely Robust Wirtinger Flow,
    to jointly estimate the unknown signal and the sparse corruption. We show that
    our proposed algorithm is guaranteed to converge linearly to the unknown true
    signal up to a minimax optimal statistical precision in such a challenging
    setting. Compared with existing robust phase retrieval methods, we improved the
    statistical error rate by a factor of (sqrt{n/m}) where (n) is the dimension
    of the signal and (m) is the sample size, provided a refined characterization
    of the corruption fraction requirement, and relaxed the lower bound condition
    on the number of corruption. In the noise-free case, our algorithm converges to
    the unknown signal at a linear rate and achieves optimal sample complexity up
    to a logarithm factor. Thorough experiments on both synthetic and real datasets
    corroborate our theory.

    Segmentation of the Proximal Femur from MR Images using Deep Convolutional Neural Networks

    Cem M. Deniz, Spencer Hallyburton, Arakua Welbeck, Stephen Honig, Kyunghyun Cho, Gregory Chang
    Comments: 7 pages, 5 figures, and one table
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)

    Osteoporosis is a public health problem characterized by increased fracture
    risk secondary to low bone mass and microarchitectural deterioration of bone
    tissue. Almost all fractures of the hip require hospitalization and major
    surgery. Early diagnosis of osteoporosis plays an important role in preventing
    osteoporotic fracture. Magnetic resonance imaging (MRI) has been successfully
    performed to image trabecular bone architecture in vivo proving itself as the
    practical imaging modality for bone quality assessment. However, segmentation
    of the whole proximal femur is required to measure bone quality and assess
    fracture risk precisely. Manual segmentation of the proximal femur is
    time-intensive, limiting the use of MRI measurements in the clinical practice.
    To overcome this bottleneck, robust automatic proximal femur segmentation
    method is required. In this paper, we propose to use deep convolutional neural
    networks (CNNs) for an automatic proximal femur segmentation using structural
    MR images. We constructed a dataset with 62 volumetric MR scans that are
    manually-segmented for proximal femur. We performed experiments using two
    different CNN architectures and achieved a high dice similarity score of 0.95.

    Learning to Acquire Information

    Yewen Pu, Leslie P Kaelbling, Armando Solar-Lezama
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    We consider the problem of diagnosis where a set of simple observations are
    used to infer a potentially complex hidden hypothesis. Finding the optimal
    subset of observations is intractable in general, thus we focus on the problem
    of active diagnosis, where the agent selects the next most-informative
    observation based on the results of previous observations. We show that under
    the assumption of uniform observation entropy, one can build an implication
    model which directly predicts the outcome of the potential next observation
    conditioned on the results of past observations, and selects the observation
    with the maximum entropy. This approach enjoys reduced computation complexity
    by bypassing the complicated hypothesis space, and can be trained on
    observation data alone, learning how to query without knowledge of the hidden
    hypothesis.

    Knowledge Fusion via Embeddings from Text, Knowledge Graphs, and Images

    Steffen Thoma, Achim Rettinger, Fabian Both
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    We present a baseline approach for cross-modal knowledge fusion. Different
    basic fusion methods are evaluated on existing embedding approaches to show the
    potential of joining knowledge about certain concepts across modalities in a
    fused concept representation.

    End-to-end representation learning for Correlation Filter based tracking

    Jack Valmadre, Luca Bertinetto, João F. Henriques, Andrea Vedaldi, Philip H. S. Torr
    Comments: To appear at CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    The Correlation Filter is an algorithm that trains a linear template to
    discriminate between images and their translations. It is well suited to object
    tracking because its formulation in the Fourier domain provides a fast
    solution, enabling the detector to be re-trained once per frame. Previous works
    that use the Correlation Filter, however, have adopted features that were
    either manually designed or trained for a different task. This work is the
    first to overcome this limitation by interpreting the Correlation Filter
    learner, which has a closed-form solution, as a differentiable layer in a deep
    neural network. This enables learning deep features that are tightly coupled to
    the Correlation Filter. Experiments illustrate that our method has the
    important practical benefit of allowing lightweight architectures to achieve
    state-of-the-art performance at high framerates.

    Retrospective Higher-Order Markov Processes for User Trails

    Tao Wu, David Gleich
    Subjects: Social and Information Networks (cs.SI); Learning (cs.LG); Machine Learning (stat.ML)

    Users form information trails as they browse the web, checkin with a
    geolocation, rate items, or consume media. A common problem is to predict what
    a user might do next for the purposes of guidance, recommendation, or
    prefetching. First-order and higher-order Markov chains have been widely used
    methods to study such sequences of data. First-order Markov chains are easy to
    estimate, but lack accuracy when history matters. Higher-order Markov chains,
    in contrast, have too many parameters and suffer from overfitting the training
    data. Fitting these parameters with regularization and smoothing only offers
    mild improvements. In this paper we propose the retrospective higher-order
    Markov process (RHOMP) as a low-parameter model for such sequences. This model
    is a special case of a higher-order Markov chain where the transitions depend
    retrospectively on a single history state instead of an arbitrary combination
    of history states. There are two immediate computational advantages: the number
    of parameters is linear in the order of the Markov chain and the model can be
    fit to large state spaces. Furthermore, by providing a specific structure to
    the higher-order chain, RHOMPs improve the model accuracy by efficiently
    utilizing history states without risks of overfitting the data. We demonstrate
    how to estimate a RHOMP from data and we demonstrate the effectiveness of our
    method on various real application datasets spanning geolocation data, review
    sequences, and business locations. The RHOMP model uniformly outperforms
    higher-order Markov chains, Kneser-Ney regularization, and tensor
    factorizations in terms of prediction accuracy.

    Monte Carlo Tree Search with Sampled Information Relaxation Dual Bounds

    Daniel R. Jiang, Lina Al-Kanj, Warren B. Powell
    Comments: 33 pages, 6 figures
    Subjects: Optimization and Control (math.OC); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Monte Carlo Tree Search (MCTS), most famously used in game-play artificial
    intelligence (e.g., the game of Go), is a well-known strategy for constructing
    approximate solutions to sequential decision problems. Its primary innovation
    is the use of a heuristic, known as a default policy, to obtain Monte Carlo
    estimates of downstream values for states in a decision tree. This information
    is used to iteratively expand the tree towards regions of states and actions
    that an optimal policy might visit. However, to guarantee convergence to the
    optimal action, MCTS requires the entire tree to be expanded asymptotically. In
    this paper, we propose a new technique called Primal-Dual MCTS that utilizes
    sampled information relaxation upper bounds on potential actions, creating the
    possibility of “ignoring” parts of the tree that stem from highly suboptimal
    choices. This allows us to prove that despite converging to a partial decision
    tree in the limit, the recommended action from Primal-Dual MCTS is optimal. The
    new approach shows significant promise when used to optimize the behavior of a
    single driver navigating a graph while operating on a ride-sharing platform.
    Numerical experiments on a real dataset of 7,000 trips in New Jersey suggest
    that Primal-Dual MCTS improves upon standard MCTS by producing deeper decision
    trees and exhibits a reduced sensitivity to the size of the action space.

    Semi-supervised classification for dynamic Android malware detection

    Li Chen, Mingwei Zhang, Chih-Yuan Yang, Ravi Sahita
    Subjects: Cryptography and Security (cs.CR); Learning (cs.LG); Machine Learning (stat.ML)

    A growing number of threats to Android phones creates challenges for malware
    detection. Manually labeling the samples into benign or different malicious
    families requires tremendous human efforts, while it is comparably easy and
    cheap to obtain a large amount of unlabeled APKs from various sources.
    Moreover, the fast-paced evolution of Android malware continuously generates
    derivative malware families. These families often contain new signatures, which
    can escape detection when using static analysis. These practical challenges can
    also cause traditional supervised machine learning algorithms to degrade in
    performance.

    In this paper, we propose a framework that uses model-based semi-supervised
    (MBSS) classification scheme on the dynamic Android API call logs. The
    semi-supervised approach efficiently uses the labeled and unlabeled APKs to
    estimate a finite mixture model of Gaussian distributions via conditional
    expectation-maximization and efficiently detects malwares during out-of-sample
    testing. We compare MBSS with the popular malware detection classifiers such as
    support vector machine (SVM), (k)-nearest neighbor (kNN) and linear
    discriminant analysis (LDA). Under the ideal classification setting, MBSS has
    competitive performance with 98\% accuracy and very low false positive rate for
    in-sample classification. For out-of-sample testing, the out-of-sample test
    data exhibit similar behavior of retrieving phone information and sending to
    the network, compared with in-sample training set. When this similarity is
    strong, MBSS and SVM with linear kernel maintain 90\% detection rate while
    (k)NN and LDA suffer great performance degradation. When this similarity is
    slightly weaker, all classifiers degrade in performance, but MBSS still
    performs significantly better than other classifiers.

    An Interpretable Knowledge Transfer Model for Knowledge Base Completion

    Qizhe Xie, Xuezhe Ma, Zihang Dai, Eduard Hovy
    Comments: ACL 2017
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Knowledge bases are important resources for a variety of natural language
    processing tasks but suffer from incompleteness. We propose a novel embedding
    model, emph{ITransF}, to perform knowledge base completion. Equipped with a
    sparse attention mechanism, ITransF discovers hidden concepts of relations and
    transfer statistical strength through the sharing of concepts. Moreover, the
    learned associations between relations and concepts, which are represented by
    sparse attention vectors, can be interpreted easily. We evaluate ITransF on two
    benchmark datasets—WN18 and FB15k for knowledge base completion and obtains
    improvements on both the mean rank and Hits@10 metrics, over all baselines that
    do not use additional information.

    End-to-End Multi-View Networks for Text Classification

    Hongyu Guo, Colin Cherry, Jiang Su
    Comments: 6 pages
    Subjects: Computation and Language (cs.CL); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    We propose a multi-view network for text classification. Our method
    automatically creates various views of its input text, each taking the form of
    soft attention weights that distribute the classifier’s focus among a set of
    base features. For a bag-of-words representation, each view focuses on a
    different subset of the text’s words. Aggregating many such views results in a
    more discriminative and robust representation. Through a novel architecture
    that both stacks and concatenates views, we produce a network that emphasizes
    both depth and width, allowing training to converge quickly. Using our
    multi-view architecture, we establish new state-of-the-art accuracies on two
    benchmark tasks.


    Information Theory

    On the DoF of Parallel MISO BCs with Partial CSIT: Total Order and Separability

    Hamdi Joudeh, Bruno Clerckx
    Subjects: Information Theory (cs.IT)

    We study the degrees of freedom (DoF) of a (K)-user parallel MISO broadcast
    channel with arbitrary levels of partial CSIT over each subchannel. We derive a
    sum-DoF upperbound which depends on the average CSIT quality of each user. This
    upperbound is shown to be tight under total order, i.e. when the order of users
    with respect to their CSIT qualities is preserved over all subchannels. In this
    case, it is shown that separate coding over each subchannel is optimum in a
    sum-DoF sense.

    A Counterexample to the Vector Generalization of Costa's EPI, and Partial Resolution

    Thomas A. Courtade, Guangyue Han, Yaochen Wu
    Comments: 3 pages
    Subjects: Information Theory (cs.IT)

    We give a counterexample to the vector generalization of Costa’s entropy
    power inequality (EPI) due to Liu, Liu, Poor and Shamai. In particular, the
    claimed inequality can fail if the matix-valued parameter in the convex
    combination does not commute with the covariance of the additive Gaussian
    noise. Conversely, the inequality holds if these two matrices commute.

    An Achievable Rate for an Optical Channel with Finite Memory

    K Gautam Shenoy, Vinod Sharma
    Comments: 7 pages, 5 figures. Submitted to IEEE Globecom 2017 conference
    Subjects: Information Theory (cs.IT)

    A fiber optic channel is modeled in a variety of ways; from the simple
    additive white complex Gaussian noise model, to models that incorporate memory
    in the channel. Because of Kerr nonlinearity, a simple model is not a good
    approximation to an optical fiber. Hence we study a fiber optic channel with
    finite memory and provide an achievable bound on channel capacity that improves
    upon a previously known bound.

    Graph-based Joint Signal / Power Restoration for Energy Harvesting Wireless Sensor Networks

    Megumi Kaneko, Gene Cheung, Weng-tai Su, Chia-Wen Lin
    Subjects: Information Theory (cs.IT)

    The design of energy and spectrally efficient Wireless Sensor Networks (WSN)
    is crucial to support the upcoming expansion of IoT/M2M mobile data traffic. In
    this work, we consider an energy harvesting WSN where sensor data are
    periodically reported to a Fusion Center (FC) by a sparse set of active
    sensors. Unlike most existing works, the transmit power levels of each sensor
    are assumed to be unknown at the FC in this distributed setting. We address the
    inverse problem of joint signal / power restoration at the FC- a challenging
    under-determined separation problem. To regularize the ill-posed problem, we
    assume both a graph-signal smoothness prior (signal is smooth with respect to a
    graph modeling spatial correlation among sensors) and a sparsity power prior
    for the two unknown variables. We design an efficient algorithm by alternately
    fixing one variable and solving for the other until convergence. Specifically,
    when the signal is fixed, we solve for the power vector using Simplex pivoting
    in linear programming (LP) to iteratively identify sparse feasible solutions,
    locally minimizing an objective. Simulation results show that our proposal can
    achieve very low reconstruction errors and outperform conventional schemes
    under reasonable assumptions

    On the Success Probability of the Box-Constrained Rounding and Babai Detectors

    Xiao-Wen Chang, Jinming Wen, Chintha Tellambura
    Comments: to appear in ISIT 2017
    Subjects: Information Theory (cs.IT)

    In communications, one frequently needs to detect a parameter vector (hbx)
    in a box from a linear model. The box-constrained rounding detector (x^sBR)
    and Babai detector (x^sBB) are often used to detect (hbx) due to their high
    probability of correct detection, which is referred to as success probability,
    and their high efficiency of implimentation. It is generally believed that the
    success probability (P^sBR) of (x^sBR) is not larger than the success
    probability (P^sBB) of (x^sBB). In this paper, we first present formulas for
    (P^sBR) and (P^sBB) for two different situations: (hbx) is deterministic and
    (hbx) is uniformly distributed over the constraint box. Then, we give a simple
    example to show that (P^sBR) may be strictly larger than (P^sBB) if (hbx) is
    deterministic, while we rigorously show that (P^sBRleq P^sBB) always holds
    if (hbx) is uniformly distributed over the constraint box.

    Piggybacking Codes for Network Coding: The High/Low SNR Regime

    Samah A. M. Ghanem
    Subjects: Information Theory (cs.IT)

    We propose a piggybacking scheme for network coding where strong source
    inputs piggyback the weaker ones, a scheme necessary and sufficient to achieve
    the cut-set upper bound at high/low-snr regime, a new asymptotically optimal
    operational regime for the multihop Amplify and Forward (AF) networks.




沪ICP备19023445号-2号
友情链接