IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Mon, 28 Nov 2016

    我爱机器学习(52ml.net)发表于 2016-11-28 00:00:00
    love 0

    Neural and Evolutionary Computing

    Learning Python Code Suggestion with a Sparse Pointer Network

    Avishkar Bhoopchand, Tim Rocktäschel, Earl Barr, Sebastian Riedel
    Comments: Under review as a conference paper at ICLR 2017
    Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Software Engineering (cs.SE)

    To enhance developer productivity, all modern integrated development
    environments (IDEs) include code suggestion functionality that proposes likely
    next tokens at the cursor. While current IDEs work well for statically-typed
    languages, their reliance on type annotations means that they do not provide
    the same level of support for dynamic programming languages as for
    statically-typed languages. Moreover, suggestion engines in modern IDEs do not
    propose expressions or multi-statement idiomatic code. Recent work has shown
    that language models can improve code suggestion systems by learning from
    software repositories. This paper introduces a neural language model with a
    sparse pointer network aimed at capturing very long-range dependencies. We
    release a large-scale code suggestion corpus of 41M lines of Python code
    crawled from GitHub. On this corpus, we found standard neural language models
    to perform well at suggesting local phenomena, but struggle to refer to
    identifiers that are introduced many tokens in the past. By augmenting a neural
    language model with a pointer network specialized in referring to predefined
    classes of identifiers, we obtain a much lower perplexity and a 5 percentage
    points increase in accuracy for code suggestion compared to an LSTM baseline.
    In fact, this increase in code suggestion accuracy is due to a 13 times more
    accurate prediction of identifiers. Furthermore, a qualitative analysis shows
    this model indeed captures interesting long-range dependencies, like referring
    to a class member defined over 60 tokens in the past.

    Survey of Expressivity in Deep Neural Networks

    Maithra Raghu, Ben Poole, Jon Kleinberg, Surya Ganguli, Jascha Sohl-Dickstein
    Comments: Presented at NIPS 2016 Workshop on Interpretable Machine Learning in Complex Systems
    Subjects: Machine Learning (stat.ML); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    We survey results on neural network expressivity described in “On the
    Expressive Power of Deep Neural Networks”. The paper motivates and develops
    three natural measures of expressiveness, which all display an exponential
    dependence on the depth of the network. In fact, all of these measures are
    related to a fourth quantity, trajectory length. This quantity grows
    exponentially in the depth of the network, and is responsible for the depth
    sensitivity observed. These results translate to consequences for networks
    during and after training. They suggest that parameters earlier in a network
    have greater influence on its expressive power — in particular, given a layer,
    its influence on expressivity is determined by the remaining depth of the
    network after that layer. This is verified with experiments on MNIST and
    CIFAR-10. We also explore the effect of training on the input-output map, and
    find that it trades off between the stability and expressivity.


    Computer Vision and Pattern Recognition

    PVANet: Lightweight Deep Neural Networks for Real-time Object Detection

    Sanghoon Hong, Byungseok Roh, Kye-Hyeon Kim, Yeongjae Cheon, Minje Park
    Comments: Presented at NIPS 2016 Workshop on Efficient Methods for Deep Neural Networks (EMDNN). Continuation of arXiv:1608.08021
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In object detection, reducing computational cost is as important as improving
    accuracy for most practical usages. This paper proposes a novel network
    structure, which is an order of magnitude lighter than other state-of-the-art
    networks while maintaining the accuracy. Based on the basic principle of more
    layers with less channels, this new deep neural network minimizes its
    redundancy by adopting recent innovations including C.ReLU and Inception
    structure. We also show that this network can be trained efficiently to achieve
    solid results on well-known object detection benchmarks: 84.9% and 84.2% mAP on
    VOC2007 and VOC2012 while the required compute is less than 10% of the recent
    ResNet-101.

    Learning from Maps: Visual Common Sense for Autonomous Driving

    Ari Seff, Jianxiong Xiao
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Today’s autonomous vehicles rely extensively on high-definition 3D maps to
    navigate the environment. While this approach works well when these maps are
    completely up-to-date, safe autonomous vehicles must be able to corroborate the
    map’s information via a real time sensor-based system. Our goal in this work is
    to develop a model for road layout inference given imagery from on-board
    cameras, without any reliance on high-definition maps. However, no sufficient
    dataset for training such a model exists. Here, we leverage the availability of
    standard navigation maps and corresponding street view images to construct an
    automatically labeled, large-scale dataset for this complex scene understanding
    problem. By matching road vectors and metadata from navigation maps with Google
    Street View images, we can assign ground truth road layout attributes (e.g.,
    distance to an intersection, one-way vs. two-way street) to the images. We then
    train deep convolutional networks to predict these road layout attributes given
    a single monocular RGB image. Experimental evaluation demonstrates that our
    model learns to correctly infer the road attributes using only panoramas
    captured by car-mounted cameras as input. Additionally, our results indicate
    that this method may be suitable to the novel application of recommending
    safety improvements to infrastructure (e.g., suggesting an alternative speed
    limit for a street).

    Online Real time Multiple Spatiotemporal Action Localisation and Prediction on a Single Platform

    Gurkirt Singh, Suman Saha, Fabio Cuzzolin
    Comments: Submitted to CVPR 2017 for review
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We present a method for multiple spatiotemporal (S/T) action localisation,
    classification and early prediction based on a single deep learning platform,
    able to perform in an online fashion and in real time. Our online action
    detection pipeline comprises of three main steps. In step one, two end-to-end
    trainable SSD (Single Shot MultiBox Detector) convolutional neural networks are
    employed to regress and classify detection boxes in each video frame
    potentially containing an action of interest, one from optical flow and one
    from RBG images. In step two, appearance and motion cues are combined by
    merging the detection boxes and classification scores generated by the two
    networks. In step three, sequences of detection boxes most likely to be
    associated with a single action instance, called `action tubes’, are
    constructed incrementally and efficiently in an online fashion, allowing the
    system to perform early action class prediction and spatial localisation in
    real time. Empirical results on the challenging UCF101 and J-HMDB-21 datasets
    demonstrate new state-of-the-art results in S/T action localisation on UCF101,
    even when compared to offline competitors, and superior results on early action
    label prediction across the board. To the best of our knowledge, ours is the
    first platform able to perform online spatial and temporal action localisation.
    The system works in real time (30fps) when computing optical flow
    incrementally, perform better than real time (52fps) when processing only RBG
    images.

    Clickstream analysis for crowd-based object segmentation with confidence

    Eric Heim, Alexander Seitel, Christian Stock, Fabian Isensee, Lena Maier-Hein
    Comments: 13 pages, 12 figures, submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI)
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    With the rapidly increasing interest in machine learning based solutions for
    automatic image annotation, the availability of reference image annotations for
    algorithm training is one of the major bottlenecks in the field. Crowdsourcing
    has evolved as a valuable option for low-cost and large-scale data annotation;
    however, quality control remains a major issue which needs to be addressed. To
    our knowledge, we are the first to investigate the analysis of clickstreams to
    improve crowd-based image segmentation. Our method involves training a
    regressor to estimate a worker’s segmentation performance from the clickstream
    data, where the performance is represented by the DICE similarity coefficient
    (DSC) comparing the worker’s annotation and the reference segmentation. The DSC
    prediction can be used to identify spammers and weight individual workers by
    their (estimated) skill when merging multiple segmentations of one image. Using
    a total of 20,000 crowd annotations performed on publicly available data of
    cars and cats, we show that (1) our method is highly accurate in predicting the
    DSC based on clickstream data and thus outperforms state-of-the-art methods for
    merging multiple annotations and that (2) the regressor does not need to be
    trained on the object class (e.g. cars) that it is applied to (e.g. cats). In
    contrast to previously proposed methods, our approach does not rely on
    additional sanity tasks and can thus be regarded as a low-cost option for spam
    control and confidence analysis in the context of crowd-based image annotation.

    Person Re-Identification by Unsupervised Video Matching

    Xiaolong Ma, Xiatian Zhu, Shaogang Gong, Xudong Xie, Jianming Hu, Kin-Man Lam, Yisheng Zhong
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Most existing person re-identification (ReID) methods rely only on the
    spatial appearance information from either one or multiple person images,
    whilst ignore the space-time cues readily available in video or image-sequence
    data. Moreover, they often assume the availability of exhaustively labelled
    cross-view pairwise data for every camera pair, making them non-scalable to
    ReID applications in real-world large scale camera networks. In this work, we
    introduce a novel video based person ReID method capable of accurately matching
    people across views from arbitrary unaligned image-sequences without any
    labelled pairwise data. Specifically, we introduce a new space-time person
    representation by encoding multiple granularities of spatio-temporal dynamics
    in form of time series. Moreover, a Time Shift Dynamic Time Warping (TS-DTW)
    model is derived for performing automatically alignment whilst achieving data
    selection and matching between inherently inaccurate and incomplete sequences
    in a unified way. We further extend the TS-DTW model for accommodating multiple
    feature-sequences of an image-sequence in order to fuse information from
    different descriptions. Crucially, this model does not require pairwise
    labelled training data (i.e. unsupervised) therefore readily scalable to large
    scale camera networks of arbitrary camera pairs without the need for exhaustive
    data annotation for every camera pair. We show the effectiveness and advantages
    of the proposed method by extensive comparisons with related state-of-the-art
    approaches using two benchmarking ReID datasets, PRID2011 and iLIDS-VID.

    Multimodal Latent Variable Analysis

    Vardan Papyan, Ronen Talmon
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Consider a set of multiple, multimodal sensors capturing a complex system or
    a physical phenomenon of interest. Our primary goal is to distinguish the
    underlying sources of variability manifested in the measured data. The first
    step in our analysis is to find the common source of variability present in all
    sensor measurements. We base our work on a recent paper, which tackles this
    problem with alternating diffusion (AD). In this work, we suggest to further
    the analysis by extracting the sensor-specific variables in addition to the
    common source. We propose an algorithm, which we analyze theoretically, and
    then demonstrate on three different applications: a synthetic example, a toy
    problem, and the task of fetal ECG extraction.

    Discriminative Correlation Filter with Channel and Spatial Reliability

    Alan Lukežič, Tomáš Vojíř, Luka Čehovin, Jiří Matas, Matej Kristan
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Short-term tracking is an open and challenging problem for which
    discriminative correlation filters (DCF) have shown excellent performance. We
    introduce the channel and spatial reliability concepts to DCF tracking and
    provide a novel learning algorithm for its efficient and seamless integration
    in the filter update and the tracking process. The spatial reliability map
    adjusts the filter support to the part of the object suitable for tracking.
    This both allows to enlarge the search region and improves tracking of
    non-rectangular objects. Reliability scores reflect channel-wise quality of the
    learned filters and are used as feature weighting coefficients in localization.
    Experimentally, with only two simple standard features, HoGs and Colornames,
    the novel CSR-DCF method — DCF with Channel and Spatial Reliability —
    achieves state-of-the-art results on VOT 2016, VOT 2015 and OTB100. The CSR-DCF
    runs in real-time on a CPU.

    Semantic Segmentation using Adversarial Networks

    Pauline Luc, Camille Couprie, Soumith Chintala, Jakob Verbeek
    Journal-ref: NIPS Workshop on Adversarial Training, Dec 2016, Barcelona, Spain
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Adversarial training has been shown to produce state of the art results for
    generative image modeling. In this paper we propose an adversarial training
    approach to train semantic segmentation models. We train a convolutional
    semantic segmentation network along with an adversarial network that
    discriminates segmentation maps coming either from the ground truth or from the
    segmentation network. The motivation for our approach is that it can detect and
    correct higher-order inconsistencies between ground truth segmentation maps and
    the ones produced by the segmentation net. Our experiments show that our
    adversarial training approach leads to improved accuracy on the Stanford
    Background and PASCAL VOC 2012 datasets.

    Geometric deep learning on graphs and manifolds using mixture model CNNs

    Federico Monti, Davide Boscaini, Jonathan Masci, Emanuele Rodolà, Michael M. Bronstein
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Deep learning has achieved a remarkable performance breakthrough in several
    fields, most notably in speech recognition, natural language processing, and
    computer vision. In particular, convolutional neural network (CNN)
    architectures currently produce state-of-the-art performance on a variety of
    image analysis tasks such as object detection and recognition. Most of deep
    learning research has so far focused on dealing with 1D, 2D, or 3D
    Euclidean-structured data such as acoustic signals, images, or videos.
    Recently, there has been an increasing interest in geometric deep learning,
    attempting to generalize deep learning methods to non-Euclidean structured data
    such as graphs and manifolds, with a variety of applications from the domains
    of network analysis, computational social science, or computer graphics. In
    this paper, we propose a unified framework allowing to generalize CNN
    architectures to non-Euclidean domains (graphs and manifolds) and learn local,
    stationary, and compositional task-specific features. We show that various
    non-Euclidean CNN methods previously proposed in the literature can be
    considered as particular instances of our framework. We test the proposed
    method on standard tasks from the realms of image-, graph- and 3D shape
    analysis and show that it consistently outperforms previous approaches.

    Color Constancy with Derivative Colors

    Huan Lei, Guang Jiang, Long Quan
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Information about the illuminant color is well contained in both achromatic
    regions and the specular components of highlight regions. In this paper, we
    propose a novel way to achieve color constancy by exploiting such clues. The
    key to our approach lies in the use of suitably extracted derivative colors,
    which are able to compute the illuminant color robustly with kernel density
    estimation. While extracting derivative colors from achromatic regions to
    approximate the illuminant color well is basically straightforward, the success
    of our extraction in highlight regions is attributed to the different rates of
    variation of the diffuse and specular magnitudes in the dichromatic reflection
    model. The proposed approach requires no training phase and is simple to
    implement. More significantly, it performs quite satisfactorily under
    inter-database parameter settings. Our experiments on three standard databases
    demonstrate its effectiveness and fine performance in comparison to
    state-of-the-art methods.

    Deep Video Deblurring

    Shuochen Su, Mauricio Delbracio, Jue Wang, Guillermo Sapiro, Wolfgang Heidrich, Oliver Wang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Motion blur from camera shake is a major problem in videos captured by
    hand-held devices. Unlike single-image deblurring, video-based approaches can
    take advantage of the abundant information that exists across neighboring
    frames. As a result the best performing methods rely on aligning nearby frames.
    However, aligning images is a computationally expensive and fragile procedure,
    and methods that aggregate information must therefore be able to identify which
    regions have been accurately aligned and which have not, a task which requires
    high level scene understanding. In this work, we introduce a deep learning
    solution to video deblurring, where a CNN is trained end-to-end to learn how to
    accumulate information across frames. To train this network, we collected a
    dataset of real videos recorded with a high framerate camera, which we use to
    generate synthetic motion blur for supervision. We show that the features
    learned from this dataset extend to deblurring motion blur that arises due to
    camera shake in a wide range of videos, and compare the quality of results to a
    number of other baselines.

    Learning an Invariant Hilbert Space for Domain Adaptation

    Samitha Herath, Mehrtash Harandi, Fatih Porikli
    Comments: 23 pages, 7 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper introduces a learning scheme to construct a Hilbert space (i.e., a
    vector space along its inner product) to address both unsupervised and
    semi-supervised domain adaptation problems. This is achieved by learning
    projections from each domain to a latent space along the Mahalanobis metric of
    the latent space to simultaneously minimizing a notion of domain variance while
    maximizing a measure of discriminatory power. In particular, we make use of the
    Riemannian optimization techniques to match statistical properties (e.g., first
    and second order statistics) between samples projected into the latent space
    from different domains. Upon availability of class labels, we further deem
    samples sharing the same label to form more compact clusters while pulling away
    samples coming from different classes.We extensively evaluate and contrast our
    proposal against state-of-the-art methods for the task of visual domain
    adaptation using both handcrafted and deep-net features. Our experiments show
    that even with a simple nearest neighbor classifier, the proposed method can
    outperform several state-of-the-art methods benefitting from more involved
    classification schemes.

    Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes

    Tobias Pohlen, Alexander Hermans, Markus Mathias, Bastian Leibe
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Semantic image segmentation is an essential component of modern autonomous
    driving systems, as an accurate understanding of the surrounding scene is
    crucial to navigation and action planning. Current state-of-the-art approaches
    in semantic image segmentation rely on pre-trained networks that were initially
    developed for classifying images as a whole. While these networks exhibit
    outstanding recognition performance (i.e., what is visible?), they lack
    localization accuracy (i.e., where precisely is something located?). Therefore,
    additional processing steps have to be performed in order to obtain
    pixel-accurate segmentation masks at the full image resolution. To alleviate
    this problem we propose a novel ResNet-like architecture that exhibits strong
    localization and recognition performance. We combine multi-scale context with
    pixel-level accuracy by using two processing streams within our network: One
    stream carries information at the full image resolution, enabling precise
    adherence to segment boundaries. The other stream undergoes a sequence of
    pooling operations to obtain robust features for recognition. The two streams
    are coupled at the full image resolution using residuals. Without additional
    processing steps and without pre-training, our approach achieves an
    intersection-over-union score of 71.8% on the Cityscapes dataset.

    Deep Watershed Transform for Instance Segmentation

    Min Bai, Raquel Urtasun
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Most contemporary approaches to instance segmentation use complex pipelines
    involving conditional random fields, recurrent neural networks, object
    proposals, or template matching schemes. In our paper, we present a simple yet
    powerful end-to-end convolutional neural network to tackle this task. Our
    approach combines intuitions from the classical watershed transform and modern
    deep learning to produce an energy map of the image where object instances are
    unambiguously represented as basins in the energy map. We then perform a cut at
    a single energy level to directly yield connected components corresponding to
    object instances. Our model achieves an increase of 75% in performance on the
    challenging Cityscapes Instance Level Segmentation task over the
    state-of-the-art.

    InstanceCut: from Edges to Instances with MultiCut

    Alexander Kirillov, Evgeny Levinkov, Bjoern Andres, Bogdan Savchynskyy, Carsten Rother
    Comments: The code would be released at this https URL
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This work addresses the task of instance-aware semantic segmentation. Our key
    motivation is to design a simple method with a new modelling-paradigm, which
    therefore has a different trade-off between advantages and disadvantages
    compared to known approaches. Our approach, we term InstanceCut, represents the
    problem by two output modalities: (i) an instance-agnostic semantic
    segmentation and (ii) all instance-boundaries. The former is computed from a
    standard convolutional neural network for semantic segmentation, and the latter
    is derived from a new instance-aware edge detection model. To reason globally
    about the optimal partitioning of an image into instances, we combine these two
    modalities into a novel MultiCut formulation. We evaluate our approach on the
    challenging CityScapes dataset. Despite the conceptual simplicity of our
    approach, we achieve the best result among all published methods, and perform
    particularly well for rare object classes.

    Weakly Supervised Cascaded Convolutional Networks

    Ali Diba, Vivek Sharma, Ali Pazandeh, Hamed Pirsiavash, Luc Van Gool
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Object detection is a challenging task in visual understanding domain, and
    even more so if the supervision is to be weak. Recently, few efforts to handle
    the task without expensive human annotations is established by promising deep
    neural network. A new architecture of cascaded networks is proposed to learn a
    convolutional neural network (CNN) under such conditions. We introduce two such
    architectures, with either two cascade stages or three which are trained in an
    end-to-end pipeline. The first stage of both architectures extracts best
    candidate of class specific region proposals by training a fully convolutional
    network. In the case of the three stage architecture, the middle stage provides
    object segmentation, using the output of the activation maps of first stage.
    The final stage of both architectures is a part of a convolutional neural
    network that performs multiple instance learning on proposals extracted in the
    previous stage(s). Our experiments on the PASCAL VOC 2007, 2010, 2012 and large
    scale object datasets, ILSVRC 2013, 2014 datasets show improvements in the
    areas of weakly-supervised object detection, classification and localization.

    AdaScan: Adaptive Scan Pooling in Deep Convolutional Neural Networks for Human Action Recognition in Videos

    Amlan Kar, Nishant Rai, Karan Sikka, Gaurav Sharma
    Comments: In submission at CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We propose a novel method for temporally pooling frames in a video for the
    task of human action recognition. The method is motivated by the observation
    that there are only a small number of frames which, together, contain
    sufficient information to discriminate an action class present in a video, from
    the rest. The proposed method learns to pool such discriminative and
    informative frames, while discarding a majority of the non-informative frames
    in a single temporal scan of the video. Our algorithm does so by continuously
    predicting the discriminative importance of each video frame and subsequently
    pooling them in a deep learning framework. We show the effectiveness of our
    proposed pooling method on standard benchmarks where it consistently improves
    on baseline pooling methods, with both RGB and optical flow based Convolutional
    networks. Further, in combination with complementary video representations, we
    show results that are competitive with respect to the state-of-the-art results
    on two challenging and publicly available benchmark datasets.

    Where Should You Attend While Driving?

    Andrea Palazzi, Francesco Solera, Simone Calderara, Stefano Alletto, Rita Cucchiara
    Comments: submitted to CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)

    Despite the advent of autonomous cars, it’s likely – at least in the near
    future – that human attention will still maintain a central role as a guarantee
    in terms of legal responsibility during the driving task. In this paper we
    study the dynamics of the driver’s gaze and use it as a proxy to understand
    related attentional mechanisms. First, we build our analysis upon two
    questions: where and what the driver is looking at? Second, we model the
    driver’s gaze by training a coarse-to-fine convolutional network on short
    sequences extracted from the DR(eye)VE dataset. Experimental comparison against
    different baselines reveal that the driver’s gaze can indeed be learnt to some
    extent, despite i) being highly subjective and ii) having only one driver’s
    gaze available for each sequence due to the irreproducibility of the scene.
    Eventually, we advocate for a new assisted driving paradigm which suggests to
    the driver, with no intervention, where she should focus her attention.

    Texture Synthesis with Spatial Generative Adversarial Networks

    Nikolay Jetchev, Urs Bergmann, Roland Vollgraf
    Comments: submitted at NIPS 2016 adversarial learning workshop
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

    Generative adversarial networks (GANs) are a recent approach to train
    generative models of data, which have been shown to work particularly well on
    image data. In the current paper we introduce a new model for texture synthesis
    based on GAN learning. By extending the input noise distribution space from a
    single vector to a whole spatial tensor, we create an architecture with
    properties well suited to the task of texture synthesis, which we call spatial
    GAN (SGAN). To our knowledge, this is the first successful completely
    data-driven texture synthesis method based on GANs.

    Our method has the following features which make it a state of the art
    algorithm for texture synthesis: high image quality of the generated textures,
    very high scalability w.r.t. the output texture size, fast real-time forward
    generation, the ability to fuse multiple diverse source images in complex
    textures. To illustrate these capabilities we present multiple experiments with
    different classes of texture images and use cases. We also discuss some
    limitations of our method with respect to the types of texture images it can
    synthesize, and compare it to other neural techniques for texture generation.

    Domain Adaptation by Mixture of Alignments of Second- or Higher-Order Scatter Tensors

    Piotr Koniusz, Yusuf Tas, Fatih Porikli
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we propose an approach to the domain adaptation, dubbed
    Second- or Higher-order Transfer of Knowledge (So-HoT), based on the mixture of
    alignments of second- or higher-order scatter statistics between the source and
    target domains. The human ability to learn from few labeled samples is a
    recurring motivation in the literature for domain adaptation. Towards this end,
    we investigate the supervised target scenario for which few labeled target
    training samples per category exist. Specifically, we utilize two CNN streams:
    the source and target networks fused at the classifier level. Features from the
    fully connected layers fc7 of each network are used to compute second- or even
    higher-order scatter tensors; one per network stream per class. As the source
    and target distributions are somewhat different despite being related, we align
    the scatters of the two network streams of the same class (within-class
    scatters) to a desired degree with our bespoke loss while maintaining good
    separation of the between-class scatters. We train the entire network in
    end-to-end fashion. We provide evaluations on the standard Office benchmark
    (visual domains), RGB-D combined with Caltech256 (depth-to-rgb transfer) and
    Pascal VOC2007 combined with the TU Berlin dataset (image-to-sketch transfer).
    We attain state-of-the-art results.

    Interferences in match kernels

    Naila Murray, Hervé Jégou, Florent Perronnin, Andrew Zisserman
    Comments: Accepted as regular paper in IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We consider the design of an image representation that embeds and aggregates
    a set of local descriptors into a single vector. Popular representations of
    this kind include the bag-of-visual-words, the Fisher vector and the VLAD. When
    two such image representations are compared with the dot-product, the
    image-to-image similarity can be interpreted as a match kernel. In match
    kernels, one has to deal with interference, i.e. with the fact that even if two
    descriptors are unrelated, their matching score may contribute to the overall
    similarity.

    We formalise this problem and propose two related solutions, both aimed at
    equalising the individual contributions of the local descriptors in the final
    representation. These methods modify the aggregation stage by including a set
    of per-descriptor weights. They differ by the objective function that is
    optimised to compute those weights. The first is a “democratisation” strategy
    that aims at equalising the relative importance of each descriptor in the set
    comparison metric. The second one involves equalising the match of a single
    descriptor to the aggregated vector.

    These concurrent methods give a substantial performance boost over the state
    of the art in image search with short or mid-size vectors, as demonstrated by
    our experiments on standard public image retrieval benchmarks.

    Comparative study of histogram distance measures for re-identification

    Pedro A. Marín-Reyes, Javier Lorenzo-Navarro, Modesto Castrillón-Santana
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Color based re-identification methods usually rely on a distance function to
    measure the similarity between individuals. In this paper we study the behavior
    of several histogram distance measures in different color spaces. We wonder
    whether there is a particular histogram distance measure better than others,
    likewise also, if there is a color space that present better discrimination
    features. Several experiments are designed and evaluated in several images to
    obtain measures against various color spaces. We test in several image
    databases. A measure ranking is generated to calculate the area under the CMC,
    this area is the indicator used to evaluate which distance measure and color
    space present the best performance for the considered databases. Also, other
    parameters such as the image division in horizontal stripes and number of
    histogram bins, have been studied.

    Extraction of airway trees using multiple hypothesis tracking and template matching

    Raghavendra Selvan, Jens Petersen, Jesper H. Pedersen, Marleen de Bruijne
    Comments: 12 pages. Presented at the MICCAI Pulmonary Image Analysis Workshop, Athens, Greece, 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Knowledge of airway tree morphology has important clinical applications in
    diagnosis of chronic obstructive pulmonary disease. We present an automatic
    tree extraction method based on multiple hypothesis tracking and template
    matching for this purpose and evaluate its performance on chest CT images. The
    method is adapted from a semi-automatic method devised for vessel segmentation.
    Idealized tubular templates are constructed that match airway probability
    obtained from a trained classifier and ranked based on their relative
    significance. Several such regularly spaced templates form the local hypotheses
    used in constructing a multiple hypothesis tree, which is then traversed to
    reach decisions. The proposed modifications remove the need for local
    thresholding of hypotheses as decisions are made entirely based on statistical
    comparisons involving the hypothesis tree. The results show improvements in
    performance when compared to the original method and region growing on
    intensity images. We also compare the method with region growing on the
    probability images, where the presented method does not show substantial
    improvement, but we expect it to be less sensitive to local anomalies in the
    data.

    Automatically Building Face Datasets of New Domains from Weakly Labeled Data with Pretrained Models

    Shengyong Ding, Junyu Wu, Wei Xu, Hongyang Chao
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Training data are critical in face recognition systems. However, labeling a
    large scale face data for a particular domain is very tedious. In this paper,
    we propose a method to automatically and incrementally construct datasets from
    massive weakly labeled data of the target domain which are readily available on
    the Internet under the help of a pretrained face model. More specifically,
    given a large scale weakly labeled dataset in which each face image is
    associated with a label, i.e. the name of an identity, we create a graph for
    each identity with edges linking matched faces verified by the existing model
    under a tight threshold. Then we use the maximal subgraph as the cleaned data
    for that identity. With the cleaned dataset, we update the existing face model
    and use the new model to filter the original dataset to get a larger cleaned
    dataset. We collect a large weakly labeled dataset containing 530,560 Asian
    face images of 7,962 identities from the Internet, which will be published for
    the study of face recognition. By running the filtering process, we obtain a
    cleaned datasets (99.7+% purity) of size 223,767 (recall 70.9%). On our testing
    dataset of Asian faces, the model trained by the cleaned dataset achieves
    recognition rate 93.1%, which obviously outperforms the model trained by the
    public dataset CASIA whose recognition rate is 85.9%.

    Geometric deep learning: going beyond Euclidean data

    Michael M. Bronstein, Joan Bruna, Yann LeCun, Arthur Szlam, Pierre Vandergheynst
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Many signal processing problems involve data whose underlying structure is
    non-Euclidean, but may be modeled as a manifold or (combinatorial) graph. For
    instance, in social networks, the characteristics of users can be modeled as
    signals on the vertices of the social graph. Sensor networks are graph models
    of distributed interconnected sensors, whose readings are modelled as
    time-dependent signals on the vertices. In genetics, gene expression data are
    modeled as signals defined on the regulatory network. In neuroscience, graph
    models are used to represent anatomical and functional structures of the brain.
    Modeling data given as points in a high-dimensional Euclidean space using
    nearest neighbor graphs is an increasingly popular trend in data science,
    allowing practitioners access to the intrinsic structure of the data. In
    computer graphics and vision, 3D objects are modeled as Riemannian manifolds
    (surfaces) endowed with properties such as color texture. Even more complex
    examples include networks of operators, e.g., functional correspondences or
    difference operators in a collection of 3D shapes, or orientations of
    overlapping cameras in multi-view vision (“structure from motion”) problems.
    The complexity of geometric data and the availability of very large datasets
    (in the case of social networks, on the scale of billions) suggest the use of
    machine learning techniques. In particular, deep learning has recently proven
    to be a powerful tool for problems with large datasets with underlying
    Euclidean structure. The purpose of this paper is to overview the problems
    arising in relation to geometric deep learning and present solutions existing
    today for this class of problems, as well as key difficulties and future
    research directions.

    Deep Joint Face Hallucination and Recognition

    Junyu Wu, Shengyong Ding, Wei Xu, Hongyang Chao
    Comments: 10 pages, 2 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Deep models have achieved impressive performance for face hallucination
    tasks. However, we observe that directly feeding the hallucinated facial images
    into recog- nition models can even degrade the recognition performance despite
    the much better visualization quality. In this paper, we address this problem
    by jointly learning a deep model for two tasks, i.e. face hallucination and
    recognition. In particular, we design an end-to-end deep convolution network
    with hallucination sub-network cascaded by recognition sub-network. The
    recognition sub- network are responsible for producing discriminative feature
    representations using the hallucinated images as inputs generated by
    hallucination sub-network. During training, we feed LR facial images into the
    network and optimize the parameters by minimizing two loss items, i.e. 1) face
    hallucination loss measured by the pixel wise difference between the ground
    truth HR images and network-generated images; and 2) verification loss which is
    measured by the classification error and intra-class distance. We extensively
    evaluate our method on LFW and YTF datasets. The experimental results show that
    our method can achieve recognition accuracy 97.95% on 4x down-sampled LFW
    testing set, outperforming the accuracy 96.35% of conventional face recognition
    model. And on the more challenging YTF dataset, we achieve recognition accuracy
    90.65%, a margin over the recognition accuracy 89.45% obtained by conventional
    face recognition model on the 4x down-sampled version.

    3D Fully Convolutional Network for Vehicle Detection in Point Cloud

    Bo Li
    Comments: Submitted to ICRA 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)

    2D fully convolutional network has been recently successfully applied to
    object detection from images. In this paper, we extend the fully convolutional
    network based detection techniques to 3D and apply it to point cloud data. The
    proposed approach is verified on the task of vehicle detection from lidar point
    cloud for autonomous driving. Experiments on the KITTI dataset shows a
    significant performance improvement over the previous point cloud based
    detection approaches.

    Recalling Holistic Information for Semantic Segmentation

    Hexiang Hu, Zhiwei Deng, Guang-tong Zhou, Fei Sha, Greg Mori
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Semantic segmentation requires a detailed labeling of image pixels by object
    category. Information derived from local image patches is necessary to describe
    the detailed shape of individual objects. However, this information is
    ambiguous and can result in noisy labels. Global inference of image content can
    instead capture the general semantic concepts present. We advocate that
    high-recall holistic inference of image concepts provides valuable information
    for detailed pixel labeling. We build a two-stream neural network architecture
    that facilitates information flow from holistic information to local pixels,
    while keeping common image features shared among the low-level layers of both
    the holistic analysis and segmentation branches. We empirically evaluate our
    network on four standard semantic segmentation datasets. Our network obtains
    state-of-the-art performance on PASCAL-Context and NYUDv2, and ablation studies
    verify its effectiveness on ADE20K and SIFT-Flow.

    Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields

    Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh
    Comments: Submitted to CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We present a realtime approach for multi-person 2D pose estimation that
    predicts vector fields, which we refer to as Part Affinity Fields (PAFs), that
    directly expose the association between anatomical parts in an image. The
    architecture is designed to jointly learn part locations and their association,
    via two branches of the same sequential prediction process. The sequential
    prediction enables the part confidence maps and the association fields to
    encode global context, while allowing an efficient bottom-up parsing step that
    maintains tractable runtime complexity. Our method has set the state-of-the-art
    performance on the inaugural MSCOCO 2016 keypoints challenge, and significantly
    exceeds the previous state-of-the-art result on the MPII Multi-Person
    benchmark, both in performance and efficiency.

    Semantic Compositional Networks for Visual Captioning

    Zhe Gan, Chuang Gan, Xiaodong He, Yunchen Pu, Kenneth Tran, Jianfeng Gao, Lawrence Carin, Li Deng
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Learning (cs.LG)

    A Semantic Compositional Network (SCN) is developed for image captioning, in
    which semantic concepts (i.e., tags) are detected from the image, and the
    probability of each tag is used to compose the parameters in a long short-term
    memory (LSTM) network. The SCN extends each weight matrix of the LSTM to an
    ensemble of tag-dependent weight matrices. The degree to which each member of
    the ensemble is used to generate an image caption is tied to the
    image-dependent probability of the corresponding tag. In addition to captioning
    images, we also extend the SCN to generate captions for video clips. We
    qualitatively analyze semantic composition in SCNs, and quantitatively evaluate
    the algorithm on three benchmark datasets: COCO, Flickr30k, and Youtube2Text.
    Experimental results show that the proposed method significantly outperforms
    prior state-of-the-art approaches, across multiple evaluation metrics.

    GuessWhat?! Visual object discovery through multi-modal dialogue

    Harm de Vries, Florian Strub, Sarath Chandar, Olivier Pietquin, Hugo Larochelle, Aaron Courville
    Comments: 23 pages
    Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

    We introduce GuessWhat?!, a two-player guessing game as a testbed for
    research on the interplay of computer vision and dialogue systems. The goal of
    the game is to locate an unknown object in a rich image scene by asking a
    sequence of questions. Higher-level image understanding, like spatial reasoning
    and language grounding, is required to solve the proposed task. Our key
    contribution is the collection of a large-scale dataset consisting of 150K
    human-played games with a total of 800K visual question-answer pairs on 66K
    images. We explain our design decisions in collecting the dataset and introduce
    the oracle and questioner tasks that are associated with the two players of the
    game. We prototyped deep learning models to establish initial baselines of the
    introduced tasks.

    Training and Evaluating Multimodal Word Embeddings with Large-scale Web Annotated Images

    Junhua Mao, Jiajing Xu, Yushi Jing, Alan Yuille
    Comments: Appears in NIPS 2016. The datasets introduced in this work will be gradually released on the project page
    Subjects: Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we focus on training and evaluating effective word embeddings
    with both text and visual information. More specifically, we introduce a
    large-scale dataset with 300 million sentences describing over 40 million
    images crawled and downloaded from publicly available Pins (i.e. an image with
    sentence descriptions uploaded by users) on Pinterest. This dataset is more
    than 200 times larger than MS COCO, the standard large-scale image dataset with
    sentence descriptions. In addition, we construct an evaluation dataset to
    directly assess the effectiveness of word embeddings in terms of finding
    semantically similar or related words and phrases. The word/phrase pairs in
    this evaluation dataset are collected from the click data with millions of
    users in an image search system, thus contain rich semantic relationships.
    Based on these datasets, we propose and compare several Recurrent Neural
    Networks (RNNs) based multimodal (text and image) models. Experiments show that
    our model benefits from incorporating the visual information into the word
    embeddings, and a weight sharing strategy is crucial for learning such
    multimodal embeddings. The project page is:
    this http URL

    Two-Level Structural Sparsity Regularization for Finding Lattice Locations and Defects in Noisy Image Data

    Xin Li, Alex Belianinov, Stephen Jesse, Chiwoo Park
    Comments: 26 Pages, 12 Figures, 2 Tables
    Subjects: Applications (stat.AP); Computer Vision and Pattern Recognition (cs.CV)

    This paper presents a regularized regression model with two-level structural
    sparsity penalties and applies it for locating individual atoms in a noisy
    electron microscope image. For crystalline materials, the locations of atoms
    have spatial symmetries, forming a few regular lattice groups. Therefore, by
    simply estimating the underlying lattice groups seen in the image, one can
    locate most atoms in the image accurately. Identifying the few underlying
    lattice groups is formulated as a sparse group selection problem. On the other
    hand, some positions on the lattice groups can be vacant due to atomic defects,
    so simply finding the lattice groups may result in many false detections on the
    vacant positions. To minimize such false detections, the proposed model
    includes an individual sparsity regularization in addition to the group
    sparsity for a within-group selection, which results in a regularization
    regression model with two-level sparsities. We propose a modification of the
    group orthogonal matching pursuit (gOMP) algorithm with a thresholding step to
    solve the problem. The convergence analysis and statistical analysis of the
    proposed algorithm are presented. The proposed algorithm is also evaluated
    through numerical experiments with two simulated images and three real images.

    Robotic Grasp Detection using Deep Convolutional Neural Networks

    Sulabh Kumra, Christopher Kanan
    Comments: Submitted to CVPR 2017
    Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)

    Deep learning has significantly advanced computer vision and natural language
    processing. While there have been some successes in robotics using deep
    learning, deep learning has not been widely adopted. In this paper, we present
    a novel robotic grasp detection system that predicts the best grasping pose of
    a parallel-plate robotic gripper for novel objects using the RGB-D image of the
    scene. The proposed model uses a deep convolutional neural network to extract
    features from the scene and then uses a shallow convolutional neural network to
    predict the graspability of the object of interest for a specific position and
    orientation. Our multi-modal model achieved an accuracy of 88.96% and runs at
    real-time speeds. This redefines the state-ofthe- art for robotic grasp
    detection.


    Artificial Intelligence

    Bipolar Weighted Argumentation Graphs

    Till Mossakowski, Fabian Neuhaus
    Subjects: Artificial Intelligence (cs.AI)

    This paper discusses the semantics of weighted argumentation graphs that are
    biplor, i.e. contain both attacks and support graphs. The work builds on
    previous work by Amgoud, Ben-Naim et. al., which presents and compares several
    semantics for argumentation graphs that contain only supports or only attacks
    relationships, respectively.

    New Trends in Neutrosophic Theory and Applications

    Florentin Smarandache, Surapati Pramanik (Editors)
    Comments: 424 pages
    Journal-ref: Pons asbl, Brussels, 2016
    Subjects: Artificial Intelligence (cs.AI)

    Neutrosophic theory and applications have been expanding in all directions at
    an astonishing rate especially after the introduction the journal entitled
    Neutrosophic Sets and Systems. New theories, techniques, algorithms have been
    rapidly developed. One of the most striking trends in the neutrosophic theory
    is the hybridization of neutrosophic set with other potential sets such as
    rough set, bipolar set, soft set, hesitant fuzzy set, etc. The different hybrid
    structure such as rough neutrosophic set, single valued neutrosophic rough set,
    bipolar neutrosophic set, single valued neutrosophic hesitant fuzzy set, etc.
    are proposed in the literature in a short period of time. Neutrosophic set has
    been a very important tool in all various areas of data mining, decision
    making, e-learning, engineering, medicine, social science, and some more. The
    book New Trends in Neutrosophic Theories and Applications focuses on theories,
    methods, algorithms for decision making and also applications involving
    neutrosophic information. Some topics deal with data mining, decision making,
    e-learning, graph theory, medical diagnosis, probability theory, topology, and
    some more.

    An Analysis of Tournament Structure

    Nhien Pham Hoang Bao, Hiroyuki Iida
    Comments: 10 pages
    Subjects: Artificial Intelligence (cs.AI)

    This paper explores a novel way for analyzing the tournament structures to
    find a best suitable one for the tournament under consideration. It concerns
    about three aspects such as tournament conducting cost, competitiveness
    development and ranking precision. It then proposes a new method using progress
    tree to detect potential throwaway matches. The analysis performed using the
    proposed method reveals the strengths and weaknesses of tournament structures.
    As a conclusion, single elimination is best if we want to qualify one winner
    only, all matches conducted are exciting in term of competitiveness. Double
    elimination with proper seeding system is a better choice if we want to qualify
    more winners. A reasonable number of extra matches need to be conducted in
    exchange of being able to qualify top four winners. Round-robin gives reliable
    ranking precision for all participants. However, its conduction cost is very
    high, and it fails to maintain competitiveness development.

    GuessWhat?! Visual object discovery through multi-modal dialogue

    Harm de Vries, Florian Strub, Sarath Chandar, Olivier Pietquin, Hugo Larochelle, Aaron Courville
    Comments: 23 pages
    Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

    We introduce GuessWhat?!, a two-player guessing game as a testbed for
    research on the interplay of computer vision and dialogue systems. The goal of
    the game is to locate an unknown object in a rich image scene by asking a
    sequence of questions. Higher-level image understanding, like spatial reasoning
    and language grounding, is required to solve the proposed task. Our key
    contribution is the collection of a large-scale dataset consisting of 150K
    human-played games with a total of 800K visual question-answer pairs on 66K
    images. We explain our design decisions in collecting the dataset and introduce
    the oracle and questioner tasks that are associated with the two players of the
    game. We prototyped deep learning models to establish initial baselines of the
    introduced tasks.

    Decision Support Systems in Fisheries and Aquaculture: A systematic review

    Bjørn Magnus Mathisen, Peter Haro, Bård Hanssen, Sara Björk, Ståle Walderhaug
    Subjects: Artificial Intelligence (cs.AI)

    Decision support systems help decision makers make better decisions in the
    face of complex decision problems (e.g. investment or policy decisions).
    Fisheries and Aquaculture is a domain where decision makers face such decisions
    since they involve factors from many different scientific fields. No systematic
    overview of literature describing decision support systems and their
    application in fisheries and aquaculture has been conducted. This paper
    summarizes scientific literature that describes decision support systems
    applied to the domain of Fisheries and Aquaculture. We use an established
    systematic mapping survey method to conduct our literature mapping. Our
    research questions are: What decision support systems for fisheries and
    aquaculture exists? What are the most investigated fishery and aquaculture
    decision support systems topics and how have these changed over time? Do any
    current DSS for fisheries provide real- time analytics? Do DSSes in Fisheries
    and Aquaculture build their models using machine learning done on captured and
    grounded data? The paper then detail how we employ the systematic mapping
    method in answering these questions. This results in 27 papers being identified
    as relevant and gives an exposition on the primary methods concluded in the
    study for designing a decision support system. We provide an analysis of the
    research done in the studies collected. We discovered that most literature does
    not consider multiple aspects for multiple stakeholders in their work. In
    addition we observed that little or no work has been done with real-time
    analysis in these decision support systems.

    The Off-Switch Game

    Dylan Hadfield-Menell, Anca Dragan, Pieter Abbeel, Stuart Russell
    Subjects: Artificial Intelligence (cs.AI)

    It is clear that one of the primary tools we can use to mitigate the
    potential risk from a misbehaving AI system is the ability to turn the system
    off. As the capabilities of AI systems improve, it is important to ensure that
    such systems do not adopt subgoals that prevent a human from switching them
    off. This is a challenge because many formulations of rational agents create
    strong incentives for self-preservation. This is not caused by a built-in
    instinct, but because a rational agent will maximize expected utility and
    cannot achieve whatever objective it has been given if it is dead. Our goal is
    to study the incentives an agent has to allow itself to be switched off. We
    analyze a simple game between a human H and a robot R, where H can press R’s
    off switch but R can disable the off switch. A traditional agent takes its
    reward function for granted: we show that such agents have an incentive to
    disable the off switch, except in the special case where H is perfectly
    rational. Our key insight is that for R to want to preserve its off switch, it
    needs to be uncertain about the utility associated with the outcome, and to
    treat H’s actions as important observations about that utility. (R also has no
    incentive to switch itself off in this setting.) We conclude that giving
    machines an appropriate level of uncertainty about their objectives leads to
    safer designs, and we argue that this setting is a useful generalization of the
    classical AI paradigm of rational agents.

    Dynamic Key-Value Memory Network for Knowledge Tracing

    Jiani Zhang, Xingjian Shi, Irwin King, Dit-Yan Yeung
    Comments: Part of the paper is under review in WWW 2017
    Subjects: Artificial Intelligence (cs.AI)

    The goal of knowledge tracing is to model students’ mastering levels of
    underlying knowledge concepts, termed knowledge state, based on students’
    exercise performance data. However, existing methods, such as Bayesian
    Knowledge Tracing (BKT) or Deep Knowledge Tracing (DKT), either require costly
    human-labeled concept annotations or fail to exactly pinpoint which concepts a
    student is good at or unfamiliar with. To solve these problems, in this paper
    we introduce a new model called Dynamic Key-Value Memory Network (DKVMN) that
    can learn representations using nonlinear transformations and directly output a
    student’s mastering level of each concept. Unlike standard Memory-Augmented
    Neural Networks (MANNs) that facilitate a single memory matrix or two static
    memory matrices, our model has one static matrix called key that stores the
    knowledge concepts and the other dynamic matrix called value that stores and
    updates corresponding concepts’ mastery levels. Experiments show that our DKVMN
    model, which is trained end-to-end, consistently outperforms the
    state-of-the-art model on a range of knowledge tracing data-sets. We also
    illustrate that the learned DKVMN can automatically discover underlying
    concepts of the exercises which are typically performed by human annotations,
    and depict a student’s changing knowledge state.

    Double-quantitative (γ^{ast}-)fuzzy coverings approximation operators

    Guangming Lang
    Comments: It enriches the fuzzy covering rough set theory
    Subjects: Artificial Intelligence (cs.AI)

    In digital-based information boom, the fuzzy covering rough set model is an
    important mathematical tool for artificial intelligence, and how to build the
    bridge between the fuzzy covering rough set theory and Pawlak’s model is
    becoming a hot research topic. In this paper, we first present the
    (gamma-)fuzzy covering based probabilistic and grade approximation operators
    and double-quantitative approximation operators. We also study the
    relationships among the three types of (gamma-)fuzzy covering based
    approximation operators. Second, we propose the (gamma^{ast}-)fuzzy coverings
    based multi-granulation probabilistic and grade lower and upper approximation
    operators and multi-granulation double-quantitative lower and upper
    approximation operators. We also investigate the relationships among these
    types of (gamma-)fuzzy coverings based approximation operators. Finally, we
    employ several examples to illustrate how to construct the lower and upper
    approximations of fuzzy sets with the absolute and relative quantitative
    information.

    A Spatio-Temporal Representation for the Orienteering Problem with Time-Varying Profits

    Zhibei Ma, Kai Yin, Lantao Liu, Gaurav S. Sukhatme
    Subjects: Artificial Intelligence (cs.AI)

    We consider an orienteering problem (OP) where an agent needs to visit a
    series (possibly a subset) of depots, from which the maximal accumulated
    profits are desired within given limited time budget. Different from most
    existing works where the profits are assumed to be static, in this work we
    investigate a variant that has time-dependent profits. Specifically, the
    profits to be collected change over time and they follow different (e.g.,
    independent) time-varying functions. The problem is essentially NP-hard. To
    tackle the challenge, we present a simple and effective framework that
    incorporates time-variations into the fundamental planning process.
    Specifically, we propose a deterministic spatio-temporal representation where
    both spatial description and temporal logic are unified into one routing
    topology. By employing existing basic sorting and searching algorithms, the
    routing solutions can be computed in an extremely efficient way. The proposed
    method is easy to implement and extensive numerical results show that our
    approach is time efficient and generates near-optimal solutions.

    Local Discriminant Hyperalignment for multi-subject fMRI data alignment

    Muhammad Yousefnezhad, Daoqiang Zhang
    Comments: Published in the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17)
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Multivariate Pattern (MVP) classification can map different cognitive states
    to the brain tasks. One of the main challenges in MVP analysis is validating
    the generated results across subjects. However, analyzing multi-subject fMRI
    data requires accurate functional alignments between neuronal activities of
    different subjects, which can rapidly increase the performance and robustness
    of the final results. Hyperalignment (HA) is one of the most effective
    functional alignment methods, which can be mathematically formulated by the
    Canonical Correlation Analysis (CCA) methods. Since HA mostly uses the
    unsupervised CCA techniques, its solution may not be optimized for MVP
    analysis. By incorporating the idea of Local Discriminant Analysis (LDA) into
    CCA, this paper proposes Local Discriminant Hyperalignment (LDHA) as a novel
    supervised HA method, which can provide better functional alignment for MVP
    analysis. Indeed, the locality is defined based on the stimuli categories in
    the train-set, where the correlation between all stimuli in the same category
    will be maximized and the correlation between distinct categories of stimuli
    approaches to near zero. Experimental studies on multi-subject MVP analysis
    confirm that the LDHA method achieves superior performance to other
    state-of-the-art HA algorithms.

    Learning Python Code Suggestion with a Sparse Pointer Network

    Avishkar Bhoopchand, Tim Rocktäschel, Earl Barr, Sebastian Riedel
    Comments: Under review as a conference paper at ICLR 2017
    Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Software Engineering (cs.SE)

    To enhance developer productivity, all modern integrated development
    environments (IDEs) include code suggestion functionality that proposes likely
    next tokens at the cursor. While current IDEs work well for statically-typed
    languages, their reliance on type annotations means that they do not provide
    the same level of support for dynamic programming languages as for
    statically-typed languages. Moreover, suggestion engines in modern IDEs do not
    propose expressions or multi-statement idiomatic code. Recent work has shown
    that language models can improve code suggestion systems by learning from
    software repositories. This paper introduces a neural language model with a
    sparse pointer network aimed at capturing very long-range dependencies. We
    release a large-scale code suggestion corpus of 41M lines of Python code
    crawled from GitHub. On this corpus, we found standard neural language models
    to perform well at suggesting local phenomena, but struggle to refer to
    identifiers that are introduced many tokens in the past. By augmenting a neural
    language model with a pointer network specialized in referring to predefined
    classes of identifiers, we obtain a much lower perplexity and a 5 percentage
    points increase in accuracy for code suggestion compared to an LSTM baseline.
    In fact, this increase in code suggestion accuracy is due to a 13 times more
    accurate prediction of identifiers. Furthermore, a qualitative analysis shows
    this model indeed captures interesting long-range dependencies, like referring
    to a class member defined over 60 tokens in the past.

    Quantum Enhanced Inference in Markov Logic Networks

    Peter Wittek, Christian Gogolin
    Comments: 8 pages, 1 figure
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Quantum Physics (quant-ph)

    Markov logic networks (MLNs) reconcile two opposing schools in machine
    learning and artificial intelligence: causal networks, which account for
    uncertainty extremely well, and first-order logic, which allows for formal
    deduction. An MLN is essentially a first-order logic template to generate
    Markov networks. Inference in MLNs is probabilistic and it is often performed
    by approximate methods such as Markov chain Monte Carlo (MCMC) Gibbs sampling.
    An MLN has many regular, symmetric structures that can be exploited at both
    first-order level and in the generated Markov network. We analyze the graph
    structures that are produced by various lifting methods and investigate the
    extent to which quantum protocols can be used to speed up Gibbs sampling with
    state preparation and measurement schemes. We review different such approaches,
    discuss their advantages, theoretical limitations, and their appeal to
    implementations. We find that a straightforward application of a recent result
    yields exponential speedup compared to classical heuristics in approximate
    probabilistic inference, thereby demonstrating another example where advanced
    quantum resources can potentially prove useful in machine learning.

    Multiscale Inverse Reinforcement Learning using Diffusion Wavelets

    Jung-Su Ha, Han-Lim Choi
    Comments: Presented at NIPS 2016 Workshop on Interpretable Machine Learning in Complex Systems
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

    This work presents a multiscale framework to solve an inverse reinforcement
    learning (IRL) problem for continuous-time/state stochastic systems. We take
    advantage of a diffusion wavelet representation of the associated Markov chain
    to abstract the state space. This not only allows for effectively handling the
    large (and geometrically complex) decision space but also provides more
    interpretable representations of the demonstrated state trajectories and also
    of the resulting policy of IRL. In the proposed framework, the problem is
    divided into the global and local IRL, where the global approximation of the
    optimal value functions are obtained using coarse features and the local
    details are quantified using fine local features. An illustrative numerical
    example on robot path control in a complex environment is presented to verify
    the proposed method.


    Information Retrieval

    Question Retrieval for Community-based Question Answering via Heterogeneous Network Integration Learning

    Zheqian Chen, Chi Zhang, Zhou Zhao, Deng Cai
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    Community based question answering platforms have attracted substantial users
    to share knowledge and learn from each other. As the rapid enlargement of CQA
    platforms, quantities of overlapped questions emerge, which makes users
    confounded to select a proper reference. It is urgent for us to take effective
    automated algorithms to reuse historical questions with corresponding answers.
    In this paper we focus on the problem with question retrieval, which aims to
    match historical questions that are relevant or semantically equivalent to
    resolve one s query directly. The challenges in this task are the lexical gaps
    between questions for the word ambiguity and word mismatch problem.
    Furthermore, limited words in queried sentences cause sparsity of word
    features. To alleviate these challenges, we propose a novel framework named
    HNIL which encodes not only the question contents but also the askers social
    interactions to enhance the question embedding performance. More specifically,
    we apply random walk based learning method with recurrent neural network to
    match the similarities between askers question and historical questions
    proposed by other users. Extensive experiments on a large scale dataset from a
    real world CQA site show that employing the heterogeneous social network
    information outperforms the other state of the art solutions in this task.

    User Personalized Satisfaction Prediction via Multiple Instance Deep Learning

    Zheqian Chen, Ben Gao, Huimin Zhang, Zhou Zhao, Deng Cai
    Comments: draft for www
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    Community based question answering services have arisen as a popular
    knowledge sharing pattern for netizens. With abundant interactions among users,
    individuals are capable of obtaining satisfactory information. However, it is
    not effective for users to attain answers within minutes. Users have to check
    the progress over time until the satisfying answers submitted. We address this
    problem as a user personalized satisfaction prediction task. Existing methods
    usually exploit manual feature selection. It is not desirable as it requires
    careful design and is labor intensive. In this paper, we settle this issue by
    developing a new multiple instance deep learning framework. Specifically, in
    our settings, each question follows a weakly supervised learning multiple
    instance learning assumption, where its obtained answers can be regarded as
    instance sets and we define the question resolved with at least one
    satisfactory answer. We thus design an efficient framework exploiting multiple
    instance learning property with deep learning to model the question answer
    pairs. Extensive experiments on large scale datasets from Stack Exchange
    demonstrate the feasibility of our proposed framework in predicting askers
    personalized satisfaction. Our framework can be extended to numerous
    applications such as UI satisfaction Prediction, multi armed bandit problem,
    expert finding and so on.


    Computation and Language

    A Simple, Fast Diverse Decoding Algorithm for Neural Generation

    Jiwei Li, Will Monroe, Dan Jurafsky
    Subjects: Computation and Language (cs.CL)

    In this paper, we propose a simple, fast decoding algorithm that fosters
    diversity in neural generation. The algorithm modifies the standard beam search
    algorithm by adding an inter-sibling ranking penalty, favoring choosing
    hypotheses from diverse parents. We evaluate the proposed model on the tasks of
    dialogue response generation, abstractive summarization and machine
    translation. We find that diverse decoding helps across all tasks, especially
    those for which reranking is needed.

    We further propose a variation that is capable of automatically adjusting its
    diversity decoding rates for different inputs using reinforcement learning
    (RL). We observe a further performance boost from this RL technique. This paper
    includes material from the unpublished script “Mutual Information and Diverse
    Decoding Improve Neural Machine Translation” (Li and Jurafsky, 2016).

    Neural Machine Translation with Latent Semantic of Image and Text

    Joji Toyama, Masanori Misono, Masahiro Suzuki, Kotaro Nakayama, Yutaka Matsuo
    Subjects: Computation and Language (cs.CL)

    Although attention-based Neural Machine Translation have achieved great
    success, attention-mechanism cannot capture the entire meaning of the source
    sentence because the attention mechanism generates a target word depending
    heavily on the relevant parts of the source sentence. The report of earlier
    studies has introduced a latent variable to capture the entire meaning of
    sentence and achieved improvement on attention-based Neural Machine
    Translation. We follow this approach and we believe that the capturing meaning
    of sentence benefits from image information because human beings understand the
    meaning of language not only from textual information but also from perceptual
    information such as that gained from vision. As described herein, we propose a
    neural machine translation model that introduces a continuous latent variable
    containing an underlying semantic extracted from texts and images. Our model,
    which can be trained end-to-end, requires image information only when training.
    Experiments conducted with an English–German translation task show that our
    model outperforms over the baseline.

    Kannada Spell Checker with Sandhi Splitter

    A N Akshatha, Chandana G Upadhyaya, Rajashekara S Murthy
    Comments: 7 pages, 10 figures
    Subjects: Computation and Language (cs.CL)

    Spelling errors are introduced in text either during typing, or when the user
    does not know the correct phoneme or grapheme. If a language contains complex
    words like sandhi where two or more morphemes join based on some rules, spell
    checking becomes very tedious. In such situations, having a spell checker with
    sandhi splitter which alerts the user by flagging the errors and providing
    suggestions is very useful. A novel algorithm of sandhi splitting is proposed
    in this paper. The sandhi splitter can split about 7000 most common sandhi
    words in Kannada language used as test samples. The sandhi splitter was
    integrated with a Kannada spell checker and a mechanism for generating
    suggestions was added. A comprehensive, platform independent, standalone spell
    checker with sandhi splitter application software was thus developed and tested
    extensively for its efficiency and correctness. A comparative analysis of this
    spell checker with sandhi splitter was made and results concluded that the
    Kannada spell checker with sandhi splitter has an improved performance. It is
    twice as fast, 200 times more space efficient, and it is 90% accurate in case
    of complex nouns and 50% accurate for complex verbs. Such a spell checker with
    sandhi splitter will be of foremost significance in machine translation
    systems, voice processing, etc. This is the first sandhi splitter in Kannada
    and the advantage of the novel algorithm is that, it can be extended to all
    Indian languages.

    Scalable Bayesian Learning of Recurrent Neural Networks for Language Modeling

    Zhe Gan, Chunyuan Li, Changyou Chen, Yunchen Pu, Qinliang Su, Lawrence Carin
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    Recurrent neural networks (RNNs) have shown promising performance for
    language modeling. However, traditional training of RNNs using back-propagation
    through time often suffers from overfitting. One reason for this is that
    stochastic optimization (used for large training sets) does not provide good
    estimates of model uncertainty. This paper leverages recent advances in
    stochastic gradient Markov Chain Monte Carlo (also appropriate for large
    training sets) to learn weight uncertainty in RNNs. It yields a principled
    Bayesian learning algorithm, adding gradient noise during training (enhancing
    exploration of the model-parameter space) and model averaging when testing.
    Extensive experiments on various RNN models and across a broad range of
    applications demonstrate the superiority of the proposed approach over
    stochastic optimization.

    Bidirectional LSTM-CRF for Clinical Concept Extraction

    Raghavendra Chalapathy, Ehsan Zare Borzeshi, Massimo Piccardi
    Comments: This paper “Bidirectional LSTM-CRF for Clinical Concept Extraction” is accepted for short paper presentation at Clinical Natural Language Processing Workshop at COLING 2016 Osaka, Japan. December 11, 2016
    Subjects: Machine Learning (stat.ML); Computation and Language (cs.CL); Learning (cs.LG)

    Automated extraction of concepts from patient clinical records is an
    essential facilitator of clinical research. For this reason, the 2010 i2b2/VA
    Natural Language Processing Challenges for Clinical Records introduced a
    concept extraction task aimed at identifying and classifying concepts into
    predefined categories (i.e., treatments, tests and problems). State-of-the-art
    concept extraction approaches heavily rely on handcrafted features and
    domain-specific resources which are hard to collect and define. For this
    reason, this paper proposes an alternative, streamlined approach: a recurrent
    neural network (the bidirectional LSTM with CRF decoding) initialized with
    general-purpose, off-the-shelf word embeddings. The experimental results
    achieved on the 2010 i2b2/VA reference corpora using the proposed framework
    outperform all recent methods and ranks closely to the best submission from the
    original 2010 i2b2/VA challenge.

    Training and Evaluating Multimodal Word Embeddings with Large-scale Web Annotated Images

    Junhua Mao, Jiajing Xu, Yushi Jing, Alan Yuille
    Comments: Appears in NIPS 2016. The datasets introduced in this work will be gradually released on the project page
    Subjects: Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we focus on training and evaluating effective word embeddings
    with both text and visual information. More specifically, we introduce a
    large-scale dataset with 300 million sentences describing over 40 million
    images crawled and downloaded from publicly available Pins (i.e. an image with
    sentence descriptions uploaded by users) on Pinterest. This dataset is more
    than 200 times larger than MS COCO, the standard large-scale image dataset with
    sentence descriptions. In addition, we construct an evaluation dataset to
    directly assess the effectiveness of word embeddings in terms of finding
    semantically similar or related words and phrases. The word/phrase pairs in
    this evaluation dataset are collected from the click data with millions of
    users in an image search system, thus contain rich semantic relationships.
    Based on these datasets, we propose and compare several Recurrent Neural
    Networks (RNNs) based multimodal (text and image) models. Experiments show that
    our model benefits from incorporating the visual information into the word
    embeddings, and a weight sharing strategy is crucial for learning such
    multimodal embeddings. The project page is:
    this http URL

    Learning Python Code Suggestion with a Sparse Pointer Network

    Avishkar Bhoopchand, Tim Rocktäschel, Earl Barr, Sebastian Riedel
    Comments: Under review as a conference paper at ICLR 2017
    Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Software Engineering (cs.SE)

    To enhance developer productivity, all modern integrated development
    environments (IDEs) include code suggestion functionality that proposes likely
    next tokens at the cursor. While current IDEs work well for statically-typed
    languages, their reliance on type annotations means that they do not provide
    the same level of support for dynamic programming languages as for
    statically-typed languages. Moreover, suggestion engines in modern IDEs do not
    propose expressions or multi-statement idiomatic code. Recent work has shown
    that language models can improve code suggestion systems by learning from
    software repositories. This paper introduces a neural language model with a
    sparse pointer network aimed at capturing very long-range dependencies. We
    release a large-scale code suggestion corpus of 41M lines of Python code
    crawled from GitHub. On this corpus, we found standard neural language models
    to perform well at suggesting local phenomena, but struggle to refer to
    identifiers that are introduced many tokens in the past. By augmenting a neural
    language model with a pointer network specialized in referring to predefined
    classes of identifiers, we obtain a much lower perplexity and a 5 percentage
    points increase in accuracy for code suggestion compared to an LSTM baseline.
    In fact, this increase in code suggestion accuracy is due to a 13 times more
    accurate prediction of identifiers. Furthermore, a qualitative analysis shows
    this model indeed captures interesting long-range dependencies, like referring
    to a class member defined over 60 tokens in the past.

    Question Retrieval for Community-based Question Answering via Heterogeneous Network Integration Learning

    Zheqian Chen, Chi Zhang, Zhou Zhao, Deng Cai
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    Community based question answering platforms have attracted substantial users
    to share knowledge and learn from each other. As the rapid enlargement of CQA
    platforms, quantities of overlapped questions emerge, which makes users
    confounded to select a proper reference. It is urgent for us to take effective
    automated algorithms to reuse historical questions with corresponding answers.
    In this paper we focus on the problem with question retrieval, which aims to
    match historical questions that are relevant or semantically equivalent to
    resolve one s query directly. The challenges in this task are the lexical gaps
    between questions for the word ambiguity and word mismatch problem.
    Furthermore, limited words in queried sentences cause sparsity of word
    features. To alleviate these challenges, we propose a novel framework named
    HNIL which encodes not only the question contents but also the askers social
    interactions to enhance the question embedding performance. More specifically,
    we apply random walk based learning method with recurrent neural network to
    match the similarities between askers question and historical questions
    proposed by other users. Extensive experiments on a large scale dataset from a
    real world CQA site show that employing the heterogeneous social network
    information outperforms the other state of the art solutions in this task.

    User Personalized Satisfaction Prediction via Multiple Instance Deep Learning

    Zheqian Chen, Ben Gao, Huimin Zhang, Zhou Zhao, Deng Cai
    Comments: draft for www
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    Community based question answering services have arisen as a popular
    knowledge sharing pattern for netizens. With abundant interactions among users,
    individuals are capable of obtaining satisfactory information. However, it is
    not effective for users to attain answers within minutes. Users have to check
    the progress over time until the satisfying answers submitted. We address this
    problem as a user personalized satisfaction prediction task. Existing methods
    usually exploit manual feature selection. It is not desirable as it requires
    careful design and is labor intensive. In this paper, we settle this issue by
    developing a new multiple instance deep learning framework. Specifically, in
    our settings, each question follows a weakly supervised learning multiple
    instance learning assumption, where its obtained answers can be regarded as
    instance sets and we define the question resolved with at least one
    satisfactory answer. We thus design an efficient framework exploiting multiple
    instance learning property with deep learning to model the question answer
    pairs. Extensive experiments on large scale datasets from Stack Exchange
    demonstrate the feasibility of our proposed framework in predicting askers
    personalized satisfaction. Our framework can be extended to numerous
    applications such as UI satisfaction Prediction, multi armed bandit problem,
    expert finding and so on.

    Semantic Compositional Networks for Visual Captioning

    Zhe Gan, Chuang Gan, Xiaodong He, Yunchen Pu, Kenneth Tran, Jianfeng Gao, Lawrence Carin, Li Deng
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Learning (cs.LG)

    A Semantic Compositional Network (SCN) is developed for image captioning, in
    which semantic concepts (i.e., tags) are detected from the image, and the
    probability of each tag is used to compose the parameters in a long short-term
    memory (LSTM) network. The SCN extends each weight matrix of the LSTM to an
    ensemble of tag-dependent weight matrices. The degree to which each member of
    the ensemble is used to generate an image caption is tied to the
    image-dependent probability of the corresponding tag. In addition to captioning
    images, we also extend the SCN to generate captions for video clips. We
    qualitatively analyze semantic composition in SCNs, and quantitatively evaluate
    the algorithm on three benchmark datasets: COCO, Flickr30k, and Youtube2Text.
    Experimental results show that the proposed method significantly outperforms
    prior state-of-the-art approaches, across multiple evaluation metrics.


    Distributed, Parallel, and Cluster Computing

    The Marriage of Incremental and Approximate Computing

    Dhanya R Krishnan
    Comments: this http URL
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Most data analytics systems that require low-latency execution and efficient
    utilization of computing resources, increasingly adopt two computational
    paradigms, namely, incremental and approximate computing. Incremental
    computation updates the output incrementally instead of re-computing everything
    from scratch for successive runs of a job with input changes. Approximate
    computation returns an approximate output for a job instead of the exact
    output.

    Both paradigms rely on computing over a subset of data items instead of
    computing over the entire dataset, but they differ in their means for skipping
    parts of the computation. Incremental computing relies on the memoization of
    intermediate results of sub-computations, and reusing these memoized results
    across jobs for sub-computations that are unaffected by the changed input.
    Approximate computing relies on representative sampling of the entire dataset
    to compute over a subset of data items.

    In this thesis, we make the observation that these two computing paradigms
    are complementary, and can be married together! The high level idea is to:
    design a sampling algorithm that biases the sample selection to the memoized
    data items from previous runs. To concretize this idea, we designed an online
    stratified sampling algorithm that uses self-adjusting computation to produce
    an incrementally updated approximate output with bounded error. We implemented
    our algorithm in a data analytics system called IncAppox based on Apache Spark
    Streaming. Our evaluation of the system shows that IncApprox achieves the
    benefits of both incremental and approximate computing.

    Investigating Low Level Protocols for Wireless Body Sensor Networks

    Nadine Boudargham, Jacques Bou Abdo, Jacques Demerjian, Christophe Guyeux, Abdallah Makhoul
    Comments: Accepted to AICCSA 2016
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Information Theory (cs.IT)

    The rapid development of medical sensors has increased the interest in
    Wireless Body Area Network (WBAN) applications where physiological data from
    the human body and its environment is gathered, monitored, and analyzed to take
    the proper measures. In WBANs, it is essential to design MAC protocols that
    ensure adequate Quality of Service (QoS) such as low delay and high
    scalability. This paper investigates Medium Access Control (MAC) protocols used
    in WBAN, and compares their performance in a high traffic environment. Such
    scenario can be induced in case of emergency for example, where physiological
    data collected from all sensors on human body should be sent simultaneously to
    take appropriate action. This study can also be extended to cover collaborative
    WBAN systems where information from different bodies is sent simultaneously
    leading to high traffic. OPNET simulations are performed to compare the delay
    and scalability performance of the different MAC protocols under the same
    experimental conditions and to draw conclusions about the best protocol to be
    used in a high traffic environment.

    Asynchronous Distributed Automata: A Characterization of the Modal Mu-Fragment

    Fabian Reiter
    Comments: 9 pages
    Subjects: Formal Languages and Automata Theory (cs.FL); Distributed, Parallel, and Cluster Computing (cs.DC); Logic in Computer Science (cs.LO)

    We establish the equivalence on finite directed graphs between a class of
    asynchronous distributed automata and a small fragment of least fixpoint logic.
    More specifically, the logic we consider is (a variant of) the fragment of
    modal (mu)-calculus that allows least fixpoints but forbids greatest
    fixpoints. The corresponding automaton model uses a network of identical
    finite-state machines that communicate in an asynchronous manner and whose
    state diagram must be acyclic except for self-loops. Exploiting the connection
    with logic, we also prove that the expressive power of these machines is
    independent of whether or not messages can be lost.

    Search on a Line by Byzantine Robots

    Jurek Czyzowicz, Konstantinos Georgiou, Evangelos Kranakis, Danny Krizanc, Lata Narayanan, Jaroslav Opatrny, Sunil Shende
    Comments: 14 pages
    Subjects: Data Structures and Algorithms (cs.DS); Distributed, Parallel, and Cluster Computing (cs.DC)

    We consider the problem of fault-tolerant parallel search on an infinite line
    by (n) robots. Starting from the origin, the robots are required to find a
    target at an unknown location. The robots can move with maximum speed (1) and
    can communicate in wireless mode among themselves. However, among the (n)
    robots, there are (f) robots that exhibit {em byzantine faults}. A faulty
    robot can fail to report the target even after reaching it, or it can make
    malicious claims about having found the target when in fact it has not. Given
    the presence of such faulty robots, the search for the target can only be
    concluded when the non-faulty robots have sufficient verification that the
    target has been found. We aim to design algorithms that minimize the value of
    (S_d(n,f)), the time to find a target at a distance (d) from the origin by (n)
    robots among which (f) are faulty. We give several different algorithms whose
    running time depends on the ratio (f/n), the density of faulty robots, and also
    prove lower bounds. Our algorithms are optimal for some densities of faulty
    robots.


    Learning

    An Overview on Data Representation Learning: From Traditional Feature Learning to Recent Deep Learning

    Guoqiang Zhong, Li-Na Wang, Junyu Dong
    Comments: About 20 pages. Submitted to Journal of Finance and Data Science as an invited paper
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    Since about 100 years ago, to learn the intrinsic structure of data, many
    representation learning approaches have been proposed, including both linear
    ones and nonlinear ones, supervised ones and unsupervised ones. Particularly,
    deep architectures are widely applied for representation learning in recent
    years, and have delivered top results in many tasks, such as image
    classification, object detection and speech recognition. In this paper, we
    review the development of data representation learning methods. Specifically,
    we investigate both traditional feature learning algorithms and
    state-of-the-art deep learning models. The history of data representation
    learning is introduced, while available resources (e.g. online course, tutorial
    and book information) and toolboxes are provided. Finally, we conclude this
    paper with remarks and some interesting research directions on data
    representation learning.

    Training and Evaluating Multimodal Word Embeddings with Large-scale Web Annotated Images

    Junhua Mao, Jiajing Xu, Yushi Jing, Alan Yuille
    Comments: Appears in NIPS 2016. The datasets introduced in this work will be gradually released on the project page
    Subjects: Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we focus on training and evaluating effective word embeddings
    with both text and visual information. More specifically, we introduce a
    large-scale dataset with 300 million sentences describing over 40 million
    images crawled and downloaded from publicly available Pins (i.e. an image with
    sentence descriptions uploaded by users) on Pinterest. This dataset is more
    than 200 times larger than MS COCO, the standard large-scale image dataset with
    sentence descriptions. In addition, we construct an evaluation dataset to
    directly assess the effectiveness of word embeddings in terms of finding
    semantically similar or related words and phrases. The word/phrase pairs in
    this evaluation dataset are collected from the click data with millions of
    users in an image search system, thus contain rich semantic relationships.
    Based on these datasets, we propose and compare several Recurrent Neural
    Networks (RNNs) based multimodal (text and image) models. Experiments show that
    our model benefits from incorporating the visual information into the word
    embeddings, and a weight sharing strategy is crucial for learning such
    multimodal embeddings. The project page is:
    this http URL

    On Human Intellect and Machine Failures: Troubleshooting Integrative Machine Learning Systems

    Besmira Nushi, Ece Kamar, Eric Horvitz, Donald Kossmann
    Comments: 11 pages, Thirty-First AAAI conference on Artificial Intelligence
    Subjects: Learning (cs.LG)

    We study the problem of troubleshooting machine learning systems that rely on
    analytical pipelines of distinct components. Understanding and fixing errors
    that arise in such integrative systems is difficult as failures can occur at
    multiple points in the execution workflow. Moreover, errors can propagate,
    become amplified or be suppressed, making blame assignment difficult. We
    propose a human-in-the-loop methodology which leverages human intellect for
    troubleshooting system failures. The approach simulates potential component
    fixes through human computation tasks and measures the expected improvements in
    the holistic behavior of the system. The method provides guidance to designers
    about how they can best improve the system. We demonstrate the effectiveness of
    the approach on an automated image captioning system that has been pressed into
    real-world use.

    Learning Fast Sparsifying Transforms

    Cristian Rusu, John Thompson
    Subjects: Learning (cs.LG)

    Given a dataset, the task of learning a transform that allows sparse
    representations of the data bears the name of dictionary learning. In many
    applications, these learned dictionaries represent the data much better than
    the static well-known transforms (Fourier, Hadamard etc.). The main downside of
    learned transforms is that they lack structure and therefore they are not
    computationally efficient, unlike their classical counterparts. This posses
    several difficulties especially when using power limited hardware such as
    mobile devices, therefore discouraging the application of sparsity techniques
    in such scenarios. In this paper we construct orthonormal and non-orthonormal
    dictionaries that are factorized as a product of a few basic transformations.
    In the orthonormal case, we solve exactly the dictionary update problem for one
    basic transformation, which can be viewed as a generalized Givens rotation, and
    then propose to construct orthonormal dictionaries that are a product of these
    transformations, guaranteeing their fast manipulation. We also propose a method
    to construct fast square but non-orthonormal dictionaries that are factorized
    as a product of few transforms that can be viewed as a further generalization
    of Givens rotations to the non-orthonormal setting. We show how the proposed
    transforms can balance very well data representation performance and
    computational complexity. We also compare with classical fast and learned
    general and orthonormal transforms.

    Fast Orthonormal Sparsifying Transforms Based on Householder Reflectors

    Cristian Rusu, Nuria Gonzalez-Prelcic, Robert Heath
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    Dictionary learning is the task of determining a data-dependent transform
    that yields a sparse representation of some observed data. The dictionary
    learning problem is non-convex, and usually solved via computationally complex
    iterative algorithms. Furthermore, the resulting transforms obtained generally
    lack structure that permits their fast application to data. To address this
    issue, this paper develops a framework for learning orthonormal dictionaries
    which are built from products of a few Householder reflectors. Two algorithms
    are proposed to learn the reflector coefficients: one that considers a
    sequential update of the reflectors and one with a simultaneous update of all
    reflectors that imposes an additional internal orthogonal constraint. The
    proposed methods have low computational complexity and are shown to converge to
    local minimum points which can be described in terms of the spectral properties
    of the matrices involved. The resulting dictionaries balance between the
    computational complexity and the quality of the sparse representations by
    controlling the number of Householder reflectors in their product. Simulations
    of the proposed algorithms are shown in the image processing setting where
    well-known fast transforms are available for comparisons. The proposed
    algorithms have favorable reconstruction error and the advantage of a fast
    implementation relative to the classical, unstructured, dictionaries.

    Multiscale Inverse Reinforcement Learning using Diffusion Wavelets

    Jung-Su Ha, Han-Lim Choi
    Comments: Presented at NIPS 2016 Workshop on Interpretable Machine Learning in Complex Systems
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

    This work presents a multiscale framework to solve an inverse reinforcement
    learning (IRL) problem for continuous-time/state stochastic systems. We take
    advantage of a diffusion wavelet representation of the associated Markov chain
    to abstract the state space. This not only allows for effectively handling the
    large (and geometrically complex) decision space but also provides more
    interpretable representations of the demonstrated state trajectories and also
    of the resulting policy of IRL. In the proposed framework, the problem is
    divided into the global and local IRL, where the global approximation of the
    optimal value functions are obtained using coarse features and the local
    details are quantified using fine local features. An illustrative numerical
    example on robot path control in a complex environment is presented to verify
    the proposed method.

    EEGNet: A Compact Convolutional Network for EEG-based Brain-Computer Interfaces

    Vernon J. Lawhern, Amelia J. Solon, Nicholas R. Waytowich, Stephen M. Gordon, Chou P. Hung, Brent J. Lance
    Comments: Under review at IEEE Transactions on Biomedical Engineering
    Subjects: Learning (cs.LG); Neurons and Cognition (q-bio.NC); Machine Learning (stat.ML)

    Objective: Brain-Computer Interface technologies (BCI) enable the direct
    communication between humans and computers by analyzing brain measurements,
    such as electroencephalography (EEG). These technologies have been applied to a
    variety of domains, including neuroprosthetic control and the monitoring of
    epileptic seizures. Existing BCI systems primarily use a priori knowledge of
    EEG features of interest to build machine learning models. Recently,
    convolutional networks have been used for automatic feature extraction of large
    image databases, where they have obtained state-of-the-art results. In this
    work we introduce EEGNet, a compact fully convolutional network for EEG-based
    BCIs developed using Deep Learning approaches. Methods: EEGNet is a 4-layer
    convolutional network that uses filter factorization for learning a compact
    representation of EEG time series. EEGNet is one of the smallest convolutional
    networks to date, having less than 2200 parameters for a binary classification.
    Results: We show state-of-the-art classification performance across four
    different BCI paradigms: P300 event-related potential, error-related
    negativity, movement-related cortical potential, and sensory motor rhythm, with
    as few as 500 EEG trials. We also show that adding more trials reduces the
    error variance of prediction rather than improving classification performance.
    Conclusion: We provide preliminary evidence suggesting that our model can be
    used with small EEG databases while improving upon the state-of-the-art
    performance across several tasks and across subjects. Significance: The EEGNet
    neural network architecture provides state-of-the-art performance across
    several tasks and across subjects, challenging the notion that large datasets
    are required to obtain optimal performance.

    Bottleneck Conditional Density Estimation

    Rui Shu, Hung H. Bui, Mohammad Ghavamzadeh
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    We propose a neural network framework for high-dimensional conditional
    density estimation. The Bottleneck Conditional Density Estimator (BCDE) is a
    variant of the conditional variational autoencoder (CVAE) that employs layer(s)
    of stochastic variables as the bottleneck between the input x and target y,
    where both are high-dimensional. The key to effectively train BCDEs is the
    hybrid blending of the conditional generative model with a joint generative
    model that leverages unlabeled data to regularize the conditional model. We
    show that the BCDE significantly outperforms the CVAE in MNIST quadrant
    prediction benchmarks in the fully supervised case and establishes new
    benchmarks in the semi-supervised case.

    On the Exponentially Weighted Aggregate with the Laplace Prior

    Arnak S. Dalalyan, Edwin Grappin, Quentin Paris
    Comments: 30 pages, 2 figures
    Subjects: Statistics Theory (math.ST); Learning (cs.LG)

    In this paper, we study the statistical behaviour of the Exponentially
    Weighted Aggregate (EWA) in the problem of high-dimensional regression with
    fixed design. Under the assumption that the underlying regression vector is
    sparse, it is reasonable to use the Laplace distribution as a prior. The
    resulting estimator and, specifically, a particular instance of it referred to
    as the Bayesian lasso, was already used in the statistical literature because
    of its computational convenience, even though no thorough mathematical analysis
    of its statistical properties was carried out. The present work fills this gap
    by establishing sharp oracle inequalities for the EWA with the Laplace prior.
    These inequalities show that if the temperature parameter is small, the EWA
    with the Laplace prior satisfies the same type of oracle inequality as the
    lasso estimator does, as long as the quality of estimation is measured by the
    prediction loss. Extensions of the proposed methodology to the problem of
    prediction with low-rank matrices are considered.

    Distributed Optimization of Multi-Class SVMs

    Maximilian Alber, Julian Zimmert, Urun Dogan, Marius Kloft
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Training of one-vs.-rest SVMs can be parallelized over the number of classes
    in a straight forward way. Given enough computational resources, one-vs.-rest
    SVMs can thus be trained on data involving a large number of classes. The same
    cannot be stated, however, for the so-called all-in-one SVMs, which require
    solving a quadratic program of size quadratically in the number of classes. We
    develop distributed algorithms for two all-in-one SVM formulations (Lee et al.
    and Weston and Watkins) that parallelize the computation evenly over the number
    of classes. This allows us to compare these models to one-vs.-rest SVMs on
    unprecedented scale. The results indicate superior accuracy on text
    classification data.

    Bidirectional LSTM-CRF for Clinical Concept Extraction

    Raghavendra Chalapathy, Ehsan Zare Borzeshi, Massimo Piccardi
    Comments: This paper “Bidirectional LSTM-CRF for Clinical Concept Extraction” is accepted for short paper presentation at Clinical Natural Language Processing Workshop at COLING 2016 Osaka, Japan. December 11, 2016
    Subjects: Machine Learning (stat.ML); Computation and Language (cs.CL); Learning (cs.LG)

    Automated extraction of concepts from patient clinical records is an
    essential facilitator of clinical research. For this reason, the 2010 i2b2/VA
    Natural Language Processing Challenges for Clinical Records introduced a
    concept extraction task aimed at identifying and classifying concepts into
    predefined categories (i.e., treatments, tests and problems). State-of-the-art
    concept extraction approaches heavily rely on handcrafted features and
    domain-specific resources which are hard to collect and define. For this
    reason, this paper proposes an alternative, streamlined approach: a recurrent
    neural network (the bidirectional LSTM with CRF decoding) initialized with
    general-purpose, off-the-shelf word embeddings. The experimental results
    achieved on the 2010 i2b2/VA reference corpora using the proposed framework
    outperform all recent methods and ranks closely to the best submission from the
    original 2010 i2b2/VA challenge.

    A Unified Convex Surrogate for the Schatten-(p) Norm

    Chen Xu, Zhouchen Lin, Hongbin Zha
    Comments: The paper is accepted by AAAI-17. We show that multi-factor matrix factorization enjoys superiority over the traditional two-factor case
    Subjects: Machine Learning (stat.ML); Learning (cs.LG); Numerical Analysis (math.NA); Optimization and Control (math.OC)

    The Schatten-(p) norm ((0<p<1)) has been widely used to replace the nuclear
    norm for better approximating the rank function. However, existing methods are
    either 1) not scalable for large scale problems due to relying on singular
    value decomposition (SVD) in every iteration, or 2) specific to some (p)
    values, e.g., (1/2), and (2/3). In this paper, we show that for any (p), (p_1),
    and (p_2 >0) satisfying (1/p=1/p_1+1/p_2), there is an equivalence between the
    Schatten-(p) norm of one matrix and the Schatten-(p_1) and the Schatten-(p_2)
    norms of its two factor matrices. We further extend the equivalence to multiple
    factor matrices and show that all the factor norms can be convex and smooth for
    any (p>0). In contrast, the original Schatten-(p) norm for (0<p<1) is
    non-convex and non-smooth. As an example we conduct experiments on matrix
    completion. To utilize the convexity of the factor matrix norms, we adopt the
    accelerated proximal alternating linearized minimization algorithm and
    establish its sequence convergence. Experiments on both synthetic and real
    datasets exhibit its superior performance over the state-of-the-art methods.
    Its speed is also highly competitive.

    Local Discriminant Hyperalignment for multi-subject fMRI data alignment

    Muhammad Yousefnezhad, Daoqiang Zhang
    Comments: Published in the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17)
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Multivariate Pattern (MVP) classification can map different cognitive states
    to the brain tasks. One of the main challenges in MVP analysis is validating
    the generated results across subjects. However, analyzing multi-subject fMRI
    data requires accurate functional alignments between neuronal activities of
    different subjects, which can rapidly increase the performance and robustness
    of the final results. Hyperalignment (HA) is one of the most effective
    functional alignment methods, which can be mathematically formulated by the
    Canonical Correlation Analysis (CCA) methods. Since HA mostly uses the
    unsupervised CCA techniques, its solution may not be optimized for MVP
    analysis. By incorporating the idea of Local Discriminant Analysis (LDA) into
    CCA, this paper proposes Local Discriminant Hyperalignment (LDHA) as a novel
    supervised HA method, which can provide better functional alignment for MVP
    analysis. Indeed, the locality is defined based on the stimuli categories in
    the train-set, where the correlation between all stimuli in the same category
    will be maximized and the correlation between distinct categories of stimuli
    approaches to near zero. Experimental studies on multi-subject MVP analysis
    confirm that the LDHA method achieves superior performance to other
    state-of-the-art HA algorithms.

    Identifying Significant Predictive Bias in Classifiers

    Zhe Zhang, Daniel B. Neill
    Comments: Presented at NIPS 2016 Workshop on Interpretable Machine Learning in Complex Systems
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    We present a novel subset scan method to detect if a probabilistic binary
    classifier has statistically significant bias — over or under predicting the
    risk — for some subgroup, and identify the characteristics of this subgroup.
    This form of model checking and goodness-of-fit test provides a way to
    interpretably detect the presence of classifier bias and poor classifier fit,
    not just in one or two dimensions of features of a priori interest, but in the
    space of all possible feature subgroups. We use subset scan and parametric
    bootstrap methods to efficiently address the difficulty of assessing the
    exponentially many possible subgroups. We also suggest several useful
    extensions of this method for increasing interpretability of predictive models
    and prediction performance.

    Interpreting the Predictions of Complex ML Models by Layer-wise Relevance Propagation

    Wojciech Samek, Grégoire Montavon, Alexander Binder, Sebastian Lapuschkin, Klaus-Robert Müller
    Comments: Presented at NIPS 2016 Workshop on Interpretable Machine Learning in Complex Systems
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Complex nonlinear models such as deep neural network (DNNs) have become an
    important tool for image classification, speech recognition, natural language
    processing, and many other fields of application. These models however lack
    transparency due to their complex nonlinear structure and to the complex data
    distributions to which they typically apply. As a result, it is difficult to
    fully characterize what makes these models reach a particular decision for a
    given input. This lack of transparency can be a drawback, especially in the
    context of sensitive applications such as medical analysis or security. In this
    short paper, we summarize a recent technique introduced by Bach et al. [1] that
    explains predictions by decomposing the classification decision of DNN models
    in terms of input variables.

    Survey of Expressivity in Deep Neural Networks

    Maithra Raghu, Ben Poole, Jon Kleinberg, Surya Ganguli, Jascha Sohl-Dickstein
    Comments: Presented at NIPS 2016 Workshop on Interpretable Machine Learning in Complex Systems
    Subjects: Machine Learning (stat.ML); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    We survey results on neural network expressivity described in “On the
    Expressive Power of Deep Neural Networks”. The paper motivates and develops
    three natural measures of expressiveness, which all display an exponential
    dependence on the depth of the network. In fact, all of these measures are
    related to a fourth quantity, trajectory length. This quantity grows
    exponentially in the depth of the network, and is responsible for the depth
    sensitivity observed. These results translate to consequences for networks
    during and after training. They suggest that parameters earlier in a network
    have greater influence on its expressive power — in particular, given a layer,
    its influence on expressivity is determined by the remaining depth of the
    network after that layer. This is verified with experiments on MNIST and
    CIFAR-10. We also explore the effect of training on the input-output map, and
    find that it trades off between the stability and expressivity.

    Scalable Bayesian Learning of Recurrent Neural Networks for Language Modeling

    Zhe Gan, Chunyuan Li, Changyou Chen, Yunchen Pu, Qinliang Su, Lawrence Carin
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    Recurrent neural networks (RNNs) have shown promising performance for
    language modeling. However, traditional training of RNNs using back-propagation
    through time often suffers from overfitting. One reason for this is that
    stochastic optimization (used for large training sets) does not provide good
    estimates of model uncertainty. This paper leverages recent advances in
    stochastic gradient Markov Chain Monte Carlo (also appropriate for large
    training sets) to learn weight uncertainty in RNNs. It yields a principled
    Bayesian learning algorithm, adding gradient noise during training (enhancing
    exploration of the model-parameter space) and model averaging when testing.
    Extensive experiments on various RNN models and across a broad range of
    applications demonstrate the superiority of the proposed approach over
    stochastic optimization.

    Semantic Compositional Networks for Visual Captioning

    Zhe Gan, Chuang Gan, Xiaodong He, Yunchen Pu, Kenneth Tran, Jianfeng Gao, Lawrence Carin, Li Deng
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Learning (cs.LG)

    A Semantic Compositional Network (SCN) is developed for image captioning, in
    which semantic concepts (i.e., tags) are detected from the image, and the
    probability of each tag is used to compose the parameters in a long short-term
    memory (LSTM) network. The SCN extends each weight matrix of the LSTM to an
    ensemble of tag-dependent weight matrices. The degree to which each member of
    the ensemble is used to generate an image caption is tied to the
    image-dependent probability of the corresponding tag. In addition to captioning
    images, we also extend the SCN to generate captions for video clips. We
    qualitatively analyze semantic composition in SCNs, and quantitatively evaluate
    the algorithm on three benchmark datasets: COCO, Flickr30k, and Youtube2Text.
    Experimental results show that the proposed method significantly outperforms
    prior state-of-the-art approaches, across multiple evaluation metrics.


    Information Theory

    A Bound on Rate of Codes with Locality with Sequential Recovery from Multiple Erasures

    S.B.Balaji, G.R.Kini, P. Vijay Kumar
    Comments: To be submitted to ISIT 2017
    Subjects: Information Theory (cs.IT)

    In this paper, we derive an upper bound on rate of a code with locality with
    sequential recovery from multiple erasures for any (r geq 3) and any (t>0) .
    We also give a construction of codes achieving our rate bound for any (r) and
    (t in 2mathbb{Z_{+}}).

    User Point Processes in Cellular Networks

    Martin Haenggi
    Comments: 8 pages, 4 figures
    Subjects: Information Theory (cs.IT); Discrete Mathematics (cs.DM); Networking and Internet Architecture (cs.NI); Probability (math.PR)

    The point process of concurrent users is critical for the analysis of
    cellular networks, in particular for the uplink and for full-duplex
    communication. We analyze the properties of two popular models. For the first
    one, we provide an accurate characterization of the pair correlation functions
    from the user and the base station point of view, which are applied to
    approximate the user process by Poisson and Ginibre point processes. For the
    second model, which includes the first model asymptotically, we study the cell
    vacancy probability, the mean area of vacant and occupied cells, the user-base
    station distance, and the pair correlation function in lightly and heavily
    loaded regimes.

    Reliability of k-out-of-n Data Storage System with Deterministic Parallel and Serial Repair

    Vaneet Aggarwal
    Subjects: Information Theory (cs.IT)

    In this paper, we find the Laplace Stieltjes transform of the probability of
    data loss for the k-out-of-n distributed storage system with deterministic
    repair times. We consider two repair models, namely the serial and parallel
    repair. We show that for failure rate much lower than the repair rate, mean
    time of data loss for the two models is the same unlike the case for
    exponential repair models.

    Demonstration of Fully Nonlinear Spectrum Modulated System in the Highly Nonlinear Optical Transmission Regime

    Vahid Aref, Son T. Le, Henning Buelow
    Comments: The paper was presented in European Conference on Optical Communication (ECOC) 2016, Sept. 2016
    Subjects: Information Theory (cs.IT)

    We report a 3 dB increase in the nonlinear threshold of a 64*0.5Gbaud 16-QAM
    continuous-nonlinear-spectrum modulated signal by nonlinear multiplexing with
    QPSK modulated multi-solitons, showing the first ever fully nonlinear-spectrum
    modulated system in the highly nonlinear regime.

    Communication for Omniscience

    Ni Ding, Chung Chan, Qiaoqiao Zhou, Rodney A. Kennedy, Parastoo Sadeghi
    Comments: 26 pages, 14 figures
    Subjects: Information Theory (cs.IT)

    This paper considers the communication for omniscience (CO) problem: A set of
    users observe a discrete memoryless multiple source and want to recover the
    entire multiple source via noise-free broadcast communications. We study the
    problem of how to attain omniscience with the minimum sum-rate, the total
    number of communications, and determine a corresponding optimal rate vector.
    The results cover both asymptotic and non-asymptotic models where the
    transmission rates are real and integral, respectively. Based on the concepts
    of submodularity and Dilworth truncation, we formulate a maximization problem.
    The maximum is the highest Slepian-Wolf constraint over all multi-way cuts of
    the user set, which determines the minimum sum-rate. For solving this
    maximization problem and searching for an optimal rate vector, we propose a
    modified decomposition algorithm (MDA) and a sum-rate increment algorithm (SIA)
    for asymptotic and non-asymptotic models, respectively, both of which complete
    in polynomial time. For solving the Dilworth truncation problem as the
    subroutine in both algorithms, we propose a fusion method to implement the
    existing coordinate saturation capacity (CoordSatCap) algorithm, where the
    submodular function minimization (SFM) is done over a merged user set. We show
    by experimental results that this fusion method contributes to a reduction in
    computation complexity as compared to the original CoordSatCap algorithm.

    Massive MIMO Pilot Retransmission Strategies for Robustification against Jamming

    Tan Tai Do, Hien Quoc Ngo, Trung Q. Duong, Tobias J. Oechtering, Mikael Skoglund
    Comments: Accepted for publication in the IEEE Wireless Communications Letters
    Subjects: Information Theory (cs.IT)

    This letter proposes anti-jamming strategies based on pilot retransmission
    for a single user uplink massive MIMO under jamming attack. A jammer is assumed
    to attack the system both in the training and data transmission phases. We
    first derive an achievable rate which enables us to analyze the effect of
    jamming attacks on the system performance. Counter-attack strategies are then
    proposed to mitigate this effect under two different scenarios: random and
    deterministic jamming attacks. Numerical results illustrate our analysis and
    benefit of the proposed schemes.

    Spatially Coupled LDPC Ensembles on Burst Erasure Channels

    Vahid Aref, Narayanan Rengaswamy, Laurent Schmalen
    Subjects: Information Theory (cs.IT)

    Spatially-Coupled LDPC (SC-LDPC) ensembles have gained significant interest
    since they were shown to universally achieve the capacity of binary memoryless
    channels under low-complexity belief-propagation decoding. In this work, we
    focus on the performance of these ensembles over binary erasure channels
    affected additionally by bursts. We assume that the burst can erase either a
    complete spatial position, modeling node failures in distributed storage, or
    can span across multiple spatial positions. We study the expected performance
    of random regular SC-LDPC ensembles on single-burst-erasure channels and
    provide tight lower bounds for the block erasure probability ((P_B)) at finite
    block length and bounds on the coupling parameter for being asymptotically able
    to recover the burst. Subsequently, we show the effect of introducing
    additional random erasures to these channels. Finally, we show that expurgation
    can bring substantial improvements in the block error rate by analyzing the
    minimal stopping sets of some expurgated code ensembles. Our study shows that
    for a fixed asymptotic code rate, the combination of increasing the variable
    node degree and expurgating the ensemble can improve the block error
    probability by several orders of magnitude. All the results are verified using
    Monte-Carlo simulations.

    Neyman-Pearson Test for Zero-Rate Multiterminal Hypothesis Testing

    Shun Watanabe
    Comments: 30 pages, 8 figures
    Subjects: Information Theory (cs.IT)

    The problem of zero-rate multiterminal hypothesis testing is revisited. A
    Neyman-Pearson-like test is proposed and its non-asymptotic performance is
    clarified; for short blocklength, it is numerically examined that the proposed
    test is superior to a previously known Hoeffding-like test proposed by
    Han-Kobayashi. For the large deviation regime, it is shown that our proposed
    test achieves the optimal trade-off between the type I and type II exponents
    shown by Han-Kobayashi. Among the class of symmetric (type based) testing
    schemes, when the type I error probability is non-vanishing, the proposed test
    is optimal up to the second-order term of the type II error exponent; the
    second-order term is characterized in terms of the variance of the projected
    relative entropy density. The information geometry method plays an important
    role in the analysis as well as the construction of the test.

    Optimal Codes from Fibonacci Polynomials and Secret Sharing Schemes

    Mehmet E. Koroglu, Ibrahim Ozbek, Irfan Siap
    Subjects: Information Theory (cs.IT)

    In this work, we study cyclic codes that have generators as Fibonacci
    polynomials over finite fields. We show that these cyclic codes in most cases
    produce families of maximum distance separable and optimal codes with
    interesting properties. We explore these relations and present some examples.
    Also, we present applications of these codes to secret sharing schemes.

    Heterogeneous Cellular Networks with Spatio-Temporal Traffic: Delay Analysis and Scheduling

    Yi Zhong, Tony Q.S. Quek, Xiaohu Ge
    Comments: 30 pages, 8 figures
    Subjects: Information Theory (cs.IT)

    Emergence of new types of services has led to various traffic and diverse
    delay requirements in fifth generation (5G) wireless networks. Meeting diverse
    delay requirements is one of the most critical goals for the design of 5G
    wireless networks. Though the delay of point-to-point communications has been
    well investigated, the delay of multi-point to multi-point communications has
    not been thoroughly studied since it is a complicated function of all links in
    the network. In this work, we propose a novel tractable approach to analyze the
    delay in the heterogenous cellular networks with spatio-temporal random arrival
    of traffic. Specifically, we propose the notion of delay outage and evaluated
    the effect of different scheduling policies on the delay performance. Our
    numerical analysis reveals that offloading policy based on cell range expansion
    greatly reduces the macrocell traffic while bringing a small amount of growth
    for the picocell traffic. Our results also show that the delay performance of
    round-robin scheduling outperforms first in first out scheduling for heavy
    traffic, and it is reversed for light traffic. In summary, this analytical
    framework provides an understanding and a rule-of-thumb for the practical
    deployment of 5G systems where delay requirement is increasingly becoming a key
    concern.

    Joint Throughput Maximization and Power Control in Poisson CoopMAC Networks

    Masoumeh Sadeghi, Homa Nikbakht, Amir Masoud Rabiei, Vahid Shah-Mansouri
    Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT)

    A cooperative medium access control (CoopMAC) network with randomly
    distributed helpers is considered. We introduce a new helper selection scheme
    which maximizes the average throughput while maintaining a low power
    consumption profile in the network. To this end, all transmissions are assumed
    to be performed using power control. We assume that each node can estimate the
    channel between itself and a receiving node. Then, it evaluates the minimum
    transmission power needed to achieve the desired signal to noise ratio (SNR).
    If the required transmission power is less than the maximum transmission power
    of the node, the communication is regarded as successful. Otherwise, the
    transmission is canceled. %Also, we classify the helpers into six groups
    according to their transmission rates IEEE 802.11b Standard. In order to
    increase the average throughput, we assume that the cooperative link with the
    highest transmission rate is chosen from those that can successfully forward
    the source signal to destination. Also, when there are several links with the
    same rates, the one with minimum required power is given the highest priority.
    Assuming that the helpers are distributed as a Poisson point process with fixed
    intensity, we derive exact expressions for the average throughput and the power
    consumption of the network. Simulation results show that our scheme is able to
    significantly increase the throughput in comparison to the conventional CoopMAC
    network. It is also able to reduce the power consumption compared to a network
    with no power control approach.

    Investigating Low Level Protocols for Wireless Body Sensor Networks

    Nadine Boudargham, Jacques Bou Abdo, Jacques Demerjian, Christophe Guyeux, Abdallah Makhoul
    Comments: Accepted to AICCSA 2016
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Information Theory (cs.IT)

    The rapid development of medical sensors has increased the interest in
    Wireless Body Area Network (WBAN) applications where physiological data from
    the human body and its environment is gathered, monitored, and analyzed to take
    the proper measures. In WBANs, it is essential to design MAC protocols that
    ensure adequate Quality of Service (QoS) such as low delay and high
    scalability. This paper investigates Medium Access Control (MAC) protocols used
    in WBAN, and compares their performance in a high traffic environment. Such
    scenario can be induced in case of emergency for example, where physiological
    data collected from all sensors on human body should be sent simultaneously to
    take appropriate action. This study can also be extended to cover collaborative
    WBAN systems where information from different bodies is sent simultaneously
    leading to high traffic. OPNET simulations are performed to compare the delay
    and scalability performance of the different MAC protocols under the same
    experimental conditions and to draw conclusions about the best protocol to be
    used in a high traffic environment.

    Interference alignment for downlink cellular networks: Joint scheduling and precoding

    Yasser Fadlallah (SOCRATE), Paul Ferrand, Leonardo Cardoso (SOCRATE), Jean-Marie Gorce (SOCRATE)
    Journal-ref: IEEE 17th International Workshop onSignal Processing Advances in
    Wireless Communications (SPAWC), Jul 2016, Edinburgh, United Kingdom. pp.1 –
    5, 2016
    Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT)

    Interference Alignment (IA) is technique that, in a large sense, makes use of
    the increasing signal dimensions available in the system through MIMO and OFDM
    technologies in order to globally reduce the interference suffered by users in
    a network. In this paper, we address the problem of downlink cellular networks,
    the so-called interfering broadcast channels, where mobile users at cell edges
    may suffer from high interference and thus, poor performance. Starting from the
    downlink IA scheme proposed by Suh et al., a new approach is proposed where
    each user feeds back multiple selected received signal directions with high
    signal-to-interference gain. A exhaustive search based scheduler selects a
    subset of users to be served simultaneously, balancing between sum-rate
    performance and fairness, but becomes untractable in dense network scenarios
    where many users send simultaneous requests. Therefore, we develop a
    sub-optimal scheduler that greatly decreases the complexity while preserving a
    near-optimal data rate gain. More interestingly, our simulations show that the
    IA scheme becomes valuable only in correlated channels, whereas the matched
    filtering based scheme performs the best in the uncorrelated scenarios.




沪ICP备19023445号-2号
友情链接