IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Wed, 12 Apr 2017

    我爱机器学习(52ml.net)发表于 2017-04-12 00:00:00
    love 0

    Neural and Evolutionary Computing

    Automated Curriculum Learning for Neural Networks

    Alex Graves, Marc G. Bellemare, Jacob Menick, Remi Munos, Koray Kavukcuoglu
    Subjects: Neural and Evolutionary Computing (cs.NE)

    We introduce a method for automatically selecting the path, or syllabus, that
    a neural network follows through a curriculum so as to maximise learning
    efficiency. A measure of the amount that the network learns from each data
    sample is provided as a reward signal to a nonstationary multi-armed bandit
    algorithm, which then determines a stochastic syllabus. We consider a range of
    signals derived from two distinct indicators of learning progress: rate of
    increase in prediction accuracy, and rate of increase in network complexity.
    Experimental results for LSTM networks on three curricula demonstrate that our
    approach can significantly accelerate learning, in some cases halving the time
    required to attain a satisfactory performance level.

    WRPN: Training and Inference using Wide Reduced-Precision Networks

    Asit Mishra, Jeffrey J Cook, Eriko Nurvitadhi, Debbie Marr
    Comments: Under submission to CVPR Workshop
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    For computer vision applications, prior works have shown the efficacy of
    reducing the numeric precision of model parameters (network weights) in deep
    neural networks but also that reducing the precision of activations hurts model
    accuracy much more than reducing the precision of model parameters. We study
    schemes to train networks from scratch using reduced-precision activations
    without hurting the model accuracy. We reduce the precision of activation maps
    (along with model parameters) using a novel quantization scheme and increase
    the number of filter maps in a layer, and find that this scheme compensates or
    surpasses the accuracy of the baseline full-precision network. As a result, one
    can significantly reduce the dynamic memory footprint, memory bandwidth,
    computational energy and speed up the training and inference process with
    appropriate hardware support. We call our scheme WRPN – wide reduced-precision
    networks. We report results using our proposed schemes and show that our
    results are better than previously reported accuracies on ILSVRC-12 dataset
    while being computationally less expensive compared to previously reported
    reduced-precision networks.

    Stochastic Neural Networks for Hierarchical Reinforcement Learning

    Carlos Florensa, Yan Duan, Pieter Abbeel
    Comments: Published as a conference paper at ICLR 2017
    Journal-ref: International Conference on Learning Representations 2017
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Robotics (cs.RO)

    Deep reinforcement learning has achieved many impressive results in recent
    years. However, tasks with sparse rewards or long horizons continue to pose
    significant challenges. To tackle these important problems, we propose a
    general framework that first learns useful skills in a pre-training
    environment, and then leverages the acquired skills for learning faster in
    downstream tasks. Our approach brings together some of the strengths of
    intrinsic motivation and hierarchical methods: the learning of useful skill is
    guided by a single proxy reward, the design of which requires very minimal
    domain knowledge about the downstream tasks. Then a high-level policy is
    trained on top of these skills, providing a significant improvement of the
    exploration and allowing to tackle sparse rewards in the downstream tasks. To
    efficiently pre-train a large span of skills, we use Stochastic Neural Networks
    combined with an information-theoretic regularizer. Our experiments show that
    this combination is effective in learning a wide span of interpretable skills
    in a sample-efficient way, and can significantly boost the learning performance
    uniformly across a wide range of downstream tasks.


    Computer Vision and Pattern Recognition

    Forecasting Human Dynamics from Static Images

    Yu-Wei Chao, Jimei Yang, Brian Price, Scott Cohen, Jia Deng
    Comments: Accepted in CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper presents the first study on forecasting human dynamics from static
    images. The problem is to input a single RGB image and generate a sequence of
    upcoming human body poses in 3D. To address the problem, we propose the 3D Pose
    Forecasting Network (3D-PFNet). Our 3D-PFNet integrates recent advances on
    single-image human pose estimation and sequence prediction, and converts the 2D
    predictions into 3D space. We train our 3D-PFNet using a three-step training
    strategy to leverage a diverse source of training data, including image and
    video based human pose datasets and 3D motion capture (MoCap) data. We
    demonstrate competitive performance of our 3D-PFNet on 2D pose forecasting and
    3D pose recovery through quantitative and qualitative results.

    A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection

    Xiaolong Wang, Abhinav Shrivastava, Abhinav Gupta
    Comments: CVPR 2017 Camera Ready
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    How do we learn an object detector that is invariant to occlusions and
    deformations? Our current solution is to use a data-driven strategy — collect
    large-scale datasets which have object instances under different conditions.
    The hope is that the final classifier can use these examples to learn
    invariances. But is it really possible to see all the occlusions in a dataset?
    We argue that like categories, occlusions and object deformations also follow a
    long-tail. Some occlusions and deformations are so rare that they hardly
    happen; yet we want to learn a model invariant to such occurrences. In this
    paper, we propose an alternative solution. We propose to learn an adversarial
    network that generates examples with occlusions and deformations. The goal of
    the adversary is to generate examples that are difficult for the object
    detector to classify. In our framework both the original detector and adversary
    are learned in a joint manner. Our experimental results indicate a 2.3% mAP
    boost on VOC07 and a 2.6% mAP boost on VOC2012 object detection challenge
    compared to the Fast-RCNN pipeline. We also release the code for this paper.

    Deep Learning for Multi-Task Medical Image Segmentation in Multiple Modalities

    Pim Moeskops, Jelmer M. Wolterink, Bas H.M. van der Velden, Kenneth G.A. Gilhuijs, Tim Leiner, Max A. Viergever, Ivana Išgum
    Journal-ref: Moeskops, P., Wolterink, J.M., van der Velden, B.H.M., Gilhuijs,
    K.G.A., Leiner, T., Viergever, M.A., Iv{s}gum, I. Deep learning for
    multi-task medical image segmentation in multiple modalities. In: MICCAI
    2016, pp. 478-486
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Automatic segmentation of medical images is an important task for many
    clinical applications. In practice, a wide range of anatomical structures are
    visualised using different imaging modalities. In this paper, we investigate
    whether a single convolutional neural network (CNN) can be trained to perform
    different segmentation tasks.

    A single CNN is trained to segment six tissues in MR brain images, the
    pectoral muscle in MR breast images, and the coronary arteries in cardiac CTA.
    The CNN therefore learns to identify the imaging modality, the visualised
    anatomical structures, and the tissue classes.

    For each of the three tasks (brain MRI, breast MRI and cardiac CTA), this
    combined training procedure resulted in a segmentation performance equivalent
    to that of a CNN trained specifically for that task, demonstrating the high
    capacity of CNN architectures. Hence, a single system could be used in clinical
    practice to automatically perform diverse segmentation tasks without
    task-specific training.

    Reconstruction of~3-D Rigid Smooth Curves Moving Free when Two Traceable Points Only are Available

    Mieczysław A. Kłopotek
    Journal-ref: Preliminaru version of the paper M.A. K{l}opotek: Reconstruction
    of 3-D rigid smooth curves moving free when two traceable points only are
    available. Machine Graphics & Vision 1(1992)1-2, pp. 392-405
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)

    This paper extends previous research in that sense that for orthogonal
    projections of rigid smooth (true-3D) curves moving totally free it reduces the
    number of required traceable points to two only (the best results known so far
    to the author are 3 points from free motion and 2 for motion restricted to
    rotation around a fixed direction and and 2 for motion restricted to influence
    of a homogeneous force field). The method used is exploitation of information
    on tangential projections. It discusses also possibility of simplification of
    reconstruction of flat curves moving free for prospective projections.

    Quality Aware Network for Set to Set Recognition

    Yu Liu, Junjie Yan, Wanli Ouyang
    Comments: Accepted at CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

    This paper targets on the problem of set to set recognition, which learns the
    metric between two image sets. Images in each set belong to the same identity.
    Since images in a set can be complementary, they hopefully lead to higher
    accuracy in practical applications. However, the quality of each sample cannot
    be guaranteed, and samples with poor quality will hurt the metric. In this
    paper, the quality aware network (QAN) is proposed to confront this problem,
    where the quality of each sample can be automatically learned although such
    information is not explicitly provided in the training stage. The network has
    two branches, where the first branch extracts appearance feature embedding for
    each sample and the other branch predicts quality score for each sample.
    Features and quality scores of all samples in a set are then aggregated to
    generate the final feature embedding. We show that the two branches can be
    trained in an end-to-end manner given only the set-level identity annotation.
    Analysis on gradient spread of this mechanism indicates that the quality
    learned by the network is beneficial to set-to-set recognition and simplifies
    the distribution that the network needs to fit. Experiments on both face
    verification and person re-identification show advantages of the proposed QAN.
    The source code and network structure can be downloaded at
    this https URL

    Interpretable Explanations of Black Boxes by Meaningful Perturbation

    Ruth Fong, Andrea Vedaldi
    Comments: 9 pages, 10 figures, submitted to ICCV 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    As machine learning algorithms are increasingly applied to high impact yet
    high risk tasks, e.g. problems in health, it is critical that researchers can
    explain how such algorithms arrived at their predictions. In recent years, a
    number of image saliency methods have been developed to summarize where highly
    complex neural networks “look” in an image for evidence for their predictions.
    However, these techniques are limited by their heuristic nature and
    architectural constraints.

    In this paper, we make two main contributions: First, we propose a general
    framework for learning different kinds of explanations for any black box
    algorithm. Second, we introduce a paradigm that learns the minimally salient
    part of an image by directly editing it and learning from the corresponding
    changes to its output. Unlike previous works, our method is model-agnostic and
    testable because it is grounded in replicable image perturbations.

    Automatic segmentation of MR brain images with a convolutional neural network

    Pim Moeskops, Max A. Viergever, Adriënne M. Mendrik, Linda S. de Vries, Manon J.N.L. Benders, Ivana Išgum
    Journal-ref: IEEE Transactions on Medical Imaging, 35(5), 1252-1261 (2016)
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Automatic segmentation in MR brain images is important for quantitative
    analysis in large-scale studies with images acquired at all ages.

    This paper presents a method for the automatic segmentation of MR brain
    images into a number of tissue classes using a convolutional neural network. To
    ensure that the method obtains accurate segmentation details as well as spatial
    consistency, the network uses multiple patch sizes and multiple convolution
    kernel sizes to acquire multi-scale information about each voxel. The method is
    not dependent on explicit features, but learns to recognise the information
    that is important for the classification based on training data. The method
    requires a single anatomical MR image only.

    The segmentation method is applied to five different data sets: coronal
    T2-weighted images of preterm infants acquired at 30 weeks postmenstrual age
    (PMA) and 40 weeks PMA, axial T2- weighted images of preterm infants acquired
    at 40 weeks PMA, axial T1-weighted images of ageing adults acquired at an
    average age of 70 years, and T1-weighted images of young adults acquired at an
    average age of 23 years. The method obtained the following average Dice
    coefficients over all segmented tissue classes for each data set, respectively:
    0.87, 0.82, 0.84, 0.86 and 0.91.

    The results demonstrate that the method obtains accurate segmentations in all
    five sets, and hence demonstrates its robustness to differences in age and
    acquisition protocol.

    Online Video Deblurring via Dynamic Temporal Blending Network

    Tae Hyun Kim, Kyoung Mu Lee, Bernhard Schölkopf, Michael Hirsch
    Comments: 10 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    State-of-the-art video deblurring methods are capable of removing non-uniform
    blur caused by unwanted camera shake and/or object motion in dynamic scenes.
    However, most existing methods are based on batch processing and thus need
    access to all recorded frames, rendering them computationally demanding and
    time consuming and thus limiting their practical use. In contrast, we propose
    an online (sequential) video deblurring method based on a spatio-temporal
    recurrent network that allows for real-time performance. In particular, we
    introduce a novel architecture which extends the receptive field while keeping
    the overall size of the network small to enable fast execution. In doing so,
    our network is able to remove even large blur caused by strong camera shake
    and/or fast moving objects. Furthermore, we propose a novel network layer that
    enforces temporal consistency between consecutive frames by dynamic temporal
    blending which compares and adaptively (at test time) shares features obtained
    at different time steps. We show the superiority of the proposed method in an
    extensive experimental evaluation.

    Simultaneous Stereo Video Deblurring and Scene Flow Estimation

    Liyuan Pan, Yuchao Dai, Miaomiao Liu, Fatih Porikli
    Comments: Accepted to IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Videos for outdoor scene often show unpleasant blur effects due to the large
    relative motion between the camera and the dynamic objects and large depth
    variations. Existing works typically focus monocular video deblurring. In this
    paper, we propose a novel approach to deblurring from stereo videos. In
    particular, we exploit the piece-wise planar assumption about the scene and
    leverage the scene flow information to deblur the image. Unlike the existing
    approach [31] which used a pre-computed scene flow, we propose a single
    framework to jointly estimate the scene flow and deblur the image, where the
    motion cues from scene flow estimation and blur information could reinforce
    each other, and produce superior results than the conventional scene flow
    estimation or stereo deblurring methods. We evaluate our method extensively on
    two available datasets and achieve significant improvement in flow estimation
    and removing the blur effect over the state-of-the-art methods.

    Learning Deep CNN Denoiser Prior for Image Restoration

    Kai Zhang, Wangmeng Zuo, Shuhang Gu, Lei Zhang
    Comments: Accepted to CVPR 2017. Code: this https URL
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Model-based optimization methods and discriminative learning methods have
    been the two dominant strategies for solving various inverse problems in
    low-level vision. Typically, those two kinds of methods have their respective
    merits and drawbacks, e.g., model-based optimization methods are flexible for
    handling different inverse problems but are usually time-consuming with
    sophisticated priors for the purpose of good performance; in the meanwhile,
    discriminative learning methods have fast testing speed but their application
    range is greatly restricted by the specialized task. Recent works have revealed
    that, with the aid of variable splitting techniques, denoiser prior can be
    plugged in as a modular part of model-based optimization methods to solve other
    inverse problems (e.g., deblurring). Such an integration induces considerable
    advantage when the denoiser is obtained via discriminative learning. However,
    the study of integration with fast discriminative denoiser prior is still
    lacking. To this end, this paper aims to train a set of fast and effective CNN
    (convolutional neural network) denoisers and integrate them into model-based
    optimization method to solve other inverse problems. Experimental results
    demonstrate that the learned set of denoisers not only achieve promising
    Gaussian denoising results but also can be used as prior to deliver good
    performance for various low-level vision applications.

    Reconstruction of three-dimensional porous media using generative adversarial neural networks

    Lukas Mosser, Olivier Dubrule, Martin J. Blunt
    Comments: 21 pages, 20 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Materials Science (cond-mat.mtrl-sci); Fluid Dynamics (physics.flu-dyn); Geophysics (physics.geo-ph)

    To evaluate the variability of multi-phase flow properties of porous media at
    the pore scale, it is necessary to acquire a number of representative samples
    of the void-solid structure. While modern x-ray computer tomography has made it
    possible to extract three-dimensional images of the pore space, assessment of
    the variability in the inherent material properties is often experimentally not
    feasible. We present a novel method to reconstruct the solid-void structure of
    porous media by applying a generative neural network that allows an implicit
    description of the probability distribution represented by three-dimensional
    image datasets. We show, by using an adversarial learning approach for neural
    networks, that this method of unsupervised learning is able to generate
    representative samples of porous media that honor their statistics. We
    successfully compare measures of pore morphology, such as the Euler
    characteristic, two-point statistics and directional single-phase permeability
    of synthetic realizations with the calculated properties of a bead pack, Berea
    sandstone, and Ketton limestone. Results show that GANs can be used to
    reconstruct high-resolution three-dimensional images of porous media at
    different scales that are representative of the morphology of the images used
    to train the neural network. The fully convolutional nature of the trained
    neural network allows the generation of large samples while maintaining
    computational efficiency. Compared to classical stochastic methods of image
    reconstruction, the implicit representation of the learned data distribution
    can be stored and reused to generate multiple realizations of the pore
    structure very rapidly.

    Pyramidal Gradient Matching for Optical Flow Estimation

    Yuanwei Li
    Comments: This work was finished in August 2016 and then submitted to IEEE PAMI in August 17,2016 and submitted to IEEE TIP in April 9,2017 after revising
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Initializing optical flow field by either sparse descriptor matching or dense
    patch matches has been proved to be particularly useful for capturing large
    displacements. In this paper, we present a pyramidal gradient matching approach
    that can provide dense matches for highly accurate and efficient optical flow
    estimation. A novel contribution of our method is that image gradient is used
    to describe image patches and proved to be able to produce robust matching.
    Therefore, our method is more efficient than methods that adopt special
    features (like SIFT) or patch distance metric. Moreover, we find that image
    gradient is scalable for optical flow estimation, which means we can use
    different levels of gradient feature (for example, full gradients or only
    direction information of gradients) to obtain different complexity without
    dramatic changes in accuracy. Another contribution is that we uncover the
    secrets of limited PatchMatch through a thorough analysis and design a
    pyramidal matching framework based these secrets. Our pyramidal matching
    framework is aimed at robust gradient matching and effective to grow inliers
    and reject outliers. In this framework, we present some special enhancements
    for outlier filtering in gradient matching. By initializing EpicFlow with our
    matches, experimental results show that our method is efficient and robust
    (ranking 1st on both clean pass and final pass of MPI Sintel dataset among
    published methods).

    Mining Object Parts from CNNs via Active Question-Answering

    Quanshi Zhang, Ruiming Cao, Ying Nian Wu, Song-Chun Zhu
    Comments: Published in CVPR 2017
    Journal-ref: Quanshi Zhang, Ruiming Cao, Ying Nian Wu, and Song-Chun Zhu,
    “Mining Object Parts from CNNs via Active Question-Answering” in CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Given a convolutional neural network (CNN) that is pre-trained for object
    classification, this paper proposes to use active question-answering to
    semanticize neural patterns in conv-layers of the CNN and mine part concepts.
    For each part concept, we mine neural patterns in the pre-trained CNN, which
    are related to the target part, and use these patterns to construct an And-Or
    graph (AOG) to represent a four-layer semantic hierarchy of the part. As an
    interpretable model, the AOG associates different CNN units with different
    explicit object parts. We use an active human-computer communication to
    incrementally grow such an AOG on the pre-trained CNN as follows. We allow the
    computer to actively identify objects, whose neural patterns cannot be
    explained by the current AOG. Then, the computer asks human about the
    unexplained objects, and uses the answers to automatically discover certain CNN
    patterns corresponding to the missing knowledge. We incrementally grow the AOG
    to encode new knowledge discovered during the active-learning process. In
    experiments, our method exhibits high learning efficiency. Our method uses
    about 1/6-1/3 of the part annotations for training, but achieves similar or
    better part-localization performance than fast-RCNN methods.

    Show, Ask, Attend, and Answer: A Strong Baseline For Visual Question Answering

    Vahid Kazemi, Ali Elqursh
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper presents a new baseline for visual question answering task. Given
    an image and a question in natural language, our model produces accurate
    answers according to the content of the image. Our model, while being
    architecturally simple and relatively small in terms of trainable parameters,
    sets a new state of the art on both unbalanced and balanced VQA benchmark. On
    VQA 1.0 open ended challenge, our model achieves 64.6% accuracy on the
    test-standard set without using additional data, an improvement of 0.4% over
    state of the art, and on newly released VQA 2.0, our model scores 59.7% on
    validation set outperforming best previously reported results by 4%. The
    results presented in this paper are especially interesting because very similar
    models have been tried before but significantly lower performance were
    reported. In light of the new results we hope to see more meaningful research
    on visual question answering in the future.

    EAST: An Efficient and Accurate Scene Text Detector

    Xinyu Zhou, Cong Yao, He Wen, Yuzhi Wang, Shuchang Zhou, Weiran He, Jiajun Liang
    Comments: Accepted to CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Previous approaches for scene text detection have already achieved promising
    performances across various benchmarks. However, they usually fall short when
    dealing with challenging scenarios, even when equipped with deep neural network
    models, because the overall performance is determined by the interplay of
    multiple stages and components in the pipelines. In this work, we propose a
    simple yet powerful pipeline that yields fast and accurate text detection in
    natural scenes. The pipeline directly predicts words or text lines of arbitrary
    orientations and quadrilateral shapes in full images, eliminating unnecessary
    intermediate steps (e.g., candidate aggregation and word partitioning), with a
    single neural network. The simplicity of our pipeline allows concentrating
    efforts on designing loss functions and neural network architecture.
    Experiments on standard datasets including ICDAR 2015, COCO-Text and MSRA-TD500
    demonstrate that the proposed algorithm significantly outperforms
    state-of-the-art methods in terms of both accuracy and efficiency. On the ICDAR
    2015 dataset, the proposed algorithm achieves an F-score of 0.7820 at 13.2fps
    at 720p resolution.

    Deep Multimodal Representation Learning from Temporal Data

    Xitong Yang, Palghat Ramesh, Radha Chitta, Sriganesh Madhvanath, Edgar A. Bernal, Jiebo Luo
    Comments: To appear in CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In recent years, Deep Learning has been successfully applied to multimodal
    learning problems, with the aim of learning useful joint representations in
    data fusion applications. When the available modalities consist of time series
    data such as video, audio and sensor signals, it becomes imperative to consider
    their temporal structure during the fusion process. In this paper, we propose
    the Correlational Recurrent Neural Network (CorrRNN), a novel temporal fusion
    model for fusing multiple input modalities that are inherently temporal in
    nature. Key features of our proposed model include: (i) simultaneous learning
    of the joint representation and temporal dependencies between modalities, (ii)
    use of multiple loss terms in the objective function, including a maximum
    correlation loss term to enhance learning of cross-modal information, and (iii)
    the use of an attention model to dynamically adjust the contribution of
    different input modalities to the joint representation. We validate our model
    via experimentation on two different tasks: video- and sensor-based activity
    classification, and audio-visual speech recognition. We empirically analyze the
    contributions of different components of the proposed CorrRNN model, and
    demonstrate its robustness, effectiveness and state-of-the-art performance on
    multiple datasets.

    Restoration of Atmospheric Turbulence-distorted Images via RPCA and Quasiconformal Maps

    Chun Pong Lau, Yu Hin Lai, Lok Ming Lui
    Comments: 21 pages, 24 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Computational Geometry (cs.CG); Graphics (cs.GR)

    We address the problem of restoring a high-quality image from an observed
    image sequence strongly distorted by atmospheric turbulence. A novel algorithm
    is proposed in this paper to reduce geometric distortion as well as
    space-and-time-varying blur due to strong turbulence. By considering a suitable
    energy functional, our algorithm first obtains a sharp reference image and a
    subsampled image sequence containing sharp and mildly distorted image frames
    with respect to the reference image. The subsampled image sequence is then
    stabilized by applying the Robust Principal Component Analysis (RPCA) on the
    deformation fields between image frames and warping the image frames by a
    quasiconformal map associated with the low-rank part of the deformation matrix.
    After image frames are registered to the reference image, the low-rank part of
    them are deblurred via a blind deconvolution, and the deblurred frames are then
    fused with the enhanced sparse part. Experiments have been carried out on both
    synthetic and real turbulence-distorted video. Results demonstrate that our
    method is effective in alleviating distortions and blur, restoring image
    details and enhancing visual quality.

    Improving Pairwise Ranking for Multi-label Image Classification

    Yuncheng Li, Yale Song, Jiebo Luo
    Comments: cvpr 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Learning to rank has recently emerged as an attractive technique to train
    deep convolutional neural networks for various computer vision tasks. Pairwise
    ranking, in particular, has been successful in multi-label image
    classification, achieving state-of-the-art results on various benchmarks.
    However, most existing approaches use the hinge loss to train their models,
    which is non-smooth and thus is difficult to optimize especially with deep
    networks. Furthermore, they employ simple heuristics, such as top-k or
    thresholding, to determine which labels to include in the output from a ranked
    list of labels, which limits their use in the real-world setting. In this work,
    we propose two techniques to improve pairwise ranking based multi-label image
    classification: (1) we propose a novel loss function for pairwise ranking,
    which is smooth everywhere and thus is easier to optimize; and (2) we
    incorporate a label decision module into the model, estimating the optimal
    confidence thresholds for each visual concept. We provide theoretical analyses
    of our loss function in the Bayes consistency and risk minimization framework,
    and show its benefit over existing pairwise ranking formulations. We
    demonstrate the effectiveness of our approach on three large-scale datasets,
    VOC2007, NUS-WIDE and MS-COCO, achieving the best reported results in the
    literature.

    DOPE: Distributed Optimization for Pairwise Energies

    Jose Dolz, Ismail Ben Ayed, Christian Desrosiers
    Comments: Accepted at CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We formulate an Alternating Direction Method of Mul-tipliers (ADMM) that
    systematically distributes the computations of any technique for optimizing
    pairwise functions, including non-submodular potentials. Such discrete
    functions are very useful in segmentation and a breadth of other vision
    problems. Our method decomposes the problem into a large set of small
    sub-problems, each involving a sub-region of the image domain, which can be
    solved in parallel. We achieve consistency between the sub-problems through a
    novel constraint that can be used for a large class of pair-wise functions. We
    give an iterative numerical solution that alternates between solving the
    sub-problems and updating consistency variables, until convergence. We report
    comprehensive experiments, which demonstrate the benefit of our general
    distributed solution in the case of the popular serial algorithm of Boykov and
    Kolmogorov (BK algorithm) and, also, in the context of non-submodular
    functions.

    Detecting Visual Relationships with Deep Relational Networks

    Bo Dai, Yuqi Zhang, Dahua Lin
    Comments: To be appeared in CVPR 2017 as an oral paper
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Relationships among objects play a crucial role in image understanding.
    Despite the great success of deep learning techniques in recognizing individual
    objects, reasoning about the relationships among objects remains a challenging
    task. Previous methods often treat this as a classification problem,
    considering each type of relationship (e.g. “ride”) or each distinct visual
    phrase (e.g. “person-ride-horse”) as a category. Such approaches are faced with
    significant difficulties caused by the high diversity of visual appearance for
    each kind of relationships or the large number of distinct visual phrases. We
    propose an integrated framework to tackle this problem. At the heart of this
    framework is the Deep Relational Network, a novel formulation designed
    specifically for exploiting the statistical dependencies between objects and
    their relationships. On two large datasets, the proposed method achieves
    substantial improvement over state-of-the-art.

    A semidiscrete version of the Petitot model as a plausible model for anthropomorphic image reconstruction and pattern recognition

    Dario Prandi, Jean-Paul Gauthier
    Comments: 118 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Analysis of PDEs (math.AP); Representation Theory (math.RT)

    In his beautiful book [54], Jean Petitot proposes a subriemannian model for
    the primary visual cortex of mammals. This model is neurophysiologically
    justified. Further developments of this theory lead to efficient algorithms for
    image reconstruction, based upon the consideration of an associated
    hypoelliptic diffusion. The subriemannian model of Petitot (or certain of its
    improvements) is a left-invariant structure over the group (SE(2)) of
    rototranslations of the plane. Here, we propose a semi-discrete version of this
    theory, leading to a left-invariant structure over the group (SE(2,N)),
    restricting to a finite number of rotations. This apparently very simple group
    is in fact quite atypical: it is maximally almost periodic, which leads to much
    simpler harmonic analysis compared to (SE(2).) Based upon this semi-discrete
    model, we improve on the image-reconstruction algorithms and we develop a
    pattern-recognition theory that leads also to very efficient algorithms in
    practice.

    Action Unit Detection with Region Adaptation, Multi-labeling Learning and Optimal Temporal Fusing

    Wei Li, Farnaz Abitahi, Zhigang Zhu
    Comments: The paper is accepted to CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Action Unit (AU) detection becomes essential for facial analysis. Many
    proposed approaches face challenging problems in dealing with the alignments of
    different face regions, in the effective fusion of temporal information, and in
    training a model for multiple AU labels. To better address these problems, we
    propose a deep learning framework for AU detection with region of interest
    (ROI) adaptation, integrated multi-label learning, and optimal LSTM-based
    temporal fusing. First, ROI cropping nets (ROI Nets) are designed to make sure
    specifically interested regions of faces are learned independently; each
    sub-region has a local convolutional neural network (CNN) – an ROI Net, whose
    convolutional filters will only be trained for the corresponding region.
    Second, multi-label learning is employed to integrate the outputs of those
    individual ROI cropping nets, which learns the inter-relationships of various
    AUs and acquires global features across sub-regions for AU detection. Finally,
    the optimal selection of multiple LSTM layers to form the best LSTM Net is
    carried out to best fuse temporal features, in order to make the AU prediction
    the most accurate. The proposed approach is evaluated on two popular AU
    detection datasets, BP4D and DISFA, outperforming the state of the art
    significantly, with an average improvement of around 13% on BP4D and 25% on
    DISFA, respectively.

    CERN: Confidence-Energy Recurrent Network for Group Activity Recognition

    Tianmin Shu, Sinisa Todorovic, Song-Chun Zhu
    Comments: Accepted to IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)

    This work is about recognizing human activities occurring in videos at
    distinct semantic levels, including individual actions, interactions, and group
    activities. The recognition is realized using a two-level hierarchy of Long
    Short-Term Memory (LSTM) networks, forming a feed-forward deep architecture,
    which can be trained end-to-end. In comparison with existing architectures of
    LSTMs, we make two key contributions giving the name to our approach as
    Confidence-Energy Recurrent Network — CERN. First, instead of using the common
    softmax layer for prediction, we specify a novel energy layer (EL) for
    estimating the energy of our predictions. Second, rather than finding the
    common minimum-energy class assignment, which may be numerically unstable under
    uncertainty, we specify that the EL additionally computes the p-values of the
    solutions, and in this way estimates the most confident energy minimum. The
    evaluation on the Collective Activity and Volleyball datasets demonstrates: (i)
    advantages of our two contributions relative to the common softmax and
    energy-minimization formulations and (ii) a superior performance relative to
    the state-of-the-art approaches.

    DRAW: Deep networks for Recognizing styles of Artists Who illustrate children's books

    Samet Hicsonmez, Nermin Samet, Fadime Sener, Pinar Duygulu
    Comments: ACM ICMR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper is motivated from a young boy’s capability to recognize an
    illustrator’s style in a totally different context. In the book “We are All
    Born Free” [1], composed of selected rights from the Universal Declaration of
    Human Rights interpreted by different illustrators, the boy was surprised to
    see a picture similar to the ones in the “Winnie the Witch” series drawn by
    Korky Paul (Figure 1). The style was noticeable in other characters of the same
    illustrator in different books as well. The capability of a child to easily
    spot the style was shown to be valid for other illustrators such as Axel
    Scheffler and Debi Gliori. The boy’s enthusiasm let us to start the journey to
    explore the capabilities of machines to recognize the style of illustrators.

    We collected pages from children’s books to construct a new illustrations
    dataset consisting of about 6500 pages from 24 artists. We exploited deep
    networks for categorizing illustrators and with around 94% classification
    performance our method over-performed the traditional methods by more than 10%.
    Going beyond categorization we explored transferring style. The classification
    performance on the transferred images has shown the ability of our system to
    capture the style. Furthermore, we discovered representative illustrations and
    discriminative stylistic elements.

    Semantically Consistent Regularization for Zero-Shot Recognition

    Pedro Morgado, Nuno Vasconcelos
    Comments: Accepted to CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)

    The role of semantics in zero-shot learning is considered. The effectiveness
    of previous approaches is analyzed according to the form of supervision
    provided. While some learn semantics independently, others only supervise the
    semantic subspace explained by training classes. Thus, the former is able to
    constrain the whole space but lacks the ability to model semantic correlations.
    The latter addresses this issue but leaves part of the semantic space
    unsupervised. This complementarity is exploited in a new convolutional neural
    network (CNN) framework, which proposes the use of semantics as constraints for
    recognition.Although a CNN trained for classification has no transfer ability,
    this can be encouraged by learning an hidden semantic layer together with a
    semantic code for classification. Two forms of semantic constraints are then
    introduced. The first is a loss-based regularizer that introduces a
    generalization constraint on each semantic predictor. The second is a codeword
    regularizer that favors semantic-to-class mappings consistent with prior
    semantic knowledge while allowing these to be learned from data. Significant
    improvements over the state-of-the-art are achieved on several datasets.

    Weakly-Supervised Spatial Context Networks

    Zuxuan Wu, Larry S. Davis, Leonid Sigal
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We explore the power of spatial context as a self-supervisory signal for
    learning visual representations. In particular, we propose spatial context
    networks that learn to predict a representation of one image patch from another
    image patch, within the same image, conditioned on their real-valued relative
    spatial offset. Unlike auto-encoders, that aim to encode and reconstruct
    original image patches, our network aims to encode and reconstruct intermediate
    representations of the spatially offset patches. As such, the network learns a
    spatially conditioned contextual representation. By testing performance with
    various patch selection mechanisms we show that focusing on object-centric
    patches is important, and that using object proposal as a patch selection
    mechanism leads to the highest improvement in performance. Further, unlike
    auto-encoders, context encoders [21], or other forms of unsupervised feature
    learning, we illustrate that contextual supervision (with pre-trained model
    initialization) can improve on existing pre-trained model performance. We build
    our spatial context networks on top of standard VGG_19 and CNN_M architectures
    and, among other things, show that we can achieve improvements (with no
    additional explicit supervision) over the original ImageNet pre-trained VGG_19
    and CNN_M models in object categorization and detection on VOC2007.

    Solving the L1 regularized least square problem via a box-constrained smooth minimization

    Majid Mohammadi, Wout Hofman, Yaohua Tan, S. Hamid Mousavi
    Comments: 5 pages
    Subjects: Optimization and Control (math.OC); Computer Vision and Pattern Recognition (cs.CV)

    In this paper, an equivalent smooth minimization for the L1 regularized least
    square problem is proposed. The proposed problem is a convex box-constrained
    smooth minimization which allows applying fast optimization methods to find its
    solution. Further, it is investigated that the property “the dual of dual is
    primal” holds for the L1 regularized least square problem. A solver for the
    smooth problem is proposed, and its affinity to the proximal gradient is shown.
    Finally, the experiments on L1 and total variation regularized problems are
    performed, and the corresponding results are reported.

    WRPN: Training and Inference using Wide Reduced-Precision Networks

    Asit Mishra, Jeffrey J Cook, Eriko Nurvitadhi, Debbie Marr
    Comments: Under submission to CVPR Workshop
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    For computer vision applications, prior works have shown the efficacy of
    reducing the numeric precision of model parameters (network weights) in deep
    neural networks but also that reducing the precision of activations hurts model
    accuracy much more than reducing the precision of model parameters. We study
    schemes to train networks from scratch using reduced-precision activations
    without hurting the model accuracy. We reduce the precision of activation maps
    (along with model parameters) using a novel quantization scheme and increase
    the number of filter maps in a layer, and find that this scheme compensates or
    surpasses the accuracy of the baseline full-precision network. As a result, one
    can significantly reduce the dynamic memory footprint, memory bandwidth,
    computational energy and speed up the training and inference process with
    appropriate hardware support. We call our scheme WRPN – wide reduced-precision
    networks. We report results using our proposed schemes and show that our
    results are better than previously reported accuracies on ILSVRC-12 dataset
    while being computationally less expensive compared to previously reported
    reduced-precision networks.


    Artificial Intelligence

    Efficient Large Scale Clustering based on data partitioning

    Malika Bendechache, Nhien-An Le-Khac, M-Tahar Kechadi
    Subjects: Artificial Intelligence (cs.AI)

    Clustering techniques are very attractive for extracting and identifying
    patterns in datasets. However, their application to very large spatial datasets
    presents numerous challenges such as high-dimensionality data, heterogeneity,
    and high complexity of some algorithms. For instance, some algorithms may have
    linear complexity but they require the domain knowledge in order to determine
    their input parameters. Distributed clustering techniques constitute a very
    good alternative to the big data challenges (e.g.,Volume, Variety, Veracity,
    and Velocity). Usually these techniques consist of two phases. The first phase
    generates local models or patterns and the second one tends to aggregate the
    local results to obtain global models. While the first phase can be executed in
    parallel on each site and, therefore, efficient, the aggregation phase is
    complex, time consuming and may produce incorrect and ambiguous global clusters
    and therefore incorrect models. In this paper we propose a new distributed
    clustering approach to deal efficiently with both phases; generation of local
    results and generation of global models by aggregation. For the first phase,
    our approach is capable of analysing the datasets located in each site using
    different clustering techniques. The aggregation phase is designed in such a
    way that the final clusters are compact and accurate while the overall process
    is efficient in time and memory allocation. For the evaluation, we use two
    well-known clustering algorithms; K-Means and DBSCAN. One of the key outputs of
    this distributed clustering technique is that the number of global clusters is
    dynamic; no need to be fixed in advance. Experimental results show that the
    approach is scalable and produces high quality results.

    What we really want to find by Sentiment Analysis: The Relationship between Computational Models and Psychological State

    Hwiyeol Jo, Soo-Min Kim, Jeong Ryu
    Comments: Rejected Paper in CogSci2017. I’m sure there is no place for integrated research. arXiv admin note: text overlap with arXiv:1607.03707
    Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

    As the first step to model emotional state of a person, we build sentiment
    analysis models with existing deep neural network algorithms and compare the
    models with psychological measurements to enlighten the relationship. In the
    experiments, we first examined psychological state of 64 participants and asked
    them to summarize the story of a book, Chronicle of a Death Foretold (Marquez,
    1981). Secondly, we trained models using crawled 365,802 movie review data;
    then we evaluated participants’ summaries using the pretrained model as a
    concept of transfer learning. With the background that emotion affects on
    memories, we investigated the relationship between the evaluation score of the
    summaries from computational models and the examined psychological
    measurements. The result shows that although CNN performed the best among other
    deep neural network algorithms (LSTM, GRU), its results are not related to the
    psychological state. Rather, GRU shows more explainable results depending on
    the psychological state. The contribution of this paper can be summarized as
    follows: (1) we enlighten the relationship between computational models and
    psychological measurements. (2) we suggest this framework as objective methods
    to evaluate the emotion; the real sentiment analysis of a person.

    Next Generation Business Intelligence and Analytics: A Survey

    Quoc Duy Vo, Jaya Thomas, Shinyoung Cho, Pradipta De, Bong Jun Choi, Lee Sael
    Comments: 11 pages, 4 figures
    Subjects: Artificial Intelligence (cs.AI)

    Business Intelligence and Analytics (BI&A) is the process of extracting and
    predicting business-critical insights from data. Traditional BI focused on data
    collection, extraction, and organization to enable efficient query processing
    for deriving insights from historical data. With the rise of big data and cloud
    computing, there are many challenges and opportunities for the BI. Especially
    with the growing number of data sources, traditional BI&A are evolving to
    provide intelligence at different scales and perspectives – operational BI,
    situational BI, self-service BI. In this survey, we review the evolution of
    business intelligence systems in full scale from back-end architecture to and
    front-end applications. We focus on the changes in the back-end architecture
    that deals with the collection and organization of the data. We also review the
    changes in the front-end applications, where analytic services and
    visualization are the core components. Using a uses case from BI in Healthcare,
    which is one of the most complex enterprises, we show how BI&A will play an
    important role beyond the traditional usage. The survey provides a holistic
    view of Business Intelligence and Analytics for anyone interested in getting a
    complete picture of the different pieces in the emerging next generation BI&A
    solutions.

    Source-Sensitive Belief Change

    Shahab Ebrahimi
    Comments: 13 pages
    Journal-ref: International Journal of Artificial Intelligence and Applications
    (IJAIA), Vol.8, No.2, March 2017
    Subjects: Artificial Intelligence (cs.AI); Logic in Computer Science (cs.LO)

    The AGM model is the most remarkable framework for modeling belief revision.
    However, it is not perfect in all aspects. Paraconsistent belief revision,
    multi-agent belief revision and non-prioritized belief revision are three
    different extensions to AGM to address three important criticisms applied to
    it. In this article, we propose a framework based on AGM that takes a position
    in each of these categories. Also, we discuss some features of our framework
    and study the satisfiability of AGM postulates in this new context.

    Beliefs and Probability in Bacchus' l.p. Logic: A~3-Valued Logic Solution to Apparent Counter-intuition

    Mieczysław A. Kłopotek
    Comments: Draft for the conference M.A. K{l}opotek: Beliefs and Probability in Bacchus’ l.p. Logic: A 3-Valued Logic Solution to Apparent Counter-intuition. [in:] R. Trappl Ed,: Cybernetics and Systems Research. Proc. 11 European Meeting on Cybernetics and System Research EMCSR’92, Wien, Osterreich, 20. April 1992. World Scientific Singapore, New Jersey, London, HongKong Vol. 1, pp. 519-526
    Subjects: Artificial Intelligence (cs.AI)

    Fundamental discrepancy between first order logic and statistical inference
    (global versus local properties of universe) is shown to be the obstacle for
    integration of logic and probability in L.p. logic of Bacchus. To overcome the
    counterintuitiveness of L.p. behaviour, a 3-valued logic is proposed.

    The MATLAB Toolbox SciXMiner: User's Manual and Programmer's Guide

    Ralf Mikut, Andreas Bartschat, Wolfgang Doneit, Jorge Ángel González Ordiano, Benjamin Schott, Johannes Stegmaier, Simon Waczowicz, Markus Reischl
    Subjects: Artificial Intelligence (cs.AI)

    The Matlab toolbox SciXMiner is designed for the visualization and analysis
    of time series and features with a special focus to classification problems. It
    was developed at the Institute of Applied Computer Science of the Karlsruhe
    Institute of Technology (KIT), a member of the Helmholtz Association of German
    Research Centres in Germany. The aim was to provide an open platform for the
    development and improvement of data mining methods and its applications to
    various medical and technical problems. SciXMiner bases on Matlab (tested for
    the version 2017a). Many functions do not require additional standard toolboxes
    but some parts of Signal, Statistics and Wavelet toolboxes are used for special
    cases. The decision to a Matlab-based solution was made to use the wide
    mathematical functionality of this package provided by The Mathworks Inc.
    SciXMiner is controlled by a graphical user interface (GUI) with menu items and
    control elements like popup lists, checkboxes and edit elements. This makes it
    easier to work with SciXMiner for inexperienced users. Furthermore, an
    automatization and batch standardization of analyzes is possible using macros.
    The standard Matlab style using the command line is also available. SciXMiner
    is an open source software. The download page is
    this http URL It is licensed under the conditions
    of the GNU General Public License (GNU-GPL) of The Free Software Foundation.

    Matching Media Contents with User Profiles by means of the Dempster-Shafer Theory

    Luigi Troiano, Irene Díaz, Ciro Gaglione
    Comments: FUZZ-IEEE 2017. 6 pages, 3 figures, 4 tables
    Subjects: Artificial Intelligence (cs.AI)

    The media industry is increasingly personalizing the offering of contents in
    attempt to better target the audience. This requires to analyze the
    relationships that goes established between users and content they enjoy,
    looking at one side to the content characteristics and on the other to the user
    profile, in order to find the best match between the two. In this paper we
    suggest to build that relationship using the Dempster-Shafer’s Theory of
    Evidence, proposing a reference model and illustrating its properties by means
    of a toy example. Finally we suggest possible applications of the model for
    tasks that are common in the modern media industry.

    Stochastic Neural Networks for Hierarchical Reinforcement Learning

    Carlos Florensa, Yan Duan, Pieter Abbeel
    Comments: Published as a conference paper at ICLR 2017
    Journal-ref: International Conference on Learning Representations 2017
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Robotics (cs.RO)

    Deep reinforcement learning has achieved many impressive results in recent
    years. However, tasks with sparse rewards or long horizons continue to pose
    significant challenges. To tackle these important problems, we propose a
    general framework that first learns useful skills in a pre-training
    environment, and then leverages the acquired skills for learning faster in
    downstream tasks. Our approach brings together some of the strengths of
    intrinsic motivation and hierarchical methods: the learning of useful skill is
    guided by a single proxy reward, the design of which requires very minimal
    domain knowledge about the downstream tasks. Then a high-level policy is
    trained on top of these skills, providing a significant improvement of the
    exploration and allowing to tackle sparse rewards in the downstream tasks. To
    efficiently pre-train a large span of skills, we use Stochastic Neural Networks
    combined with an information-theoretic regularizer. Our experiments show that
    this combination is effective in learning a wide span of interpretable skills
    in a sample-efficient way, and can significantly boost the learning performance
    uniformly across a wide range of downstream tasks.

    Quality Aware Network for Set to Set Recognition

    Yu Liu, Junjie Yan, Wanli Ouyang
    Comments: Accepted at CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

    This paper targets on the problem of set to set recognition, which learns the
    metric between two image sets. Images in each set belong to the same identity.
    Since images in a set can be complementary, they hopefully lead to higher
    accuracy in practical applications. However, the quality of each sample cannot
    be guaranteed, and samples with poor quality will hurt the metric. In this
    paper, the quality aware network (QAN) is proposed to confront this problem,
    where the quality of each sample can be automatically learned although such
    information is not explicitly provided in the training stage. The network has
    two branches, where the first branch extracts appearance feature embedding for
    each sample and the other branch predicts quality score for each sample.
    Features and quality scores of all samples in a set are then aggregated to
    generate the final feature embedding. We show that the two branches can be
    trained in an end-to-end manner given only the set-level identity annotation.
    Analysis on gradient spread of this mechanism indicates that the quality
    learned by the network is beneficial to set-to-set recognition and simplifies
    the distribution that the network needs to fit. Experiments on both face
    verification and person re-identification show advantages of the proposed QAN.
    The source code and network structure can be downloaded at
    this https URL

    Interpretable Explanations of Black Boxes by Meaningful Perturbation

    Ruth Fong, Andrea Vedaldi
    Comments: 9 pages, 10 figures, submitted to ICCV 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    As machine learning algorithms are increasingly applied to high impact yet
    high risk tasks, e.g. problems in health, it is critical that researchers can
    explain how such algorithms arrived at their predictions. In recent years, a
    number of image saliency methods have been developed to summarize where highly
    complex neural networks “look” in an image for evidence for their predictions.
    However, these techniques are limited by their heuristic nature and
    architectural constraints.

    In this paper, we make two main contributions: First, we propose a general
    framework for learning different kinds of explanations for any black box
    algorithm. Second, we introduce a paradigm that learns the minimally salient
    part of an image by directly editing it and learning from the corresponding
    changes to its output. Unlike previous works, our method is model-agnostic and
    testable because it is grounded in replicable image perturbations.

    Scavenger 0.1: A Theorem Prover Based on Conflict Resolution

    Daniyar Itegulov, John Slaney, Bruno Woltzenlogel Paleo
    Subjects: Logic in Computer Science (cs.LO); Artificial Intelligence (cs.AI); Formal Languages and Automata Theory (cs.FL)

    This paper introduces Scavenger, the first theorem prover for pure
    first-order logic without equality based on the new conflict resolution
    calculus. Conflict resolution has a restricted resolution inference rule that
    resembles (a first-order generalization of) unit propagation as well as a rule
    for assuming decision literals and a rule for deriving new clauses by (a
    first-order generalization of) conflict-driven clause learning.

    Minkowski Operations of Sets with Application to Robot Localization

    Benoit Desrochers (DGA-TN), Luc Jaulin (Ensta Bretagne, Lab-Sticc)
    Comments: In Proceedings SNR 2017, arXiv:1704.02421
    Journal-ref: EPTCS 247, 2017, pp. 34-45
    Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computational Geometry (cs.CG); Systems and Control (cs.SY)

    This papers shows that using separators, which is a pair of two complementary
    contractors, we can easily and efficiently solve the localization problem of a
    robot with sonar measurements in an unstructured environment. We introduce
    separators associated with the Minkowski sum and the Minkowski difference in
    order to facilitate the resolution. A test-case is given in order to illustrate
    the principle of the approach.

    Composite Task-Completion Dialogue System via Hierarchical Deep Reinforcement Learning

    Baolin Peng, Xiujun Li, Lihong Li, Jianfeng Gao, Asli Celikyilmaz, Sungjin Lee, Kam-Fai Wong
    Comments: 13 pages, 7 figures
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG)

    In a composite-domain task-completion dialogue system, a conversation agent
    often switches among multiple sub-domains before it successfully completes the
    task. Given such a scenario, a standard deep reinforcement learning based
    dialogue agent may suffer to find a good policy due to the issues such as:
    increased state and action spaces, high sample complexity demands, sparse
    reward and long horizon, etc. In this paper, we propose to use hierarchical
    deep reinforcement learning approach which can operate at different temporal
    scales and is intrinsically motivated to attack these problems. Our
    hierarchical network consists of two levels: the top-level meta-controller for
    subgoal selection and the low-level controller for dialogue policy learning.
    Subgoals selected by meta-controller and intrinsic rewards can guide the
    controller to effectively explore in the state-action space and mitigate the
    spare reward and long horizon problems. Experiments on both simulations and
    human evaluation show that our model significantly outperforms flat deep
    reinforcement learning agents in terms of success rate, rewards and user
    rating.

    WRPN: Training and Inference using Wide Reduced-Precision Networks

    Asit Mishra, Jeffrey J Cook, Eriko Nurvitadhi, Debbie Marr
    Comments: Under submission to CVPR Workshop
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    For computer vision applications, prior works have shown the efficacy of
    reducing the numeric precision of model parameters (network weights) in deep
    neural networks but also that reducing the precision of activations hurts model
    accuracy much more than reducing the precision of model parameters. We study
    schemes to train networks from scratch using reduced-precision activations
    without hurting the model accuracy. We reduce the precision of activation maps
    (along with model parameters) using a novel quantization scheme and increase
    the number of filter maps in a layer, and find that this scheme compensates or
    surpasses the accuracy of the baseline full-precision network. As a result, one
    can significantly reduce the dynamic memory footprint, memory bandwidth,
    computational energy and speed up the training and inference process with
    appropriate hardware support. We call our scheme WRPN – wide reduced-precision
    networks. We report results using our proposed schemes and show that our
    results are better than previously reported accuracies on ILSVRC-12 dataset
    while being computationally less expensive compared to previously reported
    reduced-precision networks.

    Semantically Consistent Regularization for Zero-Shot Recognition

    Pedro Morgado, Nuno Vasconcelos
    Comments: Accepted to CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)

    The role of semantics in zero-shot learning is considered. The effectiveness
    of previous approaches is analyzed according to the form of supervision
    provided. While some learn semantics independently, others only supervise the
    semantic subspace explained by training classes. Thus, the former is able to
    constrain the whole space but lacks the ability to model semantic correlations.
    The latter addresses this issue but leaves part of the semantic space
    unsupervised. This complementarity is exploited in a new convolutional neural
    network (CNN) framework, which proposes the use of semantics as constraints for
    recognition.Although a CNN trained for classification has no transfer ability,
    this can be encouraged by learning an hidden semantic layer together with a
    semantic code for classification. Two forms of semantic constraints are then
    introduced. The first is a loss-based regularizer that introduces a
    generalization constraint on each semantic predictor. The second is a codeword
    regularizer that favors semantic-to-class mappings consistent with prior
    semantic knowledge while allowing these to be learned from data. Significant
    improvements over the state-of-the-art are achieved on several datasets.

    Prediction and Control with Temporal Segment Models

    Nikhil Mishra, Pieter Abbeel, Igor Mordatch
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO)

    We introduce a method for learning the dynamics of complex nonlinear systems
    based on deep generative models over temporal segments of states and actions.
    Unlike dynamics models that operate over individual discrete timesteps, we
    learn the distribution over future state trajectories conditioned on past
    state, past action, and planned future action trajectories, as well as a latent
    prior over action trajectories. Our approach is based on convolutional
    autoregressive models and variational autoencoders. It makes stable and
    accurate predictions over long horizons for complex, stochastic systems,
    effectively expressing uncertainty and modeling the effects of collisions,
    sensory noise, and action delays. The learned dynamics model and action prior
    can be used for end-to-end, fully differentiable trajectory optimization and
    model-based policy optimization, which we use to evaluate the performance and
    sample-efficiency of our method.


    Information Retrieval

    Impact Of Content Features For Automatic Online Abuse Detection

    Etienne Papegnies (LIA), Vincent Labatut (LIA), Richard Dufour (LIA), Georges Linares (LIA)
    Journal-ref: International Conference on Computational Linguistics and
    Intelligent Text Processing, Apr 2017, Budapest, Hungary. International
    Conference on Computational Linguistics and Intelligent Text Processing, 18,
    International Conference on Computational Linguistics and Intelligent Text
    Processing
    Subjects: Information Retrieval (cs.IR); Social and Information Networks (cs.SI)

    Online communities have gained considerable importance in recent years due to
    the increasing number of people connected to the Internet. Moderating user
    content in online communities is mainly performed manually, and reducing the
    workload through automatic methods is of great financial interest for community
    maintainers. Often, the industry uses basic approaches such as bad words
    filtering and regular expression matching to assist the moderators. In this
    article, we consider the task of automatically determining if a message is
    abusive. This task is complex since messages are written in a non-standardized
    way, including spelling errors, abbreviations, community-specific codes…
    First, we evaluate the system that we propose using standard features of online
    messages. Then, we evaluate the impact of the addition of pre-processing
    strategies, as well as original specific features developed for the community
    of an online in-browser strategy game. We finally propose to analyze the
    usefulness of this wide range of features using feature selection. This work
    can lead to two possible applications: 1) automatically flag potentially
    abusive messages to draw the moderator’s attention on a narrow subset of
    messages ; and 2) fully automate the moderation process by deciding whether a
    message is abusive without any human intervention.


    Computation and Language

    Unfolding and Shrinking Neural Machine Translation Ensembles

    Felix Stahlberg, Bill Byrne
    Comments: Submitted to EMNLP 2017
    Subjects: Computation and Language (cs.CL)

    Ensembling is a well-known technique in neural machine translation (NMT).
    Instead of a single neural net, multiple neural nets with the same topology are
    trained separately, and the decoder generates predictions by averaging over the
    individual models. Ensembling often improves the quality of the generated
    translations drastically. However, it is not suitable for production systems
    because it is cumbersome and slow. This work aims to reduce the runtime to be
    on par with a single system without compromising the translation quality.
    First, we show that the ensemble can be unfolded into a single large neural
    network which imitates the output of the ensemble system. We show that
    unfolding can already improve the runtime in practice since more work can be
    done on the GPU. We proceed by describing a set of techniques to shrink the
    unfolded network by reducing the dimensionality of layers. On Japanese-English
    we report that the resulting network has the size and decoding speed of a
    single NMT network but performs on the level of a 3-ensemble system.

    Automatic Keyword Extraction for Text Summarization: A Survey

    Santosh Kumar Bharti, Korra Sathya Babu
    Comments: 12 pages, 4 figures
    Subjects: Computation and Language (cs.CL)

    In recent times, data is growing rapidly in every domain such as news, social
    media, banking, education, etc. Due to the excessiveness of data, there is a
    need of automatic summarizer which will be capable to summarize the data
    especially textual data in original document without losing any critical
    purposes. Text summarization is emerged as an important research area in recent
    past. In this regard, review of existing work on text summarization process is
    useful for carrying out further research. In this paper, recent literature on
    automatic keyword extraction and text summarization are presented since text
    summarization process is highly depend on keyword extraction. This literature
    includes the discussion about different methodology used for keyword extraction
    and text summarization. It also discusses about different databases used for
    text summarization in several domains along with evaluation matrices. Finally,
    it discusses briefly about issues and research challenges faced by researchers
    along with future direction.

    Persian Wordnet Construction using Supervised Learning

    Zahra Mousavi, Heshaam Faili
    Subjects: Computation and Language (cs.CL); Learning (cs.LG); Machine Learning (stat.ML)

    This paper presents an automated supervised method for Persian wordnet
    construction. Using a Persian corpus and a bi-lingual dictionary, the initial
    links between Persian words and Princeton WordNet synsets have been generated.
    These links will be discriminated later as correct or incorrect by employing
    seven features in a trained classification system. The whole method is just a
    classification system, which has been trained on a train set containing FarsNet
    as a set of correct instances. State of the art results on the automatically
    derived Persian wordnet is achieved. The resulted wordnet with a precision of
    91.18% includes more than 16,000 words and 22,000 synsets.

    Later-stage Minimum Bayes-Risk Decoding for Neural Machine Translation

    Raphael Shu, Hideki Nakayama
    Subjects: Computation and Language (cs.CL)

    For extended periods of time, sequence generation models rely on beam search
    algorithm to generate output sequence. However, the correctness of beam search
    degrades when the a model is over-confident about a suboptimal prediction. In
    this paper, we propose to perform minimum Bayes-risk (MBR) decoding for some
    extra steps at a later stage. In order to speed up MBR decoding, we compute the
    Bayes risks on GPU in batch mode. In our experiments, we found that MBR
    reranking works with a large beam size. Later-stage MBR decoding is shown to
    outperform simple MBR reranking in machine translation tasks.

    Composite Task-Completion Dialogue System via Hierarchical Deep Reinforcement Learning

    Baolin Peng, Xiujun Li, Lihong Li, Jianfeng Gao, Asli Celikyilmaz, Sungjin Lee, Kam-Fai Wong
    Comments: 13 pages, 7 figures
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG)

    In a composite-domain task-completion dialogue system, a conversation agent
    often switches among multiple sub-domains before it successfully completes the
    task. Given such a scenario, a standard deep reinforcement learning based
    dialogue agent may suffer to find a good policy due to the issues such as:
    increased state and action spaces, high sample complexity demands, sparse
    reward and long horizon, etc. In this paper, we propose to use hierarchical
    deep reinforcement learning approach which can operate at different temporal
    scales and is intrinsically motivated to attack these problems. Our
    hierarchical network consists of two levels: the top-level meta-controller for
    subgoal selection and the low-level controller for dialogue policy learning.
    Subgoals selected by meta-controller and intrinsic rewards can guide the
    controller to effectively explore in the state-action space and mitigate the
    spare reward and long horizon problems. Experiments on both simulations and
    human evaluation show that our model significantly outperforms flat deep
    reinforcement learning agents in terms of success rate, rewards and user
    rating.

    Automatic semantic role labeling on non-revised syntactic trees of journalistic texts

    Nathan Siegle Hartmann, Magali Sanches Duran, Sandra Maria Aluísio
    Comments: PROPOR International Conference on the Computational Processing of Portuguese, 2016, 8 pages
    Journal-ref: PROPOR 2016. Springer. Lecture Notes in Computer Science volume
    9727 (2016) pgs. 202-212
    Subjects: Computation and Language (cs.CL)

    Semantic Role Labeling (SRL) is a Natural Language Processing task that
    enables the detection of events described in sentences and the participants of
    these events. For Brazilian Portuguese (BP), there are two studies recently
    concluded that perform SRL in journalistic texts. [1] obtained F1-measure
    scores of 79.6, using the PropBank.Br corpus, which has syntactic trees
    manually revised, [8], without using a treebank for training, obtained
    F1-measure scores of 68.0 for the same corpus. However, the use of manually
    revised syntactic trees for this task does not represent a real scenario of
    application. The goal of this paper is to evaluate the performance of SRL on
    revised and non-revised syntactic trees using a larger and balanced corpus of
    BP journalistic texts. First, we have shown that [1]’s system also performs
    better than [8]’s system on the larger corpus. Second, the SRL system trained
    on non-revised syntactic trees performs better over non-revised trees than a
    system trained on gold-standard data.

    Automatic Classification of the Complexity of Nonfiction Texts in Portuguese for Early School Years

    Nathan Siegle Hartmann, Livia Cucatto, Danielle Brants, Sandra Aluísio
    Comments: PROPOR International Conference on the Computational Processing of Portuguese, 2016, 9 pages
    Journal-ref: Hartmann N., Cucatto L., Brants D., Alu’isio S. (2016) Automatic
    Classification of the Complexity of Nonfiction Texts in Portuguese for Early
    School Years. In: Computational Processing of the Portuguese Language. PROPOR
    2016. Springer
    Subjects: Computation and Language (cs.CL)

    Recent research shows that most Brazilian students have serious problems
    regarding their reading skills. The full development of this skill is key for
    the academic and professional future of every citizen. Tools for classifying
    the complexity of reading materials for children aim to improve the quality of
    the model of teaching reading and text comprehension. For English, Fengs work
    [11] is considered the state-of-art in grade level prediction and achieved 74%
    of accuracy in automatically classifying 4 levels of textual complexity for
    close school grades. There are no classifiers for nonfiction texts for close
    grades in Portuguese. In this article, we propose a scheme for manual
    annotation of texts in 5 grade levels, which will be used for customized
    reading to avoid the lack of interest by students who are more advanced in
    reading and the blocking of those that still need to make further progress. We
    obtained 52% of accuracy in classifying texts into 5 levels and 74% in 3
    levels. The results prove to be promising when compared to the state-of-art
    work.9

    What we really want to find by Sentiment Analysis: The Relationship between Computational Models and Psychological State

    Hwiyeol Jo, Soo-Min Kim, Jeong Ryu
    Comments: Rejected Paper in CogSci2017. I’m sure there is no place for integrated research. arXiv admin note: text overlap with arXiv:1607.03707
    Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

    As the first step to model emotional state of a person, we build sentiment
    analysis models with existing deep neural network algorithms and compare the
    models with psychological measurements to enlighten the relationship. In the
    experiments, we first examined psychological state of 64 participants and asked
    them to summarize the story of a book, Chronicle of a Death Foretold (Marquez,
    1981). Secondly, we trained models using crawled 365,802 movie review data;
    then we evaluated participants’ summaries using the pretrained model as a
    concept of transfer learning. With the background that emotion affects on
    memories, we investigated the relationship between the evaluation score of the
    summaries from computational models and the examined psychological
    measurements. The result shows that although CNN performed the best among other
    deep neural network algorithms (LSTM, GRU), its results are not related to the
    psychological state. Rather, GRU shows more explainable results depending on
    the psychological state. The contribution of this paper can be summarized as
    follows: (1) we enlighten the relationship between computational models and
    psychological measurements. (2) we suggest this framework as objective methods
    to evaluate the emotion; the real sentiment analysis of a person.


    Distributed, Parallel, and Cluster Computing

    Portable, high-performance containers for HPC

    Lucas Benedicic, Felipe A. Cruz, Alberto Madonna, Kean Mariotti
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Building and deploying software on high-end computing systems is a
    challenging task. High performance applications have to reliably run across
    multiple platforms and environments, and make use of site-specific resources
    while resolving complicated software-stack dependencies. Containers are a type
    of lightweight virtualization technology that attempt to solve this problem by
    packaging applications and their environments into standard units of software
    that are: portable, easy to build and deploy, have a small footprint, and low
    runtime overhead. In this work we present an extension to the container runtime
    of Shifter that provides containerized applications with a mechanism to access
    GPU accelerators and specialized networking from the host system, effectively
    enabling performance portability of containers across HPC resources. The
    presented extension makes possible to rapidly deploy high-performance software
    on supercomputers from containerized applications that have been developed,
    built, and tested in non-HPC commodity hardware, e.g. the laptop or workstation
    of a researcher.

    A Domain Specific Language for Performance Portable Molecular Dynamics Algorithms

    William R. Saunders, James Grant, Eike H. Müller
    Comments: 20 pages, 12 figures, 6 tables, submitted to Computer Physics Communications
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Software Engineering (cs.SE); Computational Physics (physics.comp-ph)

    Developers of Molecular Dynamics (MD) codes face significant challenges when
    adapting existing simulation packages to new hardware. In a continuously
    diversifying hardware landscape it becomes increasingly difficult for
    scientists to be experts both in their own domain (physics/chemistry/biology)
    and specialists in the low level parallelisation and optimisation of their
    codes. To address this challenge, we describe a “Separation of Concerns”
    approach for the development of parallel and optimised MD codes: the science
    specialist writes code at a high abstraction level in a domain specific
    language (DSL), which is then translated into efficient computer code by a
    scientific programmer. In a related context, an abstraction for the solution of
    partial differential equations with grid based methods has recently been
    implemented in the (Py)OP2 library. Inspired by this approach, we develop a
    Python code generation system for molecular dynamics simulations on different
    parallel architectures, including massively parallel distributed memory systems
    and GPUs. We demonstrate the efficiency of the auto-generated code by studying
    its performance and scalability on different hardware and compare it to other
    state-of-the-art simulation packages. With growing data volumes the extraction
    of physically meaningful information from the simulation becomes increasingly
    challenging and requires equally efficient implementations. A particular
    advantage of our approach is the easy expression of such analysis algorithms.
    We consider two popular methods for deducing the crystalline structure of a
    material from the local environment of each atom, show how they can be
    expressed in our abstraction and implement them in the code generation
    framework.

    Speeding up Consensus by Chasing Fast Decisions

    Balaji Arun, Sebastiano Peluso, Roberto Palmieri, Giuliano Losa, Binoy Ravindran
    Comments: To be published in the 47th IEEE/IFIP International Conference on Dependable Systems and Networks
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    This paper proposes CAESAR, a novel multi-leader Generalized Consensus
    protocol for geographically replicated sites. The main goal of CAESAR is to
    overcome one of the major limitations of existing approaches, which is the
    significant performance degradation when application workload produces
    conflicting requests. CAESAR does that by changing the way a fast decision is
    taken: its ordering protocol does not reject a fast decision for a client
    request if a quorum of nodes reply with different dependency sets for that
    request. The effectiveness of CAESAR is demonstrated through an evaluation
    study performed on Amazon’s EC2 infrastructure using 5 geo-replicated sites.
    CAESAR outperforms other multi-leader (e.g., EPaxos) competitors by as much as
    1.7x in the presence of 30% conflicting requests, and single-leader (e.g.,
    Multi-Paxos) by up to 3.5x.

    Field of Groves: An Energy-Efficient Random Forest

    Zafar Takhirov, Joseph Wang, Marcia S. Louis, Venkatesh Saligrama, Ajay Joshi
    Comments: Submitted as Work in Progress to DAC’17
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (stat.ML)

    Machine Learning (ML) algorithms, like Convolutional Neural Networks (CNN),
    Support Vector Machines (SVM), etc. have become widespread and can achieve high
    statistical performance. However their accuracy decreases significantly in
    energy-constrained mobile and embedded systems space, where all computations
    need to be completed under a tight energy budget. In this work, we present a
    field of groves (FoG) implementation of random forests (RF) that achieves an
    accuracy comparable to CNNs and SVMs under tight energy budgets. Evaluation of
    the FoG shows that at comparable accuracy it consumes ~1.48x, ~24x, ~2.5x, and
    ~34.7x lower energy per classification compared to conventional RF, SVM_RBF ,
    MLP, and CNN, respectively. FoG is ~6.5x less energy efficient than SVM_LR, but
    achieves 18% higher accuracy on average across all considered datasets.

    Gang-GC: Locality-aware Parallel Data Placement Optimizations for Key-Value Storages

    Duarte Patrício, José Simão, Luís Veiga
    Subjects: Programming Languages (cs.PL); Distributed, Parallel, and Cluster Computing (cs.DC)

    Many cloud applications rely on fast and non-relational storage to aid in the
    processing of large amounts of data. Managed runtimes are now widely used to
    support the execution of several storage solutions of the NoSQL movement,
    particularly when dealing with big data key-value store-driven applications.
    The benefits of these runtimes can however be limited by modern parallel
    throughput-oriented GC algorithms, where related objects have the potential to
    be dispersed in memory, either in the same or different generations. In the
    long run this causes more page faults and degradation of locality on
    system-level memory caches.

    We propose, Gang-CG, an extension to modern heap layouts and to a parallel GC
    algorithm to promote locality between groups of related objects. This is done
    without extensive profiling of the applications and in a way that is
    transparent to the programmer, without the need to use specialized data
    structures. The heap layout and algorithmic extensions were implemented over
    the Parallel Scavenge garbage collector of the HotSpot JVM@.

    Using microbenchmarks that capture the architecture of several key-value
    stores databases, we show negligible overhead in frequent operations such as
    the allocation of new objects and improvements to the access speed of data,
    supported by lower misses in system-level memory caches. Overall, we show a 6\%
    improvement in the average time of read and update operations and an average
    decrease of 12.4\% in page faults.


    Learning

    ENWalk: Learning Network Features for Spam Detection in Twitter

    K C Santosh, Suman Kalyan Maity, Arjun Mukherjee
    Subjects: Learning (cs.LG); Social and Information Networks (cs.SI)

    Social medias are increasing their influence with the vast public information
    leading to their active use for marketing by the companies and organizations.
    Such marketing promotions are difficult to identify unlike the traditional
    medias like TV and newspaper. So, it is very much important to identify the
    promoters in the social media. Although, there are active ongoing researches,
    existing approaches are far from solving the problem. To identify such
    imposters, it is very much important to understand their strategies of social
    circle creation and dynamics of content posting. Are there any specific spammer
    types? How successful are each types? We analyze these questions in the light
    of social relationships in Twitter. Our analyses discover two types of spammers
    and their relationships with the dynamics of content posts. Our results
    discover novel dynamics of spamming which are intuitive and arguable. We
    propose ENWalk, a framework to detect the spammers by learning the feature
    representations of the users in the social media. We learn the feature
    representations using the random walks biased on the spam dynamics.
    Experimental results on large-scale twitter network and the corresponding
    tweets show the effectiveness of our approach that outperforms the existing
    approaches

    Simplified Stochastic Feedforward Neural Networks

    Kimin Lee, Jaehyung Kim, Song Chong, Jinwoo Shin
    Comments: 22 pages, 6 figures
    Subjects: Learning (cs.LG)

    It has been believed that stochastic feedforward neural networks (SFNNs) have
    several advantages beyond deterministic deep neural networks (DNNs): they have
    more expressive power allowing multi-modal mappings and regularize better due
    to their stochastic nature. However, training large-scale SFNN is notoriously
    harder. In this paper, we aim at developing efficient training methods for
    SFNN, in particular using known architectures and pre-trained parameters of
    DNN. To this end, we propose a new intermediate stochastic model, called
    Simplified-SFNN, which can be built upon any baseline DNNand approximates
    certain SFNN by simplifying its upper latent units above stochastic ones. The
    main novelty of our approach is in establishing the connection between three
    models, i.e., DNN->Simplified-SFNN->SFNN, which naturally leads to an efficient
    training procedure of the stochastic models utilizing pre-trained parameters of
    DNN. Using several popular DNNs, we show how they can be effectively
    transferred to the corresponding stochastic models for both multi-modal and
    classification tasks on MNIST, TFD, CASIA, CIFAR-10, CIFAR-100 and SVHN
    datasets. In particular, we train a stochastic model of 28 layers and 36
    million parameters, where training such a large-scale stochastic network is
    significantly challenging without using Simplified-SFNN

    Federated Tensor Factorization for Computational Phenotyping

    Yejin Kim, Jimeng Sun, Hwanjo Yu, Xiaoqian Jiang
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    Tensor factorization models offer an effective approach to convert massive
    electronic health records into meaningful clinical concepts (phenotypes) for
    data analysis. These models need a large amount of diverse samples to avoid
    population bias. An open challenge is how to derive phenotypes jointly across
    multiple hospitals, in which direct patient-level data sharing is not possible
    (e.g., due to institutional policies). In this paper, we developed a novel
    solution to enable federated tensor factorization for computational phenotyping
    without sharing patient-level data. We developed secure data harmonization and
    federated computation procedures based on alternating direction method of
    multipliers (ADMM). Using this method, the multiple hospitals iteratively
    update tensors and transfer secure summarized information to a central server,
    and the server aggregates the information to generate phenotypes. We
    demonstrated with real medical datasets that our method resembles the
    centralized training model (based on combined datasets) in terms of accuracy
    and phenotypes discovery while respecting privacy.

    WRPN: Training and Inference using Wide Reduced-Precision Networks

    Asit Mishra, Jeffrey J Cook, Eriko Nurvitadhi, Debbie Marr
    Comments: Under submission to CVPR Workshop
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    For computer vision applications, prior works have shown the efficacy of
    reducing the numeric precision of model parameters (network weights) in deep
    neural networks but also that reducing the precision of activations hurts model
    accuracy much more than reducing the precision of model parameters. We study
    schemes to train networks from scratch using reduced-precision activations
    without hurting the model accuracy. We reduce the precision of activation maps
    (along with model parameters) using a novel quantization scheme and increase
    the number of filter maps in a layer, and find that this scheme compensates or
    surpasses the accuracy of the baseline full-precision network. As a result, one
    can significantly reduce the dynamic memory footprint, memory bandwidth,
    computational energy and speed up the training and inference process with
    appropriate hardware support. We call our scheme WRPN – wide reduced-precision
    networks. We report results using our proposed schemes and show that our
    results are better than previously reported accuracies on ILSVRC-12 dataset
    while being computationally less expensive compared to previously reported
    reduced-precision networks.

    Data-efficient Deep Reinforcement Learning for Dexterous Manipulation

    Ivaylo Popov, Nicolas Heess, Timothy Lillicrap, Roland Hafner, Gabriel Barth-Maron, Matej Vecerik, Thomas Lampe, Yuval Tassa, Tom Erez, Martin Riedmiller
    Comments: 12 pages, 5 Figures
    Subjects: Learning (cs.LG); Robotics (cs.RO)

    Deep learning and reinforcement learning methods have recently been used to
    solve a variety of problems in continuous control domains. An obvious
    application of these techniques is dexterous manipulation tasks in robotics
    which are difficult to solve using traditional control theory or
    hand-engineered approaches. One example of such a task is to grasp an object
    and precisely stack it on another. Solving this difficult and practically
    relevant problem in the real world is an important long-term goal for the field
    of robotics. Here we take a step towards this goal by examining the problem in
    simulation and providing models and techniques aimed at solving it. We
    introduce two extensions to the Deep Deterministic Policy Gradient algorithm
    (DDPG), a model-free Q-learning based method, which make it significantly more
    data-efficient and scalable. Our results show that by making extensive use of
    off-policy data and replay, it is possible to find control policies that
    robustly grasp objects and stack them. Further, our results hint that it may
    soon be feasible to train successful stacking policies by collecting
    interactions on real robots.

    Learning from Multi-View Structural Data via Structural Factorization Machines

    Chun-Ta Lu, Lifang He, Hao Ding, Philip S. Yu
    Comments: 9 pages
    Subjects: Learning (cs.LG)

    Real-world relations among entities can often be observed and determined by
    different perspectives/views. For example, the decision made by a user on
    whether to adopt an item relies on multiple aspects such as the contextual
    information of the decision, the item’s attributes, the user’s profile and the
    reviews given by other users. Different views may exhibit multi-way
    interactions among entities and provide complementary information. In this
    paper, we introduce a multi-tensor-based approach that can preserve the
    underlying structure of multi-view data in a generic predictive model.
    Specifically, we propose structural factorization machines (SFMs) that learn
    the common latent spaces shared by multi-view tensors and automatically adjust
    the importance of each view in the predictive model. Furthermore, the
    complexity of SFMs is linear in the number of parameters, which make SFMs
    suitable to large-scale problems. Extensive experiments on real-world datasets
    demonstrate that the proposed SFMs outperform several state-of-the-art methods
    in terms of prediction accuracy and computational cost.

    A Dual-Stage Attention-Based Recurrent Neural Network for Time Series Prediction

    Yao Qin, Dongjin Song, Haifeng Cheng, Wei Cheng, Guofei Jiang, Garrison Cottrell
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    The Nonlinear autoregressive exogenous (NARX) model, which predicts the
    current value of a time series based upon its previous values as well as the
    current and past values of multiple driving (exogenous) series, has been
    studied for decades. Despite the fact that various NARX models have been
    developed, few of them can capture the long-term temporal dependencies
    appropriately and select the relevant driving series to make predictions. In
    this paper, we propose a dual-stage attention based recurrent neural network
    (DA-RNN) to address these two issues. In the first stage, we introduce an input
    attention mechanism to adaptively extract relevant driving series (a.k.a.,
    input features) at each timestamp by referring to the previous encoder hidden
    state. In the second stage, we use a temporal attention mechanism to select
    relevant encoder hidden states across all the timestamps. With this dual-stage
    attention scheme, our model can not only make prediction effectively, but can
    also be easily interpreted. Thorough empirical studies based upon the SML 2010
    dataset and the NASDAQ 100 Stock dataset demonstrate that DA-RNN can outperform
    state-of-the-art methods for time series prediction.

    The Space of Transferable Adversarial Examples

    Florian Tramèr, Nicolas Papernot, Ian Goodfellow, Dan Boneh, Patrick McDaniel
    Comments: 16 pages, 7 figures
    Subjects: Machine Learning (stat.ML); Cryptography and Security (cs.CR); Learning (cs.LG)

    Adversarial examples are maliciously perturbed inputs designed to mislead
    machine learning (ML) models at test-time. Adversarial examples are known to
    transfer across models: a same perturbed input is often misclassified by
    different models despite being generated to mislead a specific architecture.
    This phenomenon enables simple yet powerful black-box attacks against deployed
    ML systems.

    In this work, we propose novel methods for estimating the previously unknown
    dimensionality of the space of adversarial inputs. We find that adversarial
    examples span a contiguous subspace of large dimensionality and that a
    significant fraction of this space is shared between different models, thus
    enabling transferability.

    The dimensionality of the transferred adversarial subspace implies that the
    decision boundaries learned by different models are eerily close in the input
    domain, when moving away from data points in adversarial directions. A first
    quantitative analysis of the similarity of different models’ decision
    boundaries reveals that these boundaries are actually close in arbitrary
    directions, whether adversarial or benign.

    We conclude with a formal study of the limits of transferability. We show (1)
    sufficient conditions on the data distribution that imply transferability for
    simple model classes and (2) examples of tasks for which transferability fails
    to hold. This suggests the existence of defenses making models robust to
    transferability attacks—even when the model is not robust to its own
    adversarial examples.

    Sublinear Time Low-Rank Approximation of Positive Semidefinite Matrices

    Cameron Musco, David P. Woodruff
    Subjects: Data Structures and Algorithms (cs.DS); Learning (cs.LG); Numerical Analysis (math.NA)

    We show how to compute a relative-error low-rank approximation to any
    positive semidefinite (PSD) matrix in sublinear time, i.e., for any (n imes
    n) PSD matrix (A), in ( ilde O(n cdot poly(k/epsilon))) time we output a
    rank-(k) matrix (B), in factored form, for which (|A-B|_F^2 leq
    (1+epsilon)|A-A_k|_F^2), where (A_k) is the best rank-(k) approximation to
    (A). When (k) and (1/epsilon) are not too large compared to the sparsity of
    (A), our algorithm does not need to read all entries of the matrix. Hence, we
    significantly improve upon previous (nnz(A)) time algorithms based on oblivious
    subspace embeddings, and bypass an (nnz(A)) time lower bound for general
    matrices (where (nnz(A)) denotes the number of non-zero entries in the matrix).
    We prove time lower bounds for low-rank approximation of PSD matrices, showing
    that our algorithm is close to optimal. Finally, we extend our techniques to
    give sublinear time algorithms for low-rank approximation of (A) in the (often
    stronger) spectral norm metric (|A-B|_2^2) and for ridge regression on PSD
    matrices.

    Interpretable Explanations of Black Boxes by Meaningful Perturbation

    Ruth Fong, Andrea Vedaldi
    Comments: 9 pages, 10 figures, submitted to ICCV 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    As machine learning algorithms are increasingly applied to high impact yet
    high risk tasks, e.g. problems in health, it is critical that researchers can
    explain how such algorithms arrived at their predictions. In recent years, a
    number of image saliency methods have been developed to summarize where highly
    complex neural networks “look” in an image for evidence for their predictions.
    However, these techniques are limited by their heuristic nature and
    architectural constraints.

    In this paper, we make two main contributions: First, we propose a general
    framework for learning different kinds of explanations for any black box
    algorithm. Second, we introduce a paradigm that learns the minimally salient
    part of an image by directly editing it and learning from the corresponding
    changes to its output. Unlike previous works, our method is model-agnostic and
    testable because it is grounded in replicable image perturbations.

    Persian Wordnet Construction using Supervised Learning

    Zahra Mousavi, Heshaam Faili
    Subjects: Computation and Language (cs.CL); Learning (cs.LG); Machine Learning (stat.ML)

    This paper presents an automated supervised method for Persian wordnet
    construction. Using a Persian corpus and a bi-lingual dictionary, the initial
    links between Persian words and Princeton WordNet synsets have been generated.
    These links will be discriminated later as correct or incorrect by employing
    seven features in a trained classification system. The whole method is just a
    classification system, which has been trained on a train set containing FarsNet
    as a set of correct instances. State of the art results on the automatically
    derived Persian wordnet is achieved. The resulted wordnet with a precision of
    91.18% includes more than 16,000 words and 22,000 synsets.

    On Feature Reduction using Deep Learning for Trend Prediction in Finance

    Luigi Troiano, Elena Mejuto, Pravesh Kriplani
    Comments: 6 pages, 6 figures, 5 tables
    Subjects: Trading and Market Microstructure (q-fin.TR); Learning (cs.LG)

    One of the major advantages in using Deep Learning for Finance is to embed a
    large collection of information into investment decisions. A way to do that is
    by means of compression, that lead us to consider a smaller feature space.
    Several studies are proving that non-linear feature reduction performed by Deep
    Learning tools is effective in price trend prediction. The focus has been put
    mainly on Restricted Boltzmann Machines (RBM) and on output obtained by them.
    Few attention has been payed to Auto-Encoders (AE) as an alternative means to
    perform a feature reduction. In this paper we investigate the application of
    both RBM and AE in more general terms, attempting to outline how architectural
    and input space characteristics can affect the quality of prediction.

    struc2vec: Learning Node Representations from Structural Identity

    Daniel R. Figueiredo, Leonardo F. R. Ribeiro, Pedro H. P. Saverese
    Subjects: Social and Information Networks (cs.SI); Learning (cs.LG); Machine Learning (stat.ML)

    Structural identity is a concept of symmetry in which network nodes are
    identified according to the network structure and their relationship to other
    nodes. Structural identity has been studied in theory and practice over the
    past decades, but has only recently been addressed with techniques from
    representational learning. This work presents struc2vec, a novel and flexible
    framework for learning latent representations of node’s structural identity.
    struc2vec assesses structural similarity without using node or edge attributes,
    uses a hierarchy to measure similarity at different scales, and constructs a
    multilayer graph to encode the structural similarities and generate structural
    context for nodes. Numerical experiments indicate that state-of-the-art
    techniques for learning node representations fail in capturing stronger notions
    of structural identity, while struc2vec exhibits much superior performance in
    this task, as it overcomes limitations of prior techniques.

    Parametric Gaussian Process Regression for Big Data

    Maziar Raissi
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    This work introduces the concept of parametric Gaussian processes (PGPs),
    which is built upon the seemingly self-contradictory idea of making Gaussian
    processes parametric. Parametric Gaussian processes, by construction, are
    designed to operate in “big data” regimes where one is interested in
    quantifying the uncertainty associated with noisy data. The proposed
    methodology circumvents the well-established need for stochastic variational
    inference, a scalable algorithm for approximating posterior distributions. The
    effectiveness of the proposed approach is demonstrated using an illustrative
    example with simulated data and a benchmark dataset in the airline industry
    with approximately (6) million records.

    Composite Task-Completion Dialogue System via Hierarchical Deep Reinforcement Learning

    Baolin Peng, Xiujun Li, Lihong Li, Jianfeng Gao, Asli Celikyilmaz, Sungjin Lee, Kam-Fai Wong
    Comments: 13 pages, 7 figures
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG)

    In a composite-domain task-completion dialogue system, a conversation agent
    often switches among multiple sub-domains before it successfully completes the
    task. Given such a scenario, a standard deep reinforcement learning based
    dialogue agent may suffer to find a good policy due to the issues such as:
    increased state and action spaces, high sample complexity demands, sparse
    reward and long horizon, etc. In this paper, we propose to use hierarchical
    deep reinforcement learning approach which can operate at different temporal
    scales and is intrinsically motivated to attack these problems. Our
    hierarchical network consists of two levels: the top-level meta-controller for
    subgoal selection and the low-level controller for dialogue policy learning.
    Subgoals selected by meta-controller and intrinsic rewards can guide the
    controller to effectively explore in the state-action space and mitigate the
    spare reward and long horizon problems. Experiments on both simulations and
    human evaluation show that our model significantly outperforms flat deep
    reinforcement learning agents in terms of success rate, rewards and user
    rating.

    CERN: Confidence-Energy Recurrent Network for Group Activity Recognition

    Tianmin Shu, Sinisa Todorovic, Song-Chun Zhu
    Comments: Accepted to IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)

    This work is about recognizing human activities occurring in videos at
    distinct semantic levels, including individual actions, interactions, and group
    activities. The recognition is realized using a two-level hierarchy of Long
    Short-Term Memory (LSTM) networks, forming a feed-forward deep architecture,
    which can be trained end-to-end. In comparison with existing architectures of
    LSTMs, we make two key contributions giving the name to our approach as
    Confidence-Energy Recurrent Network — CERN. First, instead of using the common
    softmax layer for prediction, we specify a novel energy layer (EL) for
    estimating the energy of our predictions. Second, rather than finding the
    common minimum-energy class assignment, which may be numerically unstable under
    uncertainty, we specify that the EL additionally computes the p-values of the
    solutions, and in this way estimates the most confident energy minimum. The
    evaluation on the Collective Activity and Volleyball datasets demonstrates: (i)
    advantages of our two contributions relative to the common softmax and
    energy-minimization formulations and (ii) a superior performance relative to
    the state-of-the-art approaches.

    Semantically Consistent Regularization for Zero-Shot Recognition

    Pedro Morgado, Nuno Vasconcelos
    Comments: Accepted to CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)

    The role of semantics in zero-shot learning is considered. The effectiveness
    of previous approaches is analyzed according to the form of supervision
    provided. While some learn semantics independently, others only supervise the
    semantic subspace explained by training classes. Thus, the former is able to
    constrain the whole space but lacks the ability to model semantic correlations.
    The latter addresses this issue but leaves part of the semantic space
    unsupervised. This complementarity is exploited in a new convolutional neural
    network (CNN) framework, which proposes the use of semantics as constraints for
    recognition.Although a CNN trained for classification has no transfer ability,
    this can be encouraged by learning an hidden semantic layer together with a
    semantic code for classification. Two forms of semantic constraints are then
    introduced. The first is a loss-based regularizer that introduces a
    generalization constraint on each semantic predictor. The second is a codeword
    regularizer that favors semantic-to-class mappings consistent with prior
    semantic knowledge while allowing these to be learned from data. Significant
    improvements over the state-of-the-art are achieved on several datasets.

    A probabilistic data-driven model for planar pushing

    Maria Bauza, Alberto Rodriguez
    Comments: 8 pages, 11 figures
    Journal-ref: ICRA 2017
    Subjects: Robotics (cs.RO); Learning (cs.LG); Machine Learning (stat.ML)

    This paper presents a data-driven approach to model planar pushing
    interaction to predict both the most likely outcome of a push and its expected
    variability. The learned models rely on a variation of Gaussian processes with
    input-dependent noise called Variational Heteroscedastic Gaussian processes
    (VHGP) that capture the mean and variance of a stochastic function. We show
    that we can learn accurate models that outperform analytical models after less
    than 100 samples and saturate in performance with less than 1000 samples. We
    validate the results against a collected dataset of repeated trajectories, and
    use the learned models to study questions such as the nature of the variability
    in pushing, and the validity of the quasi-static assumption.

    Stochastic Neural Networks for Hierarchical Reinforcement Learning

    Carlos Florensa, Yan Duan, Pieter Abbeel
    Comments: Published as a conference paper at ICLR 2017
    Journal-ref: International Conference on Learning Representations 2017
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Robotics (cs.RO)

    Deep reinforcement learning has achieved many impressive results in recent
    years. However, tasks with sparse rewards or long horizons continue to pose
    significant challenges. To tackle these important problems, we propose a
    general framework that first learns useful skills in a pre-training
    environment, and then leverages the acquired skills for learning faster in
    downstream tasks. Our approach brings together some of the strengths of
    intrinsic motivation and hierarchical methods: the learning of useful skill is
    guided by a single proxy reward, the design of which requires very minimal
    domain knowledge about the downstream tasks. Then a high-level policy is
    trained on top of these skills, providing a significant improvement of the
    exploration and allowing to tackle sparse rewards in the downstream tasks. To
    efficiently pre-train a large span of skills, we use Stochastic Neural Networks
    combined with an information-theoretic regularizer. Our experiments show that
    this combination is effective in learning a wide span of interpretable skills
    in a sample-efficient way, and can significantly boost the learning performance
    uniformly across a wide range of downstream tasks.


    Information Theory

    Directivity-Beamwidth Tradeoff of Massive MIMO Uplink Beamforming for High Speed Train Communication

    Xuhong Chen, Jiaxun Lu, Tao Li, Pingyi Fan, Khaled Ben Letaief
    Comments: This paper has been accepted for future publication in IEEE ACCESS. arXiv admin note: substantial text overlap with arXiv:1702.02121
    Subjects: Information Theory (cs.IT)

    High-mobility adaption and massive Multiple-input Multiple-output (MIMO)
    application are two primary evolving objectives for the next generation high
    speed train (HST) wireless communication system. In this paper, we consider how
    to design a location-aware beamforming for the massive MIMO system in the high
    traffic density HST network. We first analyze the tradeoff between beam
    directivity and beamwidth, based on which we present the sensitivity analysis
    of positioning accuracy. Then, in order to guarantee a high efficient
    transmission, we derive an optimal problem to maximize the beam directivity
    under the restriction of diverse positioning accuracies. After that, we present
    a low-complexity beamforming design by utilizing location information, which
    requires neither eigen-decomposing (ED) the uplink channel covariance matrix
    (CCM) nor ED the downlink CCM (DCCM). Finally, we study the beamforming scheme
    in future high traffic density HST network, where a two HSTs encountering
    scenario is emphasized. By utilizing the real-time location information, we
    propose an optimal adaptive beamforming scheme to maximize the achievable rate
    region under limited channel source constraint. Numerical simulation indicates
    that a massive MIMO system with less than a certain positioning error can
    guarantee a required performance with satisfying transmission efficiency in the
    high traffic density HST scenario and the achievable rate region when two HSTs
    encounter is greatly improved as well.

    Enhancement of Physical Layer Security Using Destination Artificial Noise Based on Outage Probability

    Ali Rahmanpour, Vahid T. Vakili, S. Mohammad Razavizadeh
    Comments: Accepted for publication in Wireless Personal Communications (Springer)
    Subjects: Information Theory (cs.IT); Cryptography and Security (cs.CR)

    In this paper, we study using Destination Artificial Noise (DAN) besides
    Source Artificial Noise (SAN) to enhance physical layer secrecy with an outage
    probability based approach. It is assumed that all nodes in the network (i.e.
    source, destination and eavesdropper) are equipped with multiple antennas. In
    addition, the eavesdropper is passive and its channel state and location are
    unknown at the source and destination. In our proposed scheme, by optimized
    allocation of power to the SAN, DAN and data signal, a minimum value for the
    outage probability is guaranteed at the eavesdropper, and at the same time a
    certain level of signal to noise ratio (SNR) at the destination is ensured. Our
    simulation results show that using DAN along with SAN brings a significant
    enhancement in power consumption compared to methods that merely adopt SAN to
    achieve the same outage probability at the eavesdropper.

    (b)-symbol distance distribution of repeated-root cyclic codes

    Hojjat Mostafanasab, Esra Sengelen Sevim
    Comments: 6 pages
    Subjects: Information Theory (cs.IT)

    Symbol-pair codes, introduced by Cassuto and Blaum [1], have been raised for
    symbol-pair read channels. This new idea is motivated by the limitation of the
    reading process in high-density data storage technologies. Yaakobi et al. [8]
    introduced codes for (b)-symbol read channels, where the read operation is
    performed as a consecutive sequence of (b>2) symbols. In this paper, we come up
    with a method to compute the (b)-symbol-pair distance of two (n)-tuples, where
    (n) is a positive integer. Also, we deal with the (b)-symbol-pair distances of
    some kind of cyclic codes of length (p^e) over (mathbb{F}_{p^m}).

    Energy Efficiency in Cell-Free Massive MIMO with Zero-Forcing Precoding Design

    L. D. Nguyen, T. Q. Duong, H. Q. Ngo, K. Tourki
    Comments: Accepted for publication on IEEE Communications Letters
    Journal-ref: IEEE Communications Letters, 2017
    Subjects: Information Theory (cs.IT)

    We consider the downlink of a cell-free massive multiple-input
    multiple-output (MIMO) network where numerous distributed access points (APs)
    serve a smaller number of users under time division duplex operation. An
    important issue in deploying cell-free networks is high power consumption,
    which is proportional to the number of APs. This issue has raised the question
    as to their suitability for green communications in terms of the total energy
    efficiency (bits/Joule). To tackle this, we develop a novel low-complexity
    power control technique with zero-forcing precoding design to maximize the
    energy efficiency of cell-free massive MIMO taking into account the backhaul
    power consumption and the imperfect channel state information.

    Uplink Multiuser Massive MIMO Systems with Low-Resolution ADCs: A Coding-Theoretic Approach

    Song-Nam Hong, Seonho Kim, Namyoon Lee
    Comments: Submitted to IEEE TWC
    Subjects: Information Theory (cs.IT)

    This paper considers an uplink multiuser massive
    multiple-input-multiple-output (MIMO) system with low-resolution
    analog-to-digital converters (ADCs), in which K users with a single-antenna
    communicate with one base station (BS) with Nr antennas. In this system, we
    present a novel multiuser MIMO detection framework that is inspired by coding
    theory. The key idea of the proposed framework is to create a code C of length
    2Nr over a spatial domain. This code is constructed by a so-called
    auto-encoding function that is not designable but is completely described by a
    channel transformation followed by a quantization function of the ADCs. From
    this point of view, we convert a multiuser MIMO detection problem into an
    equivalent channel coding problem, in which a codeword of C corresponding to
    users’ messages is sent over 2Nr parallel channels, each with different channel
    reliability. To the resulting problem, we propose a novel weighted minimum
    distance decoding (wMDD) that appropriately exploits the unequal channel
    reliabilities. It is shown that the proposed wMDD yields a non-trivial gain
    over the conventional minimum distance decoding (MDD). From coding-theoretic
    viewpoint, we identify that bit-error-rate (BER) exponentially decreases with
    the minimum distance of the code C, which plays a similar role with a condition
    number in conventional MIMO systems. Furthermore, we develop the communication
    method that uses the wMDD for practical scenarios where the BS has no knowledge
    of channel state information. Finally, numerical results are provided to verify
    the superiority of the proposed method.

    Phase Retrieval via Sparse Wirtinger Flow

    Ziyang Yuan, Qi Wang, Hongxia Wang
    Subjects: Information Theory (cs.IT)

    Phase retrieval(PR) problem is a kind of ill-condition inverse problem which
    can be found in various of applications. Utilizing the sparse priority, an
    algorithm called SWF(Sparse Wirtinger Flow) is proposed in this paper to deal
    with sparse PR problem based on the Wirtinger flow method. SWF firstly recovers
    the support of the signal and then updates the evaluation by hard thresholding
    method with an elaborate initialization. Theoretical analyses show that SWF has
    a geometric convergence for any (k) sparse (n) length signal with the sampling
    complexity (mathcal{O}(k^2mathrm{log}n)). To get (varepsilon) accuracy, the
    computational complexity of SWF is
    (mathcal{O}(k^3nmathrm{log}nmathrm{log}frac{1}{varepsilon})).

    Numerical tests also demonstrate that SWF performs better than
    state-of-the-art methods especially when we have no priori knowledge about
    sparsity (k). Moreover, SWF is also robust to the noise

    Error Bounds for Uplink and Downlink 3D Localization in 5G mmWave Systems

    Zohair Abu-Shaban, Xiangyun Zhou, Thushara Abhayapala, Gonzalo Seco-Granados, Henk Wymeersch
    Comments: This work has been submitted to the IEEE Transactions on Wireless Communications
    Subjects: Information Theory (cs.IT)

    Location-aware communication systems are expected to play a pivotal part in
    the next generation of mobile communication networks. Therefore, there is a
    need to understand the localization limits in these networks, particularly,
    using millimeter-wave technology (mmWave). Towards that, we address the uplink
    and downlink localization limits in terms of 3D position and orientation error
    bounds for mmWave multipath channels. We also carry out a detailed analysis of
    the dependence of the bounds of different systems parameters. Our key findings
    indicate that the uplink and downlink behave differently in two distinct ways.
    First of all, the error bounds have different scaling factors with respect to
    the number of antennas in the uplink and downlink. Secondly, uplink
    localization is sensitive to the orientation angle of the user equipment (UE),
    whereas downlink is not. Moreover, in the considered outdoor scenarios, the
    non-line-of-sight paths generally improve localization when a line-of-sight
    path exists. Finally, our numerical results show that mmWave systems are
    capable of localizing a UE with sub-meter position error, and sub-degree
    orientation error.

    Error Vector Magnitude Analysis in Generlaized Fading with Co-Channel Interference

    Sudharsan Parthasarathy, Suman Kumar, Radha Krishna Ganti, Sheetal Kalyani, K. Giridhar
    Subjects: Information Theory (cs.IT)

    In this paper, we derive the data-aided Error Vector Magnitude (EVM) in an
    interference limited system when both the desired signal and interferers
    experience independent and non identically distributed (kappa)-(mu) shadowed
    fading. Then it is analytically shown that the EVM is equal to the square root
    of number of interferers when the desired signal and interferers do not
    experience fading. Further, EVM is derived in the presence of interference and
    noise, when the desired signal experiences (kappa)-(mu) shadowed fading and
    the interferers experience independent and identical Nakagami fading. Moreover,
    using the properties of the special functions, the derived EVM expressions are
    also simplified for various special cases.

    Resolution-Adaptive Hybrid MIMO Architectures for Millimeter Wave Communications

    Jinseok Choi, Brian L. Evans, Alan Gatherer
    Comments: Submitted to IEEE Transactions on Signal Processing
    Subjects: Information Theory (cs.IT)

    Hybrid analog-digital beamforming architectures with low-resolution
    analog-to-digital converters (ADCs) reduce hardware cost and power consumption
    in multiple-input multiple-output (MIMO) millimeter wave (mmWave) communication
    systems. In this paper, we propose a hybrid architecture with
    resolution-adaptive ADCs for mmWave receivers with large antenna arrays. We
    adopt array response vectors for the analog combiners and derive ADC bit
    allocation (BA) algorithms. The two proposed BA algorithms minimize the mean
    square quantization error of received analog signals under a total ADC power
    constraint. It is beneficial to assign more bits to the ADC with a larger
    channel gain on the corresponding radio frequency (RF) chain, and the optimal
    number of ADC bits is logarithmically proportional to the RF chain’s
    signal-to-noise ratio raised to the 1/3 power. Contributions of this paper
    include 1) an ADC bit allocation algorithm to improve communication performance
    of a hybrid MIMO receiver, 2) a revised ADC bit allocation algorithm that is
    robust to additive noise, and 3) a worst-case analysis of the ergodic rate of
    the proposed MIMO receiver that quantifies system tradeoffs and serves as the
    lower bound. Simulation results validate the ergodic rate formula and
    demonstrate that the proposed BA algorithms outperform a fixed-ADC approach in
    both spectral and energy efficiency. For a power constraint equivalent to that
    of fixed 4-bit ADCs, the revised BA algorithm makes the quantization error
    negligible while achieving 22% better energy efficiency. Having negligible
    quantization error allows existing state-of-the-art digital beamforming
    techniques to be readily applied to the proposed system.

    Network Information Science

    Henrique F. de Arruda, Filipi N. Silva, Cesar H. Comin, Diego R. Amancio, Luciano da F. Costa
    Subjects: Information Theory (cs.IT); Social and Information Networks (cs.SI); Data Analysis, Statistics and Probability (physics.data-an); Physics and Society (physics.soc-ph)

    A framework integrating information theory and network science is proposed,
    giving rise to a potentially new area of network information science. By
    incorporating and integrating concepts such as complexity, coding, topological
    projections and network dynamics, the proposed network-based framework paves
    the way not only to extending traditional information science, but also to
    modeling, characterizing and analyzing a broad class of real-world problems,
    from language communication to DNA coding. Basically, an original network is
    supposed to be transmitted, with our without compaction, through a time-series
    obtained by sampling its topology by some network dynamics, such as random
    walks. We show that the degree of compression is ultimately related to the
    ability to predict the frequency of symbols based on the topology of the
    original network and the adopted dynamics. The potential of the proposed
    approach is illustrated with respect to the efficiency of transmitting
    topologies of several network models by using a variety of random walks.
    Several interesting results are obtained, including the behavior of the BA
    model oscillating between high and low performance depending on the considered
    dynamics, and the distinct performances obtained for two analogous geographical
    models.

    Constant Modulus Beamforming via Convex Optimization

    Amir Adler, Mati Wax
    Subjects: Information Theory (cs.IT)

    We present novel convex-optimization-based solutions to the problem of blind
    beamforming of constant modulus signals, and to the related problem of linearly
    constrained blind beamforming of constant modulus signals. These solutions
    ensure global optimality and are parameter free, namely, do not contain any
    tuneable parameters and do not require any a-priori parameter settings. The
    performance of these solutions, as demonstrated by simulated data, is superior
    to existing methods.

    Optimized Data Pre-Processing for Discrimination Prevention

    Flavio P. Calmon, Dennis Wei, Karthikeyan Natesan Ramamurthy, Kush R. Varshney
    Subjects: Machine Learning (stat.ML); Computers and Society (cs.CY); Information Theory (cs.IT)

    Non-discrimination is a recognized objective in algorithmic decision making.
    In this paper, we introduce a novel probabilistic formulation of data
    pre-processing for reducing discrimination. We propose a convex optimization
    for learning a data transformation with three goals: controlling
    discrimination, limiting distortion in individual data samples, and preserving
    utility. We characterize the impact of limited sample size in accomplishing
    this objective, and apply two instances of the proposed optimization to
    datasets, including one on real-world criminal recidivism. The results
    demonstrate that all three criteria can be simultaneously achieved and also
    reveal interesting patterns of bias in American society.

    A Bell state in a Penning Trap as a quantum simulator of the factorization problem

    Jose Luis Rosales, Vicente Martin
    Subjects: Quantum Physics (quant-ph); Information Theory (cs.IT)

    The recently introduced equivalent formulation of the integer factorization
    problem for (N=xy), obtaining a function of the primes (E(x)) within the
    factorization ensemble, is reviewed. Here we demonstrate that this formulation
    can be readily translated to the physics of a quantum device in which the
    quantities (E_k) are the eigenvalues of a bounded Hamiltonian. The spectrum is
    solved for (x_k=o(sqrt N)) leading to a super efficient probabilistic quantum
    factoring algorithm which only requires ( o((log sqrt N)^3)) input
    calculations. The state of the quantum simulator can be identified as that of a
    two electrons (mathbf {P}) wave in a Penning Trap. We consider the possibility
    to build the simulator experimentally in order to obtain, from the measured
    magnetron trap frequencies, the probabilistic quantum sieve for the possible
    factors of (N). This approach is suited for large (N) and (x=o(sqrt{N})).




沪ICP备19023445号-2号
友情链接