IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Thu, 16 Mar 2017

    我爱机器学习(52ml.net)发表于 2017-03-16 00:00:00
    love 0

    Neural and Evolutionary Computing

    Deep learning with convolutional neural networks for brain mapping and decoding of movement-related information from the human EEG

    Robin Tibor Schirrmeister, Jost Tobias Springenberg, Lukas Dominique Josef Fiederer, Martin Glasstetter, Katharina Eggensperger, Michael Tangermann, Frank Hutter, Wolfram Burgard, Tonio Ball
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Deep learning with convolutional neural networks (deep ConvNets) has
    revolutionized computer vision through end-to-end learning, i.e. learning from
    the raw data. Now, there is increasing interest in using deep ConvNets for
    end-to-end EEG analysis. However, little is known about many important aspects
    of how to design and train ConvNets for end-to-end EEG decoding, and there is
    still a lack of techniques to visualize the informative EEG features the
    ConvNets learn.

    Here, we studied deep ConvNets with a range of different architectures,
    designed for decoding imagined or executed movements from raw EEG. Our results
    show that recent advances from the machine learning field, including batch
    normalization and exponential linear units, together with a cropped training
    strategy, boosted the deep ConvNets decoding performance, reaching or
    surpassing that of the widely-used filter bank common spatial patterns (FBCSP)
    decoding algorithm. While FBCSP is designed to use spectral power modulations,
    the features used by ConvNets are not fixed a priori. Our novel methods for
    visualizing the learned features demonstrated that ConvNets indeed learned to
    use spectral power modulations in the alpha, beta and high gamma frequencies.
    These methods also proved useful as a technique for spatially mapping the
    learned features, revealing the topography of the causal contributions of
    features in different frequency bands to decoding the movement classes.

    Our study thus shows how to design and train ConvNets to decode
    movement-related information from the raw EEG without handcrafted features and
    highlights the potential of deep ConvNets combined with advanced visualization
    techniques for EEG-based brain mapping.

    Neural Programming by Example

    Chengxun Shu, Hongyu Zhang
    Comments: 7 pages, Association for the Advancement of Artificial Intelligence (AAAI)
    Journal-ref: AAAI-2017
    Subjects: Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE); Software Engineering (cs.SE)

    Programming by Example (PBE) targets at automatically inferring a computer
    program for accomplishing a certain task from sample input and output. In this
    paper, we propose a deep neural networks (DNN) based PBE model called Neural
    Programming by Example (NPBE), which can learn from input-output strings and
    induce programs that solve the string manipulation problems. Our NPBE model has
    four neural network based components: a string encoder, an input-output
    analyzer, a program generator, and a symbol selector. We demonstrate the
    effectiveness of NPBE by training it end-to-end to solve some common string
    manipulation problems in spreadsheet systems. The results show that our model
    can induce string manipulation programs effectively. Our work is one step
    towards teaching DNN to generate computer programs.

    Neural Graph Machines: Learning Neural Networks Using Graphs

    Thang D. Bui, Sujith Ravi, Vivek Ramavajjala
    Comments: 9 pages
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Label propagation is a powerful and flexible semi-supervised learning
    technique on graphs. Neural networks, on the other hand, have proven track
    records in many supervised learning tasks. In this work, we propose a training
    framework with a graph-regularised objective, namely “Neural Graph Machines”,
    that can combine the power of neural networks and label propagation. This work
    generalises previous literature on graph-augmented training of neural networks,
    enabling it to be applied to multiple neural architectures (Feed-forward NNs,
    CNNs and LSTM RNNs) and a wide range of graphs. The new objective allows the
    neural networks to harness both labeled and unlabeled data by: (a) allowing the
    network to train using labeled data as in the supervised setting, (b) biasing
    the network to learn similar hidden representations for neighboring nodes on a
    graph, in the same vein as label propagation. Such architectures with the
    proposed objective can be trained efficiently using stochastic gradient descent
    and scaled to large graphs, with a runtime that is linear in the number of
    edges. The proposed joint training approach convincingly outperforms many
    existing methods on a wide range of tasks (multi-label classification on social
    graphs, news categorization, document classification and semantic intent
    classification), with multiple forms of graph inputs (including graphs with and
    without node-level features) and using different types of neural networks.

    FastQA: A Simple and Efficient Neural Architecture for Question Answering

    Dirk Weissenborn, Georg Wiese, Laura Seiffe
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    Recent development of large-scale question answering (QA) datasets triggered
    a substantial amount of research into end-to-end neural architectures for QA.
    Increasingly complex systems have been conceived without comparison to a
    simpler neural baseline system that would justify their complexity. In this
    work, we propose a simple heuristic that guided the development of FastQA, an
    efficient end-to-end neural model for question answering that is very
    competitive with existing models. We further demonstrate, that an extended
    version (FastQAExt) achieves state-of-the-art results on recent benchmark
    datasets, namely SQuAD, NewsQA and MsMARCO, outperforming most existing models.
    However, we show that increasing the complexity of FastQA to FastQAExt does not
    yield any systematic improvements. We argue that the same holds true for most
    existing systems that are similar to FastQAExt. A manual analysis reveals that
    our proposed heuristic explains most predictions of our model, which indicates
    that modeling a simple heuristic is enough to achieve strong performance on
    extractive QA datasets. The overall strong performance of FastQA puts results
    of existing, more complex models into perspective.

    Learned Optimizers that Scale and Generalize

    Olga Wichrowska, Niru Maheswaranathan, Matthew W. Hoffman, Sergio Gomez Colmenarejo, Misha Denil, Nando de Freitas, Jascha Sohl-Dickstein
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    Learning to learn has emerged as an important direction for achieving
    artificial intelligence. Two of the primary barriers to its adoption are an
    inability to scale to larger problems and a limited ability to generalize to
    new tasks. We introduce a learned gradient descent optimizer that generalizes
    well to new tasks, and which has significantly reduced memory and computation
    overhead. We achieve this by introducing a novel hierarchical RNN architecture,
    with minimal per-parameter overhead, augmented with additional architectural
    features that mirror the known structure of optimization tasks. We also develop
    a meta-training ensemble of small, diverse, optimization tasks capturing common
    properties of loss landscapes. The optimizer learns to out-perform RMSProp/ADAM
    on problems in this corpus. More importantly, it performs comparably or better
    when applied to small convolutional neural networks, despite seeing no neural
    networks in its meta-training set. Finally, it generalizes to train Inception
    V3 and ResNet V2 architectures on the ImageNet dataset, optimization problems
    that are of a vastly different scale than those it was trained on.

    Tree Memory Networks for Modelling Long-term Temporal Dependencies

    Tharindu Fernando, Simon Denman, Aaron McFadyen, Sridha Sridharan, Clinton Fookes
    Comments: This paper has been submitted to JMLR
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    In the domain of sequence modelling, Recurrent Neural Networks (RNN) have
    been capable of achieving impressive results in a variety of application areas
    including visual question answering, part-of-speech tagging and machine
    translation. However this success in modelling short term dependencies has not
    successfully transitioned to application areas such as trajectory prediction,
    which require capturing both short term and long term relationships. In this
    paper, we propose a Tree Memory Network (TMN) for modelling long term and short
    term relationships in sequence-to-sequence mapping problems. The proposed
    network architecture is composed of an input module, controller and a memory
    module. In contrast to related literature, which models the memory as a
    sequence of historical states, we model the memory as a recursive tree
    structure. This structure more effectively captures temporal dependencies
    across both short term and long term sequences using its hierarchical
    structure. We demonstrate the effectiveness and flexibility of the proposed TMN
    in two practical problems, aircraft trajectory modelling and pedestrian
    trajectory modelling in a surveillance setting, and in both cases we outperform
    the current state-of-the-art. Furthermore, we perform an in depth analysis on
    the evolution of the memory module content over time and provide visual
    evidence on how the proposed TMN is able to map both long term and short term
    relationships efficiently via a hierarchical structure.

    Drone Squadron Optimization: a Self-adaptive Algorithm for Global Numerical Optimization

    Vinícius Veloso de Melo, Wolfgang Banzhaf
    Comments: Short version – Full version published by Springer Neural Computing and Applications
    Journal-ref: Neural Computing and Applications, 2017, pp 1-28
    Subjects: Optimization and Control (math.OC); Neural and Evolutionary Computing (cs.NE)

    This paper proposes Drone Squadron Optimization, a new self-adaptive
    metaheuristic for global numerical optimization which is updated online by a
    hyper-heuristic. DSO is an artifact-inspired technique, as opposed to many
    algorithms used nowadays, which are nature-inspired. DSO is very flexible
    because it is not related to behaviors or natural phenomena. DSO has two core
    parts: the semi-autonomous drones that fly over a landscape to explore, and the
    Command Center that processes the retrieved data and updates the drones’
    firmware whenever necessary. The self-adaptive aspect of DSO in this work is
    the perturbation/movement scheme, which is the procedure used to generate
    target coordinates. This procedure is evolved by the Command Center during the
    global optimization process in order to adapt DSO to the search landscape. DSO
    was evaluated on a set of widely employed benchmark functions. The statistical
    analysis of the results shows that the proposed method is competitive with the
    other methods in the comparison, the performance is promising, but several
    future improvements are planned.

    Sensor Fusion for Robot Control through Deep Reinforcement Learning

    Steven Bohez, Tim Verbelen, Elias De Coninck, Bert Vankeirsbilck, Pieter Simoens, Bart Dhoedt
    Comments: 6 pages, 6 figures, submitted to IROS 2017
    Subjects: Robotics (cs.RO); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Systems and Control (cs.SY)

    Deep reinforcement learning is becoming increasingly popular for robot
    control algorithms, with the aim for a robot to self-learn useful feature
    representations from unstructured sensory input leading to the optimal
    actuation policy. In addition to sensors mounted on the robot, sensors might
    also be deployed in the environment, although these might need to be accessed
    via an unreliable wireless connection. In this paper, we demonstrate deep
    neural network architectures that are able to fuse information coming from
    multiple sensors and are robust to sensor failures at runtime. We evaluate our
    method on a search and pick task for a robot both in simulation and the real
    world.

    A Compact DNN: Approaching GoogLeNet-Level Accuracy of Classification and Domain Adaptation

    Chunpeng Wu, Wei Wen, Tariq Afzal, Yongmei Zhang, Yiran Chen, Hai Li
    Comments: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17)
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    Recently, DNN model compression based on network architecture design, e.g.,
    SqueezeNet, attracted a lot attention. No accuracy drop on image classification
    is observed on these extremely compact networks, compared to well-known models.
    An emerging question, however, is whether these model compression techniques
    hurt DNN’s learning ability other than classifying images on a single dataset.
    Our preliminary experiment shows that these compression methods could degrade
    domain adaptation (DA) ability, though the classification performance is
    preserved. Therefore, we propose a new compact network architecture and
    unsupervised DA method in this paper. The DNN is built on a new basic module
    Conv-M which provides more diverse feature extractors without significantly
    increasing parameters. The unified framework of our DA method will
    simultaneously learn invariance across domains, reduce divergence of feature
    representations, and adapt label prediction. Our DNN has 4.1M parameters, which
    is only 6.7% of AlexNet or 59% of GoogLeNet. Experiments show that our DNN
    obtains GoogLeNet-level accuracy both on classification and DA, and our DA
    method slightly outperforms previous competitive ones. Put all together, our DA
    strategy based on our DNN achieves state-of-the-art on sixteen of total
    eighteen DA tasks on popular Office-31 and Office-Caltech datasets.


    Computer Vision and Pattern Recognition

    A clever elimination strategy for efficient minimal solvers

    Zuzana Kukelova, Joe Kileel, Bernd Sturmfels, Tomas Pajdla
    Comments: 13 pages, 7 figures
    Journal-ref: IEEE Conference on Computer Vision and Pattern Recognition 2017
    (CVPR 2017)
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Symbolic Computation (cs.SC)

    We present a new insight into the systematic generation of minimal solvers in
    computer vision, which leads to smaller and faster solvers. Many minimal
    problem formulations are coupled sets of linear and polynomial equations where
    image measurements enter the linear equations only. We show that it is useful
    to solve such systems by first eliminating all the unknowns that do not appear
    in the linear equations and then extending solutions to the rest of unknowns.
    This can be generalized to fully non-linear systems by linearization via
    lifting. We demonstrate that this approach leads to more efficient solvers in
    three problems of partially calibrated relative camera pose computation with
    unknown focal length and/or radial distortion. Our approach also generates new
    interesting constraints on the fundamental matrices of partially calibrated
    cameras, which were not known before.

    Hybrid Supervised-unsupervised Image Topic Visualization with Convolutional Neural Network and LDA

    Kai Zhen, Mridul Birla, David Crandall, Bingjing Zhang, Judy Qiu
    Comments: 6 pages, 9 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    The system generates three errors of “Bad character(s) in field Abstract” for
    no reason. Please refer to manuscript for the full abstract.

    Transfer Learning for Melanoma Detection: Participation in ISIC 2017 Skin Lesion Classification Challenge

    Dennis H. Murphree, Che Ngufor
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This manuscript describes our participation in the International Skin Imaging
    Collaboration’s 2017 Skin Lesion Analysis Towards Melanoma Detection
    competition. We participated in Part 3: Lesion Classification. The two stated
    goals of this binary image classification challenge were to distinguish between
    (a) melanoma and (b) nevus and seborrheic keratosis, followed by distinguishing
    between (a) seborrheic keratosis and (b) nevus and melanoma. We chose a deep
    neural network approach with a transfer learning strategy, using a pre-trained
    Inception V3 network as both a feature extractor to provide input for a
    multi-layer perceptron as well as fine-tuning an augmented Inception network.
    This approach yielded validation set AUC’s of 0.84 on the second task and 0.76
    on the first task, for an average AUC of 0.80. We joined the competition
    unfortunately late, and we look forward to improving on these results.

    Texture segmentation with Fully Convolutional Networks

    Vincent Andrearczyk, Paul F. Whelan
    Comments: 13 pages, 4 figures, 3 tables
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In the last decade, deep learning has contributed to advances in a wide range
    computer vision tasks including texture analysis. This paper explores a new
    approach for texture segmentation using deep convolutional neural networks,
    sharing important ideas with classic filter bank based texture segmentation
    methods. Several methods are developed to train Fully Convolutional Networks to
    segment textures in various applications. We show in particular that these
    networks can learn to recognize and segment a type of texture, e.g. wood and
    grass from texture recognition datasets (no training segmentation). We
    demonstrate that Fully Convolutional Networks can learn from repetitive
    patterns to segment a particular texture from a single image or even a part of
    an image. We take advantage of these findings to develop a method that is
    evaluated on a series of supervised and unsupervised experiments and improve
    the state of the art on the Prague texture segmentation datasets.

    Learning to Discover Cross-Domain Relations with Generative Adversarial Networks

    Taeksoo Kim, Moonsu Cha, Hyunsoo Kim, Jungkwon Lee, Jiwon Kim
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    While humans easily recognize relations between data from different domains
    without any supervision, learning to automatically discover them is in general
    very challenging and needs many ground-truth pairs that illustrate the
    relations. To avoid costly pairing, we address the task of discovering
    cross-domain relations given unpaired data. We propose a method based on
    generative adversarial networks that learns to discover relations between
    different domains (DiscoGAN). Using the discovered relations, our proposed
    network successfully transfers style from one domain to another while
    preserving key attributes such as orientation and face identity.

    Automatic skin lesion segmentation with fully convolutional-deconvolutional networks

    Yading Yuan, Ming Chao, Yeh-Chi Lo
    Comments: ISIC2017 challenge, 4 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper summarizes our method and validation results for the ISBI
    Challenge 2017 – Skin Lesion Analysis Towards Melanoma Detection – Part I:
    Lesion Segmentation

    Real-Time Panoramic Tracking for Event Cameras

    Christian Reinbacher, Gottfried Munda, Thomas Pock
    Comments: Accepted to International Conference on Computational Photography 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Event cameras are a paradigm shift in camera technology. Instead of full
    frames, the sensor captures a sparse set of {em events} caused by intensity
    changes. Since only the changes are transferred, those cameras are able to
    capture quick movements of objects in the scene or of the camera itself. In
    this work we propose a novel method to perform camera tracking of event cameras
    in a panoramic setting with three degrees of freedom. We propose a direct
    camera tracking formulation, similar to state-of-the-art in visual odometry. We
    show that the minimal information needed for simultaneous tracking and mapping
    is the spatial position of events, without using the appearance of the imaged
    scene point. We verify the robustness to fast camera movements and dynamic
    objects in the scene on a recently proposed dataset cite{Mueggler2016} and
    self-recorded sequences.

    Random Forests and VGG-NET: An Algorithm for the ISIC 2017 Skin Lesion Classification Challenge

    Songtao Guo, Yixin Luo, Yanzhi Song
    Comments: ISIC2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This manuscript briefly describes an algorithm developed for the ISIC 2017
    Skin Lesion Classification Competition. In this task, participants are asked to
    complete two independent binary image classification tasks that involve three
    unique diagnoses of skin lesions (melanoma, nevus, and seborrheic keratosis).
    In the first binary classification task, participants are asked to distinguish
    between (a) melanoma and (b) nevus and seborrheic keratosis. In the second
    binary classification task, participants are asked to distinguish between (a)
    seborrheic keratosis and (b) nevus and melanoma. The other phases of the
    competition are not considered. Our proposed algorithm consists of three steps:
    preprocessing, classification using VGG-NET and Random Forests, and calculation
    of a final score.

    Block Compressive Sensing of Image and Video with Nonlocal Lagrangian Multiplier and Patch-based Sparse Representation

    Trinh Van Chien, Khanh Quoc Dinh, Byeungwoo Jeon, Martin Burger
    Comments: 30 pages, 13 figures, 2 tables; Accepted by Elsevier Signal Processing:Image Communication
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Although block compressive sensing (BCS) makes it tractable to sense
    large-sized images and video, its recovery performance has yet to be
    significantly improved because its recovered images or video usually suffer
    from blurred edges, loss of details, and high-frequency oscillatory artifacts,
    especially at a low subrate. This paper addresses these problems by designing a
    modified total variation technique that employs multi-block gradient
    processing, a denoised Lagrangian multiplier, and patch-based sparse
    representation. In the case of video, the proposed recovery method is able to
    exploit both spatial and temporal similarities. Simulation results confirm the
    improved performance of the proposed method for compressive sensing of images
    and video in terms of both objective and subjective qualities.

    A Data Driven Approach for Compound Figure Separation Using Convolutional Neural Networks

    Satoshi Tsutsui, David Crandall
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    A key problem in automatic analysis and understanding of scientific papers is
    to extract semantic information from non-textual paper components like figures,
    diagrams, tables, etc. This research always requires a very first preprocessing
    step: decomposing compound multi-part figures into individual subfigures.
    Previous work in compound figure separation has been based on manually designed
    features and separation rules, which often fail for less common figure types
    and layouts. Moreover, no implementation for compound figure decomposition is
    publicly available.

    This paper proposes a data driven approach to separate compound figures using
    modern deep Convolutional Neural Networks (CNNs) to train the separator in an
    end-to-end manner. CNNs eliminate the need for manually designing features and
    separation rules, but require large amount of annotated training data. We
    overcome this challenge using transfer learning as well as automatically
    synthesizing training exemplars. We evaluate our technique on the ImageCLEF
    Medical dataset, achieving 85.9% accuracy and outperforming manually engineered
    previous techniques. We made the resulting approach available as an easy-to-use
    Python library, aiming to promote further research in scientific figure mining.

    Joint Epipolar Tracking (JET): Simultaneous optimization of epipolar geometry and feature correspondences

    Henry Bradler, Matthias Ochs, Rudolf Mester
    Comments: Accepted at IEEE Winter Conference on Applications of Computer Vision (WACV), Santa Rosa, USA, March 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Traditionally, pose estimation is considered as a two step problem. First,
    feature correspondences are determined by direct comparison of image patches,
    or by associating feature descriptors. In a second step, the relative pose and
    the coordinates of corresponding points are estimated, most often by minimizing
    the reprojection error (RPE). RPE optimization is based on a loss function that
    is merely aware of the feature pixel positions but not of the underlying image
    intensities. In this paper, we propose a sparse direct method which introduces
    a loss function that allows to simultaneously optimize the unscaled relative
    pose, as well as the set of feature correspondences directly considering the
    image intensity values. Furthermore, we show how to integrate statistical prior
    information on the motion into the optimization process. This constructive
    inclusion of a Bayesian bias term is particularly efficient in application
    cases with a strongly predictable (short term) dynamic, e.g. in a driving
    scenario. In our experiments, we demonstrate that the JET algorithm we propose
    outperforms the classical reprojection error optimization on two synthetic
    datasets and on the KITTI dataset. The JET algorithm runs in real-time on a
    single CPU thread.

    Learning Rank Reduced Interpolation with Principal Component Analysis

    Matthias Ochs, Henry Bradler, Rudolf Mester
    Comments: Accepted at Intelligent Vehicles Symposium (IV), Los Angeles, USA, June 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In computer vision most iterative optimization algorithms, both sparse and
    dense, rely on a coarse and reliable dense initialization to bootstrap their
    optimization procedure. For example, dense optical flow algorithms profit
    massively in speed and robustness if they are initialized well in the basin of
    convergence of the used loss function. The same holds true for methods as
    sparse feature tracking when initial flow or depth information for new features
    at arbitrary positions is needed. This makes it extremely important to have
    techniques at hand that allow to obtain from only very few available
    measurements a dense but still approximative sketch of a desired 2D structure
    (e.g. depth maps, optical flow, disparity maps, etc.). The 2D map is regarded
    as sample from a 2D random process. The method presented here exploits the
    complete information given by the principal component analysis (PCA) of that
    process, the principal basis and its prior distribution. The method is able to
    determine a dense reconstruction from sparse measurement. When facing
    situations with only very sparse measurements, typically the number of
    principal components is further reduced which results in a loss of
    expressiveness of the basis. We overcome this problem and inject prior
    knowledge in a maximum a posterior (MAP) approach. We test our approach on the
    KITTI and the virtual KITTI datasets and focus on the interpolation of depth
    maps for driving scenes. The evaluation of the results show good agreement to
    the ground truth and are clearly better than results of interpolation by the
    nearest neighbor method which disregards statistical information.

    Large Margin Object Tracking with Circulant Feature Maps

    Mengmeng Wang, Yong Liu, Zeyi Huang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Structured output support vector machine (SVM) based tracking algorithms have
    shown favorable performance recently. Nonetheless, the time-consuming candidate
    sampling and complex optimization limit their real-time applications. In this
    paper, we propose a novel large margin object tracking method which absorbs the
    strong discriminative ability from structured output SVM and speeds up by the
    correlation filter algorithm significantly. Secondly, a multimodal target
    detection technique is proposed to improve the target localization precision
    and prevent model drift introduced by similar objects or background noise.
    Thirdly, we exploit the feedback from high-confidence tracking results to avoid
    the model corruption problem. We implement two versions of the proposed tracker
    with the representations from both conventional hand-crafted and deep
    convolution neural networks (CNNs) based features to validate the strong
    compatibility of the algorithm. The experimental results demonstrate that the
    proposed tracker performs superiorly against several state-of-the-art
    algorithms on the challenging benchmark sequences while runs at speed in excess
    of 80 frames per second. The source code and experimental results will be made
    publicly available.

    Zero-Shot Recognition using Dual Visual-Semantic Mapping Paths

    Yanan Li, Donghui Wang, Huanhang Hu, Yuetan Lin, Yueting Zhuang
    Comments: Accepted as a full paper in IEEE Computer Vision and Pattern Recognition (CVPR) 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Zero-shot recognition aims to accurately recognize objects of unseen classes
    by using a shared visual-semantic mapping between the image feature space and
    the semantic embedding space. This mapping is learned on training data of seen
    classes and is expected to have transfer ability to unseen classes. In this
    paper, we tackle this problem by exploiting the intrinsic relationship between
    the semantic space manifold and the transfer ability of visual-semantic
    mapping. We formalize their connection and cast zero-shot recognition as a
    joint optimization problem. Motivated by this, we propose a novel framework for
    zero-shot recognition, which contains dual visual-semantic mapping paths. Our
    analysis shows this framework can not only apply prior semantic knowledge to
    infer underlying semantic manifold in the image feature space, but also
    generate optimized semantic embedding space, which can enhance the transfer
    ability of the visual-semantic mapping to unseen classes. The proposed method
    is evaluated for zero-shot recognition on four benchmark datasets, achieving
    outstanding results.

    Label Stability in Multiple Instance Learning

    Veronika Cheplygina, Lauge Sørensen, David M. J. Tax, Marleen de Bruijne, Marco Loog
    Comments: Published at MICCAI 2015
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

    We address the problem of emph{instance label stability} in multiple
    instance learning (MIL) classifiers. These classifiers are trained only on
    globally annotated images (bags), but often can provide fine-grained
    annotations for image pixels or patches (instances). This is interesting for
    computer aided diagnosis (CAD) and other medical image analysis tasks for which
    only a coarse labeling is provided. Unfortunately, the instance labels may be
    unstable. This means that a slight change in training data could potentially
    lead to abnormalities being detected in different parts of the image, which is
    undesirable from a CAD point of view. Despite MIL gaining popularity in the CAD
    literature, this issue has not yet been addressed. We investigate the stability
    of instance labels provided by several MIL classifiers on 5 different datasets,
    of which 3 are medical image datasets (breast histopathology, diabetic
    retinopathy and computed tomography lung images). We propose an unsupervised
    measure to evaluate instance stability, and demonstrate that a
    performance-stability trade-off can be made when comparing MIL classifiers.

    Transfer Learning by Asymmetric Image Weighting for Segmentation across Scanners

    Veronika Cheplygina, Annegreet van Opbroek, M. Arfan Ikram, Meike W. Vernooij, Marleen de Bruijne
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

    Supervised learning has been very successful for automatic segmentation of
    images from a single scanner. However, several papers report deteriorated
    performances when using classifiers trained on images from one scanner to
    segment images from other scanners. We propose a transfer learning classifier
    that adapts to differences between training and test images. This method uses a
    weighted ensemble of classifiers trained on individual images. The weight of
    each classifier is determined by the similarity between its training image and
    the test image.

    We examine three unsupervised similarity measures, which can be used in
    scenarios where no labeled data from a newly introduced scanner or scanning
    protocol is available. The measures are based on a divergence, a bag distance,
    and on estimating the labels with a clustering procedure. These measures are
    asymmetric. We study whether the asymmetry can improve classification. Out of
    the three similarity measures, the bag similarity measure is the most robust
    across different studies and achieves excellent results on four brain tissue
    segmentation datasets and three white matter lesion segmentation datasets,
    acquired at different centers and with different scanners and scanning
    protocols. We show that the asymmetry can indeed be informative, and that
    computing the similarity from the test image to the training images is more
    appropriate than the opposite direction.

    Classification of COPD with Multiple Instance Learning

    Veronika Cheplygina, Lauge Sørensen, David M. J. Tax, Jesper Holst Pedersen, Marco Loog, Marleen de Bruijne
    Comments: Published at International Conference on Pattern Recognition (ICPR) 2014
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

    Chronic obstructive pulmonary disease (COPD) is a lung disease where early
    detection benefits the survival rate. COPD can be quantified by classifying
    patches of computed tomography images, and combining patch labels into an
    overall diagnosis for the image. As labeled patches are often not available,
    image labels are propagated to the patches, incorrectly labeling healthy
    patches in COPD patients as being affected by the disease. We approach
    quantification of COPD from lung images as a multiple instance learning (MIL)
    problem, which is more suitable for such weakly labeled data. We investigate
    various MIL assumptions in the context of COPD and show that although a concept
    region with COPD-related disease patterns is present, considering the whole
    distribution of lung tissue patches improves the performance. The best method
    is based on averaging instances and obtains an AUC of 0.742, which is higher
    than the previously reported best of 0.713 on the same dataset. Using the full
    training set further increases performance to 0.776, which is significantly
    higher (DeLong test) than previous results.

    What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?

    Alex Kendall, Yarin Gal
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    There are two major types of uncertainty one can model. Aleatoric uncertainty
    captures noise inherent in the observations. On the other hand, epistemic
    uncertainty accounts for uncertainty in the model — uncertainty which can be
    explained away given enough data. Traditionally it has been difficult to model
    epistemic uncertainty in computer vision, but with new Bayesian deep learning
    tools this is now possible. We study the benefits of modeling epistemic vs.
    aleatoric uncertainty in Bayesian deep learning models for vision tasks. For
    this we present a Bayesian deep learning framework combining input-dependent
    aleatoric uncertainty together with epistemic uncertainty. We study models
    under the framework with per-pixel semantic segmentation and depth regression
    tasks. Further, our explicit uncertainty formulation leads to new loss
    functions for these tasks, which can be interpreted as learned attenuation.
    This makes the loss more robust to noisy data, also giving new state-of-the-art
    results on segmentation and depth regression benchmarks.

    Comparison of the Deep-Learning-Based Automated Segmentation Methods for the Head Sectioned Images of the Virtual Korean Human Project

    Mohammad Eshghi, Holger R. Roth, Masahiro Oda, Min Suk Chung, Kensaku Mori
    Comments: Accepted for presentation at the 15th IAPR Conference on Machine Vision Applications (MVA2017), Nagoya, Japan
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper presents an end-to-end pixelwise fully automated segmentation of
    the head sectioned images of the Visible Korean Human (VKH) project based on
    Deep Convolutional Neural Networks (DCNNs). By converting classification
    networks into Fully Convolutional Networks (FCNs), a coarse prediction map,
    with smaller size than the original input image, can be created for
    segmentation purposes. To refine this map and to obtain a dense pixel-wise
    output, standard FCNs use deconvolution layers to upsample the coarse map.
    However, upsampling based on deconvolution increases the number of network
    parameters and causes loss of detail because of interpolation. On the other
    hand, dilated convolution is a new technique introduced recently that attempts
    to capture multi-scale contextual information without increasing the network
    parameters while keeping the resolution of the prediction maps high. We used
    both a standard FCN and a dilated convolution based FCN for semantic
    segmentation of the head sectioned images of the VKH dataset. Quantitative
    results showed approximately 20% improvement in the segmentation accuracy when
    using FCNs with dilated convolutions.

    End-to-end Binary Representation Learning via Direct Binary Embedding

    Liu Liu, Alireza Rahimpour, Ali Taalimi, Hairong Qi
    Comments: Submitted to ICIP’17
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR)

    Learning binary representation is essential to large-scale computer vision
    tasks. Most existing algorithms require a separate quantization constraint to
    learn effective hashing functions. In this work, we present Direct Binary
    Embedding (DBE), a simple yet very effective algorithm to learn binary
    representation in an end-to-end fashion. By appending an ingeniously designed
    DBE layer to the deep convolutional neural network (DCNN), DBE learns binary
    code directly from the continuous DBE layer activation without quantization
    error. By employing the deep residual network (ResNet) as DCNN component, DBE
    captures rich semantics from images. Furthermore, in the effort of handling
    multilabel images, we design a joint cross entropy loss that includes both
    softmax cross entropy and weighted binary cross entropy in consideration of the
    correlation and independence of labels, respectively. Extensive experiments
    demonstrate the significant superiority of DBE over state-of-the-art methods on
    tasks of natural object recognition, image retrieval and image annotation.

    Robust Non-Rigid Registration With Reweighted Dual Sparsities

    Jingyu Yang, Kun Li, Yu-Kun Lai, Daoliang Guo
    Comments: 12 pages, submitted to IEEE Transactions on Visualization and Computer Graphics
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Computational Geometry (cs.CG); Graphics (cs.GR)

    Non-rigid registration is challenging because it is ill-posed with high
    degrees of freedom and is thus sensitive to noise and outliers. We propose a
    robust non-rigid registration method using reweighted sparsities on position
    and transformation to estimate the deformations between 3-D shapes. We
    formulate the energy function with dual sparsities on both the data term and
    the smoothness term, and define the smoothness constraint using local rigidity.
    The dual-sparsity based non-rigid registration model is enhanced with a
    reweighting scheme, and solved by transferring the model into some alternating
    optimized subproblems which have exact solutions and guaranteed convergence.
    Experimental results on both public datasets and real scanned datasets show
    that our method outperforms the state-of-the-art methods and is more robust to
    noise and outliers than conventional non-rigid registration methods.

    Source Camera Identification Based On Content-Adaptive Fusion Network

    Pengpeng Yang, Wei Zhao, Rongrong Ni, Yao Zhao
    Comments: This article has been submitted to the 2017 IEEE International Conference on Image Processing
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Source camera identification is still a hard task in forensics community,
    especially for the case of the small query image size. In this paper, we
    propose a solution to identify the source camera of the small-size images:
    content-adaptive fusion network. In order to learn better feature
    representation from the input data, content-adaptive convolutional neural
    networks(CA-CNN) are constructed. We add a convolutional layer in preprocessing
    stage. Moreover, with the purpose of capturing more comprehensive information,
    we parallel three CA-CNNs: CA3-CNN, CA5-CNN, CA7-CNN to get the
    content-adaptive fusion network. The difference of three CA-CNNs lies in the
    convolutional kernel size of pre-processing layer. The experimental results
    show that the proposed method is practicable and satisfactory.

    Face Recognition using Multi-Modal Low-Rank Dictionary Learning

    Homa Foroughi, Moein Shakeri, Nilanjan Ray, Hong Zhang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Face recognition has been widely studied due to its importance in different
    applications; however, most of the proposed methods fail when face images are
    occluded or captured under illumination and pose variations. Recently several
    low-rank dictionary learning methods have been proposed and achieved promising
    results for noisy observations. While these methods are mostly developed for
    single-modality scenarios, recent studies demonstrated the advantages of
    feature fusion from multiple inputs. We propose a multi-modal structured
    low-rank dictionary learning method for robust face recognition, using raw
    pixels of face images and their illumination invariant representation. The
    proposed method learns robust and discriminative representations from
    contaminated face images, even if there are few training samples with large
    intra-class variations. Extensive experiments on different datasets validate
    the superior performance and robustness of our method to severe illumination
    variations and occlusion.

    Skin lesion segmentation based on preprocessing, thresholding and neural networks

    Juana M. Gutiérrez-Arriola, Marta Gómez-Álvarez, Victor Osma-Ruiz, Nicolás Sáenz-Lechón, Rubén Fraile
    Comments: 4 pages, 4 figures, abstract submitted to participate in the challenge ISIC 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This abstract describes the segmentation system used to participate in the
    challenge ISIC 2017: Skin Lesion Analysis Towards Melanoma Detection. Several
    preprocessing techniques have been tested for three color representations (RGB,
    YCbCr and HSV) of 392 images. Results have been used to choose the better
    preprocessing for each channel. In each case a neural network is trained to
    predict the Jaccard Index based on object characteristics. The system includes
    black frames and reference circle detection algorithms but no special treatment
    is done for hair removal. Segmentation is performed in two steps first the best
    channel to be segmented is chosen by selecting the best neural network output.
    If this output does not predict a Jaccard Index over 0.5 a more aggressive
    preprocessing is performed using open and close morphological operations and
    the segmentation of the channel that obtains the best output from the neural
    networks is selected as the lesion.

    A Proximity-Aware Hierarchical Clustering of Faces

    Wei-An Lin, Jun-Cheng Chen, Rama Chellappa
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we propose an unsupervised face clustering algorithm called
    “Proximity-Aware Hierarchical Clustering” (PAHC) that exploits the local
    structure of deep representations. In the proposed method, a similarity measure
    between deep features is computed by evaluating linear SVM margins. SVMs are
    trained using nearest neighbors of sample data, and thus do not require any
    external training data. Clusters are then formed by thresholding the similarity
    scores. We evaluate the clustering performance using three challenging
    unconstrained face datasets, including Celebrity in Frontal-Profile (CFP),
    IARPA JANUS Benchmark A (IJB-A), and JANUS Challenge Set 3 (JANUS CS3)
    datasets. Experimental results demonstrate that the proposed approach can
    achieve significant improvements over state-of-the-art methods. Moreover, we
    also show that the proposed clustering algorithm can be applied to curate a set
    of large-scale and noisy training dataset while maintaining sufficient amount
    of images and their variations due to nuisance factors. The face verification
    performance on JANUS CS3 improves significantly by finetuning a DCNN model with
    the curated MS-Celeb-1M dataset which contains over three million face images.

    In Search of a Dataset for Handwritten Optical Music Recognition: Introducing MUSCIMA++

    Jan Hajič jr., Pavel Pecina
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Optical Music Recognition (OMR) has long been without an adequate dataset and
    ground truth for evaluating OMR systems, which has been a major problem for
    establishing a state of the art in the field. Furthermore, machine learning
    methods require training data. We analyze how the OMR processing pipeline can
    be expressed in terms of gradually more complex ground truth, and based on this
    analysis, we design the MUSCIMA++ dataset of handwritten music notation that
    addresses musical symbol recognition and notation reconstruction. The MUSCIMA++
    dataset version 0.9 consists of 140 pages of handwritten music, with 91255
    manually annotated notation symbols and 82261 explicitly marked relationships
    between symbol pairs. The dataset allows training and evaluating models for
    symbol classification, symbol localization, and notation graph assembly, both
    in isolation and jointly. Open-source tools are provided for manipulating the
    dataset, visualizing the data and further annotation, and the dataset itself is
    made available under an open license.

    RECOD Titans at ISIC Challenge 2017

    Afonso Menegola, Julia Tavares, Michel Fornaciali, Lin Tzy Li, Sandra Avila, Eduardo Valle
    Comments: 5 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This extended abstract describes the participation of RECOD Titans in parts 1
    and 3 of the ISIC Challenge 2017 “Skin Lesion Analysis Towards Melanoma
    Detection” (ISBI 2017). Although our team has a long experience with melanoma
    classification, the ISIC Challenge 2017 was the very first time we worked on
    skin-lesion segmentation. For part 1 (segmentation), our final submission used
    four of our models: two trained with all 2000 samples, without a validation
    split, for 250 and for 500 epochs respectively; and other two trained and
    validated with two different 1600/400 splits, for 220 epochs. Those four
    models, individually, achieved between 0.780 and 0.783 official validation
    scores. Our final submission averaged the output of those four models achieved
    a score of 0.793. For part 3 (classification), the submitted test run as well
    as our last official validation run were the result from a meta-model that
    assembled seven base deep-learning models: three based on Inception-V4 trained
    on our largest dataset; three based on Inception trained on our smallest
    dataset; and one based on ResNet-101 trained on our smaller dataset. The
    results of those component models were stacked in a meta-learning layer based
    on an SVM trained on the validation set of our largest dataset.

    Discriminate-and-Rectify Encoders: Learning from Image Transformation Sets

    Andrea Tacchetti, Stephen Voinea, Georgios Evangelopoulos
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)

    The complexity of a learning task is increased by transformations in the
    input space that preserve class identity. Visual object recognition for example
    is affected by changes in viewpoint, scale, illumination or planar
    transformations. While drastically altering the visual appearance, these
    changes are orthogonal to recognition and should not be reflected in the
    representation or feature encoding used for learning. We introduce a framework
    for weakly supervised learning of image embeddings that are robust to
    transformations and selective to the class distribution, using sets of
    transforming examples (orbit sets), deep parametrizations and a novel
    orbit-based loss. The proposed loss combines a discriminative, contrastive part
    for orbits with a reconstruction error that learns to rectify orbit
    transformations. The learned embeddings are evaluated in distance metric-based
    tasks, such as one-shot classification under geometric transformations, as well
    as face verification and retrieval under more realistic visual variability. Our
    results suggest that orbit sets, suitably computed or observed, can be used for
    efficient, weakly-supervised learning of semantically relevant image
    embeddings.

    Tracking Gaze and Visual Focus of Attention of People Involved in Social Interaction

    Benoît Massé, Silèye Ba, Radu Horaud
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    The visual focus of attention (VFOA) has been recognized as a prominent
    conversational cue. We are interested in the VFOA tracking of a group of people
    involved in social interaction. We note that in this case the participants look
    either at each other or at an object of interest; therefore they don’t always
    face a camera and, consequently, their gazes (and their VFOAs) cannot be based
    on eye detection and tracking. We propose a method that exploits the
    correlation between gaze direction and head orientation. Both VFOA and gaze are
    modeled as latent variables in a Bayesian switching linear dynamic model. The
    proposed formulation leads to a tractable learning procedure and to an
    efficient gaze-and-VFOA tracking algorithm. The method is tested and
    benchmarked using a publicly available dataset that contains typical
    multi-party human-robot interaction scenarios, and that was recorded with both
    a motion capture system, and with a camera mounted onto a robot head.

    A fully end-to-end deep learning approach for real-time simultaneous 3D reconstruction and material recognition

    Cheng Zhao, Li Sun, Rustam Stolkin
    Comments: 8 pages, 7 figures, 4 tables
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper addresses the problem of simultaneous 3D reconstruction and
    material recognition and segmentation. Enabling robots to recognise different
    materials (concrete, metal etc.) in a scene is important for many tasks, e.g.
    robotic interventions in nuclear decommissioning. Previous work on 3D semantic
    reconstruction has predominantly focused on recognition of everyday domestic
    objects (tables, chairs etc.), whereas previous work on material recognition
    has largely been confined to single 2D images without any 3D reconstruction.
    Meanwhile, most 3D semantic reconstruction methods rely on computationally
    expensive post-processing, using Fully-Connected Conditional Random Fields
    (CRFs), to achieve consistent segmentations. In contrast, we propose a deep
    learning method which performs 3D reconstruction while simultaneously
    recognising different types of materials and labelling them at the pixel level.
    Unlike previous methods, we propose a fully end-to-end approach, which does not
    require hand-crafted features or CRF post-processing. Instead, we use only
    learned features, and the CRF segmentation constraints are incorporated inside
    the fully end-to-end learned system. We present the results of experiments, in
    which we trained our system to perform real-time 3D semantic reconstruction for
    23 different materials in a real-world application. The run-time performance of
    the system can be boosted to around 10Hz, using a conventional GPU, which is
    enough to achieve real-time semantic reconstruction using a 30fps RGB-D camera.
    To the best of our knowledge, this work is the first real-time end-to-end
    system for simultaneous 3D reconstruction and material recognition.

    6-DoF Object Pose from Semantic Keypoints

    Georgios Pavlakos, Xiaowei Zhou, Aaron Chan, Konstantinos G. Derpanis, Kostas Daniilidis
    Comments: IEEE International Conference on Robotics and Automation (ICRA), 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)

    This paper presents a novel approach to estimating the continuous six degree
    of freedom (6-DoF) pose (3D translation and rotation) of an object from a
    single RGB image. The approach combines semantic keypoints predicted by a
    convolutional network (convnet) with a deformable shape model. Unlike prior
    work, we are agnostic to whether the object is textured or textureless, as the
    convnet learns the optimal representation from the available training image
    data. Furthermore, the approach can be applied to instance- and class-based
    pose recovery. Empirically, we show that the proposed approach can accurately
    recover the 6-DoF object pose for both instance- and class-based scenarios with
    a cluttered background. For class-based object pose estimation,
    state-of-the-art accuracy is shown on the large-scale PASCAL3D+ dataset.

    A Framework for Dynamic Image Sampling Based on Supervised Learning (SLADS)

    G. M. Dilshan P. Godaliyadda, Dong Hye Ye, Michael D.Uchic, Michael A. Groeber, Gregery T. Buzzard, Charles A. Bouman
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Sparse sampling schemes have the potential to dramatically reduce image
    acquisition time while simultaneously reducing radiation damage to samples.
    However, for a sparse sampling scheme to be useful it is important that we are
    able to reconstruct the underlying object with sufficient clarity using the
    sparse measurements. In dynamic sampling, each new measurement location is
    selected based on information obtained from previous measurements. Therefore,
    dynamic sampling schemes have the potential to dramatically reduce the number
    of measurements needed for high fidelity reconstructions. However, most
    existing dynamic sampling methods for point-wise measurement acquisition tend
    to be computationally expensive and are therefore too slow for practical
    applications.

    In this paper, we present a framework for dynamic sampling based on machine
    learning techniques, which we call a supervised learning approach for dynamic
    sampling (SLADS). In each step of SLADS, the objective is to find the pixel
    that maximizes the expected reduction in distortion (ERD) given previous
    measurements. SLADS is fast because we use a simple regression function to
    compute the ERD, and it is accurate because the regression function is trained
    using data sets that are representative of the specific application. In
    addition, we introduce a method to terminate dynamic sampling at a desired
    level of distortion, and we extended the SLADS methodology to sample groups of
    pixels at each step. Finally, we present results on computationally-generated
    synthetic data and experimentally-collected data to demonstrate a dramatic
    improvement over state-of-the-art static sampling methods.

    A PatchMatch-based Dense-field Algorithm for Video Copy-Move Detection and Localization

    Luca D'Amiano, Davide Cozzolino, Giovanni Poggi, Luisa Verdoliva
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We propose a new algorithm for the reliable detection and localization of
    video copy-move forgeries. Discovering well crafted video copy-moves may be
    very difficult, especially when some uniform background is copied to occlude
    foreground objects. To reliably detect both additive and occlusive copy-moves
    we use a dense-field approach, with invariant features that guarantee
    robustness to several post-processing operations. To limit complexity, a
    suitable video-oriented version of PatchMatch is used, with a multiresolution
    search strategy, and a focus on volumes of interest. Performance assessment
    relies on a new dataset, designed ad hoc, with realistic copy-moves and a wide
    variety of challenging situations. Experimental results show the proposed
    method to detect and localize video copy-moves with good accuracy even in
    adverse conditions.

    Recasting Residual-based Local Descriptors as Convolutional Neural Networks: an Application to Image Forgery Detection

    Davide Cozzolino, Giovanni Poggi, Luisa Verdoliva
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Local descriptors based on the image noise residual have proven extremely
    effective for a number of forensic applications, like forgery detection and
    localization. Nonetheless, motivated by promising results in computer vision,
    the focus of the research community is now shifting on deep learning. In this
    paper we show that a class of residual-based descriptors can be actually
    regarded as a simple constrained convolutional neural network (CNN). Then, by
    relaxing the constraints, and fine-tuning the net on a relatively small
    training set, we obtain a significant performance improvement with respect to
    the conventional detector.

    Subspace Learning in The Presence of Sparse Structured Outliers and Noise

    Shervin Minaee, Yao Wang
    Journal-ref: IEEE International Symposium on Circuits and Systems, 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Subspace learning is an important problem, which has many applications in
    image and video processing. It can be used to find a low-dimensional
    representation of signals and images. But in many applications, the desired
    signal is heavily distorted by outliers and noise, which negatively affect the
    learned subspace. In this work, we present a novel algorithm for learning a
    subspace for signal representation, in the presence of structured outliers and
    noise. The proposed algorithm tries to jointly detect the outliers and learn
    the subspace for images. We present an alternating optimization algorithm for
    solving this problem, which iterates between learning the subspace and finding
    the outliers. This algorithm has been trained on a large number of image
    patches, and the learned subspace is used for image segmentation, and is shown
    to achieve better segmentation results than prior methods, including least
    absolute deviation fitting, k-means clustering based segmentation in DjVu, and
    shape primitive extraction and coding algorithm.

    Learning Background-Aware Correlation Filters for Visual Tracking

    Hamed Kiani Galoogahi, Ashton Fagg, Simon Lucey
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Correlation Filters (CFs) have recently demonstrated excellent performance in
    terms of rapidly tracking objects under challenging photometric and geometric
    variations. The strength of the approach comes from its ability to efficiently
    learn – “on the fly” – how the object is changing over time. A fundamental
    drawback to CFs, however, is that the background of the object is not be
    modelled over time which can result in suboptimal results. In this paper we
    propose a Background-Aware CF that can model how both the foreground and
    background of the object varies over time. Our approach, like conventional CFs,
    is extremely computationally efficient – and extensive experiments over
    multiple tracking benchmarks demonstrate the superior accuracy and real-time
    performance of our method compared to the state-of-the-art trackers including
    those based on a deep learning paradigm.

    Fully Convolutional Networks to Detect Clinical Dermoscopic Features

    Jeremy Kawahara, Ghassan Hamarneh
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We use a pretrained fully convolutional neural network to detect clinical
    dermoscopic features from dermoscopy skin lesion images. We reformulate the
    superpixel classification task as an image segmentation problem, and extend a
    neural network architecture originally designed for image classification to
    detect dermoscopic features. Specifically, we interpolate the feature maps from
    several layers in the network to match the size of the input, concatenate the
    resized feature maps, and train the network to minimize a smoothed negative F1
    score. Over the public validation leaderboard of the 2017 ISIC/ISBI Lesion
    Dermoscopic Feature Extraction Challenge, our approach achieves 89.3% AUROC,
    the highest averaged score when compared to the other two entries. Results over
    the private test leaderboard are still to be announced.

    Neural Networks for Beginners. A fast implemention in Matlab, Torch, TensorFlow

    Francesco Giannini, Vincenzo Laveglia, Alessandro Rossi, Dario Zanca, Andrea Zugarini
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Mathematical Software (cs.MS); Machine Learning (stat.ML)

    This report provides an introduction to some Machine Learning tools within
    the most common development environments. It mainly focuses on practical
    problems, skipping any theoretical introduction. It is oriented to both
    students trying to approach Machine Learning and experts looking for new
    frameworks.

    Real-time 3D Human Tracking for Mobile Robots with Multisensors

    Mengmeng Wang, Daobilige Su, Lei Shi, Yong Liu, Jaime Valls Miro
    Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)

    Acquiring the accurate 3-D position of a target person around a robot
    provides fundamental and valuable information that is applicable to a wide
    range of robotic tasks, including home service, navigation and entertainment.
    This paper presents a real-time robotic 3-D human tracking system which
    combines a monocular camera with an ultrasonic sensor by the extended Kalman
    filter (EKF). The proposed system consists of three sub-modules: monocular
    camera sensor tracking model, ultrasonic sensor tracking model and multi-sensor
    fusion. An improved visual tracking algorithm is presented to provide partial
    location estimation (2-D). The algorithm is designed to overcome severe
    occlusions, scale variation, target missing and achieve robust re-detection.
    The scale accuracy is further enhanced by the estimated 3-D information. An
    ultrasonic sensor array is employed to provide the range information from the
    target person to the robot and Gaussian Process Regression is used for partial
    location estimation (2-D). EKF is adopted to sequentially process multiple,
    heterogeneous measurements arriving in an asynchronous order from the vision
    sensor and the ultrasonic sensor separately. In the experiments, the proposed
    tracking system is tested in both simulation platform and actual mobile robot
    for various indoor and outdoor scenes. The experimental results show the
    superior performance of the 3-D tracking system in terms of both the accuracy
    and robustness.

    Tree Memory Networks for Modelling Long-term Temporal Dependencies

    Tharindu Fernando, Simon Denman, Aaron McFadyen, Sridha Sridharan, Clinton Fookes
    Comments: This paper has been submitted to JMLR
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    In the domain of sequence modelling, Recurrent Neural Networks (RNN) have
    been capable of achieving impressive results in a variety of application areas
    including visual question answering, part-of-speech tagging and machine
    translation. However this success in modelling short term dependencies has not
    successfully transitioned to application areas such as trajectory prediction,
    which require capturing both short term and long term relationships. In this
    paper, we propose a Tree Memory Network (TMN) for modelling long term and short
    term relationships in sequence-to-sequence mapping problems. The proposed
    network architecture is composed of an input module, controller and a memory
    module. In contrast to related literature, which models the memory as a
    sequence of historical states, we model the memory as a recursive tree
    structure. This structure more effectively captures temporal dependencies
    across both short term and long term sequences using its hierarchical
    structure. We demonstrate the effectiveness and flexibility of the proposed TMN
    in two practical problems, aircraft trajectory modelling and pedestrian
    trajectory modelling in a surveillance setting, and in both cases we outperform
    the current state-of-the-art. Furthermore, we perform an in depth analysis on
    the evolution of the memory module content over time and provide visual
    evidence on how the proposed TMN is able to map both long term and short term
    relationships efficiently via a hierarchical structure.

    Geometry-Based Region Proposals for Real-Time Robot Detection of Tabletop Objects

    Alexander Broad, Brenna Argall
    Comments: Update based on work presented at RSS 2016 Deep Learning Workshop
    Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)

    We present a novel object detection pipeline for localization and recognition
    in three dimensional environments. Our approach makes use of an RGB-D sensor
    and combines state-of-the-art techniques from the robotics and computer vision
    communities to create a robust, real-time detection system. We focus
    specifically on solving the object detection problem for tabletop scenes, a
    common environment for assistive manipulators. Our detection pipeline locates
    objects in a point cloud representation of the scene. These clusters are
    subsequently used to compute a bounding box around each object in the RGB
    space. Each defined patch is then fed into a Convolutional Neural Network (CNN)
    for object recognition. We also demonstrate that our region proposal method can
    be used to develop novel datasets that are both large and diverse enough to
    train deep learning models, and easy enough to collect that end-users can
    develop their own datasets. Lastly, we validate the resulting system through an
    extensive analysis of the accuracy and run-time of the full pipeline.


    Artificial Intelligence

    On Inconsistency Indices and Inconsistency Axioms in Pairwise Comparisons

    Jiri Mazurek
    Comments: 13 pages, 2 figures
    Subjects: Artificial Intelligence (cs.AI)

    Pairwise comparisons are an important tool of modern (multiple criteria)
    decision making. Since human judgments are often inconsistent, many studies
    focused on the ways how to express and measure this inconsistency, and several
    inconsistency indices were proposed as an alternative to Saaty inconsistency
    index and inconsistency ratio for reciprocal pairwise comparisons matrices.
    This paper aims to: firstly, introduce a new measure of inconsistency of
    pairwise comparisons and to prove its basic properties; secondly, to postulate
    an additional axiom, an upper boundary axiom, to an existing set of axioms; and
    the last, but not least, the paper provides proofs of satisfaction of this
    additional axiom by selected inconsistency indices as well as it provides their
    numerical comparison.

    Fuzzy Rankings: Properties and Applications

    Jiří Mazurek
    Comments: 11 pages, 11 figures
    Subjects: Artificial Intelligence (cs.AI)

    In practice, a ranking of objects with respect to given set of criteria is of
    considerable importance. However, due to lack of knowledge, information of time
    pressure, decision makers might not be able to provide a (crisp) ranking of
    objects from the top to the bottom. Instead, some objects might be ranked
    equally, or better than other objects only to some degree. In such cases, a
    generalization of crisp rankings to fuzzy rankings can be more useful. The aim
    of the article is to introduce the notion of a fuzzy ranking and to discuss its
    several properties, namely orderings, similarity and indecisiveness. The
    proposed approach can be used both for group decision making or multiple
    criteria decision making when uncertainty is involved.

    Neural Programming by Example

    Chengxun Shu, Hongyu Zhang
    Comments: 7 pages, Association for the Advancement of Artificial Intelligence (AAAI)
    Journal-ref: AAAI-2017
    Subjects: Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE); Software Engineering (cs.SE)

    Programming by Example (PBE) targets at automatically inferring a computer
    program for accomplishing a certain task from sample input and output. In this
    paper, we propose a deep neural networks (DNN) based PBE model called Neural
    Programming by Example (NPBE), which can learn from input-output strings and
    induce programs that solve the string manipulation problems. Our NPBE model has
    four neural network based components: a string encoder, an input-output
    analyzer, a program generator, and a symbol selector. We demonstrate the
    effectiveness of NPBE by training it end-to-end to solve some common string
    manipulation problems in spreadsheet systems. The results show that our model
    can induce string manipulation programs effectively. Our work is one step
    towards teaching DNN to generate computer programs.

    Syntax-Preserving Belief Change Operators for Logic Programs

    Sebastian Binnewies (1), Zhiqiang Zhuang (1), Kewen Wang (1), Bela Stantic (1) ((1) School of Information and Communication Technology, Griffith University, Australia)
    Comments: 44 pages, submitted to ACM Transactions on Computational Logic
    Subjects: Artificial Intelligence (cs.AI)

    Recent methods have adapted the well-established AGM and belief base
    frameworks for belief change to cover belief revision in logic programs. In
    this study here, we present two new sets of belief change operators for logic
    programs. They focus on preserving the explicit relationships expressed in the
    rules of a program, a feature that is missing in purely semantic approaches
    that consider programs only in their entirety. In particular, operators of the
    latter class fail to satisfy preservation and support, two important properties
    for belief change in logic programs required to ensure intuitive results.

    We address this shortcoming of existing approaches by introducing partial
    meet and ensconcement constructions for logic program belief change, which
    allow us to define syntax-preserving operators that satisfy preservation and
    support. Our work is novel in that our constructions not only preserve more
    information from a logic program during a change operation than existing ones,
    but they also facilitate natural definitions of contraction operators, the
    first in the field to the best of our knowledge.

    In order to evaluate the rationality of our operators, we translate the
    revision and contraction postulates from the AGM and belief base frameworks to
    the logic programming setting. We show that our operators fully comply with the
    belief base framework and formally state the interdefinability between our
    operators. We further propose an algorithm that is based on modularising a
    logic program to reduce partial meet and ensconcement revisions or contractions
    to performing the operation only on the relevant modules of that program.
    Finally, we compare our approach to two state-of-the-art logic program revision
    methods and demonstrate that our operators address the shortcomings of one and
    generalise the other method.

    Emergence of Grounded Compositional Language in Multi-Agent Populations

    Igor Mordatch, Pieter Abbeel
    Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

    By capturing statistical patterns in large corpora, machine learning has
    enabled significant advances in natural language processing, including in
    machine translation, question answering, and sentiment analysis. However, for
    agents to intelligently interact with humans, simply capturing the statistical
    patterns is insufficient. In this paper we investigate if, and how, grounded
    compositional language can emerge as a means to achieve goals in multi-agent
    populations. Towards this end, we propose a multi-agent learning environment
    and learning methods that bring about emergence of a basic compositional
    language. This language is represented as streams of abstract discrete symbols
    uttered by agents over time, but nonetheless has a coherent structure that
    possesses a defined vocabulary and syntax. We also observe emergence of
    non-verbal communication such as pointing and guiding when language
    communication is unavailable.

    Exploring the Combination Rules of D Numbers From a Perspective of Conflict Redistribution

    Xinyang Deng, Wen Jiang
    Comments: 6 pages, 4 figures
    Subjects: Artificial Intelligence (cs.AI)

    Dempster-Shafer theory of evidence is widely applied to uncertainty modelling
    and knowledge reasoning because of its advantages in dealing with uncertain
    information. But some conditions or requirements, such as exclusiveness
    hypothesis and completeness constraint, limit the development and application
    of that theory to a large extend. To overcome the shortcomings and enhance its
    capability of representing the uncertainty, a novel model, called D numbers,
    has been proposed recently. However, many key issues, for example how to
    implement the combination of D numbers, remain unsolved. In the paper, we have
    explored the combination of D Numbers from a perspective of conflict
    redistribution, and proposed two combination rules being suitable for different
    situations for the fusion of two D numbers. The proposed combination rules can
    reduce to the classical Dempster’s rule in Dempster-Shafer theory under a
    certain conditions. Numerical examples and discussion about the proposed rules
    are also given in the paper.

    Towards Moral Autonomous Systems

    Vicky Charisi, Louise Dennis, Michael Fisher Robert Lieck, Andreas Matthias, Marija Slavkovik Janina Sombetzki, Alan F. T. Winfield, Roman Yampolskiy
    Subjects: Artificial Intelligence (cs.AI)

    Both the ethics of autonomous systems and the problems of their technical
    implementation have by now been studied in some detail. Less attention has been
    given to the areas in which these two separate concerns meet. This paper,
    written by both philosophers and engineers of autonomous systems, addresses a
    number of issues in machine ethics that are located at precisely the
    intersection between ethics and engineering. We first discuss different
    approaches towards the conceptual design of autonomous systems and their
    implications on the ethics implementation in such systems. Then we examine
    problematic areas regarding the specification and verification of ethical
    behavior in autonomous systems, particularly with a view towards the
    requirements of future legislation. We discuss transparency and accountability
    issues that will be crucial for any future wide deployment of autonomous
    systems in society. Finally we consider the, often overlooked, possibility of
    intentional misuse of AI systems and the possible dangers arising out of
    deliberately unethical design, implementation, and use of autonomous robots.

    Minimizing Maximum Regret in Commitment Constrained Sequential Decision Making

    Qi Zhang, Satinder Singh, Edmund Durfee
    Subjects: Artificial Intelligence (cs.AI)

    In cooperative multiagent planning, it can often be beneficial for an agent
    to make commitments about aspects of its behavior to others, allowing them in
    turn to plan their own behaviors without taking the agent’s detailed behavior
    into account. Extending previous work in the Bayesian setting, we consider
    instead a worst-case setting in which the agent has a set of possible
    environments (MDPs) it could be in, and develop a commitment semantics that
    allows for probabilistic guarantees on the agent’s behavior in any of the
    environments it could end up facing. Crucially, an agent receives observations
    (of reward and state transitions) that allow it to potentially eliminate
    possible environments and thus obtain higher utility by adapting its policy to
    the history of observations. We develop algorithms and provide theory and some
    preliminary empirical results showing that they ensure an agent meets its
    commitments with history-dependent policies while minimizing maximum regret
    over the possible environments.

    InScript: Narrative texts annotated with script information

    Ashutosh Modi, Tatjana Anikina, Simon Ostermann, Manfred Pinkal
    Comments: Paper accepted at LREC 2016, 9 pages, The corpus can be downloaded at: this http URL
    Journal-ref: LREC 2016
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

    This paper presents the InScript corpus (Narrative Texts Instantiating Script
    structure). InScript is a corpus of 1,000 stories centered around 10 different
    scenarios. Verbs and noun phrases are annotated with event and participant
    types, respectively. Additionally, the text is annotated with coreference
    information. The corpus shows rich lexical variation and will serve as a unique
    resource for the study of the role of script knowledge in natural language
    processing.

    Resilience: A Criterion for Learning in the Presence of Arbitrary Outliers

    Jacob Steinhardt, Moses Charikar, Gregory Valiant
    Comments: 18 pages, pre-print
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computational Complexity (cs.CC); Cryptography and Security (cs.CR); Machine Learning (stat.ML)

    We introduce a criterion, resilience, which allows properties of a dataset
    (such as its mean or best low rank approximation) to be robustly computed, even
    in the presence of a large fraction of arbitrary additional data. Resilience is
    a weaker condition than most other properties considered so far in the
    literature, and yet enables robust estimation in a broader variety of settings,
    including the previously unstudied problem of robust mean estimation in
    (ell_p)-norms.

    FastQA: A Simple and Efficient Neural Architecture for Question Answering

    Dirk Weissenborn, Georg Wiese, Laura Seiffe
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    Recent development of large-scale question answering (QA) datasets triggered
    a substantial amount of research into end-to-end neural architectures for QA.
    Increasingly complex systems have been conceived without comparison to a
    simpler neural baseline system that would justify their complexity. In this
    work, we propose a simple heuristic that guided the development of FastQA, an
    efficient end-to-end neural model for question answering that is very
    competitive with existing models. We further demonstrate, that an extended
    version (FastQAExt) achieves state-of-the-art results on recent benchmark
    datasets, namely SQuAD, NewsQA and MsMARCO, outperforming most existing models.
    However, we show that increasing the complexity of FastQA to FastQAExt does not
    yield any systematic improvements. We argue that the same holds true for most
    existing systems that are similar to FastQAExt. A manual analysis reveals that
    our proposed heuristic explains most predictions of our model, which indicates
    that modeling a simple heuristic is enough to achieve strong performance on
    extractive QA datasets. The overall strong performance of FastQA puts results
    of existing, more complex models into perspective.

    Weighted Voting Via No-Regret Learning

    Nika Haghtalab, Ritesh Noothigattu, Ariel D. Procaccia
    Subjects: Computer Science and Game Theory (cs.GT); Artificial Intelligence (cs.AI); Learning (cs.LG); Multiagent Systems (cs.MA)

    Voting systems typically treat all voters equally. We argue that perhaps they
    should not: Voters who have supported good choices in the past should be given
    higher weight than voters who have supported bad ones. To develop a formal
    framework for desirable weighting schemes, we draw on no-regret learning.
    Specifically, given a voting rule, we wish to design a weighting scheme such
    that applying the voting rule, with voters weighted by the scheme, leads to
    choices that are almost as good as those endorsed by the best voter in
    hindsight. We derive possibility and impossibility results for the existence of
    such weighting schemes, depending on whether the voting rule and the weighting
    scheme are deterministic or randomized, as well as on the social choice axioms
    satisfied by the voting rule.

    Understanding Black-box Predictions via Influence Functions

    Pang Wei Koh, Percy Liang
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG)

    How can we explain the predictions of a black-box model? In this paper, we
    use influence functions — a classic technique from robust statistics — to
    trace a model’s prediction through the learning algorithm and back to its
    training data, identifying the points most responsible for a given prediction.
    Applying ideas from second-order optimization, we scale up influence functions
    to modern machine learning settings and show that they can be applied to
    high-dimensional black-box models, even in non-convex and non-differentiable
    settings. We give a simple, efficient implementation that requires only oracle
    access to gradients and Hessian-vector products. On linear models and
    convolutional neural networks, we demonstrate that influence functions are
    useful for many different purposes: to understand model behavior, debug models
    and detect dataset errors, and even identify and exploit vulnerabilities to
    adversarial training-set attacks.

    A computational investigation of sources of variability in sentence comprehension difficulty in aphasia

    Paul Mätzig, Shravan Vasishth, Felix Engelmann, David Caplan
    Comments: 6 pages, 2 figures, 2 tables, submitted to MathPsych/ICCM 2017, Warwick, UK
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

    We present a computational evaluation of three hypotheses about sources of
    deficit in sentence comprehension in aphasia: slowed processing, intermittent
    deficiency, and resource reduction. The ACT-R based Lewis & Vasishth 2005 model
    is used to implement these three proposals. Slowed processing is implemented as
    slowed default production-rule firing time; intermittent deficiency as
    increased random noise in activation of chunks in memory; and resource
    reduction as reduced goal activation. As data, we considered subject vs. object
    relatives presented in a self-paced listening modality to 56 individuals with
    aphasia (IWA) and 46 matched controls. The participants heard the sentences and
    carried out a picture verification task to decide on an interpretation of the
    sentence. These response accuracies are used to identify the best parameters
    (for each participant) that correspond to the three hypotheses mentioned above.
    We show that controls have more tightly clustered (less variable) parameter
    values than IWA; specifically, compared to controls, among IWA there are more
    individuals with low goal activations, high noise, and slow default action
    times. This suggests that (i) individual patients show differential amounts of
    deficit along the three dimensions of slowed processing, intermittent
    deficient, and resource reduction, (ii) overall, there is evidence for all
    three sources of deficit playing a role, and (iii) IWA have a more variable
    range of parameter values than controls. In sum, this study contributes a proof
    of concept of a quantitative implementation of, and evidence for, these three
    accounts of comprehension deficits in aphasia.

    Learning best K analogies from data distribution for case-based software effort estimation

    Mohammad Azzeh, Yousef Elsheikh
    Subjects: Software Engineering (cs.SE); Artificial Intelligence (cs.AI)

    Case-Based Reasoning (CBR) has been widely used to generate good software
    effort estimates. The predictive performance of CBR is a dataset dependent and
    subject to extremely large space of configuration possibilities. Regardless of
    the type of adaptation technique, deciding on the optimal number of similar
    cases to be used before applying CBR is a key challenge. In this paper we
    propose a new technique based on Bisecting k-medoids clustering algorithm to
    better understanding the structure of a dataset and discovering the …

    Fuzzy Model Tree For Early Effort Estimation

    Mohammad Azzeh, Ali Bou Nassif
    Subjects: Software Engineering (cs.SE); Artificial Intelligence (cs.AI)

    Use Case Points (UCP) is a well-known method to estimate the project size,
    based on Use Case diagram, at early phases of software development. Although
    the Use Case diagram is widely accepted as a de-facto model for analyzing
    object oriented software requirements over the world, UCP method did not take
    sufficient amount of attention because, as yet, there is no consensus on how to
    produce software effort from UCP. This paper aims to study the potential of
    using Fuzzy Model Tree to derive effort estimates …

    A Compact DNN: Approaching GoogLeNet-Level Accuracy of Classification and Domain Adaptation

    Chunpeng Wu, Wei Wen, Tariq Afzal, Yongmei Zhang, Yiran Chen, Hai Li
    Comments: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17)
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    Recently, DNN model compression based on network architecture design, e.g.,
    SqueezeNet, attracted a lot attention. No accuracy drop on image classification
    is observed on these extremely compact networks, compared to well-known models.
    An emerging question, however, is whether these model compression techniques
    hurt DNN’s learning ability other than classifying images on a single dataset.
    Our preliminary experiment shows that these compression methods could degrade
    domain adaptation (DA) ability, though the classification performance is
    preserved. Therefore, we propose a new compact network architecture and
    unsupervised DA method in this paper. The DNN is built on a new basic module
    Conv-M which provides more diverse feature extractors without significantly
    increasing parameters. The unified framework of our DA method will
    simultaneously learn invariance across domains, reduce divergence of feature
    representations, and adapt label prediction. Our DNN has 4.1M parameters, which
    is only 6.7% of AlexNet or 59% of GoogLeNet. Experiments show that our DNN
    obtains GoogLeNet-level accuracy both on classification and DA, and our DA
    method slightly outperforms previous competitive ones. Put all together, our DA
    strategy based on our DNN achieves state-of-the-art on sixteen of total
    eighteen DA tasks on popular Office-31 and Office-Caltech datasets.


    Information Retrieval

    Character-based Neural Embeddings for Tweet Clustering

    Svitlana Vakulenko, Lyndon Nixon, Mihai Lupu
    Comments: Accepted at the SocialNLP 2017 workshop held in conjunction with EACL 2017, April 3, 2017, Valencia, Spain
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    In this paper we show how the performance of tweet clustering can be improved
    by leveraging character-based neural networks. The proposed approach overcomes
    the limitations related to the vocabulary explosion in the word-based models
    and allows for the seamless processing of the multilingual content. Our
    evaluation results and code are available on-line at
    this https URL clustering

    Distributed-Representation Based Hybrid Recommender System with Short Item Descriptions

    Junhua He, Hankz Hankui Zhuo, Jarvan Law
    Comments: 10 pages, 5 figures
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    Collaborative filtering (CF) aims to build a model from users’ past behaviors
    and/or similar decisions made by other users, and use the model to recommend
    items for users. Despite of the success of previous collaborative filtering
    approaches, they are all based on the assumption that there are sufficient
    rating scores available for building high-quality recommendation models. In
    real world applications, however, it is often difficult to collect sufficient
    rating scores, especially when new items are introduced into the system, which
    makes the recommendation task challenging. We find that there are often “short”
    texts describing features of items, based on which we can approximate the
    similarity of items and make recommendation together with rating scores. In
    this paper we “borrow” the idea of vector representation of words to capture
    the information of short texts and embed it into a matrix factorization
    framework. We empirically show that our approach is effective by comparing it
    with state-of-the-art approaches.

    End-to-end Binary Representation Learning via Direct Binary Embedding

    Liu Liu, Alireza Rahimpour, Ali Taalimi, Hairong Qi
    Comments: Submitted to ICIP’17
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR)

    Learning binary representation is essential to large-scale computer vision
    tasks. Most existing algorithms require a separate quantization constraint to
    learn effective hashing functions. In this work, we present Direct Binary
    Embedding (DBE), a simple yet very effective algorithm to learn binary
    representation in an end-to-end fashion. By appending an ingeniously designed
    DBE layer to the deep convolutional neural network (DCNN), DBE learns binary
    code directly from the continuous DBE layer activation without quantization
    error. By employing the deep residual network (ResNet) as DCNN component, DBE
    captures rich semantics from images. Furthermore, in the effort of handling
    multilabel images, we design a joint cross entropy loss that includes both
    softmax cross entropy and weighted binary cross entropy in consideration of the
    correlation and independence of labels, respectively. Extensive experiments
    demonstrate the significant superiority of DBE over state-of-the-art methods on
    tasks of natural object recognition, image retrieval and image annotation.

    Ensemble of Neural Classifiers for Scoring Knowledge Base Triples

    Ikuya Yamada, Motoki Sato, Hiroyuki Shindo
    Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)

    This paper describes our approach for the triple scoring task at WSDM Cup
    2017. The task aims to assign a relevance score for each pair of entities and
    their types in a knowledge base in order to enhance the ranking results in
    entity retrieval tasks. We propose an approach wherein the outputs of multiple
    neural network classifiers are combined using a supervised machine learning
    model. The experimental results show that our proposed method achieves the best
    performance in one out of three measures, and performs competitively in the
    other two measures.


    Computation and Language

    InScript: Narrative texts annotated with script information

    Ashutosh Modi, Tatjana Anikina, Simon Ostermann, Manfred Pinkal
    Comments: Paper accepted at LREC 2016, 9 pages, The corpus can be downloaded at: this http URL
    Journal-ref: LREC 2016
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

    This paper presents the InScript corpus (Narrative Texts Instantiating Script
    structure). InScript is a corpus of 1,000 stories centered around 10 different
    scenarios. Verbs and noun phrases are annotated with event and participant
    types, respectively. Additionally, the text is annotated with coreference
    information. The corpus shows rich lexical variation and will serve as a unique
    resource for the study of the role of script knowledge in natural language
    processing.

    Is this word borrowed? An automatic approach to quantify the likeliness of borrowing in social media

    Jasabanta Patro, Bidisha Samanta, Saurabh Singh, Prithwish Mukherjee, Monojit Choudhury, Animesh Mukherjee
    Comments: 11 pages, 3 Figures
    Subjects: Computation and Language (cs.CL)

    Code-mixing or code-switching are the effortless phenomena of natural
    switching between two or more languages in a single conversation. Use of a
    foreign word in a language; however, does not necessarily mean that the speaker
    is code-switching because often languages borrow lexical items from other
    languages. If a word is borrowed, it becomes a part of the lexicon of a
    language; whereas, during code-switching, the speaker is aware that the
    conversation involves foreign words or phrases. Identifying whether a foreign
    word used by a bilingual speaker is due to borrowing or code-switching is a
    fundamental importance to theories of multilingualism, and an essential
    prerequisite towards the development of language and speech technologies for
    multilingual communities. In this paper, we present a series of novel
    computational methods to identify the borrowed likeliness of a word, based on
    the social media signals. We first propose context based clustering method to
    sample a set of candidate words from the social media data.Next, we propose
    three novel and similar metrics based on the usage of these words by the users
    in different tweets; these metrics were used to score and rank the candidate
    words indicating their borrowed likeliness. We compare these rankings with a
    ground truth ranking constructed through a human judgment experiment. The
    Spearman’s rank correlation between the two rankings (nearly 0.62 for all the
    three metric variants) is more than double the value (0.26) of the most
    competitive existing baseline reported in the literature. Some other striking
    observations are, (i) the correlation is higher for the ground truth data
    elicited from the younger participants (age less than 30) than that from the
    older participants, and (ii )those participants who use mixed-language for
    tweeting the least, provide the best signals of borrowing.

    SyntaxNet Models for the CoNLL 2017 Shared Task

    Chris Alberti, Daniel Andor, Ivan Bogatyy, Michael Collins, Dan Gillick, Lingpeng Kong, Terry Koo, Ji Ma, Mark Omernick, Slav Petrov, Chayut Thanapirom, Zora Tung, David Weiss
    Comments: Tech report
    Subjects: Computation and Language (cs.CL)

    We describe a baseline dependency parsing system for the CoNLL2017 Shared
    Task. This system, which we call “ParseySaurus,” uses the DRAGNN framework
    [Kong et al, 2017] to combine transition-based recurrent parsing and tagging
    with character-based word representations. On the v1.3 Universal Dependencies
    Treebanks, the new system outpeforms the publicly available, state-of-the-art
    “Parsey’s Cousins” models by 3.47% absolute Labeled Accuracy Score (LAS) across
    52 treebanks.

    Ensemble of Neural Classifiers for Scoring Knowledge Base Triples

    Ikuya Yamada, Motoki Sato, Hiroyuki Shindo
    Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)

    This paper describes our approach for the triple scoring task at WSDM Cup
    2017. The task aims to assign a relevance score for each pair of entities and
    their types in a knowledge base in order to enhance the ranking results in
    entity retrieval tasks. We propose an approach wherein the outputs of multiple
    neural network classifiers are combined using a supervised machine learning
    model. The experimental results show that our proposed method achieves the best
    performance in one out of three measures, and performs competitively in the
    other two measures.

    Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets

    Zhen Yang, Wei Chen, Feng Wang, Bo Xu
    Comments: 9 pages , 3 figures, 3 tables
    Subjects: Computation and Language (cs.CL)

    This paper proposes a new route for applying the generative adversarial nets
    (GANs) to NLP tasks (taking the neural machine translation as an instance) and
    the widespread perspective that GANs can’t work well in the NLP area turns out
    to be unreasonable. In this work, we build a conditional sequence generative
    adversarial net which comprises of two adversarial sub models, a generative
    model (generator) which translates the source sentence into the target sentence
    as the traditional NMT models do and a discriminative model (discriminator)
    which discriminates the machine-translated target sentence from the
    human-translated sentence. From the perspective of Turing test, the proposed
    model is to generate the translation which is indistinguishable from the
    human-translated one. Experiments show that the proposed model achieves
    significant improvements than the traditional NMT model. In Chinese-English
    translation tasks, we obtain up to +2.0 BLEU points improvement. To the best of
    our knowledge, this is the first time that the quantitative results about the
    application of GANs in the traditional NLP task is reported. Meanwhile, we
    present detailed strategies for GAN training. In addition, We find that the
    discriminator of the proposed model shows great capability in data cleaning.

    Sparse Named Entity Classification using Factorization Machines

    Ai Hirata, Mamoru Komachi
    Comments: 4+1 pages
    Subjects: Computation and Language (cs.CL)

    Named entity classification is the task of classifying text-based elements
    into various categories, including places, names, dates, times, and monetary
    values. A bottleneck in named entity classification, however, is the data
    problem of sparseness, because new named entities continually emerge, making it
    rather difficult to maintain a dictionary for named entity classification.
    Thus, in this paper, we address the problem of named entity classification
    using matrix factorization to overcome the problem of feature sparsity.
    Experimental results show that our proposed model, with fewer features and a
    smaller size, achieves competitive accuracy to state-of-the-art models.

    Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling

    Diego Marcheggiani, Ivan Titov
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    Semantic role labeling (SRL) is the task of identifying the
    predicate-argument structure of a sentence. It is typically regarded as an
    important step in the standard natural language processing pipeline, providing
    information to downstream tasks such as information extraction and question
    answering. As the semantic representations are closely related to syntactic
    ones, we exploit syntactic information in our model. We propose a version of
    graph convolutional networks (GCNs), a recent class of multilayer neural
    networks operating on graphs, suited to modeling syntactic dependency graphs.
    GCNs over syntactic dependency trees are used as sentence encoders, producing
    latent feature representations of words in a sentence and capturing information
    relevant to predicting the semantic representations. We observe that GCN layers
    are complementary to LSTM ones: when we stack both GCN and LSTM layers, we
    obtain a substantial improvement over an already state-of-the-art LSTM SRL
    model, resulting in the best reported scores on the standard benchmark
    (CoNLL-2009) both for Chinese and English.

    FastQA: A Simple and Efficient Neural Architecture for Question Answering

    Dirk Weissenborn, Georg Wiese, Laura Seiffe
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    Recent development of large-scale question answering (QA) datasets triggered
    a substantial amount of research into end-to-end neural architectures for QA.
    Increasingly complex systems have been conceived without comparison to a
    simpler neural baseline system that would justify their complexity. In this
    work, we propose a simple heuristic that guided the development of FastQA, an
    efficient end-to-end neural model for question answering that is very
    competitive with existing models. We further demonstrate, that an extended
    version (FastQAExt) achieves state-of-the-art results on recent benchmark
    datasets, namely SQuAD, NewsQA and MsMARCO, outperforming most existing models.
    However, we show that increasing the complexity of FastQA to FastQAExt does not
    yield any systematic improvements. We argue that the same holds true for most
    existing systems that are similar to FastQAExt. A manual analysis reveals that
    our proposed heuristic explains most predictions of our model, which indicates
    that modeling a simple heuristic is enough to achieve strong performance on
    extractive QA datasets. The overall strong performance of FastQA puts results
    of existing, more complex models into perspective.

    Extending Automatic Discourse Segmentation for Texts in Spanish to Catalan

    Iria da Cunha, Eric SanJuan, Juan-Manuel Torres-Moreno, Irene Castellón
    Journal-ref: Proceedings of the First Workshop on Modeling, Learning and Mining
    for Cross/Multilinguality (MultiLingMine 2016), 38th European Conference on
    Information Retrieval (ECIR 2016)
    Subjects: Computation and Language (cs.CL)

    At present, automatic discourse analysis is a relevant research topic in the
    field of NLP. However, discourse is one of the phenomena most difficult to
    process. Although discourse parsers have been already developed for several
    languages, this tool does not exist for Catalan. In order to implement this
    kind of parser, the first step is to develop a discourse segmenter. In this
    article we present the first discourse segmenter for texts in Catalan. This
    segmenter is based on Rhetorical Structure Theory (RST) for Spanish, and uses
    lexical and syntactic information to translate rules valid for Spanish into
    rules for Catalan. We have evaluated the system by using a gold standard corpus
    including manually segmented texts and results are promising.

    A computational investigation of sources of variability in sentence comprehension difficulty in aphasia

    Paul Mätzig, Shravan Vasishth, Felix Engelmann, David Caplan
    Comments: 6 pages, 2 figures, 2 tables, submitted to MathPsych/ICCM 2017, Warwick, UK
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

    We present a computational evaluation of three hypotheses about sources of
    deficit in sentence comprehension in aphasia: slowed processing, intermittent
    deficiency, and resource reduction. The ACT-R based Lewis & Vasishth 2005 model
    is used to implement these three proposals. Slowed processing is implemented as
    slowed default production-rule firing time; intermittent deficiency as
    increased random noise in activation of chunks in memory; and resource
    reduction as reduced goal activation. As data, we considered subject vs. object
    relatives presented in a self-paced listening modality to 56 individuals with
    aphasia (IWA) and 46 matched controls. The participants heard the sentences and
    carried out a picture verification task to decide on an interpretation of the
    sentence. These response accuracies are used to identify the best parameters
    (for each participant) that correspond to the three hypotheses mentioned above.
    We show that controls have more tightly clustered (less variable) parameter
    values than IWA; specifically, compared to controls, among IWA there are more
    individuals with low goal activations, high noise, and slow default action
    times. This suggests that (i) individual patients show differential amounts of
    deficit along the three dimensions of slowed processing, intermittent
    deficient, and resource reduction, (ii) overall, there is evidence for all
    three sources of deficit playing a role, and (iii) IWA have a more variable
    range of parameter values than controls. In sum, this study contributes a proof
    of concept of a quantitative implementation of, and evidence for, these three
    accounts of comprehension deficits in aphasia.

    Joint Learning of Correlated Sequence Labelling Tasks Using Bidirectional Recurrent Neural Networks

    Vardaan Pahuja, Anirban Laha, Shachar Mirkin, Vikas Raykar, Lili Kotlerman, Guy Lev
    Comments: Submitted to Interspeech 2017
    Subjects: Computation and Language (cs.CL)

    The stream of words produced by Automatic Speech Recognition (ASR) systems is
    devoid of any punctuations and formatting. Most natural language processing
    applications usually expect segmented and well-formatted texts as input, which
    is not available in ASR output. This paper proposes a novel technique of
    jointly modelling multiple correlated tasks such as punctuation and
    capitalization using bidirectional recurrent neural networks, which leads to
    improved performance for each of these tasks. This method can be extended for
    joint modelling of any other correlated multiple sequence labelling tasks.

    Exploring Question Understanding and Adaptation in Neural-Network-Based Question Answering

    Junbei Zhang, Xiaodan Zhu, Qian Chen, Lirong Dai, Hui Jiang
    Subjects: Computation and Language (cs.CL)

    The last several years have seen intensive interest in exploring
    neural-network-based models for machine comprehension (MC) and question
    answering (QA). In this paper, we approach the problems by closely modelling
    questions in a neural network framework. We first introduce syntactic
    information to help encode questions. We then view and model different types of
    questions and the information shared among them as an adaptation task and
    proposed adaptation models for them. On the Stanford Question Answering Dataset
    (SQuAD), we show that these approaches can help attain better results over a
    competitive baseline.

    Character-based Neural Embeddings for Tweet Clustering

    Svitlana Vakulenko, Lyndon Nixon, Mihai Lupu
    Comments: Accepted at the SocialNLP 2017 workshop held in conjunction with EACL 2017, April 3, 2017, Valencia, Spain
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    In this paper we show how the performance of tweet clustering can be improved
    by leveraging character-based neural networks. The proposed approach overcomes
    the limitations related to the vocabulary explosion in the word-based models
    and allows for the seamless processing of the multilingual content. Our
    evaluation results and code are available on-line at
    this https URL clustering

    Emergence of Grounded Compositional Language in Multi-Agent Populations

    Igor Mordatch, Pieter Abbeel
    Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

    By capturing statistical patterns in large corpora, machine learning has
    enabled significant advances in natural language processing, including in
    machine translation, question answering, and sentiment analysis. However, for
    agents to intelligently interact with humans, simply capturing the statistical
    patterns is insufficient. In this paper we investigate if, and how, grounded
    compositional language can emerge as a means to achieve goals in multi-agent
    populations. Towards this end, we propose a multi-agent learning environment
    and learning methods that bring about emergence of a basic compositional
    language. This language is represented as streams of abstract discrete symbols
    uttered by agents over time, but nonetheless has a coherent structure that
    possesses a defined vocabulary and syntax. We also observe emergence of
    non-verbal communication such as pointing and guiding when language
    communication is unavailable.

    Distributed-Representation Based Hybrid Recommender System with Short Item Descriptions

    Junhua He, Hankz Hankui Zhuo, Jarvan Law
    Comments: 10 pages, 5 figures
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    Collaborative filtering (CF) aims to build a model from users’ past behaviors
    and/or similar decisions made by other users, and use the model to recommend
    items for users. Despite of the success of previous collaborative filtering
    approaches, they are all based on the assumption that there are sufficient
    rating scores available for building high-quality recommendation models. In
    real world applications, however, it is often difficult to collect sufficient
    rating scores, especially when new items are introduced into the system, which
    makes the recommendation task challenging. We find that there are often “short”
    texts describing features of items, based on which we can approximate the
    similarity of items and make recommendation together with rating scores. In
    this paper we “borrow” the idea of vector representation of words to capture
    the information of short texts and embed it into a matrix factorization
    framework. We empirically show that our approach is effective by comparing it
    with state-of-the-art approaches.

    Multichannel End-to-end Speech Recognition

    Tsubasa Ochiai, Shinji Watanabe, Takaaki Hori, John R. Hershey
    Subjects: Sound (cs.SD); Computation and Language (cs.CL)

    The field of speech recognition is in the midst of a paradigm shift:
    end-to-end neural networks are challenging the dominance of hidden Markov
    models as a core technology. Using an attention mechanism in a recurrent
    encoder-decoder architecture solves the dynamic time alignment problem,
    allowing joint end-to-end training of the acoustic and language modeling
    components. In this paper we extend the end-to-end framework to encompass
    microphone array signal processing for noise suppression and speech enhancement
    within the acoustic encoding network. This allows the beamforming components to
    be optimized jointly within the recognition architecture to improve the
    end-to-end speech recognition objective. Experiments on the noisy speech
    benchmarks (CHiME-4 and AMI) show that our multichannel end-to-end system
    outperformed the attention-based baseline with input from a conventional
    adaptive beamformer.


    Distributed, Parallel, and Cluster Computing

    How to Stop Consensus Algorithms, locally?

    Pei Xie, Keyou You, Cheng Wu
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    This paper studies problems on locally stopping distributed consensus
    algorithms over networks where each node updates its state by interacting with
    its neighbors and decides by itself whether certain level of agreement has been
    achieved among nodes. Since an individual node is unable to access the states
    of those beyond its neighbors, this problem becomes challenging. In this work,
    we first define the stopping problem for generic distributed algorithms. Then,
    a distributed algorithm is explicitly provided for each node to stop consensus
    updating by exploring the relationship between the so-called local and global
    consensus. Finally, we show both in theory and simulation that its
    effectiveness depends both on the network size and the structure.

    Network Constrained Distributed Dual Coordinate Ascent for Machine Learning

    Myung Cho, Lifeng Lai, Weiyu Xu
    Comments: 5 pages, 6 figures
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Learning (cs.LG)

    With explosion of data size and limited storage space at a single location,
    data are often distributed at different locations. We thus face the challenge
    of performing large-scale machine learning from these distributed data through
    communication networks. In this paper, we study how the network communication
    constraints will impact the convergence speed of distributed machine learning
    optimization algorithms. In particular, we give the convergence rate analysis
    of the distributed dual coordinate ascent in a general tree structured network.
    Furthermore, by considering network communication delays, we optimize the
    network-constrained dual coordinate ascent algorithms to maximize its
    convergence speed. Our results show that under different network communication
    delays, to achieve maximum convergence speed, one needs to adopt
    delay-dependent numbers of local and global iterations for distributed dual
    coordinate ascent.

    Locality and Singularity for Store-Atomic Memory Models

    Egor Derevenetc, Roland Meyer, Sebastian Schweizer
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Robustness is a correctness notion for concurrent programs running under
    relaxed consistency models. The task is to check that the relaxed behavior
    coincides (up to traces) with sequential consistency (SC). Although
    computationally simple on paper (robustness has been shown to be
    PSPACE-complete for TSO, PGAS, and Power), building a practical robustness
    checker remains a challenge. The problem is that the various relaxations lead
    to a dramatic number of computations, only few of which violate robustness.

    In the present paper, we set out to reduce the search space for robustness
    checkers. We focus on store-atomic consistency models and establish two
    completeness results. The first result, called locality, states that a
    non-robust program always contains a violating computation where only one
    thread delays commands. The second result, called singularity, is even stronger
    but restricted to programs without lightweight fences. It states that there is
    a violating computation where a single store is delayed.

    As an application of the results, we derive a linear-size source-to-source
    translation of robustness to SC-reachability. It applies to general programs,
    regardless of the data domain and potentially with an unbounded number of
    threads and with unbounded buffers. We have implemented the translation and
    verified, for the first time, PGAS algorithms in a fully automated fashion. For
    TSO, our analysis outperforms existing tools.

    Optimization of Lattice Boltzmann Simulations on Heterogeneous Computers

    E. Calore, A. Gabbana, S. F. Schifano, R. Tripiccione
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    High-performance computing systems are more and more often based on
    accelerators. Computing applications targeting those systems often follow a
    host-driven approach in which hosts offload almost all compute-intensive
    sections of the code onto accelerators; this approach only marginally exploits
    the computational resources available on the host CPUs, limiting performance
    and energy efficiency. The obvious step forward is to run compute-intensive
    kernels in a concurrent and balanced way on both hosts and accelerators. In
    this paper we consider exactly this problem for a class of applications based
    on Lattice Boltzmann Methods, widely used in computational fluid-dynamics. Our
    goal is to develop just one program, portable and able to run efficiently on
    several different combinations of hosts and accelerators. To reach this goal,
    we define common data layouts enabling the code to exploit efficiently the
    different parallel and vector options of the various accelerators, and matching
    the possibly different requirements of the compute-bound and memory-bound
    kernels of the application. We also define models and metrics that predict the
    best partitioning of workloads among host and accelerator, and the optimally
    achievable overall performance level. We test the performance of our codes and
    their scaling properties using as testbeds HPC clusters incorporating different
    accelerators: Intel Xeon-Phi many-core processors, NVIDIA GPUs and AMD GPUs.

    Bandwidth-efficient Storage Services for Mitigating Side Channel Attack

    Pengfei Zuo, Yu Hua, Cong Wang, Wen Xia, Shunde Cao, Yukun Zhou, Yuanyuan Sun
    Subjects: Cryptography and Security (cs.CR); Distributed, Parallel, and Cluster Computing (cs.DC)

    Data deduplication is able to effectively identify and eliminate redundant
    data and only maintain a single copy of files and chunks. Hence, it is widely
    used in cloud storage systems to save storage space and network bandwidth.
    However, the occurrence of deduplication can be easily identified by monitoring
    and analyzing network traffic, which leads to the risk of user privacy leakage.
    The attacker can carry out a very dangerous side channel attack, i.e.,
    learn-the-remaining-information (LRI) attack, to reveal users’ privacy
    information by exploiting the side channel of network traffic in deduplication.
    Existing work addresses the LRI attack at the cost of the high bandwidth
    efficiency of deduplication. In order to address this problem, we propose a
    simple yet effective scheme, called randomized redundant chunk scheme (RRCS),
    to significantly mitigate the risk of the LRI attack while maintaining the high
    bandwidth efficiency of deduplication. The basic idea behind RRCS is to add
    randomized redundant chunks to mix up the real deduplication states of files
    used for the LRI attack, which effectively obfuscates the view of the attacker,
    who attempts to exploit the side channel of network traffic for the LRI attack.
    Our security analysis shows that RRCS could significantly mitigate the risk of
    the LRI attack. We implement the RRCS prototype and evaluate it by using three
    large-scale real-world datasets. Experimental results demonstrate the
    efficiency and efficacy of RRCS.

    Friend or Foe? Population Protocols can perform Community Detection

    Luca Becchetti, Andrea Clementi, Emanuele Natale, Francesco Pasquale, Prasad Raghavendra, Luca Trevisan
    Comments: 26 pages
    Subjects: Discrete Mathematics (cs.DM); Distributed, Parallel, and Cluster Computing (cs.DC); Social and Information Networks (cs.SI)

    We present a simple distributed algorithm that, given a regular graph
    consisting of two communities (or clusters), each inducing a good expander and
    such that the cut between them has sparsity (1/mbox{polylog}(n)), recovers the
    two communities.

    More precisely, upon running the protocol, every node assigns itself a binary
    label of (m = Theta(log n)) bits, so that with high probability, for all but
    a small number of outliers, nodes within the same community are assigned labels
    with Hamming distance (o(m)), while nodes belonging to different communities
    receive labels with Hamming distance at least (m/2 – o(m)). We refer to such an
    outcome as a “community sensitive labeling” of the graph.

    Our algorithm uses (Theta(log^2 n)) local memory and computes the community
    sensitive labeling after each node performs (Theta(log^2 n)) steps of local
    work.

    Our algorithm and its analysis work in the “(random) population protocol”
    model, in which anonymous nodes do not share any global clock (the model is
    asynchronous) and communication occurs over one single (random) edge per round.
    We believe, this is the first provably-effective protocol for community
    detection that works in this model.

    Distributed Optimal Vehicle Grid Integration Strategy with User Behavior Prediction

    Yingqi Xiong, Bin Wang, Chi-cheng Chu, Rajit Gadh
    Comments: IEEE PES General Meeting 2017
    Subjects: Optimization and Control (math.OC); Distributed, Parallel, and Cluster Computing (cs.DC)

    With the increasing of electric vehicle (EV) adoption in recent years, the
    impact of EV charging activities to the power grid becomes more and more
    significant. In this article, an optimal scheduling algorithm which combines
    smart EV charging and V2G gird service is developed to integrate EVs into power
    grid as distributed energy resources, with improved system cost performance.
    Specifically, an optimization problem is formulated and solved at each EV
    charging station according to control signal from aggregated control center and
    user charging behavior prediction by mean estimation and linear regression. The
    control center collects distributed optimization results and updates the
    control signal, periodically. The iteration continues until it converges to
    optimal scheduling. Experimental result shows this algorithm helps fill the
    valley and shave the peak in electric load profiles within a microgrid, while
    the energy demand of individual driver can be satisfied.


    Learning

    Neural Networks for Beginners. A fast implemention in Matlab, Torch, TensorFlow

    Francesco Giannini, Vincenzo Laveglia, Alessandro Rossi, Dario Zanca, Andrea Zugarini
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Mathematical Software (cs.MS); Machine Learning (stat.ML)

    This report provides an introduction to some Machine Learning tools within
    the most common development environments. It mainly focuses on practical
    problems, skipping any theoretical introduction. It is oriented to both
    students trying to approach Machine Learning and experts looking for new
    frameworks.

    Deep Embedding Forest: Forest-based Serving with Deep Embedding Features

    Jie Zhu, Ying Shan, JC Mao, Dong Yu, Holakou Rahmanian, Yi Zhang
    Comments: 14 pages, 3 figures, 5 tables
    Subjects: Learning (cs.LG)

    Deep Neural Networks (DNN) have demonstrated superior ability to extract high
    level embedding vectors from low level features. Despite the success, the
    serving time is still the bottleneck due to expensive run-time computation of
    multiple layers of dense matrices. GPGPU, FPGA, or ASIC-based serving systems
    require additional hardware that are not in the mainstream design of most
    commercial applications. In contrast, tree or forest-based models are widely
    adopted because of low serving cost, but heavily depend on carefully engineered
    features. This work proposes a Deep Embedding Forest model that benefits from
    the best of both worlds. The model consists of a number of embedding layers and
    a forest/tree layer. The former maps high dimensional (hundreds of thousands to
    millions) and heterogeneous low-level features to the lower dimensional
    (thousands) vectors, and the latter ensures fast serving.

    Built on top of a representative DNN model called Deep Crossing, and two
    forest/tree-based models including XGBoost and LightGBM, a two-step Deep
    Embedding Forest algorithm is demonstrated to achieve on-par or slightly better
    performance as compared with the DNN counterpart, with only a fraction of
    serving time on conventional hardware. After comparing with a joint
    optimization algorithm called partial fuzzification, also proposed in this
    paper, it is concluded that the two-step Deep Embedding Forest has achieved
    near optimal performance. Experiments based on large scale data sets (up to 1
    billion samples) from a major sponsored search engine proves the efficacy of
    the proposed model.

    Prototypical Networks for Few-shot Learning

    Jake Snell, Kevin Swersky, Richard S. Zemel
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    We propose prototypical networks for the problem of few-shot classification,
    where a classifier must generalize to new classes not seen in the training set,
    given only a small number of examples of each new class. Prototypical networks
    learn a metric space in which classification can be performed by computing
    Euclidean distances to prototype representations of each class. Compared to
    recent approaches for few-shot learning, they reflect a simpler inductive bias
    that is beneficial in this limited-data regime, and achieve state-of-the-art
    results. We provide an analysis showing that some simple design decisions can
    yield substantial improvements over recent approaches involving complicated
    architectural choices and meta-learning. We further extend prototypical
    networks to the case of zero-shot learning and achieve state-of-the-art
    zero-shot results on the CU-Birds dataset.

    Online Learning for Distribution-Free Prediction

    Dave Zachariah, Petre Stoica, Thomas B. Schön
    Subjects: Learning (cs.LG); Computation (stat.CO); Machine Learning (stat.ML)

    We develop an online learning method for prediction, which is important in
    problems with large and/or streaming data sets. We formulate the learning
    approach using a covariance-fitting methodology, and show that the resulting
    predictor has desirable computational and distribution-free properties: It is
    implemented online with a runtime that scales linearly in the number of
    samples; has a constant memory requirement; avoids local minima problems; and
    prunes away redundant feature dimensions without relying on restrictive
    assumptions on the data distribution. In conjunction with the split conformal
    approach, it also produces distribution-free prediction confidence intervals in
    a computationally efficient manner. The method is demonstrated on both real and
    synthetic datasets.

    Deep learning with convolutional neural networks for brain mapping and decoding of movement-related information from the human EEG

    Robin Tibor Schirrmeister, Jost Tobias Springenberg, Lukas Dominique Josef Fiederer, Martin Glasstetter, Katharina Eggensperger, Michael Tangermann, Frank Hutter, Wolfram Burgard, Tonio Ball
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Deep learning with convolutional neural networks (deep ConvNets) has
    revolutionized computer vision through end-to-end learning, i.e. learning from
    the raw data. Now, there is increasing interest in using deep ConvNets for
    end-to-end EEG analysis. However, little is known about many important aspects
    of how to design and train ConvNets for end-to-end EEG decoding, and there is
    still a lack of techniques to visualize the informative EEG features the
    ConvNets learn.

    Here, we studied deep ConvNets with a range of different architectures,
    designed for decoding imagined or executed movements from raw EEG. Our results
    show that recent advances from the machine learning field, including batch
    normalization and exponential linear units, together with a cropped training
    strategy, boosted the deep ConvNets decoding performance, reaching or
    surpassing that of the widely-used filter bank common spatial patterns (FBCSP)
    decoding algorithm. While FBCSP is designed to use spectral power modulations,
    the features used by ConvNets are not fixed a priori. Our novel methods for
    visualizing the learned features demonstrated that ConvNets indeed learned to
    use spectral power modulations in the alpha, beta and high gamma frequencies.
    These methods also proved useful as a technique for spatially mapping the
    learned features, revealing the topography of the causal contributions of
    features in different frequency bands to decoding the movement classes.

    Our study thus shows how to design and train ConvNets to decode
    movement-related information from the raw EEG without handcrafted features and
    highlights the potential of deep ConvNets combined with advanced visualization
    techniques for EEG-based brain mapping.

    Resilience: A Criterion for Learning in the Presence of Arbitrary Outliers

    Jacob Steinhardt, Moses Charikar, Gregory Valiant
    Comments: 18 pages, pre-print
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computational Complexity (cs.CC); Cryptography and Security (cs.CR); Machine Learning (stat.ML)

    We introduce a criterion, resilience, which allows properties of a dataset
    (such as its mean or best low rank approximation) to be robustly computed, even
    in the presence of a large fraction of arbitrary additional data. Resilience is
    a weaker condition than most other properties considered so far in the
    literature, and yet enables robust estimation in a broader variety of settings,
    including the previously unstudied problem of robust mean estimation in
    (ell_p)-norms.

    Sharp Minima Can Generalize For Deep Nets

    Laurent Dinh, Razvan Pascanu, Samy Bengio, Yoshua Bengio
    Comments: 8.5 pages of main content, 2.5 of bibliography and 1 page of appendix
    Subjects: Learning (cs.LG)

    Despite their overwhelming capacity to overfit, deep learning architectures
    tend to generalize relatively well to unseen data, allowing them to be deployed
    in practice. However, explaining why this is the case is still an open area of
    research. One standing hypothesis that is gaining popularity, e.g. Hochreiter &
    Schmidhuber (1997); Keskar et al. (2017), is that the flatness of minima of the
    loss function found by stochastic gradient based methods results in good
    generalization. This paper argues that most notions of flatness are problematic
    for deep models and can not be directly applied to explain generalization.
    Specifically, when focusing on deep networks with rectifier units, we can
    exploit the particular geometry of parameter space induced by the inherent
    symmetries that these architectures exhibit to build equivalent models
    corresponding to arbitrarily sharper minima. Furthermore, if we allow to
    reparametrize a function, the geometry of its parameters can change drastically
    without affecting its generalization properties.

    Riemannian stochastic quasi-Newton algorithm with variance reduction and its convergence analysis

    Hiroyuki Kasai, Hiroyuki Sato, Bamdev Mishra
    Subjects: Learning (cs.LG); Numerical Analysis (cs.NA); Numerical Analysis (math.NA); Optimization and Control (math.OC); Machine Learning (stat.ML)

    Stochastic variance reduction algorithms have recently become popular for
    minimizing the average of a large, but finite number of loss functions. The
    present paper proposes a Riemannian stochastic quasi-Newton algorithm with
    variance reduction (R-SQN-VR). The key challenges of averaging, adding, and
    subtracting multiple gradients are addressed with notions of retraction and
    vector transport. We present a global convergence analysis and a local
    convergence rate analysis of R-SQN-VR under some natural assumptions. The
    proposed algorithm is applied to the Karcher mean computation on the symmetric
    positive-definite manifold and low-rank matrix completion on the Grassmann
    manifold. In all cases, the proposed algorithm outperforms the Riemannian
    stochastic gradient descent and the Riemannian stochastic variance reduction
    algorithms.

    Towards Optimal Sparse Inverse Covariance Selection through Non-Convex Optimization

    Sidhant Misra, Marc Vuffray, Andrey Y. Lokhov, Michael Chertkov
    Subjects: Learning (cs.LG); Information Theory (cs.IT); Statistics Theory (math.ST)

    We study the problem of reconstructing the graph of a sparse Gaussian
    Graphical Model from independent observations, which is equivalent to finding
    non-zero elements of an inverse covariance matrix. For a model of size (p) and
    maximum degree (d), information theoretic lower bounds established in prior
    works require that the number of samples needed for recovering the graph
    perfectly is at least (d log p/kappa^2), where (kappa) is the minimum
    normalized non-zero entry of the inverse covariance matrix. Existing algorithms
    require additional assumptions to guarantee perfect graph reconstruction, and
    consequently, their sample complexity is dependent on parameters that are not
    present in the information theoretic lower bound. We propose an estimator,
    called SLICE, that consists of a cardinality constrained least-squares
    regression followed by a thresholding procedure. Without any additional
    assumptions we show that SLICE attains a sample complexity of
    (frac{64}{kappa^4}d log p), which differs from the lower bound by only a
    factor proportional to (1/kappa^2) and depends only on parameters present in
    the lower bound.

    Budgeted Batch Bayesian Optimization With Unknown Batch Sizes

    Vu Nguyen, Santu Rana, Sunil Gupta, Cheng Li, Svetha Venkatesh
    Comments: 22 pages
    Subjects: Learning (cs.LG)

    Parameter settings profoundly impact the performance of machine learning
    algorithms and laboratory experiments. The classical grid search or trial-error
    methods are exponentially expensive in large parameter spaces, and Bayesian
    optimization (BO) offers an elegant alternative for global optimization of
    black box functions. In situations where the black box function can be
    evaluated at multiple points simultaneously, batch Bayesian optimization is
    used. Current batch BO approaches are restrictive in that they fix the number
    of evaluations per batch, and this can be wasteful when the number of specified
    evaluations is larger than the number of real maxima in the underlying
    acquisition function. We present the Budgeted Batch Bayesian Optimization (B3O)
    for hyper-parameter tuning and experimental design – we identify the
    appropriate batch size for each iteration in an elegant way. To set the batch
    size flexible, we use the infinite Gaussian mixture model (IGMM) for
    automatically identifying the number of peaks in the underlying acquisition
    functions. We solve the intractability of estimating the IGMM directly from the
    acquisition function by formulating the batch generalized slice sampling to
    efficiently draw samples from the acquisition function. We perform extensive
    experiments for both synthetic functions and two real world applications –
    machine learning hyper-parameter tuning and experimental design for alloy
    hardening. We show empirically that the proposed B3O outperforms the existing
    fixed batch BO approaches in finding the optimum whilst requiring a fewer
    number of evaluations, thus saving cost and time.

    Neural Graph Machines: Learning Neural Networks Using Graphs

    Thang D. Bui, Sujith Ravi, Vivek Ramavajjala
    Comments: 9 pages
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Label propagation is a powerful and flexible semi-supervised learning
    technique on graphs. Neural networks, on the other hand, have proven track
    records in many supervised learning tasks. In this work, we propose a training
    framework with a graph-regularised objective, namely “Neural Graph Machines”,
    that can combine the power of neural networks and label propagation. This work
    generalises previous literature on graph-augmented training of neural networks,
    enabling it to be applied to multiple neural architectures (Feed-forward NNs,
    CNNs and LSTM RNNs) and a wide range of graphs. The new objective allows the
    neural networks to harness both labeled and unlabeled data by: (a) allowing the
    network to train using labeled data as in the supervised setting, (b) biasing
    the network to learn similar hidden representations for neighboring nodes on a
    graph, in the same vein as label propagation. Such architectures with the
    proposed objective can be trained efficiently using stochastic gradient descent
    and scaled to large graphs, with a runtime that is linear in the number of
    edges. The proposed joint training approach convincingly outperforms many
    existing methods on a wide range of tasks (multi-label classification on social
    graphs, news categorization, document classification and semantic intent
    classification), with multiple forms of graph inputs (including graphs with and
    without node-level features) and using different types of neural networks.

    Learned Optimizers that Scale and Generalize

    Olga Wichrowska, Niru Maheswaranathan, Matthew W. Hoffman, Sergio Gomez Colmenarejo, Misha Denil, Nando de Freitas, Jascha Sohl-Dickstein
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    Learning to learn has emerged as an important direction for achieving
    artificial intelligence. Two of the primary barriers to its adoption are an
    inability to scale to larger problems and a limited ability to generalize to
    new tasks. We introduce a learned gradient descent optimizer that generalizes
    well to new tasks, and which has significantly reduced memory and computation
    overhead. We achieve this by introducing a novel hierarchical RNN architecture,
    with minimal per-parameter overhead, augmented with additional architectural
    features that mirror the known structure of optimization tasks. We also develop
    a meta-training ensemble of small, diverse, optimization tasks capturing common
    properties of loss landscapes. The optimizer learns to out-perform RMSProp/ADAM
    on problems in this corpus. More importantly, it performs comparably or better
    when applied to small convolutional neural networks, despite seeing no neural
    networks in its meta-training set. Finally, it generalizes to train Inception
    V3 and ResNet V2 architectures on the ImageNet dataset, optimization problems
    that are of a vastly different scale than those it was trained on.

    Online Learning Rate Adaptation with Hypergradient Descent

    Atilim Gunes Baydin, Robert Cornish, David Martinez Rubio, Mark Schmidt, Frank Wood
    Comments: 9 pages, 4 figures
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    We introduce a general method for improving the convergence rate of
    gradient-based optimizers that is easy to implement and works well in practice.
    We analyze the effectiveness of the method by applying it to stochastic
    gradient descent, stochastic gradient descent with Nesterov momentum, and Adam,
    showing that it improves upon these commonly used algorithms on a range of
    optimization problems; in particular the kinds of objective functions that
    arise frequently in deep neural network training. Our method works by
    dynamically updating the learning rate during optimization using the gradient
    with respect to the learning rate of the update rule itself. Computing this
    “hypergradient” needs little additional computation, requires only one extra
    copy of the original gradient to be stored in memory, and relies upon nothing
    more than what is provided by reverse-mode automatic differentiation.

    Convergence of Deep Neural Networks to a Hierarchical Covariance Matrix Decomposition

    Nima Dehmamy, Neda Rohani, Aggelos Katsaggelos
    Subjects: Learning (cs.LG); Data Analysis, Statistics and Probability (physics.data-an); Machine Learning (stat.ML)

    We show that in a deep neural network trained with ReLU, the low-lying layers
    should be replaceable with truncated linearly activated layers. We derive the
    gradient descent equations in this truncated linear model and demonstrate that
    –if the distribution of the training data is stationary during training– the
    optimal choice for weights in these low-lying layers is the eigenvectors of the
    covariance matrix of the data. If the training data is random and uniform
    enough, these eigenvectors can be found using a small fraction of the training
    data, thus reducing the computational complexity of training. We show how this
    can be done recursively to form successive, trained layers. At least in the
    first layer, our tests show that this approach improves classification of
    images while reducing network size.

    Tree Memory Networks for Modelling Long-term Temporal Dependencies

    Tharindu Fernando, Simon Denman, Aaron McFadyen, Sridha Sridharan, Clinton Fookes
    Comments: This paper has been submitted to JMLR
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    In the domain of sequence modelling, Recurrent Neural Networks (RNN) have
    been capable of achieving impressive results in a variety of application areas
    including visual question answering, part-of-speech tagging and machine
    translation. However this success in modelling short term dependencies has not
    successfully transitioned to application areas such as trajectory prediction,
    which require capturing both short term and long term relationships. In this
    paper, we propose a Tree Memory Network (TMN) for modelling long term and short
    term relationships in sequence-to-sequence mapping problems. The proposed
    network architecture is composed of an input module, controller and a memory
    module. In contrast to related literature, which models the memory as a
    sequence of historical states, we model the memory as a recursive tree
    structure. This structure more effectively captures temporal dependencies
    across both short term and long term sequences using its hierarchical
    structure. We demonstrate the effectiveness and flexibility of the proposed TMN
    in two practical problems, aircraft trajectory modelling and pedestrian
    trajectory modelling in a surveillance setting, and in both cases we outperform
    the current state-of-the-art. Furthermore, we perform an in depth analysis on
    the evolution of the memory module content over time and provide visual
    evidence on how the proposed TMN is able to map both long term and short term
    relationships efficiently via a hierarchical structure.

    A New Unbiased and Efficient Class of LSH-Based Samplers and Estimators for Partition Function Computation in Log-Linear Models

    Ryan Spring, Anshumali Shrivastava
    Subjects: Machine Learning (stat.ML); Databases (cs.DB); Data Structures and Algorithms (cs.DS); Learning (cs.LG)

    Log-linear models are arguably the most successful class of graphical models
    for large-scale applications because of their simplicity and tractability.
    Learning and inference with these models require calculating the partition
    function, which is a major bottleneck and intractable for large state spaces.
    Importance Sampling (IS) and MCMC-based approaches are lucrative. However, the
    condition of having a “good” proposal distribution is often not satisfied in
    practice.

    In this paper, we add a new dimension to efficient estimation via sampling.
    We propose a new sampling scheme and an unbiased estimator that estimates the
    partition function accurately in sub-linear time. Our samples are generated in
    near-constant time using locality sensitive hashing (LSH), and so are
    correlated and unnormalized. We demonstrate the effectiveness of our proposed
    approach by comparing the accuracy and speed of estimating the partition
    function against other state-of-the-art estimation techniques including IS and
    the efficient variant of Gumbel-Max sampling. With our efficient sampling
    scheme, we accurately train real-world language models using only 1-2% of
    computations.

    Selective Harvesting over Networks

    Fabricio Murai, Diogo Rennó, Bruno Ribeiro, Gisele L. Pappa, Don Towsley, Krista Gile
    Comments: 28 pages, 9 figures
    Subjects: Social and Information Networks (cs.SI); Learning (cs.LG); Machine Learning (stat.ML)

    Active search (AS) on graphs focuses on collecting certain labeled nodes
    (targets) given global knowledge of the network topology and its edge weights
    under a query budget. However, in most networks, nodes, topology and edge
    weights are all initially unknown. We introduce selective harvesting, a variant
    of AS where the next node to be queried must be chosen among the neighbors of
    the current queried node set; the available training data for deciding which
    node to query is restricted to the subgraph induced by the queried set (and
    their node attributes) and their neighbors (without any node or edge
    attributes). Therefore, selective harvesting is a sequential decision problem,
    where we must decide which node to query at each step. A classifier trained in
    this scenario suffers from a tunnel vision effect: without recourse to
    independent sampling, the urge to query promising nodes forces classifiers to
    gather increasingly biased training data, which we show significantly hurts the
    performance of AS methods and standard classifiers. We find that it is possible
    to collect a much larger set of targets by using multiple classifiers, not by
    combining their predictions as an ensemble, but switching between classifiers
    used at each step, as a way to ease the tunnel vision effect. We discover that
    switching classifiers collects more targets by (a) diversifying the training
    data and (b) broadening the choices of nodes that can be queried next. This
    highlights an exploration, exploitation, and diversification trade-off in our
    problem that goes beyond the exploration and exploitation duality found in
    classic sequential decision problems. From these observations we propose D3TS,
    a method based on multi-armed bandits for non-stationary stochastic processes
    that enforces classifier diversity, matching or exceeding the performance of
    competing methods on seven real network datasets in our evaluation.

    Matched bipartite block model with covariates

    Zahra S. Razaee, Arash A. Amini, Jingyi Jessica Li
    Subjects: Social and Information Networks (cs.SI); Learning (cs.LG); Machine Learning (stat.ML)

    Community detection or clustering is a fundamental task in the analysis of
    network data. Many real networks have a bipartite structure which makes
    community detection challenging. In this paper, we consider a model which
    allows for matched communities in the bipartite setting, in addition to node
    covariates with information about the matching. We derive a simple fast
    algorithm for fitting the model based on variational inference ideas and show
    its effectiveness on both simulated and real data. A variation of the model to
    allow for degree-correction is also considered, in addition to a novel approach
    to fitting such degree-corrected models.

    Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling

    Diego Marcheggiani, Ivan Titov
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    Semantic role labeling (SRL) is the task of identifying the
    predicate-argument structure of a sentence. It is typically regarded as an
    important step in the standard natural language processing pipeline, providing
    information to downstream tasks such as information extraction and question
    answering. As the semantic representations are closely related to syntactic
    ones, we exploit syntactic information in our model. We propose a version of
    graph convolutional networks (GCNs), a recent class of multilayer neural
    networks operating on graphs, suited to modeling syntactic dependency graphs.
    GCNs over syntactic dependency trees are used as sentence encoders, producing
    latent feature representations of words in a sentence and capturing information
    relevant to predicting the semantic representations. We observe that GCN layers
    are complementary to LSTM ones: when we stack both GCN and LSTM layers, we
    obtain a substantial improvement over an already state-of-the-art LSTM SRL
    model, resulting in the best reported scores on the standard benchmark
    (CoNLL-2009) both for Chinese and English.

    Classification in biological networks with hypergraphlet kernels

    Jose Lugo-Martinez, Predrag Radivojac
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Biological and cellular systems are often modeled as graphs in which vertices
    represent objects of interest (genes, proteins, drugs) and edges represent
    relational ties among these objects (binds-to, interacts-with, regulates). This
    approach has been highly successful owing to the theory, methodology and
    software that support analysis and learning on graphs. Graphs, however, often
    suffer from information loss when modeling physical systems due to their
    inability to accurately represent multiobject relationships. Hypergraphs, a
    generalization of graphs, provide a framework to mitigate information loss and
    unify disparate graph-based methodologies. In this paper, we present a
    hypergraph-based approach for modeling physical systems and formulate vertex
    classification, edge classification and link prediction problems on
    (hyper)graphs as instances of vertex classification on (extended, dual)
    hypergraphs in a semi-supervised setting. We introduce a novel kernel method on
    vertex- and edge-labeled (colored) hypergraphs for analysis and learning. The
    method is based on exact and inexact (via hypergraph edit distances)
    enumeration of small simple hypergraphs, referred to as hypergraphlets, rooted
    at a vertex of interest. We extensively evaluate this method and show its
    potential use in a positive-unlabeled setting to estimate the number of missing
    and false positive links in protein-protein interaction networks.

    Network Constrained Distributed Dual Coordinate Ascent for Machine Learning

    Myung Cho, Lifeng Lai, Weiyu Xu
    Comments: 5 pages, 6 figures
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Learning (cs.LG)

    With explosion of data size and limited storage space at a single location,
    data are often distributed at different locations. We thus face the challenge
    of performing large-scale machine learning from these distributed data through
    communication networks. In this paper, we study how the network communication
    constraints will impact the convergence speed of distributed machine learning
    optimization algorithms. In particular, we give the convergence rate analysis
    of the distributed dual coordinate ascent in a general tree structured network.
    Furthermore, by considering network communication delays, we optimize the
    network-constrained dual coordinate ascent algorithms to maximize its
    convergence speed. Our results show that under different network communication
    delays, to achieve maximum convergence speed, one needs to adopt
    delay-dependent numbers of local and global iterations for distributed dual
    coordinate ascent.

    Discriminate-and-Rectify Encoders: Learning from Image Transformation Sets

    Andrea Tacchetti, Stephen Voinea, Georgios Evangelopoulos
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)

    The complexity of a learning task is increased by transformations in the
    input space that preserve class identity. Visual object recognition for example
    is affected by changes in viewpoint, scale, illumination or planar
    transformations. While drastically altering the visual appearance, these
    changes are orthogonal to recognition and should not be reflected in the
    representation or feature encoding used for learning. We introduce a framework
    for weakly supervised learning of image embeddings that are robust to
    transformations and selective to the class distribution, using sets of
    transforming examples (orbit sets), deep parametrizations and a novel
    orbit-based loss. The proposed loss combines a discriminative, contrastive part
    for orbits with a reconstruction error that learns to rectify orbit
    transformations. The learned embeddings are evaluated in distance metric-based
    tasks, such as one-shot classification under geometric transformations, as well
    as face verification and retrieval under more realistic visual variability. Our
    results suggest that orbit sets, suitably computed or observed, can be used for
    efficient, weakly-supervised learning of semantically relevant image
    embeddings.

    Audio Scene Classification with Deep Recurrent Neural Networks

    Huy Phan, Philipp Koch, Fabrice Katzberg, Marco Maass, Radoslaw Mazur, Alfred Mertins
    Comments: submitted for Interspeech 2017
    Subjects: Sound (cs.SD); Learning (cs.LG)

    We introduce in this work an efficient approach for audio scene
    classification using deep recurrent neural networks. A scene audio signal is
    firstly transformed into a sequence of high-level label tree embedding feature
    vectors. The vector sequence is then divided into multiple subsequences on
    which a deep GRU-based recurrent neural network is trained for
    sequence-to-label classification. The global predicted label for the entire
    sequence is finally obtained via aggregation of subsequence classification
    outputs. We will show that our approach obtain an F1-score of 97.7% on the
    LITIS Rouen dataset, which is the largest dataset publicly available for the
    task. Compared to the best previously reported result on the dataset, our
    approach is able to reduce the relative classification error by 35.3%.

    Weighted Voting Via No-Regret Learning

    Nika Haghtalab, Ritesh Noothigattu, Ariel D. Procaccia
    Subjects: Computer Science and Game Theory (cs.GT); Artificial Intelligence (cs.AI); Learning (cs.LG); Multiagent Systems (cs.MA)

    Voting systems typically treat all voters equally. We argue that perhaps they
    should not: Voters who have supported good choices in the past should be given
    higher weight than voters who have supported bad ones. To develop a formal
    framework for desirable weighting schemes, we draw on no-regret learning.
    Specifically, given a voting rule, we wish to design a weighting scheme such
    that applying the voting rule, with voters weighted by the scheme, leads to
    choices that are almost as good as those endorsed by the best voter in
    hindsight. We derive possibility and impossibility results for the existence of
    such weighting schemes, depending on whether the voting rule and the weighting
    scheme are deterministic or randomized, as well as on the social choice axioms
    satisfied by the voting rule.

    Understanding Black-box Predictions via Influence Functions

    Pang Wei Koh, Percy Liang
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG)

    How can we explain the predictions of a black-box model? In this paper, we
    use influence functions — a classic technique from robust statistics — to
    trace a model’s prediction through the learning algorithm and back to its
    training data, identifying the points most responsible for a given prediction.
    Applying ideas from second-order optimization, we scale up influence functions
    to modern machine learning settings and show that they can be applied to
    high-dimensional black-box models, even in non-convex and non-differentiable
    settings. We give a simple, efficient implementation that requires only oracle
    access to gradients and Hessian-vector products. On linear models and
    convolutional neural networks, we demonstrate that influence functions are
    useful for many different purposes: to understand model behavior, debug models
    and detect dataset errors, and even identify and exploit vulnerabilities to
    adversarial training-set attacks.

    On the benefits of output sparsity for multi-label classification

    Evgenii Chzhen, Christophe Denis, Mohamed Hebiri, Joseph Salmon
    Subjects: Statistics Theory (math.ST); Learning (cs.LG); Machine Learning (stat.ML)

    The multi-label classification framework, where each observation can be
    associated with a set of labels, has generated a tremendous amount of attention
    over recent years. The modern multi-label problems are typically large-scale in
    terms of number of observations, features and labels, and the amount of labels
    can even be comparable with the amount of observations. In this context,
    different remedies have been proposed to overcome the curse of dimensionality.
    In this work, we aim at exploiting the output sparsity by introducing a new
    loss, called the sparse weighted Hamming loss. This proposed loss can be seen
    as a weighted version of classical ones, where active and inactive labels are
    weighted separately. Leveraging the influence of sparsity in the loss function,
    we provide improved generalization bounds for the empirical risk minimizer, a
    suitable property for large-scale problems. For this new loss, we derive rates
    of convergence linear in the underlying output-sparsity rather than linear in
    the number of labels. In practice, minimizing the associated risk can be
    performed efficiently by using convex surrogates and modern convex optimization
    algorithms. We provide experiments on various real-world datasets demonstrating
    the pertinence of our approach when compared to non-weighted techniques.

    Optimal Densification for Fast and Accurate Minwise Hashing

    Anshumali Shrivastava
    Comments: Fast Minwise Hashing
    Subjects: Data Structures and Algorithms (cs.DS); Learning (cs.LG)

    Minwise hashing is a fundamental and one of the most successful hashing
    algorithm in the literature. Recent advances based on the idea of
    densification~cite{Proc:OneHashLSH_ICML14,Proc:Shrivastava_UAI14} have shown
    that it is possible to compute (k) minwise hashes, of a vector with (d)
    nonzeros, in mere ((d + k)) computations, a significant improvement over the
    classical (O(dk)). These advances have led to an algorithmic improvement in the
    query complexity of traditional indexing algorithms based on minwise hashing.
    Unfortunately, the variance of the current densification techniques is
    unnecessarily high, which leads to significantly poor accuracy compared to
    vanilla minwise hashing, especially when the data is sparse. In this paper, we
    provide a novel densification scheme which relies on carefully tailored
    2-universal hashes. We show that the proposed scheme is variance-optimal, and
    without losing the runtime efficiency, it is significantly more accurate than
    existing densification techniques. As a result, we obtain a significantly
    efficient hashing scheme which has the same variance and collision probability
    as minwise hashing. Experimental evaluations on real sparse and
    high-dimensional datasets validate our claims. We believe that given the
    significant advantages, our method will replace minwise hashing implementations
    in practice.

    Sensor Fusion for Robot Control through Deep Reinforcement Learning

    Steven Bohez, Tim Verbelen, Elias De Coninck, Bert Vankeirsbilck, Pieter Simoens, Bart Dhoedt
    Comments: 6 pages, 6 figures, submitted to IROS 2017
    Subjects: Robotics (cs.RO); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Systems and Control (cs.SY)

    Deep reinforcement learning is becoming increasingly popular for robot
    control algorithms, with the aim for a robot to self-learn useful feature
    representations from unstructured sensory input leading to the optimal
    actuation policy. In addition to sensors mounted on the robot, sensors might
    also be deployed in the environment, although these might need to be accessed
    via an unreliable wireless connection. In this paper, we demonstrate deep
    neural network architectures that are able to fuse information coming from
    multiple sensors and are robust to sensor failures at runtime. We evaluate our
    method on a search and pick task for a robot both in simulation and the real
    world.


    Information Theory

    A Real-Time Millimeter-Wave Phased Array MIMO Channel Sounder

    C. Umit Bas, Rui Wang, Dimitris Psychoudakis, Thomas Henige, Robert Monroe, Jeongho Park, Jianzhong Zhang, Andreas F. Molisch
    Comments: 6 pages, 15 figures, conference paper
    Subjects: Information Theory (cs.IT)

    In this paper, we present a novel real-time MIMO channel sounder for 28 GHz.
    Until now, the common practice to investigate the directional characteristics
    of millimeter-wave channels has been using a rotating horn antenna. The sounder
    presented here is capable of performing horizontal and vertical beam steering
    with the help of phased arrays. Thanks to fast beam-switching capability, the
    proposed sounder can perform measurements that are directionally resolved both
    at the transmitter (TX) and receiver (RX) as fast as 1.44 milliseconds compared
    to the minutes or even hours required for rotating horn antenna sounders. This
    does not only enable us to measure more points for better statistical inference
    but also allows to perform directional analysis in dynamic environments.
    Equally importantly, the short measurement time combined with the high phase
    stability of our setup limits the phase drift between TX and RX, enabling
    phase-coherent sounding of all beam pairs even when TX and RX are physically
    separated and have no cabled connection for synchronization. This ensures that
    the measurement data is suitable for high-resolution parameter extraction
    algorithms. Along with the system design and specifications, this paper also
    discusses the measurements performed for verification of the sounder.
    Furthermore, we present sample measurements from a channel sounding campaign
    performed on a residential street.

    Throughput Analysis for Wavelet OFDM in Broadband Power Line Communications

    Freddy A. Pinto-Benel, Manuel Blanco-Velasco, Fernando Cruz-Roldán
    Subjects: Information Theory (cs.IT)

    Windowed orthogonal frequency-division multiplexing (OFDM) and wavelet OFDM
    have been proposed as medium access techniques for broadband communications
    over the power line network by the standard IEEE 1901. Windowed OFDM has been
    extensively researched and employed in different fields of communication, while
    wavelet OFDM, which has been recently recommended for the first time in a
    standard, has received less attention. This work is aimed to show that wavelet
    OFDM, which basically is an Extended Lapped Transform-based multicarrier
    modulation (ELT-MCM), is a viable and attractive alternative for data
    transmission in hostile scenarios, such as in-home PLC. To this end, we obtain
    theoretical expressions for ELT-MCM of: 1) the useful signal power, 2) the
    inter-symbol interference (ISI) power, 3) the inter-carrier interference (ICI)
    power, and 4) the noise power at the receiver side. The system capacity and the
    achievable throughput are derived from these. This study includes several
    computer simulations that show that ELT-MCM is an efficient alternative to
    improve data rates in PLC networks.

    Steerable Discrete Fourier Transform

    Giulia Fracastoro, Enrico Magli
    Subjects: Information Theory (cs.IT)

    Directional transforms have recently raised a lot of interest thanks to their
    numerous applications in signal compression and analysis. In this letter, we
    introduce a generalization of the discrete Fourier transform, called steerable
    DFT (SDFT). Since the DFT is used in numerous fields, it may be of interest in
    a wide range of applications. Moreover, we also show that the SDFT is highly
    related to other well-known transforms, such as the Fourier sine and cosine
    transforms and the Hilbert transforms.

    Statistical Multiplexing of Computations in C-RAN with Tradeoffs in Latency and Energy

    Anders E. Kalør, Mauricio I.Agurto, Nuno K. Pratas, Jimmy J. Nielsen, Petar Popovski
    Comments: Accepted for presentation at the 3rd International Workshop on 5G RAN Design (ICC 2017)
    Subjects: Information Theory (cs.IT)

    In the Cloud Radio Access Network (C-RAN) architecture, the baseband signals
    from multiple remote radio heads are processed in a centralized baseband unit
    (BBU) pool. This architecture allows network operators to adapt the BBU’s
    computational resources to the aggregate access load experienced at the BBU,
    which can change in every air-interface access frame. The degree of savings
    that can be achieved by adapting the resources is a tradeoff between savings,
    adaptation frequency, and increased queuing time. If the time scale for
    adaptation of the resource multiplexing is greater than the access frame
    duration, then this may result in additional access latency and limit the
    energy savings. In this paper we investigate the tradeoff by considering two
    extreme time-scales for the resource multiplexing: (i) long-term, where the
    computational resources are adapted over periods much larger than the access
    frame durations; (ii) short-term, where the adaption is below the access frame
    duration. We develop a general C-RAN queuing model that describes the access
    latency and show, for Poisson arrivals, that long-term multiplexing achieves
    savings comparable to short-term multiplexing, while offering low
    implementation complexity.

    Several Classes of Trace Codes With Either Optimal Two Weights or a Few Weights over (mathbb{F}_{q}+umathbb{F}_{q})

    Hongwei Liu, Youcef Maouche
    Subjects: Information Theory (cs.IT)

    Let (p) be a prime number, (q=p^s) for a positive integer (s). For any
    positive divisor (e) of (q-1), we construct an infinite families codes of
    dimension (2m) with few Lee-weight. These codes are defined as trace codes over
    the ring (R=mathbb{F}_q + umathbb{F}_q), (u^2 = 0). Using Gauss sums, their
    Lee weight distribution is provided. When (gcd(e,m)=1), we obtain an infinite
    family of two-weight codes which meet the Griesmer bound. Moreover, if
    (gcd(e,m)=2, 3) or (4) we construct new infinite families with at most
    five-weight. These codes can be used in authentication codes and secret sharing
    schemes.

    Fairness Comparison of Uplink NOMA and OMA

    Zhiqiang Wei, Jiajia Guo, Derrick Wing Kwan Ng, Jinhong Yuan
    Comments: 6 pages, accepted for publication, VTC 2017, Spring, Sydney
    Subjects: Information Theory (cs.IT)

    In this paper, we compare the resource allocation fairness of uplink
    communications between non-orthogonal multiple access (NOMA) schemes and
    orthogonal multiple access (OMA) schemes. Through characterizing the
    contribution of the individual user data rate to the system sum rate, we
    analyze the fundamental reasons that NOMA offers a more fair resource
    allocation than that of OMA in asymmetric channels. Furthermore, a fairness
    indicator metric based on Jain’s index is proposed to measure the asymmetry of
    multiuser channels. More importantly, the proposed metric provides a selection
    criterion for choosing between NOMA and OMA for fair resource allocation. Based
    on this discussion, we propose a hybrid NOMA-OMA scheme to further enhance the
    users fairness. Simulation results confirm the accuracy of the proposed metric
    and demonstrate the fairness enhancement of the proposed hybrid NOMA-OMA scheme
    compared to the conventional OMA and NOMA schemes.

    D2D-Aware Device Caching in MmWave-Cellular Networks

    Nikolaos Giatsoglou, Konstantinos Ntontin, Elli Kartsakli, Angelos Antonopoulos, Christos Verikoukis
    Subjects: Information Theory (cs.IT)

    In this paper, we propose a novel device caching policy that facilitates
    popular content exchange through high-rate device-to-device (D2D)
    millimeter-wave (mmWave) communication. The proposed policy, which is termed
    D2D-aware caching (DAC) policy, splits the cacheable content into two content
    groups and distributes it evenly at the paired devices. By exploiting the
    high-bandwidth availability and the directionality of mmWaves, we ensure high
    rates for the D2D transmissions, while mitigating the co-channel interference
    that limits the D2D-communication potentials in the sub-6 GHz bands.
    Furthermore, based on a stochastic-geometry approach for the modeling of the
    location and number of the base stations (BSs) and users (UEs) in the network,
    we analytically derive the offloading gain that the proposed policy achieves
    and the distribution of the content retrieval delay for both half- and
    full-duplex D2D communication. The accuracy of the proposed analytical
    framework is validated against Monte-Carlo simulations. In addition, for a wide
    range of content-popularity indicator the results show that the proposed policy
    achieves better offloading and smaller content-retrieval delays than a widely
    considered state-of-the-art policy that caches the same content at every UE.

    On the Support Recovery of Jointly Sparse Gaussian Sources using Sparse Bayesian Learning

    Saurabh Khanna, Chandra R. Murthy
    Comments: 14 pages
    Subjects: Information Theory (cs.IT)

    In this work, we provide non-asymptotic, probabilistic guarantees for
    successful sparse support recovery by the Multiple Sparse Bayesian Learning
    (M-SBL) algorithm in the multiple measurement vector (MMV) framework. For joint
    sparse Gaussian sources, we show that M-SBL can perfectly recover their common
    nonzero support with arbitrarily high probability using only finitely many
    MMVs. In fact, the support error probability decays exponentially fast with the
    number of MMVs, with the decay rate depending on the restricted isometry
    constant of the self Khatri-Rao product of the measurement matrix. Our analysis
    theoretically confirms that M-SBL can indeed recover supports of size up to
    (mathcal{O}(m^{2})), where (m) is the number of measurements per MMV. This is
    in contrast to popular MMV algorithms in compressed sensing such as
    simultaneous orthogonal matching pursuit and group LASSO, which have been shown
    to succeed only when the sparsity is at most (mathcal{O}(m)). Next, in the
    presence of measurement noise, we show that M-SBL can recover the true
    (k)-sparse support of (n)-dimensional joint sparse Gaussian sources using at
    most (mathcal{O}(k log{n})) MMVs, while only a single MMV suffices in the
    noiseless case. Unlike existing support recovery guarantees for M-SBL, our
    sufficient conditions are truly non-asymptotic and do not require the
    orthogonality of the nonzero rows of the joint sparse signals.

    Greedy-Merge Degrading has Optimal Power-Law

    Assaf Kartowsky, Ido Tal
    Comments: 17 pages, 4 figures, 1 table, to be submitted to IEEE Transactions on Information Theory
    Subjects: Information Theory (cs.IT)

    Consider a channel with a given input alphabet size and a given input
    distribution. Our aim is to degrade or upgrade it to a channel with at most L
    output letters.

    The paper contains four main results. The first result, from which the paper
    title is derived, deals with the so called “greedy-merge” algorithm. We derive
    an upper bound on the reduction in mutual information between input and output.
    This upper bound is within a constant factor of an algorithm-independent lower
    bound. Thus, we establish that greedy-merge is optimal in the power-law sense.

    The other main results deal with upgrading. The second result shows that a
    certain sequence of channels that was previously shown to be “hard” for
    degrading, displays the same hardness in the context of upgrading. That is,
    suppose we are given such a channel and a corresponding input distribution. If
    we upgrade (degrade) to a new channel with L output letters, we incur an
    increase (decrease) in mutual information between input and output. We show
    that a previously derived bound on the decrease in mutual information for the
    degrading case is also a lower bound on the increase for the upgrading case.

    The third result is an efficient algorithm for optimal upgrading, in the
    binary-input case. That is, we are given a channel and an input distribution.
    We must find an upgraded channel with L output letters, for which the increase
    in mutual information is minimal. We give a simple characterization of such a
    channel, which implies an efficient algorithm.

    The fourth result is an analog of the first result for the upgrading case,
    when the input is binary. That is, we first present a sub-optimal algorithm for
    the setting considered in the third result. By analyzing the algorithm, we show
    that the increase incurred in mutual information is within a constant factor of
    the lower bound derived in the second result.

    Low-complexity Location-aware Multi-user Massive MIMO Beamforming for High Speed Train Communications

    Xuhong Chen, Pingyi Fan
    Comments: This paper has been accepted for future publication by VTC2017-Spring
    Subjects: Information Theory (cs.IT)

    Massive Multiple-input Multiple-output (MIMO) adaption is one of the primary
    evolving objectives for the next generation high speed train (HST)
    communication system. In this paper, we consider how to design an efficient
    low-complexity location-aware beamforming for the multi-user (MU) massive MIMO
    system in HST scenario. We first put forward a low-complexity beamforming based
    on location information, where multiple users are considered. Then, without
    considering inter-beam interference, a closed-form solution to maximize the
    total service competence of base station (BS) is proposed in this MU HST
    scenario. Finally, we present a location-aid searching-based suboptimal
    solution to eliminate the inter-beam interference and maximize the BS service
    competence. Various simulations are given to exhibit the advantages of our
    proposed massive MIMO beamforming method.

    Revisiting Frequency Reuse towards Supporting Ultra-Reliable Ubiquitous-Rate Communication

    Jihong Park, Dong Min Kim, Petar Popovski, Seong-Lyun Kim
    Comments: 7 pages, 6 figures, to appear in proc. IEEE WiOpt Wksp. SpaSWiN 2017
    Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

    One of the goals of 5G wireless systems stated by the NGMN alliance is to
    provide moderate rates (50+ Mbps) everywhere and with very high reliability. We
    term this service Ultra-Reliable Ubiquitous-Rate Communication (UR2C). This
    paper investigates the role of frequency reuse in supporting UR2C in the
    downlink. To this end, two frequency reuse schemes are considered:
    user-specific frequency reuse (FRu) and BS-specific frequency reuse (FRb). For
    a given unit frequency channel, FRu reduces the number of serving user
    equipments (UEs), whereas FRb directly decreases the number of interfering base
    stations (BSs). This increases the distance from the interfering BSs and the
    signal-to-interference ratio (SIR) attains ultra-reliability, e.g. 99% SIR
    coverage at a randomly picked UE. The ultra-reliability is, however, achieved
    at the cost of the reduced frequency allocation, which may degrade overall
    downlink rate. To fairly capture this reliability-rate tradeoff, we propose
    ubiquitous rate defined as the maximum downlink rate whose required SIR can be
    achieved with ultra-reliability. By using stochastic geometry, we derive
    closed-form ubiquitous rate as well as the optimal frequency reuse rules for
    UR2C.

    Temporal Analysis of Measured LOS Massive MIMO Channels with Mobility

    Paul Harris, Steffen Malkowsky, Joao Vieira, Fredrik Tufvesson, Wael Boukley Hasan, Liang Liu, Mark Beach, Simon Armour, Ove Edfors
    Comments: Accepted for presentation at the 85th IEEE Vehicular Technology Conference in Sydney. 5 Pages. arXiv admin note: substantial text overlap with arXiv:1701.08818
    Subjects: Information Theory (cs.IT)

    The first measured results for massive multiple-input, multiple-output (MIMO)
    performance in a line-of-sight (LOS) scenario with moderate mobility are
    presented, with 8 users served by a 100 antenna base Station (BS) at 3.7 GHz.
    When such a large number of channels dynamically change, the inherent
    propagation and processing delay has a critical relationship with the rate of
    change, as the use of outdated channel information can result in severe
    detection and precoding inaccuracies. For the downlink (DL) in particular, a
    time division duplex (TDD) configuration synonymous with massive MIMO
    deployments could mean only the uplink (UL) is usable in extreme cases.
    Therefore, it is of great interest to investigate the impact of mobility on
    massive MIMO performance and consider ways to combat the potential limitations.
    In a mobile scenario with moving cars and pedestrians, the correlation of the
    MIMO channel vector over time is inspected for vehicles moving up to 29 km/h.
    For a 100 antenna system, it is found that the channel state information (CSI)
    update rate requirement may increase by 7 times when compared to an 8 antenna
    system, whilst the power control update rate could be decreased by at least 5
    times relative to a single antenna system.

    Beamforming and Power Splitting Designs for AN-aided Secure Multi-user MIMO SWIPT Systems

    Zhengyu Zhu, Zheng Chu, Ning Wang, Sai Huang, Zhongyong Wang, Inkyu Lee
    Comments: 12 pages, 6 figures, submitted for possible publication
    Subjects: Information Theory (cs.IT)

    In this paper, an energy harvesting scheme for a multi-user
    multiple-input-multiple-output (MIMO) secrecy channel with artificial noise
    (AN) transmission is investigated. Joint optimization of the transmit
    beamforming matrix, the AN covariance matrix, and the power splitting ratio is
    conducted to minimize the transmit power under the target secrecy rate, the
    total transmit power, and the harvested energy constraints. The original
    problem is shown to be non-convex, which is tackled by a two-layer
    decomposition approach. The inner layer problem is solved through semi-definite
    relaxation, and the outer problem, on the other hand, is shown to be a single-
    variable optimization that can be solved by one-dimensional (1- D) line search.
    To reduce computational complexity, a sequential parametric convex
    approximation (SPCA) method is proposed to find a near-optimal solution. The
    work is then extended to the imperfect channel state information case with
    norm-bounded channel errors. Furthermore, tightness of the relaxation for the
    proposed schemes are validated by showing that the optimal solution of the
    relaxed problem is rank-one. Simulation results demonstrate that the proposed
    SPCA method achieves the same performance as the scheme based on 1-D but with
    much lower complexity.

    Tuning Free Orthogonal Matching Pursuit

    Sreejith Kallummil, Sheetal Kalyani
    Comments: 13 pages. 9 figures
    Subjects: Machine Learning (stat.ML); Information Theory (cs.IT)

    Orthogonal matching pursuit (OMP) is a widely used compressive sensing (CS)
    algorithm for recovering sparse signals in noisy linear regression models. The
    performance of OMP depends on its stopping criteria (SC). SC for OMP discussed
    in literature typically assumes knowledge of either the sparsity of the signal
    to be estimated (k_0) or noise variance (sigma^2), both of which are
    unavailable in many practical applications. In this article we develop a
    modified version of OMP called tuning free OMP or TF-OMP which does not require
    a SC. TF-OMP is proved to accomplish successful sparse recovery under the usual
    assumptions on restricted isometry constants (RIC) and mutual coherence of
    design matrix. TF-OMP is numerically shown to deliver a highly competitive
    performance in comparison with OMP having extit{a priori} knowledge of (k_0)
    or (sigma^2). Greedy algorithm for robust de-noising (GARD) is an OMP like
    algorithm proposed for efficient estimation in classical overdetermined linear
    regression models corrupted by sparse outliers. However, GARD requires the
    knowledge of inlier noise variance which is difficult to estimate. We also
    produce a tuning free algorithm (TF-GARD) for efficient estimation in the
    presence of sparse outliers by extending the operating principle of TF-OMP to
    GARD. TF-GARD is numerically shown to achieve a performance comparable to that
    of the existing implementation of GARD.

    Tail asymptotics of signal-to-interference ratio distribution in spatial cellular network models

    Naoto Miyoshi, Tomoyuki Shirai
    Comments: Dedicated to Tomasz Rolski on the occasion of his 70th birthday
    Subjects: Probability (math.PR); Information Theory (cs.IT)

    We consider a spatial stochastic model of wireless cellular networks, where
    the base stations (BSs) are deployed according to a simple and stationary point
    process on (mathbb{R}^d), (dge2). In this model, we investigate tail
    asymptotics of the distribution of signal-to-interference ratio (SIR), which is
    a key quantity in wireless communications. In the case where the path-loss
    function representing signal attenuation is unbounded at the origin, we derive
    the exact tail asymptotics of the SIR distribution under an appropriate
    sufficient condition. While we show that widely-used models based on a Poisson
    point process and on a determinantal point process meet the sufficient
    condition, we also give a counterexample violating it. In the case of bounded
    path-loss functions, we derive a logarithmically asymptotic upper bound on the
    SIR tail distribution for the Poisson-based and (alpha)-Ginibre-based models.
    A logarithmically asymptotic lower bound with the same order as the upper bound
    is also obtained for the Poisson-based model.

    Holevo Superadditivity Bounds for Heralded Channels

    Li Gao, Marius Junge, Nicholas LaRacuente
    Subjects: Quantum Physics (quant-ph); Information Theory (cs.IT)

    We show that for a particular class of quantum channels, which we call
    heralded channels, monogamy of squashed entanglement limits the superadditivity
    of Holevo capacity. Heralded channels provide a means to understand the quantum
    erasure channel composed with an arbitrary other quantum channel, as well as
    common situations in experimental quantum information that involve frequent
    loss of qubits or failure of trials. We also show how entanglement monogamy
    applies to non-classicality in quantum games, and we consider how any faithful,
    monogamous entanglement measure may bound the achievable value of
    entanglement-dependent quantities in many-party scenarios.

    Towards Optimal Sparse Inverse Covariance Selection through Non-Convex Optimization

    Sidhant Misra, Marc Vuffray, Andrey Y. Lokhov, Michael Chertkov
    Subjects: Learning (cs.LG); Information Theory (cs.IT); Statistics Theory (math.ST)

    We study the problem of reconstructing the graph of a sparse Gaussian
    Graphical Model from independent observations, which is equivalent to finding
    non-zero elements of an inverse covariance matrix. For a model of size (p) and
    maximum degree (d), information theoretic lower bounds established in prior
    works require that the number of samples needed for recovering the graph
    perfectly is at least (d log p/kappa^2), where (kappa) is the minimum
    normalized non-zero entry of the inverse covariance matrix. Existing algorithms
    require additional assumptions to guarantee perfect graph reconstruction, and
    consequently, their sample complexity is dependent on parameters that are not
    present in the information theoretic lower bound. We propose an estimator,
    called SLICE, that consists of a cardinality constrained least-squares
    regression followed by a thresholding procedure. Without any additional
    assumptions we show that SLICE attains a sample complexity of
    (frac{64}{kappa^4}d log p), which differs from the lower bound by only a
    factor proportional to (1/kappa^2) and depends only on parameters present in
    the lower bound.

    Interaction Screening: Efficient and Sample-Optimal Learning of Ising Models

    Marc Vuffray, Sidhant Misra, Andrey Y. Lokhov, Michael Chertkov
    Comments: To be published in Advances in Neural Information Processing Systems 30
    Subjects: Learning (cs.LG); Statistical Mechanics (cond-mat.stat-mech); Information Theory (cs.IT); Statistics Theory (math.ST); Machine Learning (stat.ML)

    We consider the problem of learning the underlying graph of an unknown Ising
    model on p spins from a collection of i.i.d. samples generated from the model.
    We suggest a new estimator that is computationally efficient and requires a
    number of samples that is near-optimal with respect to previously established
    information-theoretic lower-bound. Our statistical estimator has a physical
    interpretation in terms of “interaction screening”. The estimator is consistent
    and is efficiently implemented using convex optimization. We prove that with
    appropriate regularization, the estimator recovers the underlying graph using a
    number of samples that is logarithmic in the system size p and exponential in
    the maximum coupling-intensity and maximum node-degree.




沪ICP备19023445号-2号
友情链接