IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Thu, 30 Mar 2017

    我爱机器学习(52ml.net)发表于 2017-03-30 00:00:00
    love 0

    Neural and Evolutionary Computing

    Hierarchical Surrogate Modeling for Illumination Algorithms

    Alexander Hagg
    Subjects: Neural and Evolutionary Computing (cs.NE)

    Evolutionary illumination is a recent technique that allows producing many
    diverse, optimal solutions in a map of manually defined features. To support
    the large amount of objective function evaluations, surrogate model assistance
    was recently introduced. Illumination models need to represent many more,
    diverse optimal regions than classical surrogate models. In this PhD thesis, we
    propose to decompose the sample set, decreasing model complexity, by
    hierarchically segmenting the training set according to their coordinates in
    feature space. An ensemble of diverse models can then be trained to serve as a
    surrogate to illumination.

    Experience-based Optimization: A Coevolutionary Approach

    Shengcai Liu, Ke Tang, Xin Yao
    Subjects: Neural and Evolutionary Computing (cs.NE)

    This paper studies improving solvers based on their past solving experiences,
    and focuses on improving solvers by offline training. Specifically, the key
    issues of offline training methods are discussed, and research belonging to
    this category but from different areas are reviewed in a unified framework.
    Existing training methods generally adopt a two-stage strategy in which
    selecting the training instances and training instances are treated in two
    independent phases. This paper proposes a new training method, dubbed LiangYi,
    which addresses these two issues simultaneously. LiangYi includes a training
    module for a population-based solver and an instance sampling module for
    updating the training instances. The idea behind LiangYi is to promote the
    population-based solver by training it (with the training module) to improve
    its performance on those instances (discovered by the sampling module) on which
    it performs badly, while keeping the good performances obtained by it on
    previous instances. An instantiation of LiangYi on the Travelling Salesman
    Problem is also proposed. Empirical results on a huge testing set containing
    10000 instances showed LiangYi could train solvers that perform significantly
    better than the solvers trained by other state-of-the-art training method.
    Moreover, empirical investigation of the behaviours of LiangYi confirmed it was
    able to continuously improve the solver through training.

    Time Series Forecasting using RNNs: an Extended Attention Mechanism to Model Periods and Handle Missing Values

    Yagmur G. Cinar, Hamid Mirisaee, Parantapa Goswami, Eric Gaussier, Ali Ait-Bachir, Vadim Strijov
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    In this paper, we study the use of recurrent neural networks (RNNs) for
    modeling and forecasting time series. We first illustrate the fact that
    standard sequence-to-sequence RNNs neither capture well periods in time series
    nor handle well missing values, even though many real life times series are
    periodic and contain missing values. We then propose an extended attention
    mechanism that can be deployed on top of any RNN and that is designed to
    capture periods and make the RNN more robust to missing values. We show the
    effectiveness of this novel model through extensive experiments with multiple
    univariate and multivariate datasets.

    Exploring Heritability of Functional Brain Networks with Inexact Graph Matching

    Sofia Ira Ktena, Salim Arslan, Sarah Parisot, Daniel Rueckert
    Comments: accepted at ISBI 2017: International Symposium on Biomedical Imaging, Apr 2017, Melbourne, Australia
    Subjects: Neurons and Cognition (q-bio.NC); Neural and Evolutionary Computing (cs.NE)

    Data-driven brain parcellations aim to provide a more accurate representation
    of an individual’s functional connectivity, since they are able to capture
    individual variability that arises due to development or disease. This renders
    comparisons between the emerging brain connectivity networks more challenging,
    since correspondences between their elements are not preserved. Unveiling these
    correspondences is of major importance to keep track of local functional
    connectivity changes. We propose a novel method based on graph edit distance
    for the comparison of brain graphs directly in their domain, that can
    accurately reflect similarities between individual networks while providing the
    network element correspondences. This method is validated on a dataset of 116
    twin subjects provided by the Human Connectome Project.

    Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation

    Albert Gatt, Emiel Krahmer
    Comments: 111 pages, 8 figures, 2 tables
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    This paper surveys the current state of the art in Natural Language
    Generation (NLG), defined as the task of generating text or speech from
    non-linguistic input. A survey of NLG is timely in view of the changes that the
    field has undergone over the past decade or so, especially in relation to new
    (usually data-driven) methods, as well as new applications of NLG technology.
    This survey therefore aims to (a) give an up-to-date synthesis of research on
    the core tasks in NLG and the architectures adopted in which such tasks are
    organised; (b) highlight a number of relatively recent research topics that
    have arisen partly as a result of growing synergies between NLG and other areas
    of artificial intelligence; (c) draw attention to the challenges in NLG
    evaluation, relating them to similar challenges faced in other areas of Natural
    Language Processing, with an emphasis on different evaluation methods and the
    relationships between them.

    Theory II: Landscape of the Empirical Risk in Deep Learning

    Tomaso Poggio, Qianli Liao
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    Previous theoretical work on deep learning and neural network optimization
    tend to focus on avoiding saddle points and local minima. However, the
    practical observation is that, at least for the most successful Deep
    Convolutional Neural Networks (DCNNs) for visual processing, practitioners can
    always increase the network size to fit the training data (an extreme example
    would be [1]). The most successful DCNNs such as VGG and ResNets are best used
    with a small degree of “overparametrization”. In this work, we characterize
    with a mix of theory and experiments, the landscape of the empirical risk of
    overparametrized DCNNs. We first prove the existence of a large number of
    degenerate global minimizers with zero empirical error (modulo inconsistent
    equations). The zero-minimizers — in the case of classification — have a
    non-zero margin. The same minimizers are degenerate and thus very likely to be
    found by SGD that will furthermore select with higher probability the
    zero-minimizer with larger margin, as discussed in Theory III (to be released).
    We further experimentally explored and visualized the landscape of empirical
    risk of a DCNN on CIFAR-10 during the entire training process and especially
    the global minima. Finally, based on our theoretical and experimental results,
    we propose an intuitive model of the landscape of DCNN’s empirical loss
    surface, which might not be as complicated as people commonly believe.


    Computer Vision and Pattern Recognition

    CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training

    Jianmin Bao, Dong Chen, Fang Wen, Houqiang Li, Gang Hua
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We present variational generative adversarial networks, a general learning
    framework that combines a variational auto-encoder with a generative
    adversarial network, for synthesizing images of fine-grained categories, such
    as faces of a specific person or objects in a category. Our approach models an
    image as a composition of label and latent attributes in a probabilistic model.
    By varying the fine-grained category label fed to the resulting generative
    model, we can generate images in a specific category by randomly drawn values
    on a latent attribute vector. The novelty of our approach comes from two
    aspects. Firstly, we propose to adopt a cross entropy loss for the
    discriminative and classifier network, but a mean discrepancy objective for the
    generative network. This kind of asymmetric loss function makes the training of
    the GAN more stable. Secondly, we adopt an encoder network to learn the
    relationship between the latent space and the real image space, and use
    pairwise feature matching to keep the structure of generated images. We
    experiment with natural images of faces, flowers, and birds, and demonstrate
    that the proposed models are capable of generating realistic and diverse
    samples with fine-grained category labels. We further show that our models can
    be applied to other tasks, such as image inpainting, super-resolution, and data
    augmentation for training better face recognition models.

    Unrestricted Facial Geometry Reconstruction Using Image-to-Image Translation

    Matan Sela, Elad Richardson, Ron Kimmel
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    It has been recently shown that neural networks can recover the geometric
    structure of a face from a single given image. A common denominator of most
    existing face geometry reconstruction methods is the restriction of the
    solution space to some low-dimensional subspace. While such a model
    significantly simplifies the reconstruction problem, it is inherently limited
    in its expressiveness. As an alternative, we propose an Image-to-Image
    translation network that maps the input image to a depth image and a facial
    correspondence map. This explicit pixel-based mapping can then be utilized to
    provide high quality reconstructions of diverse faces under extreme
    expressions. In the spirit of recent approaches, the network is trained only
    with synthetic data, and is then evaluated on “in-the-wild” facial images. Both
    qualitative and quantitative analyses demonstrate the accuracy and the
    robustness of our approach. As an additional analysis of the proposed network,
    we show that it can be used as a geometric constraint for facial image
    translation tasks.

    Google Map Aided Visual Navigation for UAVs in GPS-denied Environment

    Mo Shan, Fei Wang, Feng Lin, Zhi Gao, Ya Z. Tang, Ben M. Chen
    Comments: Published in ROBIO 2015, Zhuhai, China. Fixed a typo
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We propose a framework for Google Map aided UAV navigation in GPS-denied
    environment. Geo-referenced navigation provides drift-free localization and
    does not require loop closures. The UAV position is initialized via
    correlation, which is simple and efficient. We then use optical flow to predict
    its position in subsequent frames. During pose tracking, we obtain inter-frame
    translation either by motion field or homography decomposition, and we use HOG
    features for registration on Google Map. We employ particle filter to conduct a
    coarse to fine search to localize the UAV. Offline test using aerial images
    collected by our quadrotor platform shows promising results as our approach
    eliminates the drift in dead-reckoning, and the small localization error
    indicates the superiority of our approach as a supplement to GPS.

    Improved Lossy Image Compression with Priming and Spatially Adaptive Bit Rates for Recurrent Networks

    Nick Johnston, Damien Vincent, David Minnen, Michele Covell, Saurabh Singh, Troy Chinen, Sung Jin Hwang, Joel Shor, George Toderici
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We propose a method for lossy image compression based on recurrent,
    convolutional neural networks that outperforms BPG (4:2:0 ), WebP, JPEG2000,
    and JPEG as measured by MS-SSIM. We introduce three improvements over previous
    research that lead to this state-of-the-art result. First, we show that
    training with a pixel-wise loss weighted by SSIM increases reconstruction
    quality according to several metrics. Second, we modify the recurrent
    architecture to improve spatial diffusion, which allows the network to more
    effectively capture and propagate image information through the network’s
    hidden state. Finally, in addition to lossless entropy coding, we use a
    spatially adaptive bit allocation algorithm to more efficiently use the limited
    number of bits to encode visually complex image regions. We evaluate our method
    on the Kodak and Tecnick image sets and compare against standard codecs as well
    recently published methods based on deep neural networks.

    Pose-conditioned Spatio-Temporal Attention for Human Action Recognition

    Fabien Baradel, Christian Wolf, Julien Mille
    Comments: 10 pages, project page: this https URL
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We address human action recognition from multi-modal video data involving
    articulated pose and RGB frames and propose a two-stream approach. The pose
    stream is processed with a convolutional model taking as input a 3D tensor
    holding data from a sub-sequence. A specific joint ordering, which respects the
    topology of the human body, ensures that different convolutional layers
    correspond to meaningful levels of abstraction. The raw RGB stream is handled
    by a spatio-temporal soft-attention mechanism conditioned on features from the
    pose network. An LSTM network receives input from a set of image locations at
    each instant. A trainable glimpse sensor extracts features on a set of
    predefined locations specified by the pose stream, namely the 4 hands of the
    two people involved in the activity. Appearance features give important cues on
    hand motion and on objects held in each hand. We show that it is of high
    interest to shift the attention to different hands at different time steps
    depending on the activity itself. Finally a temporal attention mechanism learns
    how to fuse LSTM features over time. We evaluate the method on 3 datasets.
    State-of-the-art results are achieved on the largest dataset for human activity
    recognition, namely NTU-RGB+D, as well as on the SBU Kinect Interaction
    dataset. Performance close to state-of-the-art is achieved on the smaller MSR
    Daily Activity 3D dataset.

    Flow-Guided Feature Aggregation for Video Object Detection

    Xizhou Zhu, Yujie Wang, Jifeng Dai, Lu Yuan, Yichen Wei
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Extending state-of-the-art object detectors from image to video is
    challenging. The accuracy of detection suffers from degenerated object
    appearances in videos, e.g., motion blur, video defocus, rare poses, etc.
    Existing work attempts to exploit temporal information on box level, but such
    methods are not trained end-to-end. We present flow-guided feature aggregation,
    an accurate and end-to-end learning framework for video object detection. It
    leverages temporal coherence on feature level instead. It improves the
    per-frame features by aggregation of nearby features along the motion paths,
    and thus improves the video recognition accuracy. Our method significantly
    improves upon strong single-frame baselines in ImageNet VID, especially for
    more challenging fast moving objects. Our framework is principled, and on par
    with the best engineered systems winning the ImageNet VID challenges 2016,
    without additional bells-and-whistles. The code would be released.

    Iterative Object and Part Transfer for Fine-Grained Recognition

    Zhiqiang Shen, Yu-Gang Jiang, Dequan Wang, Xiangyang Xue
    Comments: To appear in ICME 2017 as an oral paper
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    The aim of fine-grained recognition is to identify sub-ordinate categories in
    images like different species of birds. Existing works have confirmed that, in
    order to capture the subtle differences across the categories, automatic
    localization of objects and parts is critical. Most approaches for object and
    part localization relied on the bottom-up pipeline, where thousands of region
    proposals are generated and then filtered by pre-trained object/part models.
    This is computationally expensive and not scalable once the number of
    objects/parts becomes large. In this paper, we propose a nonparametric
    data-driven method for object and part localization. Given an unlabeled test
    image, our approach transfers annotations from a few similar images retrieved
    in the training set. In particular, we propose an iterative transfer strategy
    that gradually refine the predicted bounding boxes. Based on the located
    objects and parts, deep convolutional features are extracted for recognition.
    We evaluate our approach on the widely-used CUB200-2011 dataset and a new and
    large dataset called Birdsnap. On both datasets, we achieve better results than
    many state-of-the-art approaches, including a few using oracle (manually
    annotated) bounding boxes in the test images.

    A Geometric Framework for Stochastic Shape Analysis

    Alexis Arnaudon, Darryl D. Holm, Stefan Sommer
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Dynamical Systems (math.DS); Numerical Analysis (math.NA)

    We introduce a stochastic model of diffeomorphisms, whose action on a variety
    of data types descends to stochastic models of shapes, images and landmarks.
    The stochasticity is introduced in the vector field which transports the data
    in the Large Deformation Diffeomorphic Metric Mapping (LDDMM) framework for
    shape analysis and image registration. The stochasticity thereby models errors
    or uncertainties of the flow in following the prescribed deformation velocity.
    The approach is illustrated in the example of finite dimensional landmark
    manifolds, whose stochastic evolution is studied both via the Fokker-Planck
    equation and by numerical simulations. We derive two approaches for inferring
    parameters of the stochastic model from landmark configurations observed at
    discrete time points. The first of the two approaches matches moments of the
    Fokker-Planck equation to sample moments of the data, while the second approach
    employs an Expectation-Maximisation based algorithm using a Monte Carlo bridge
    sampling scheme to optimise the data likelihood. We derive and numerically test
    the ability of the two approaches to infer the spatial correlation length of
    the underlying noise.

    Image Restoration using Autoencoding Priors

    Siavash Arjomand Bigdeli, Matthias Zwicker
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)

    We propose to leverage denoising autoencoder networks as priors to address
    image restoration problems. We build on the key observation that the output of
    an optimal denoising autoencoder is a local mean of the true data density, and
    the autoencoder error (the difference between the output and input of the
    trained autoencoder) is a mean shift vector. We use the magnitude of this mean
    shift vector, that is, the distance to the local mean, as the negative log
    likelihood of our natural image prior. For image restoration, we maximize the
    likelihood using gradient descent by backpropagating the autoencoder error. A
    key advantage of our approach is that we do not need to train separate networks
    for different image restoration tasks, such as non-blind deconvolution with
    different kernels, or super-resolution at different magnification factors. We
    demonstrate state of the art results for non-blind deconvolution and
    super-resolution using the same autoencoding prior.

    Sentiment Recognition in Egocentric Photostreams

    Estefania Talavera, Nicola Strisciuglio, Nicolai Petkov, Petia Radeva
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Lifelogging is a process of collecting rich source of information about daily
    life of people. In this paper, we introduce the problem of sentiment analysis
    in egocentric events focusing on the moments that compose the images recalling
    positive, neutral or negative feelings to the observer. We propose a method for
    the classification of the sentiments in egocentric pictures based on global and
    semantic image features extracted by Convolutional Neural Networks. We carried
    out experiments on an egocentric dataset, which we organized in 3 classes on
    the basis of the sentiment that is recalled to the user (positive, negative or
    neutral).

    Bundle Optimization for Multi-aspect Embedding

    Qiong Zeng, Wenzheng Chen, Zhuo Han, Mingyi Shi, Yanir Kleiman, Daniel Cohen-Or, Baoquan Chen, Yangyan Li
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Understanding semantic similarity among images is the core of a wide range of
    computer vision applications. An important step towards this goal is to collect
    and learn human perceptions. Interestingly, the semantic context of images is
    often ambiguous as images can be perceived with emphasis on different aspects,
    which may be contradictory to each other.

    In this paper, we present a method for learning the semantic similarity among
    images, inferring their latent aspects and embedding them into multi-spaces
    corresponding to their semantic aspects.

    We consider the multi-embedding problem as an optimization function that
    evaluates the embedded distances with respect to the qualitative clustering
    queries. The key idea of our approach is to collect and embed qualitative
    measures that share the same aspects in bundles. To ensure similarity aspect
    sharing among multiple measures, image classification queries are presented to,
    and solved by users. The collected image clusters are then converted into
    bundles of tuples, which are fed into our bundle optimization algorithm that
    jointly infers the aspect similarity and multi-aspect embedding. Extensive
    experimental results show that our approach significantly outperforms
    state-of-the-art multi-embedding approaches on various datasets, and scales
    well for large multi-aspect similarity measures.

    Towards thinner convolutional neural networks through Gradually Global Pruning

    Zhengtao Wang, Ce Zhu, Zhiqiang Xia, Qi Guo, Yipeng Liu
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Deep network pruning is an effective method to reduce the storage and
    computation cost of deep neural networks when applying them to resource-limited
    devices. Among many pruning granularities, neuron level pruning will remove
    redundant neurons and filters in the model and result in thinner networks. In
    this paper, we propose a gradually global pruning scheme for neuron level
    pruning. In each pruning step, a small percent of neurons were selected and
    dropped across all layers in the model. We also propose a simple method to
    eliminate the biases in evaluating the importance of neurons to make the scheme
    feasible. Compared with layer-wise pruning scheme, our scheme avoid the
    difficulty in determining the redundancy in each layer and is more effective
    for deep networks. Our scheme would automatically find a thinner sub-network in
    original network under a given performance.

    Who's Better, Who's Best: Skill Determination in Video using Deep Ranking

    Hazel Doughty, Dima Damen, Walterio Mayol-Cuevas
    Comments: 10 pages, 8 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper presents a method for assessing skill of performance from video,
    for a variety of tasks, ranging from drawing to surgery and rolling dough. We
    formulate the problem as pairwise and overall ranking of video collections, and
    propose a supervised deep ranking model to learn discriminative features
    between pairs of videos exhibiting different amounts of skill. We utilise a
    two-stream Temporal Segment Network to capture both the type and quality of
    motions and the evolving task state. Results demonstrate our method is
    applicable to a variety of tasks, with the percentage of correctly ordered
    pairs of videos ranging from 70% to 82% for four datasets. We demonstrate the
    robustness of our approach via sensitivity analysis of its parameters.

    We see this work as effort toward the automated and objective organisation of
    how-to videos and overall, generic skill determination in video.

    One Network to Solve Them All — Solving Linear Inverse Problems using Deep Projection Models

    J. H. Rick Chang, Chun-Liang Li, Barnabas Poczos, B. V. K. Vijaya Kumar, Aswin C. Sankaranarayanan
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    While deep learning methods have achieved state-of-the-art performance in
    many challenging inverse problems like image inpainting and super-resolution,
    they invariably involve problem-specific training of the networks. Under this
    approach, different problems require different networks. In scenarios where we
    need to solve a wide variety of problems, e.g., on a mobile camera, it is
    inefficient and costly to use these specially-trained networks. On the other
    hand, traditional methods using signal priors can be used in all linear inverse
    problems but often have worse performance on challenging tasks. In this work,
    we provide a middle ground between the two kinds of methods — we propose a
    general framework to train a single deep neural network that solves arbitrary
    linear inverse problems. The proposed network acts as a proximal operator for
    an optimization algorithm and projects non-image signals onto the set of
    natural images defined by the decision boundary of a classifier. In our
    experiments, the proposed framework demonstrates superior performance over
    traditional methods using a wavelet sparsity prior and achieves comparable
    performance of specially-trained networks on tasks including compressive
    sensing and pixel-wise inpainting.

    Learning with Privileged Information for Multi-Label Classification

    Shiyu Chen, Shangfei Wang, Tanfang Chen, Xiaoxiao Shi
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we propose a novel approach for learning multi-label
    classifiers with the help of privileged information. Specifically, we use
    similarity constraints to capture the relationship between available
    information and privileged information, and use ranking constraints to capture
    the dependencies among multiple labels. By integrating similarity constraints
    and ranking constraints into the learning process of classifiers, the
    privileged information and the dependencies among multiple labels are exploited
    to construct better classifiers during training. A maximum margin classifier is
    adopted, and an efficient learning algorithm of the proposed method is also
    developed. We evaluate the proposed method on two applications: multiple object
    recognition from images with the help of implicit information about object
    importance conveyed by the list of manually annotated image tags; and multiple
    facial action unit detection from low-resolution images augmented by
    high-resolution images. Experimental results demonstrate that the proposed
    method can effectively take full advantage of privileged information and
    dependencies among multiple labels for better object recognition and better
    facial action unit detection.

    LabelBank: Revisiting Global Perspectives for Semantic Segmentation

    Hexiang Hu, Zhiwei Deng, Guang-Tong Zhou, Fei Sha, Greg Mori
    Comments: Pre-prints
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Semantic segmentation requires a detailed labeling of image pixels by object
    category. Information derived from local image patches is necessary to describe
    the detailed shape of individual objects. However, this information is
    ambiguous and can result in noisy labels. Global inference of image content can
    instead capture the general semantic concepts present. We advocate that
    holistic inference of image concepts provides valuable information for detailed
    pixel labeling. We propose a generic framework to leverage holistic information
    in the form of a LabelBank for pixel-level segmentation.

    We show the ability of our framework to improve semantic segmentation
    performance in a variety of settings. We learn models for extracting a holistic
    LabelBank from visual cues, attributes, and/or textual descriptions. We
    demonstrate improvements in semantic segmentation accuracy on standard datasets
    across a range of state-of-the-art segmentation architectures and holistic
    inference approaches.

    Novel Structured Low-rank algorithm to recover spatially smooth exponential image time series

    Arvind Balachandrasekaran, Mathews Jacob
    Comments: 4 pages, 3 figures, accepted at ISBI 2017, Melbourne, Australia
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We propose a structured low rank matrix completion algorithm to recover a
    time series of images consisting of linear combination of exponential
    parameters at every pixel, from under-sampled Fourier measurements. The spatial
    smoothness of these parameters is exploited along with the exponential
    structure of the time series at every pixel, to derive an annihilation relation
    in the (k-t) domain. This annihilation relation translates into a structured
    low rank matrix formed from the (k-t) samples. We demonstrate the algorithm in
    the parameter mapping setting and show significant improvement over state of
    the art methods.

    Click Here: Human-Localized Keypoints as Guidance for Viewpoint Estimation

    Ryan Szeto, Jason J. Corso
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We motivate and address a human-in-the-loop variant of the monocular
    viewpoint estimation task in which the location and class of one semantic
    object keypoint is available at test time. In order to leverage the keypoint
    information, we devise a Convolutional Neural Network called Click-Here CNN
    (CH-CNN) that integrates the keypoint information with activations from the
    layers that process the image. It transforms the keypoint information into a 2D
    map that can be used to weigh features from certain parts of the image more
    heavily. The weighted sum of these spatial features is combined with global
    image features to provide relevant information to the prediction layers. To
    train our network, we collect a novel dataset of 3D keypoint annotations on
    thousands of CAD models, and synthetically render millions of images with 2D
    keypoint information. On test instances from PASCAL 3D+, our model achieves a
    mean class accuracy of 90.7%, whereas the state-of-the-art baseline only
    obtains 85.7% accuracy, justifying our argument for human-in-the-loop
    inference.

    Automatic Detection of Knee Joints and Quantification of Knee Osteoarthritis Severity using Convolutional Neural Networks

    Joseph Antony, Kevin McGuinness, Kieran Moran, Noel E O'Connor
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper introduces a new approach to automatically quantify the severity
    of knee OA using X-ray images. Automatically quantifying knee OA severity
    involves two steps: first, automatically localizing the knee joints; next,
    classifying the localized knee joint images. We introduce a new approach to
    automatically detect the knee joints using a fully convolutional neural network
    (FCN). We train convolutional neural networks (CNN) from scratch to
    automatically quantify the knee OA severity optimizing a weighted ratio of two
    loss functions: categorical cross-entropy and mean-squared loss. This joint
    training further improves the overall quantification of knee OA severity, with
    the added benefit of naturally producing simultaneous multi-class
    classification and regression outputs. Two public datasets are used to evaluate
    our approach, the Osteoarthritis Initiative (OAI) and the Multicenter
    Osteoarthritis Study (MOST), with extremely promising results that outperform
    existing approaches.

    Deceiving Google's Cloud Video Intelligence API Built for Summarizing Videos

    Hossein Hosseini, Baicen Xiao, Radha Poovendran
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Despite the rapid progress of the techniques for image classification, video
    annotation has remained a challenging task. Automated video annotation would be
    a breakthrough technology, enabling users to search within the videos.
    Recently, Google introduced the Cloud Video Intelligence API for video
    analysis. As per the website, the system “separates signal from noise, by
    retrieving relevant information at the video, shot or per frame.” A
    demonstration website has been also launched, which allows anyone to select a
    video for annotation. The API then detects the video labels (objects within the
    video) as well as shot labels (description of the video events over time). In
    this paper, we examine the usability of the Google’s Cloud Video Intelligence
    API in adversarial environments. In particular, we investigate whether an
    adversary can manipulate a video in such a way that the API will return only
    the adversary-desired labels. For this, we select an image that is different
    from the content of the Video and insert it, periodically and at a very low
    rate, into the video. We found that if we insert one image every two seconds,
    the API is deceived into annotating the entire video as if it only contains the
    inserted image. Note that the modification to the video is hardly noticeable
    as, for instance, for a typical frame rate of 25, we insert only one image per
    50 video frames. We also found that, by inserting one image per second, all the
    shot labels returned by the API are related to the inserted image. We perform
    the experiments on the sample videos provided by the API demonstration website
    and show that our attack is successful with different videos and images.

    ProcNets: Learning to Segment Procedures in Untrimmed and Unconstrained Videos

    Luowei Zhou, Chenliang Xu, Jason J. Corso
    Comments: 15 pages including Appendix
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We propose a temporal segmentation and procedure learning model for long
    untrimmed and unconstrained videos, e.g., videos from YouTube. The proposed
    model segments a video into segments that constitute a procedure and learns the
    underlying temporal dependency among the procedure segments. The output
    procedure segments can be applied for other tasks, such as video description
    generation or activity recognition. Two aspects distinguish our work from the
    existing literature. First, we introduce the problem of learning long-range
    temporal structure for procedure segments within a video, in contrast to the
    majority of efforts that focus on understanding short-range temporal structure.
    Second, the proposed model segments an unseen video with only visual evidence
    and can automatically determine the number of segments to predict. For
    evaluation, there is no large-scale dataset with annotated procedure steps
    available. Hence, we collect a new cooking video dataset, named YouCookII, with
    the procedure steps localized and described. Our ProcNets model achieves
    state-of-the-art performance in procedure segmentation.

    Perception Driven Texture Generation

    Yanhai Gan, Huifang Chi, Ying Gao, Jun Liu, Guoqiang Zhong, Junyu Dong
    Comments: 7 pages, 4 figures, icme2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)

    This paper investigates a novel task of generating texture images from
    perceptual descriptions. Previous work on texture generation focused on either
    synthesis from examples or generation from procedural models. Generating
    textures from perceptual attributes have not been well studied yet. Meanwhile,
    perceptual attributes, such as directionality, regularity and roughness are
    important factors for human observers to describe a texture. In this paper, we
    propose a joint deep network model that combines adversarial training and
    perceptual feature regression for texture generation, while only random noise
    and user-defined perceptual attributes are required as input. In this model, a
    preliminary trained convolutional neural network is essentially integrated with
    the adversarial framework, which can drive the generated textures to possess
    given perceptual attributes. An important aspect of the proposed model is that,
    if we change one of the input perceptual features, the corresponding appearance
    of the generated textures will also be changed. We design several experiments
    to validate the effectiveness of the proposed method. The results show that the
    proposed method can produce high quality texture images with desired perceptual
    properties.

    Two-Stream RNN/CNN for Action Recognition in 3D Videos

    Rui Zhao, Haider Ali, Patrick van der Smagt
    Comments: 8 pages, 8 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    The recognition of actions from video sequences has many applications in
    health monitoring, assisted living, surveillance, and smart homes. Despite
    advances in sensing, in particular related to 3D video, the methodologies to
    process the data are still subject to research. We demonstrate superior results
    by a system which combines recurrent neural networks with convolutional neural
    networks in a voting approach. The gated-recurrent-unit-based neural networks
    are particularly well-suited to distinguish actions based on long-term
    information from optical tracking data; the 3D-CNNs focus more on detailed,
    recent information from video data. The resulting features are merged in an SVM
    which then classifies the movement. In this architecture, our method improves
    recognition rates of state-of-the-art methods by 14% on standard data sets.

    A Holistic Approach for Optimizing DSP Block Utilization of a CNN implementation on FPGA

    Kamel Abdelouahab, Cedric Bourrasset, Maxime Pelcat, François Berry, Jean-Charles Quinton, Jocelyn Serot
    Comments: 8 pages, 6 figures
    Journal-ref: Proceedings of the 10th International Conference on Distributed
    Smart Camera (ICDSC) 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Deep Neural Networks are becoming the de-facto standard models for image
    understanding, and more generally for computer vision tasks. As they involve
    highly parallelizable computations, CNN are well suited to current fine grain
    programmable logic devices. Thus, multiple CNN accelerators have been
    successfully implemented on FPGAs. Unfortunately, FPGA resources such as logic
    elements or DSP units remain limited. This work presents a holistic method
    relying on approximate computing and design space exploration to optimize the
    DSP block utilization of a CNN implementation on an FPGA. This method was
    tested when implementing a reconfigurable OCR convolutional neural network on
    an Altera Stratix V device and varying both data representation and CNN
    topology in order to find the best combination in terms of DSP block
    utilization and classification accuracy. This exploration generated dataflow
    architectures of 76 CNN topologies with 5 different fixed point representation.
    Most efficient implementation performs 883 classifications/sec at 256 x 256
    resolution using 8% of the available DSP blocks.

    INTEL-TUT Dataset for Camera Invariant Color Constancy Research

    Caglar Aytekin, Jarno Nikkanen, Moncef Gabbouj
    Comments: Download Link for the Dataset: this https URL Submission Info: Submitted to IEEE TIP
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we provide a novel dataset designed for camera invariant color
    constancy research. Camera invariance corresponds to the robustness of an
    algorithm’s performance when run on images of the same scene taken by different
    cameras. Accordingly, images in the database correspond to several lab and
    field scenes each of which are captured by three different cameras with minimal
    registration errors. The lab scenes are also captured under five different
    illuminations. The spectral responses of cameras and the spectral power
    distributions of the lab light sources are also provided, as they may prove
    beneficial for training future algorithms to achieve color constancy. For a
    fair evaluation of future methods, we provide guidelines for supervised methods
    with indicated training, validation and testing partitions. Accordingly, we
    evaluate a recently proposed convolutional neural network based color constancy
    algorithm as a baseline for future research. As a side contribution, this
    dataset also includes images taken by a mobile camera with color shading
    corrected and uncorrected results. This allows research on the effect of color
    shading as well.

    Deep 6-DOF Tracking

    Mathieu Garon, Jean-François Lalonde
    Comments: 8 pages, 7 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We present a temporal 6-DOF tracking method which leverages deep learning to
    achieve state-of-the-art performance on challenging datasets of real world
    capture. Our method is both more accurate and more robust to occlusions than
    the existing best performing approaches while maintaining real-time
    performance. To assess its efficacy, we evaluate our approach on several
    challenging RGBD sequences of real objects in a variety of conditions. Notably,
    we systematically evaluate robustness to occlusions through a series of
    sequences where the object to be tracked is increasingly occluded. Finally, our
    approach is purely data-driven and does not require any hand-designed features:
    robust tracking is automatically learned from data.

    Coordinating Filters for Faster Deep Neural Networks

    Wei Wen, Cong Xu, Chunpeng Wu, Yandan Wang, Yiran Chen, Hai Li
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Very large-scale Deep Neural Networks (DNNs) have achieved remarkable
    successes in a large variety of computer vision tasks. However, the high
    computation intensity of DNNs makes it challenging to deploy these models on
    resource-limited systems. Some studies used low-rank approaches that
    approximate the filters by low-rank basis to accelerate the testing. Those
    works directly decomposed the pre-trained DNNs by Low-Rank Approximations
    (LRA). How to train DNNs toward lower-rank space for more efficient DNNs,
    however, remains as an open area. To solve the issue, in this work, we propose
    Force Regularization, which uses attractive forces to enforce filters so as to
    coordinate more weight information into lower-rank space. We mathematically and
    empirically prove that after applying our technique, standard LRA methods can
    reconstruct filters using much lower basis and thus result in faster DNNs. The
    effectiveness of our approach is comprehensively evaluated in ResNets, AlexNet,
    and GoogLeNet. In AlexNet, for example, Force Regularization gains 2x speedup
    on modern GPU without accuracy loss and 4.05x speedup on CPU by paying small
    accuracy degradation. Moreover, Force Regularization better initializes the
    low-rank DNNs such that the fine-tuning can converge faster toward higher
    accuracy. The obtained lower-rank DNNs can be further sparsified, proving that
    Force Regularization can be integrated with state-of-the-art sparsity-based
    acceleration methods.

    Feature Analysis and Selection for Training an End-to-End Autonomous Vehicle Controller Using the Deep Learning Approach

    Shun Yang, Wenshuo Wang, Chang Liu, Kevin Deng, J. Karl Hedrick
    Comments: 6 pages, 11 figures, 3 tables, accepted by 2017 IEEE Intelligent Vehicles Symposium
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Systems and Control (cs.SY)

    Deep learning-based approaches have been widely used for training controllers
    for autonomous vehicles due to their powerful ability to approximate nonlinear
    functions or policies. However, the training process usually requires large
    labeled data sets and takes a lot of time. In this paper, we analyze the
    influences of features on the performance of controllers trained using the
    convolutional neural networks (CNNs), which gives a guideline of feature
    selection to reduce computation cost. We collect a large set of data using The
    Open Racing Car Simulator (TORCS) and classify the image features into three
    categories (sky-related, roadside-related, and road-related features).We then
    design two experimental frameworks to investigate the importance of each single
    feature for training a CNN controller.The first framework uses the training
    data with all three features included to train a controller, which is then
    tested with data that has one feature removed to evaluate the feature’s
    effects. The second framework is trained with the data that has one feature
    excluded, while all three features are included in the test data. Different
    driving scenarios are selected to test and analyze the trained controllers
    using the two experimental frameworks. The experiment results show that (1) the
    road-related features are indispensable for training the controller, (2) the
    roadside-related features are useful to improve the generalizability of the
    controller to scenarios with complicated roadside information, and (3) the
    sky-related features have limited contribution to train an end-to-end
    autonomous vehicle controller.

    An Epipolar Line from a Single Pixel

    Tavi Halperin, Michael Werman
    Comments: Submitted to ICCV
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We exploit the following observation to directly find epipolar lines. For a
    pixel p in Image A all pixels corresponding to p in Image B are on the same
    epipolar line, or equivalently the image of the line spanning A’s center and p
    is an epipolar line in B. Computing the epipolar geometry from feature points
    between cameras with very different viewpoints is often error prone as an
    object’s appearance can vary greatly between images. This paper extends earlier
    work based on the dynamics of the scene which was successful in these cases.
    The algorithms introduced here for finding corresponding epipolar lines
    accelerate and robustify previous methods for computing the epipolar geometry
    in dynamic scenes.

    Theory II: Landscape of the Empirical Risk in Deep Learning

    Tomaso Poggio, Qianli Liao
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    Previous theoretical work on deep learning and neural network optimization
    tend to focus on avoiding saddle points and local minima. However, the
    practical observation is that, at least for the most successful Deep
    Convolutional Neural Networks (DCNNs) for visual processing, practitioners can
    always increase the network size to fit the training data (an extreme example
    would be [1]). The most successful DCNNs such as VGG and ResNets are best used
    with a small degree of “overparametrization”. In this work, we characterize
    with a mix of theory and experiments, the landscape of the empirical risk of
    overparametrized DCNNs. We first prove the existence of a large number of
    degenerate global minimizers with zero empirical error (modulo inconsistent
    equations). The zero-minimizers — in the case of classification — have a
    non-zero margin. The same minimizers are degenerate and thus very likely to be
    found by SGD that will furthermore select with higher probability the
    zero-minimizer with larger margin, as discussed in Theory III (to be released).
    We further experimentally explored and visualized the landscape of empirical
    risk of a DCNN on CIFAR-10 during the entire training process and especially
    the global minima. Finally, based on our theoretical and experimental results,
    we propose an intuitive model of the landscape of DCNN’s empirical loss
    surface, which might not be as complicated as people commonly believe.


    Artificial Intelligence

    Rational Choice and Artificial Intelligence

    Tshilidzi Marwala
    Subjects: Artificial Intelligence (cs.AI)

    The theory of rational choice assumes that when people make decisions they do
    so in order to maximize their utility. In order to achieve this goal they ought
    to use all the information available and consider all the choices available to
    choose an optimal choice. This paper investigates what happens when decisions
    are made by artificially intelligent machines in the market rather than human
    beings. Firstly, the expectations of the future are more consistent if they are
    made by an artificially intelligent machine and the decisions are more rational
    and thus marketplace becomes more rational.

    Multiagent Bidirectionally-Coordinated Nets for Learning to Play StarCraft Combat Games

    Peng Peng (1), Quan Yuan (1), Ying Wen (2), Yaodong Yang (2), Zhenkun Tang (1), Haitao Long (1), Jun Wang (2) ((1) Alibaba Group, (2) University College London)
    Comments: 13 pages, 10 figures
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG)

    Real-world artificial intelligence (AI) applications often require multiple
    agents to work in a collaborative effort. Efficient learning for intra-agent
    communication and coordination is an indispensable step towards general AI. In
    this paper, we take StarCraft combat game as the test scenario, where the task
    is to coordinate multiple agents as a team to defeat their enemies. To maintain
    a scalable yet effective communication protocol, we introduce a multiagent
    bidirectionally-coordinated network (BiCNet [‘bIknet]) with a vectorised
    extension of actor-critic formulation. We show that BiCNet can handle different
    types of combats under diverse terrains with arbitrary numbers of AI agents for
    both sides. Our analysis demonstrates that without any supervisions such as
    human demonstrations or labelled data, BiCNet could learn various types of
    coordination strategies that is similar to these of experienced game players.
    Moreover, BiCNet is easily adaptable to the tasks with heterogeneous agents. In
    our experiments, we evaluate our approach against multiple baselines under
    different scenarios; it shows state-of-the-art performance, and possesses
    potential values for large-scale real-world applications.

    Spaceprint: a Mobility-based Fingerprinting Scheme for Public Spaces

    Mitra Baratchi, Geert Heijenk, Maarten van Steen
    Subjects: Artificial Intelligence (cs.AI)

    In this paper, we address the problem of how automated situation-awareness
    can be achieved by learning real-world situations from ubiquitously generated
    mobility data. Without semantic input about the time and space where situations
    take place, this turns out to be a fundamental challenging problem.
    Uncertainties also introduce technical challenges when data is generated in
    irregular time intervals, being mixed with noise, and errors. Purely relying on
    temporal patterns observable in mobility data, in this paper, we propose
    Spaceprint, a fully automated algorithm for finding the repetitive pattern of
    similar situations in spaces. We evaluate this technique by showing how the
    latent variables describing the category, and the actual identity of a space
    can be discovered from the extracted situation patterns. Doing so, we use
    different real-world mobility datasets with data about the presence of mobile
    entities in a variety of spaces. We also evaluate the performance of this
    technique by showing its robustness against uncertainties.

    On Convergence Property of Implicit Self-paced Objective

    Zilu Ma, Shiqi Liu, Deyu Meng
    Comments: 9 pages, 0 figures
    Subjects: Artificial Intelligence (cs.AI)

    Self-paced learning (SPL) is a new methodology that simulates the learning
    principle of humans/animals to start learning easier aspects of a learning
    task, and then gradually take more complex examples into training. This
    new-coming learning regime has been empirically substantiated to be effective
    in various computer vision and pattern recognition tasks. Recently, it has been
    proved that the SPL regime has a close relationship to a implicit self-paced
    objective function. While this implicit objective could provide helpful
    interpretations to the effectiveness, especially the robustness, insights under
    the SPL paradigms, there are still no theoretical results strictly proved to
    verify such relationship. To this issue, in this paper, we provide some
    convergence results on this implicit objective of SPL. Specifically, we prove
    that the learning process of SPL always converges to critical points of this
    implicit objective under some mild conditions. This result verifies the
    intrinsic relationship between SPL and this implicit objective, and makes the
    previous robustness analysis on SPL complete and theoretically rational.

    The Top 10 Topics in Machine Learning Revisited: A Quantitative Meta-Study

    Patrick Glauner, Manxing Du, Victor Paraschiv, Andrey Boytsov, Isabel Lopez Andrade, Jorge Meira, Petko Valtchev, Radu State
    Journal-ref: Proceedings of the 25th European Symposium on Artificial Neural
    Networks, Computational Intelligence and Machine Learning (ESANN 2017)
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

    Which topics of machine learning are most commonly addressed in research?
    This question was initially answered in 2007 by doing a qualitative survey
    among distinguished researchers. In our study, we revisit this question from a
    quantitative perspective. Concretely, we collect 54K abstracts of papers
    published between 2007 and 2016 in leading machine learning journals and
    conferences. We then use machine learning in order to determine the top 10
    topics in machine learning. We not only include models, but provide a holistic
    view across optimization, data, features, etc. This quantitative approach
    allows reducing the bias of surveys. It reveals new and up-to-date insights
    into what the 10 most prolific topics in machine learning research are. This
    allows researchers to identify popular topics as well as new and rising topics
    for their research.

    Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation

    Albert Gatt, Emiel Krahmer
    Comments: 111 pages, 8 figures, 2 tables
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    This paper surveys the current state of the art in Natural Language
    Generation (NLG), defined as the task of generating text or speech from
    non-linguistic input. A survey of NLG is timely in view of the changes that the
    field has undergone over the past decade or so, especially in relation to new
    (usually data-driven) methods, as well as new applications of NLG technology.
    This survey therefore aims to (a) give an up-to-date synthesis of research on
    the core tasks in NLG and the architectures adopted in which such tasks are
    organised; (b) highlight a number of relatively recent research topics that
    have arisen partly as a result of growing synergies between NLG and other areas
    of artificial intelligence; (c) draw attention to the challenges in NLG
    evaluation, relating them to similar challenges faced in other areas of Natural
    Language Processing, with an emphasis on different evaluation methods and the
    relationships between them.

    LabelBank: Revisiting Global Perspectives for Semantic Segmentation

    Hexiang Hu, Zhiwei Deng, Guang-Tong Zhou, Fei Sha, Greg Mori
    Comments: Pre-prints
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Semantic segmentation requires a detailed labeling of image pixels by object
    category. Information derived from local image patches is necessary to describe
    the detailed shape of individual objects. However, this information is
    ambiguous and can result in noisy labels. Global inference of image content can
    instead capture the general semantic concepts present. We advocate that
    holistic inference of image concepts provides valuable information for detailed
    pixel labeling. We propose a generic framework to leverage holistic information
    in the form of a LabelBank for pixel-level segmentation.

    We show the ability of our framework to improve semantic segmentation
    performance in a variety of settings. We learn models for extracting a holistic
    LabelBank from visual cues, attributes, and/or textual descriptions. We
    demonstrate improvements in semantic segmentation accuracy on standard datasets
    across a range of state-of-the-art segmentation architectures and holistic
    inference approaches.

    Bringing Salary Transparency to the World: Computing Robust Compensation Insights via LinkedIn Salary

    Krishnaram Kenthapadi, Stuart Ambler, Liang Zhang, Deepak Agarwal
    Subjects: Social and Information Networks (cs.SI); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)

    The recently launched LinkedIn Salary product has been designed to realize
    the vision of helping the world’s professionals optimize their earning
    potential through salary transparency. We describe the overall design and
    architecture of the salary modeling system underlying this product. We focus on
    the unique data mining challenges in designing and implementing the system, and
    describe the modeling components such as outlier detection and Bayesian
    hierarchical smoothing that help to compute and present robust compensation
    insights to users. We report on extensive evaluation with nearly one year of
    anonymized compensation data collected from over one million LinkedIn users,
    thereby demonstrating the efficacy of the statistical models. We also highlight
    the lessons learned through the deployment of our system at LinkedIn.

    Probabilistic Models for Computerized Adaptive Testing

    Martin Plajner
    Comments: Study for Dissertation Thesis. Supervisor: Jiv{r}’i Vomlel
    Subjects: Computers and Society (cs.CY); Artificial Intelligence (cs.AI)

    In this paper we follow our previous research in the area of Computerized
    Adaptive Testing (CAT). We present three different methods for CAT. One of
    them, the item response theory, is a well established method, while the other
    two, Bayesian and neural networks, are new in the area of educational testing.
    In the first part of this paper, we present the concept of CAT and its
    advantages and disadvantages. We collected data from paper tests performed with
    grammar school students. We provide the summary of data used for our
    experiments in the second part. Next, we present three different model types
    for CAT. They are based on the item response theory, Bayesian networks, and
    neural networks. The general theory associated with each type is briefly
    explained and the utilization of these models for CAT is analyzed. Future
    research is outlined in the concluding part of the paper. It shows many
    interesting research paths that are important not only for CAT but also for
    other areas of artificial intelligence.

    Perception Driven Texture Generation

    Yanhai Gan, Huifang Chi, Ying Gao, Jun Liu, Guoqiang Zhong, Junyu Dong
    Comments: 7 pages, 4 figures, icme2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)

    This paper investigates a novel task of generating texture images from
    perceptual descriptions. Previous work on texture generation focused on either
    synthesis from examples or generation from procedural models. Generating
    textures from perceptual attributes have not been well studied yet. Meanwhile,
    perceptual attributes, such as directionality, regularity and roughness are
    important factors for human observers to describe a texture. In this paper, we
    propose a joint deep network model that combines adversarial training and
    perceptual feature regression for texture generation, while only random noise
    and user-defined perceptual attributes are required as input. In this model, a
    preliminary trained convolutional neural network is essentially integrated with
    the adversarial framework, which can drive the generated textures to possess
    given perceptual attributes. An important aspect of the proposed model is that,
    if we change one of the input perceptual features, the corresponding appearance
    of the generated textures will also be changed. We design several experiments
    to validate the effectiveness of the proposed method. The results show that the
    proposed method can produce high quality texture images with desired perceptual
    properties.

    Inverse Reinforcement Learning from Summary Data

    Antti Kangasrääsiö, Samuel Kaski
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    Inverse reinforcement learning (IRL) aims to explain observed complex
    behavior by fitting reinforcement learning models to behavioral data. However,
    traditional IRL methods are only applicable when the observations are in the
    form of state-action paths. This is a problem in many real-world modelling
    settings, where only more limited observations are easily available. To address
    this issue, we extend the traditional IRL problem formulation. We call this new
    formulation the inverse reinforcement learning from summary data (IRL-SD)
    problem, where instead of state-action paths, only summaries of the paths are
    observed. We propose exact and approximate methods for both maximum likelihood
    and full posterior estimation for IRL-SD problems. Through case studies we
    compare these methods, demonstrating that the approximate methods can be used
    to solve moderate-sized IRL-SD problems in reasonable time.


    Information Retrieval

    Is Climate Change Controversial? Modeling Controversy as Contention Within Populations

    Shiri Dori-Hacohen, Myungha Jang, James Allan
    Subjects: Information Retrieval (cs.IR); Social and Information Networks (cs.SI); Physics and Society (physics.soc-ph)

    A growing body of research focuses on computationally detecting controversial
    topics and understanding the stances people hold on them. Yet gaps remain in
    our theoretical and practical understanding of how to define controversy, how
    it manifests, and how to measure it. In this paper, we introduce a novel
    measure we call “contention”, defined with respect to a topic and a population.
    We model contention from a mathematical standpoint. We validate our model by
    examining a diverse set of sources: real-world polling data sets, actual voter
    data, and Twitter coverage on several topics. In our publicly-released Twitter
    data set of nearly 100M tweets, we examine several topics such as Brexit, the
    2016 U.S. Elections, and “The Dress”, and cross-reference them with other
    sources. We demonstrate that the contention measure holds explanatory power for
    a wide variety of observed phenomena, such as controversies over climate change
    and other topics that are well within scientific consensus. Finally, we
    re-examine the notion of controversy, and present a theoretical framework that
    defines it in terms of population. We present preliminary evidence suggesting
    that contention is one dimension of controversy, along with others, such as
    “importance”. Our new contention measure, along with the hypothesized model of
    controversy, suggest several avenues for future work in this emerging
    interdisciplinary research area.

    Bringing Salary Transparency to the World: Computing Robust Compensation Insights via LinkedIn Salary

    Krishnaram Kenthapadi, Stuart Ambler, Liang Zhang, Deepak Agarwal
    Subjects: Social and Information Networks (cs.SI); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)

    The recently launched LinkedIn Salary product has been designed to realize
    the vision of helping the world’s professionals optimize their earning
    potential through salary transparency. We describe the overall design and
    architecture of the salary modeling system underlying this product. We focus on
    the unique data mining challenges in designing and implementing the system, and
    describe the modeling components such as outlier detection and Bayesian
    hierarchical smoothing that help to compute and present robust compensation
    insights to users. We report on extensive evaluation with nearly one year of
    anonymized compensation data collected from over one million LinkedIn users,
    thereby demonstrating the efficacy of the statistical models. We also highlight
    the lessons learned through the deployment of our system at LinkedIn.


    Computation and Language

    Automatic Argumentative-Zoning Using Word2vec

    Haixia Liu
    Comments: 13 pages; 5 tables
    Subjects: Computation and Language (cs.CL)

    In comparison with document summarization on the articles from social media
    and newswire, argumentative zoning (AZ) is an important task in scientific
    paper analysis. Traditional methodology to carry on this task relies on feature
    engineering from different levels. In this paper, three models of generating
    sentence vectors for the task of sentence classification were explored and
    compared. The proposed approach builds sentence representations using learned
    embeddings based on neural network. The learned word embeddings formed a
    feature space, to which the examined sentence is mapped to. Those features are
    input into the classifiers for supervised classification. Using
    10-cross-validation scheme, evaluation was conducted on the
    Argumentative-Zoning (AZ) annotated articles. The results showed that simply
    averaging the word vectors in a sentence works better than the paragraph to
    vector algorithm and by integrating specific cuewords into the loss function of
    the neural network can improve the classification performance. In comparison
    with the hand-crafted features, the word2vec method won for most of the
    categories. However, the hand-crafted features showed their strength on
    classifying some of the categories.

    Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model

    Yuxuan Wang, RJ Skerry-Ryan, Daisy Stanton, Yonghui Wu, Ron J. Weiss, Navdeep Jaitly, Zongheng Yang, Ying Xiao, Zhifeng Chen, Samy Bengio, Quoc Le, Yannis Agiomyrgiannakis, Rob Clark, Rif A. Saurous
    Comments: Submitted to Interspeech 2017
    Subjects: Computation and Language (cs.CL); Learning (cs.LG); Sound (cs.SD)

    A text-to-speech synthesis system typically consists of multiple stages, such
    as a text analysis frontend, an acoustic model and an audio synthesis module.
    Building these components often requires extensive domain expertise and may
    contain brittle design choices. In this paper, we present Tacotron, an
    end-to-end generative text-to-speech model that synthesizes speech directly
    from characters. Given <text, audio> pairs, the model can be trained completely
    from scratch with random initialization. We present several key techniques to
    make the sequence-to-sequence framework perform well for this challenging task.
    Tacotron achieves a 3.82 subjective 5-scale mean opinion score on US English,
    outperforming a production parametric system in terms of naturalness. In
    addition, since Tacotron generates speech at the frame level, it’s
    substantially faster than sample-level autoregressive methods.

    A Short Review of Ethical Challenges in Clinical Natural Language Processing

    Simon Šuster, Stéphan Tulkens, Walter Daelemans
    Comments: First Workshop on Ethics in Natural Language Processing (EACL’17)
    Subjects: Computation and Language (cs.CL); Computers and Society (cs.CY)

    Clinical NLP has an immense potential in contributing to how clinical
    practice will be revolutionized by the advent of large scale processing of
    clinical records. However, this potential has remained largely untapped due to
    slow progress primarily caused by strict data access policies for researchers.
    In this paper, we discuss the concern for privacy and the measures it entails.
    We also suggest sources of less sensitive data. Finally, we draw attention to
    biases that can compromise the validity of empirical research and lead to
    socially harmful applications.

    Hierarchical Classification for Spoken Arabic Dialect Identification using Prosody: Case of Algerian Dialects

    Soumia Bougrine, Hadda Cherroun, Djelloul Ziadi
    Comments: 33 pages, 7 figures
    Subjects: Computation and Language (cs.CL)

    In daily communications, Arabs use local dialects which are hard to identify
    automatically using conventional classification methods. The dialect
    identification challenging task becomes more complicated when dealing with an
    under-resourced dialects belonging to a same county/region. In this paper, we
    start by analyzing statistically Algerian dialects in order to capture their
    specificities related to prosody information which are extracted at utterance
    level after a coarse-grained consonant/vowel segmentation. According to these
    analysis findings, we propose a Hierarchical classification approach for spoken
    Arabic algerian Dialect IDentification (HADID). It takes advantage from the
    fact that dialects have an inherent property of naturally structured into
    hierarchy. Within HADID, a top-down hierarchical classification is applied, in
    which we use Deep Neural Networks (DNNs) method to build a local classifier for
    every parent node into the hierarchy dialect structure. Our framework is
    implemented and evaluated on Algerian Arabic dialects corpus. Whereas, the
    hierarchy dialect structure is deduced from historic and linguistic knowledges.
    The results reveal that within {HD}, the best classifier is DNNs compared to
    Support Vector Machine. In addition, compared with a baseline Flat
    classification system, our HADID gives an improvement of 63.5% in term of
    precision. Furthermore, overall results evidence the suitability of our
    prosody-based HADID for speaker independent dialect identification while
    requiring less than 6s test utterances.

    Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation

    Albert Gatt, Emiel Krahmer
    Comments: 111 pages, 8 figures, 2 tables
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    This paper surveys the current state of the art in Natural Language
    Generation (NLG), defined as the task of generating text or speech from
    non-linguistic input. A survey of NLG is timely in view of the changes that the
    field has undergone over the past decade or so, especially in relation to new
    (usually data-driven) methods, as well as new applications of NLG technology.
    This survey therefore aims to (a) give an up-to-date synthesis of research on
    the core tasks in NLG and the architectures adopted in which such tasks are
    organised; (b) highlight a number of relatively recent research topics that
    have arisen partly as a result of growing synergies between NLG and other areas
    of artificial intelligence; (c) draw attention to the challenges in NLG
    evaluation, relating them to similar challenges faced in other areas of Natural
    Language Processing, with an emphasis on different evaluation methods and the
    relationships between them.

    A Deep Compositional Framework for Human-like Language Acquisition in Virtual Environment

    Haonan Yu, Haichao Zhang, Wei Xu
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    We tackle a task where an agent learns to navigate in a 2D maze-like
    environment called XWORLD. In each session, the agent perceives a sequence of
    raw-pixel frames, a natural language command issued by a teacher, and a set of
    rewards. The agent learns the teacher’s language from scratch in a grounded and
    compositional manner, such that after training it is able to correctly execute
    zero-shot commands: 1) the combination of words in the command never appeared
    before, and/or 2) the command contains new object concepts that are learned
    from another task but never learned from navigation. Our deep framework for the
    agent is trained end to end: it learns simultaneously the visual
    representations of the environment, the syntax and semantics of the language,
    and the action module that outputs actions. The zero-shot learning capability
    of our framework results from its compositionality and modularity with
    parameter tying. We visualize the intermediate outputs of the framework,
    demonstrating that the agent truly understands how to solve the problem. We
    believe that our results provide some preliminary insights on how to train an
    agent with similar abilities in a 3D environment.

    Semi-Supervised Affective Meaning Lexicon Expansion Using Semantic and Distributed Word Representations

    Areej Alhothali, Jesse Hoey
    Subjects: Computation and Language (cs.CL)

    In this paper, we propose an extension to graph-based sentiment lexicon
    induction methods by incorporating distributed and semantic word
    representations in building the similarity graph to expand a three-dimensional
    sentiment lexicon. We also implemented and evaluated the label propagation
    using four different word representations and similarity metrics. Our
    comprehensive evaluation of the four approaches was performed on a single data
    set, demonstrating that all four methods can generate a significant number of
    new sentiment assignments with high accuracy. The highest correlations
    (tau=0.51) and the lowest error (mean absolute error < 1.1%), obtained by
    combining both the semantic and the distributional features, outperformed the
    distributional-based and semantic-based label-propagation models and approached
    a supervised algorithm.

    Learning Similarity Function for Pronunciation Variations

    Einat Naaman, Yossi Adi, Joseph Keshet
    Subjects: Computation and Language (cs.CL)

    A significant source of errors in Automatic Speech Recognition (ASR) systems
    is due to pronunciation variations which occur in spontaneous and
    conversational speech. Usually ASR systems use a finite lexicon that provides
    one or more pronunciations for each word. In this paper, we focus on learning a
    similarity function between two pronunciations. The pronunciation can be the
    canonical and the surface pronunciations of the same word or it can be two
    surface pronunciations of different words. This task generalizes problems such
    as lexical access (the problem of learning the mapping between words and their
    possible pronunciations), and defining word neighborhoods. It can also be used
    to dynamically increase the size of the pronunciation lexicon, or in predicting
    ASR errors. We propose two methods, which are based on recurrent neural
    networks, to learn the similarity function. The first is based on binary
    classification, and the second is based on learning the ranking of the
    pronunciations. We demonstrate the efficiency of our approach on the task of
    lexical access using a subset from the Switchboard conversational speech
    corpus. Results suggest that our method is superior to previous methods which
    are based on graphical Bayesian methods.

    Developpement de Methodes Automatiques pour la Reutilisation des Composants Logiciels

    Kouakou Ive Arsene Koffi, Konan Marcellin Brou, Souleymane Oumtanaga
    Comments: in French
    Subjects: Software Engineering (cs.SE); Computation and Language (cs.CL); Databases (cs.DB)

    The large amount of information and the increasing complexity of applications
    constrain developers to have stand-alone and reusable components from libraries
    and component markets.Our approach consists in developing methods to evaluate
    the quality of the software component of these libraries, on the one hand and
    moreover to optimize the financial cost and the adaptation’s time of these
    selected components. Our objective function defines a metric that maximizes the
    value of the software component quality by minimizing the financial cost and
    maintenance time. This model should make it possible to classify the components
    and order them in order to choose the most optimized.

    MOTS-CLES : d{‘e}veloppement de m{‘e}thode, r{‘e}utilisation, composants
    logiciels, qualit{‘e} de composant

    KEYWORDS:method development, reuse, software components, component quality .


    Distributed, Parallel, and Cluster Computing

    Exploiting Data Reduction Principles in Cloud-Based Data Management for Cryo-Image Data

    Kashish Ara Shakil, Ari Ora, Mansaf Alam, Shabih Shakeel
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Cloud computing is a cost-effective way for start-up life sciences
    laboratories to store and manage their data. However, in many instances the
    data stored over the cloud could be redundant which makes cloud-based data
    management inefficient and costly because one has to pay for every byte of data
    stored over the cloud. Here, we tested efficient management of data generated
    by an electron cryo microscopy (cryoEM) lab on a cloud-based environment. The
    test data was obtained from cryoEM repository EMPIAR. All the images were
    subjected to an in-house parallelized version of principal component analysis.
    An efficient cloud-based MapReduce modality was used for parallelization. We
    showed that large data in order of terabytes could be efficiently reduced to
    its minimal essential self in a cost-effective scalable manner. Furthermore,
    on-spot instance on Amazon EC2 was shown to reduce costs by a margin of about
    27 percent. This approach could be scaled to data of any large volume and type.

    Admire framework: Distributed data mining on data grid platforms

    Nhien-An Le-Khac, M-Tahar Kechadi, Joe Carthy
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    In this paper, we present the ADMIRE architecture; a new framework for
    developing novel and innovative data mining techniques to deal with very large
    and distributed heterogeneous datasets in both commercial and academic
    applications. The main ADMIRE components are detailed as well as its interfaces
    allowing the user to efficiently develop and implement their data mining
    applications techniques on a Grid platform such as Globus ToolKit, DGET, etc.

    Accelerating gravitational microlensing simulations using the Xeon Phi coprocessor

    Bin Chen, Ronald Kantowski, Xinyu Dai, Eddie Baron, Paul Van der Mark
    Comments: 18 pages, 3 figures, accepted by the Astronomy & Computing
    Subjects: Instrumentation and Methods for Astrophysics (astro-ph.IM); High Energy Astrophysical Phenomena (astro-ph.HE); Distributed, Parallel, and Cluster Computing (cs.DC)

    Recently Graphics Processing Units (GPUs) have been used to speed up very
    CPU-intensive gravitational microlensing simulations. In this work, we use the
    Xeon Phi coprocessor to accelerate such simulations and compare its performance
    on a microlensing code with that of NVIDIA’s GPUs. For the selected set of
    parameters evaluated in our experiment, we find that the speedup by Intel’s
    Knights Corner coprocessor is comparable to that by NVIDIA’s Fermi family of
    GPUs with compute capability 2.0, but less significant than GPUs with higher
    compute capabilities such as the Kepler. However, the very recently released
    second generation Xeon Phi, Knights Landing, is about 5.8 times faster than the
    Knights Corner, and about 2.9 times faster than the Kepler GPU used in our
    simulations. We conclude that the Xeon Phi is a very promising alternative to
    GPUs for modern high performance microlensing simulations.


    Learning

    The Top 10 Topics in Machine Learning Revisited: A Quantitative Meta-Study

    Patrick Glauner, Manxing Du, Victor Paraschiv, Andrey Boytsov, Isabel Lopez Andrade, Jorge Meira, Petko Valtchev, Radu State
    Journal-ref: Proceedings of the 25th European Symposium on Artificial Neural
    Networks, Computational Intelligence and Machine Learning (ESANN 2017)
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

    Which topics of machine learning are most commonly addressed in research?
    This question was initially answered in 2007 by doing a qualitative survey
    among distinguished researchers. In our study, we revisit this question from a
    quantitative perspective. Concretely, we collect 54K abstracts of papers
    published between 2007 and 2016 in leading machine learning journals and
    conferences. We then use machine learning in order to determine the top 10
    topics in machine learning. We not only include models, but provide a holistic
    view across optimization, data, features, etc. This quantitative approach
    allows reducing the bias of surveys. It reveals new and up-to-date insights
    into what the 10 most prolific topics in machine learning research are. This
    allows researchers to identify popular topics as well as new and rising topics
    for their research.

    Learning Inverse Mapping by Autoencoder based Generative Adversarial Nets

    Junyu Luo
    Comments: 9 pages, 6 figures
    Subjects: Learning (cs.LG)

    Generative Adversarial Net has shown its great ability in generating samples.
    The inverse mapping of generator also contains a great value. Some works have
    been developed to construct the inverse function of generator. However, the
    existing ways of training the inverse model of GANs have many shortcomings. In
    this paper, we propose a new approach of training the inverse model of
    generator by regarding a pre-trained generator as the decoder part of an
    autoencoder network. This model does not directly minimize the difference
    between original input and inverse output, but try to minimize the difference
    between the generated data by using original input and inverse output. This
    strategy overcome the difficulty in training a inverse model of a non
    one-to-one function. And the inverse mapping we learned can be directly used in
    image searching and processing.

    Time Series Forecasting using RNNs: an Extended Attention Mechanism to Model Periods and Handle Missing Values

    Yagmur G. Cinar, Hamid Mirisaee, Parantapa Goswami, Eric Gaussier, Ali Ait-Bachir, Vadim Strijov
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    In this paper, we study the use of recurrent neural networks (RNNs) for
    modeling and forecasting time series. We first illustrate the fact that
    standard sequence-to-sequence RNNs neither capture well periods in time series
    nor handle well missing values, even though many real life times series are
    periodic and contain missing values. We then propose an extended attention
    mechanism that can be deployed on top of any RNN and that is designed to
    capture periods and make the RNN more robust to missing values. We show the
    effectiveness of this novel model through extensive experiments with multiple
    univariate and multivariate datasets.

    Cohesion-based Online Actor-Critic Reinforcement Learning for mHealth Intervention

    Feiyun Zhu, Peng Liao, Xinliang Zhu, Yaowen Yao, Junzhou Huang
    Subjects: Learning (cs.LG)

    In the wake of the vast population of smart device users worldwide, mobile
    health (mHealth) technologies are hopeful to generate positive and wide
    influence on people’s health. They are able to provide flexible, affordable and
    portable health guides to device users. Current online decision-making methods
    for mHealth assume that the users are completely heterogeneous. They share no
    information among users and learn a separate policy for each user. However,
    data for each user is very limited in size to support the separate online
    learning, leading to unstable policies that contain lots of variances. Besides,
    we find the truth that a user may be similar with some, but not all, users, and
    connected users tend to have similar behaviors. In this paper, we propose a
    network cohesion constrained (actor-critic) Reinforcement Learning (RL) method
    for mHealth. The goal is to explore how to share information among similar
    users to better convert the limited user information into sharper learned
    policies. To the best of our knowledge, this is the first online actor-critic
    RL for mHealth and first network cohesion constrained (actor-critic) RL method
    in all applications. The network cohesion is important to derive effective
    policies. We come up with a novel method to learn the network by using the warm
    start trajectory, which directly reflects the users’ property. The optimization
    of our model is difficult and very different from the general supervised
    learning due to the indirect observation of values. As a contribution, we
    propose two algorithms for the proposed online RLs. Apart from mHealth, the
    proposed methods can be easily applied or adapted to other health-related
    tasks. Extensive experiment results on the HeartSteps dataset demonstrates that
    in a variety of parameter settings, the proposed two methods obtain obvious
    improvements over the state-of-the-art methods.

    Probabilistic Line Searches for Stochastic Optimization

    Maren Mahsereci, Philipp Hennig
    Comments: Extended version of the NIPS ’15 conference paper, includes detailed pseudo-code, 51 pages, 30 figures
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    In deterministic optimization, line searches are a standard tool ensuring
    stability and efficiency. Where only stochastic gradients are available, no
    direct equivalent has so far been formulated, because uncertain gradients do
    not allow for a strict sequence of decisions collapsing the search space. We
    construct a probabilistic line search by combining the structure of existing
    deterministic methods with notions from Bayesian optimization. Our method
    retains a Gaussian process surrogate of the univariate optimization objective,
    and uses a probabilistic belief over the Wolfe conditions to monitor the
    descent. The algorithm has very low computational cost, and no user-controlled
    parameters. Experiments show that it effectively removes the need to define a
    learning rate for stochastic gradient descent.

    Efficient Private ERM for Smooth Objectives

    Jiaqi Zhang, Kai Zheng, Wenlong Mou, Liwei Wang
    Subjects: Learning (cs.LG); Data Structures and Algorithms (cs.DS); Machine Learning (stat.ML)

    In this paper, we consider efficient differentially private empirical risk
    minimization from the viewpoint of optimization algorithms. For strongly convex
    and smooth objectives, we prove that gradient descent with output perturbation
    not only achieves nearly optimal utility, but also significantly improves the
    running time of previous state-of-the-art private optimization algorithms, for
    both (epsilon)-DP and ((epsilon, delta))-DP. For non-convex but smooth
    objectives, we propose an RRPSGD (Random Round Private Stochastic Gradient
    Descent) algorithm, which provably converges to a stationary point with privacy
    guarantee. Besides the expected utility bounds, we also provide guarantees in
    high probability form. Experiments demonstrate that our algorithm consistently
    outperforms existing method in both utility and running time.

    Grouped Convolutional Neural Networks for Multivariate Time Series

    Subin Yi, Janghoon Ju, Man-Ki Yoon, Jaesik Choi
    Subjects: Learning (cs.LG)

    Analyzing multivariate time series data is important for many applications
    such as automated control, fault diagnosis and anomaly detection. One of the
    key challenges is to learn latent features automatically from dynamically
    changing multivariate input. In visual recognition tasks, convolutional neural
    networks (CNNs) have been successful to learn generalized feature extractors
    with shared parameters over the spatial domain. However, when high-dimensional
    multivariate time series is given, designing an appropriate CNN model structure
    becomes challenging because the kernels may need to be extended through the
    full dimension of the input volume. To address this issue, we present two
    structure learning algorithms for deep CNN models. Our algorithms exploit the
    covariance structure over multiple time series to partition input volume into
    groups. The first algorithm learns the group CNN structures explicitly by
    clustering individual input sequences. The second algorithm learns the group
    CNN structures implicitly from the error backpropagation. In experiments with
    two real-world datasets, we demonstrate that our group CNNs outperform existing
    CNN based regression methods.

    Solar Power Forecasting Using Support Vector Regression

    Mohamed Abuella, Badrul Chowdhury
    Comments: This works has been presented in the American Society for Engineering Management, International Annual Conference, 2016
    Subjects: Learning (cs.LG); Computational Engineering, Finance, and Science (cs.CE)

    Generation and load balance is required in the economic scheduling of
    generating units in the smart grid. Variable energy generations, particularly
    from wind and solar energy resources, are witnessing a rapid boost, and, it is
    anticipated that with a certain level of their penetration, they can become
    noteworthy sources of uncertainty. As in the case of load demand, energy
    forecasting can also be used to mitigate some of the challenges that arise from
    the uncertainty in the resource. While wind energy forecasting research is
    considered mature, solar energy forecasting is witnessing a steadily growing
    attention from the research community. This paper presents a support vector
    regression model to produce solar power forecasts on a rolling basis for 24
    hours ahead over an entire year, to mimic the practical business of energy
    forecasting. Twelve weather variables are considered from a high-quality
    benchmark dataset and new variables are extracted. The added value of the heat
    index and wind speed as additional variables to the model is studied across
    different seasons. The support vector regression model performance is compared
    with artificial neural networks and multiple linear regression models for
    energy forecasting.

    Multi-Scale Dense Convolutional Networks for Efficient Prediction

    Gao Huang, Danlu Chen, Tianhong Li, Felix Wu, Laurens van der Maaten, Kilian Q. Weinberger
    Subjects: Learning (cs.LG)

    This paper studies convolutional networks that require limited computational
    resources at test time. We develop a new network architecture that performs on
    par with state-of-the-art convolutional networks, whilst facilitating
    prediction in two settings: (1) an anytime-prediction setting in which the
    network’s prediction for one example is progressively updated, facilitating the
    output of a prediction at any time; and (2) a batch computational budget
    setting in which a fixed amount of computation is available to classify a set
    of examples that can be spent unevenly across ‘easier’ and ‘harder’ examples.
    Our network architecture uses multi-scale convolutions and progressively
    growing feature representations, which allows for the training of multiple
    classifiers at intermediate layers of the network. Experiments on three
    image-classification datasets demonstrate the efficacy of our architecture, in
    particular, when measured in terms of classification accuracy as a function of
    the amount of compute available.

    Risk-Sensitive Inverse Reinforcement Learning via Gradient Methods

    Lillian J. Ratliff, Eric Mazumdar
    Subjects: Learning (cs.LG)

    We address the problem of inverse reinforcement learning in Markov decision
    processes where the agent is risk-sensitive. In particular, we model
    risk-sensitivity in a reinforcement learning framework by making use of models
    of human decision-making having their origins in behavioral psychology,
    behavioral economics, and neuroscience. We propose a gradient-based inverse
    reinforcement learning algorithm that minimizes a loss function defined on the
    observed behavior. We demonstrate the performance of the proposed technique on
    two examples, the first of which is the canonical Grid World example and the
    second of which is a Markov decision process modeling passengers decisions
    regarding ride-sharing. In the latter, we use pricing and travel time data from
    a ride-sharing company to construct the transition probabilities and rewards of
    the Markov decision process.

    Theory II: Landscape of the Empirical Risk in Deep Learning

    Tomaso Poggio, Qianli Liao
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    Previous theoretical work on deep learning and neural network optimization
    tend to focus on avoiding saddle points and local minima. However, the
    practical observation is that, at least for the most successful Deep
    Convolutional Neural Networks (DCNNs) for visual processing, practitioners can
    always increase the network size to fit the training data (an extreme example
    would be [1]). The most successful DCNNs such as VGG and ResNets are best used
    with a small degree of “overparametrization”. In this work, we characterize
    with a mix of theory and experiments, the landscape of the empirical risk of
    overparametrized DCNNs. We first prove the existence of a large number of
    degenerate global minimizers with zero empirical error (modulo inconsistent
    equations). The zero-minimizers — in the case of classification — have a
    non-zero margin. The same minimizers are degenerate and thus very likely to be
    found by SGD that will furthermore select with higher probability the
    zero-minimizer with larger margin, as discussed in Theory III (to be released).
    We further experimentally explored and visualized the landscape of empirical
    risk of a DCNN on CIFAR-10 during the entire training process and especially
    the global minima. Finally, based on our theoretical and experimental results,
    we propose an intuitive model of the landscape of DCNN’s empirical loss
    surface, which might not be as complicated as people commonly believe.

    Disruptive Event Classification using PMU Data in Distribution Networks

    Iman Niazazari, Hanif Livani
    Comments: 5 pages, 5 figures, conference
    Subjects: Learning (cs.LG); Systems and Control (cs.SY)

    Proliferation of advanced metering devices with high sampling rates in
    distribution grids, e.g., micro-phasor measurement units ({mu}PMU), provides
    unprecedented potentials for wide-area monitoring and diagnostic applications,
    e.g., situational awareness, health monitoring of distribution assets.
    Unexpected disruptive events interrupting the normal operation of assets in
    distribution grids can eventually lead to permanent failure with expensive
    replacement cost over time. Therefore, disruptive event classification provides
    useful information for preventive maintenance of the assets in distribution
    networks. Preventive maintenance provides wide range of benefits in terms of
    time, avoiding unexpected outages, maintenance crew utilization, and equipment
    replacement cost. In this paper, a PMU-data-driven framework is proposed for
    classification of disruptive events in distribution networks. The two
    disruptive events, i.e., malfunctioned capacitor bank switching and
    malfunctioned regulator on-load tap changer (OLTC) switching are considered and
    distinguished from the normal abrupt load change in distribution grids. The
    performance of the proposed framework is verified using the simulation of the
    events in the IEEE 13-bus distribution network. The event classification is
    formulated using two different algorithms as; i) principle component analysis
    (PCA) together with multi-class support vector machine (SVM), and ii)
    autoencoder along with softmax classifier. The results demonstrate the
    effectiveness of the proposed algorithms and satisfactory classification
    accuracies.

    Collective Anomaly Detection based on Long Short Term Memory Recurrent Neural Network

    Loic Bontemps, Van Loi Cao, James McDermott, Nhien-An Le-Khac
    Subjects: Learning (cs.LG); Cryptography and Security (cs.CR)

    Intrusion detection for computer network systems becomes one of the most
    critical tasks for network administrators today. It has an important role for
    organizations, governments and our society due to its valuable resources on
    computer networks. Traditional misuse detection strategies are unable to detect
    new and unknown intrusion. Besides, anomaly detection in network security is
    aim to distinguish between illegal or malicious events and normal behavior of
    network systems. Anomaly detection can be considered as a classification
    problem where it builds models of normal network behavior, which it uses to
    detect new patterns that significantly deviate from the model. Most of the cur-
    rent research on anomaly detection is based on the learning of normally and
    anomaly behaviors. They do not take into account the previous, re- cent events
    to detect the new incoming one. In this paper, we propose a real time
    collective anomaly detection model based on neural network learning and feature
    operating. Normally a Long Short Term Memory Recurrent Neural Network (LSTM
    RNN) is trained only on normal data and it is capable of predicting several
    time steps ahead of an input. In our approach, a LSTM RNN is trained with
    normal time series data before performing a live prediction for each time step.
    Instead of considering each time step separately, the observation of prediction
    errors from a certain number of time steps is now proposed as a new idea for
    detecting collective anomalies. The prediction errors from a number of the
    latest time steps above a threshold will indicate a collective anomaly. The
    model is built on a time series version of the KDD 1999 dataset. The
    experiments demonstrate that it is possible to offer reliable and efficient for
    collective anomaly detection.

    Inverse Reinforcement Learning from Summary Data

    Antti Kangasrääsiö, Samuel Kaski
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    Inverse reinforcement learning (IRL) aims to explain observed complex
    behavior by fitting reinforcement learning models to behavioral data. However,
    traditional IRL methods are only applicable when the observations are in the
    form of state-action paths. This is a problem in many real-world modelling
    settings, where only more limited observations are easily available. To address
    this issue, we extend the traditional IRL problem formulation. We call this new
    formulation the inverse reinforcement learning from summary data (IRL-SD)
    problem, where instead of state-action paths, only summaries of the paths are
    observed. We propose exact and approximate methods for both maximum likelihood
    and full posterior estimation for IRL-SD problems. Through case studies we
    compare these methods, demonstrating that the approximate methods can be used
    to solve moderate-sized IRL-SD problems in reasonable time.

    Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model

    Yuxuan Wang, RJ Skerry-Ryan, Daisy Stanton, Yonghui Wu, Ron J. Weiss, Navdeep Jaitly, Zongheng Yang, Ying Xiao, Zhifeng Chen, Samy Bengio, Quoc Le, Yannis Agiomyrgiannakis, Rob Clark, Rif A. Saurous
    Comments: Submitted to Interspeech 2017
    Subjects: Computation and Language (cs.CL); Learning (cs.LG); Sound (cs.SD)

    A text-to-speech synthesis system typically consists of multiple stages, such
    as a text analysis frontend, an acoustic model and an audio synthesis module.
    Building these components often requires extensive domain expertise and may
    contain brittle design choices. In this paper, we present Tacotron, an
    end-to-end generative text-to-speech model that synthesizes speech directly
    from characters. Given <text, audio> pairs, the model can be trained completely
    from scratch with random initialization. We present several key techniques to
    make the sequence-to-sequence framework perform well for this challenging task.
    Tacotron achieves a 3.82 subjective 5-scale mean opinion score on US English,
    outperforming a production parametric system in terms of naturalness. In
    addition, since Tacotron generates speech at the frame level, it’s
    substantially faster than sample-level autoregressive methods.

    Priv'IT: Private and Sample Efficient Identity Testing

    Bryan Cai, Constantinos Daskalakis, Gautam Kamath
    Subjects: Data Structures and Algorithms (cs.DS); Cryptography and Security (cs.CR); Information Theory (cs.IT); Learning (cs.LG); Statistics Theory (math.ST)

    We develop differentially private hypothesis testing methods for the small
    sample regime. Given a sample (cal D) from a categorical distribution (p) over
    some domain (Sigma), an explicitly described distribution (q) over (Sigma),
    some privacy parameter (varepsilon), accuracy parameter (alpha), and
    requirements (eta_{
    m I}) and (eta_{
    m II}) for the type I and type II
    errors of our test, the goal is to distinguish between (p=q) and
    (d_{
    m{TV}}(p,q) geq alpha).

    We provide theoretical bounds for the sample size (|{cal D}|) so that our
    method both satisfies ((varepsilon,0))-differential privacy, and guarantees
    (eta_{
    m I}) and (eta_{
    m II}) type I and type II errors. We show that
    differential privacy may come for free in some regimes of parameters, and we
    always beat the sample complexity resulting from running the (chi^2)-test with
    noisy counts, or standard approaches such as repetition for endowing
    non-private (chi^2)-style statistics with differential privacy guarantees. We
    experimentally compare the sample complexity of our method to that of recently
    proposed methods for private hypothesis testing.

    Multiagent Bidirectionally-Coordinated Nets for Learning to Play StarCraft Combat Games

    Peng Peng (1), Quan Yuan (1), Ying Wen (2), Yaodong Yang (2), Zhenkun Tang (1), Haitao Long (1), Jun Wang (2) ((1) Alibaba Group, (2) University College London)
    Comments: 13 pages, 10 figures
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG)

    Real-world artificial intelligence (AI) applications often require multiple
    agents to work in a collaborative effort. Efficient learning for intra-agent
    communication and coordination is an indispensable step towards general AI. In
    this paper, we take StarCraft combat game as the test scenario, where the task
    is to coordinate multiple agents as a team to defeat their enemies. To maintain
    a scalable yet effective communication protocol, we introduce a multiagent
    bidirectionally-coordinated network (BiCNet [‘bIknet]) with a vectorised
    extension of actor-critic formulation. We show that BiCNet can handle different
    types of combats under diverse terrains with arbitrary numbers of AI agents for
    both sides. Our analysis demonstrates that without any supervisions such as
    human demonstrations or labelled data, BiCNet could learn various types of
    coordination strategies that is similar to these of experienced game players.
    Moreover, BiCNet is easily adaptable to the tasks with heterogeneous agents. In
    our experiments, we evaluate our approach against multiple baselines under
    different scenarios; it shows state-of-the-art performance, and possesses
    potential values for large-scale real-world applications.

    LabelBank: Revisiting Global Perspectives for Semantic Segmentation

    Hexiang Hu, Zhiwei Deng, Guang-Tong Zhou, Fei Sha, Greg Mori
    Comments: Pre-prints
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Semantic segmentation requires a detailed labeling of image pixels by object
    category. Information derived from local image patches is necessary to describe
    the detailed shape of individual objects. However, this information is
    ambiguous and can result in noisy labels. Global inference of image content can
    instead capture the general semantic concepts present. We advocate that
    holistic inference of image concepts provides valuable information for detailed
    pixel labeling. We propose a generic framework to leverage holistic information
    in the form of a LabelBank for pixel-level segmentation.

    We show the ability of our framework to improve semantic segmentation
    performance in a variety of settings. We learn models for extracting a holistic
    LabelBank from visual cues, attributes, and/or textual descriptions. We
    demonstrate improvements in semantic segmentation accuracy on standard datasets
    across a range of state-of-the-art segmentation architectures and holistic
    inference approaches.

    A Deep Compositional Framework for Human-like Language Acquisition in Virtual Environment

    Haonan Yu, Haichao Zhang, Wei Xu
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    We tackle a task where an agent learns to navigate in a 2D maze-like
    environment called XWORLD. In each session, the agent perceives a sequence of
    raw-pixel frames, a natural language command issued by a teacher, and a set of
    rewards. The agent learns the teacher’s language from scratch in a grounded and
    compositional manner, such that after training it is able to correctly execute
    zero-shot commands: 1) the combination of words in the command never appeared
    before, and/or 2) the command contains new object concepts that are learned
    from another task but never learned from navigation. Our deep framework for the
    agent is trained end to end: it learns simultaneously the visual
    representations of the environment, the syntax and semantics of the language,
    and the action module that outputs actions. The zero-shot learning capability
    of our framework results from its compositionality and modularity with
    parameter tying. We visualize the intermediate outputs of the framework,
    demonstrating that the agent truly understands how to solve the problem. We
    believe that our results provide some preliminary insights on how to train an
    agent with similar abilities in a 3D environment.

    Deceiving Google's Cloud Video Intelligence API Built for Summarizing Videos

    Hossein Hosseini, Baicen Xiao, Radha Poovendran
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Despite the rapid progress of the techniques for image classification, video
    annotation has remained a challenging task. Automated video annotation would be
    a breakthrough technology, enabling users to search within the videos.
    Recently, Google introduced the Cloud Video Intelligence API for video
    analysis. As per the website, the system “separates signal from noise, by
    retrieving relevant information at the video, shot or per frame.” A
    demonstration website has been also launched, which allows anyone to select a
    video for annotation. The API then detects the video labels (objects within the
    video) as well as shot labels (description of the video events over time). In
    this paper, we examine the usability of the Google’s Cloud Video Intelligence
    API in adversarial environments. In particular, we investigate whether an
    adversary can manipulate a video in such a way that the API will return only
    the adversary-desired labels. For this, we select an image that is different
    from the content of the Video and insert it, periodically and at a very low
    rate, into the video. We found that if we insert one image every two seconds,
    the API is deceived into annotating the entire video as if it only contains the
    inserted image. Note that the modification to the video is hardly noticeable
    as, for instance, for a typical frame rate of 25, we insert only one image per
    50 video frames. We also found that, by inserting one image per second, all the
    shot labels returned by the API are related to the inserted image. We perform
    the experiments on the sample videos provided by the API demonstration website
    and show that our attack is successful with different videos and images.

    Perception Driven Texture Generation

    Yanhai Gan, Huifang Chi, Ying Gao, Jun Liu, Guoqiang Zhong, Junyu Dong
    Comments: 7 pages, 4 figures, icme2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)

    This paper investigates a novel task of generating texture images from
    perceptual descriptions. Previous work on texture generation focused on either
    synthesis from examples or generation from procedural models. Generating
    textures from perceptual attributes have not been well studied yet. Meanwhile,
    perceptual attributes, such as directionality, regularity and roughness are
    important factors for human observers to describe a texture. In this paper, we
    propose a joint deep network model that combines adversarial training and
    perceptual feature regression for texture generation, while only random noise
    and user-defined perceptual attributes are required as input. In this model, a
    preliminary trained convolutional neural network is essentially integrated with
    the adversarial framework, which can drive the generated textures to possess
    given perceptual attributes. An important aspect of the proposed model is that,
    if we change one of the input perceptual features, the corresponding appearance
    of the generated textures will also be changed. We design several experiments
    to validate the effectiveness of the proposed method. The results show that the
    proposed method can produce high quality texture images with desired perceptual
    properties.

    Two-Stream RNN/CNN for Action Recognition in 3D Videos

    Rui Zhao, Haider Ali, Patrick van der Smagt
    Comments: 8 pages, 8 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    The recognition of actions from video sequences has many applications in
    health monitoring, assisted living, surveillance, and smart homes. Despite
    advances in sensing, in particular related to 3D video, the methodologies to
    process the data are still subject to research. We demonstrate superior results
    by a system which combines recurrent neural networks with convolutional neural
    networks in a voting approach. The gated-recurrent-unit-based neural networks
    are particularly well-suited to distinguish actions based on long-term
    information from optical tracking data; the 3D-CNNs focus more on detailed,
    recent information from video data. The resulting features are merged in an SVM
    which then classifies the movement. In this architecture, our method improves
    recognition rates of state-of-the-art methods by 14% on standard data sets.


    Information Theory

    Cooperative Abnormality Detection via Diffusive Molecular Communications

    Reza Mosayebi, Vahid Jamali, Nafiseh Ghoroghchian, Robert Schober, Masoumeh Nasiri-Kenari, Mahdieh Mehrabi
    Comments: 30 pages, 9 figures
    Subjects: Information Theory (cs.IT)

    In this paper, we consider abnormality detection via diffusive molecular
    communications (MCs) for a network consisting of several sensors and a fusion
    center (FC). If a sensor detects an abnormality, it injects into the medium a
    number of molecules which is proportional to the sensed value. Two transmission
    schemes for releasing molecules into the medium are considered. In the first
    scheme, referred to as DTM, each sensor releases a different type of molecule,
    whereas in the second scheme, referred to as STM, all sensors release the same
    type of molecule. The molecules released by the sensors propagate through the
    MC channel and some may reach the FC where the final decision regarding whether
    or not an abnormality has occurred is made. We derive the optimal decision
    rules for both DTM and STM. However, the optimal detectors entail high
    computational complexity as log-likelihood ratios (LLRs) have to be computed.
    To overcome this issue, we show that the optimal decision rule for STM can be
    transformed into an equivalent low-complexity decision rule. Since a similar
    transformation is not possible for DTM, we propose simple low-complexity
    sub-optimal detectors based on different approximations of the LLR. The
    proposed low-complexity detectors are more suitable for practical MC systems
    than the original complex optimal decision rule, particularly when the FC is a
    nano-machine with limited computational capabilities. Furthermore, we analyze
    the performance of the proposed detectors in terms of their false alarm and
    missed detection probabilities. Simulation results verify our analytical
    derivations and reveal interesting insights regarding the trade-off between
    complexity and performance of the proposed detectors and the considered DTM and
    STM schemes.

    Diversity-Multiplexing Tradeoff for Multi-Connectivity and the Gain of Joint Decoding

    Albrecht Wolf, Philipp Schulz, David Öhmann, Meik Dörpinghaus, Gerhard Fettweis
    Subjects: Information Theory (cs.IT)

    Multi-connectivity is considered to be key for enabling reliable
    transmissions and enhancing data rates in future wireless networks. In this
    work, we quantify the communication performance by the outage probability and
    the system throughput. We establish a remarkably simple, yet accurate
    analytical framework based on distributed source coding to describe the outage
    probability and the system throughput in dependency on the number of links, the
    modulation scheme, the code rate, the bandwidth, and the received
    signal-to-noise ratio (SNR). It is known that a tradeoff exists between the
    outage probability and the system throughput. To investigate this tradeoff we
    define two modes to either achieve low outage probabilities or high system
    throughput which we refer to as the diversity and the multiplexing mode,
    respectively. For the diversity mode, we compare three signal processing
    schemes and show the SNR gain of joint decoding in comparison to maximum
    selection combining and maximum ratio combing while achieving the same outage
    probability. We then establish a diversity-multiplexing tradeoff analysis based
    on time sharing between both modes. Additionally, we apply our analytical
    framework to real field channel measurements and thereby illustrate the
    potential of multi-connectivity in real cellular networks to achieve high
    reliability or high throughput.

    Painless Breakups — Efficient Demixing of Low Rank Matrices

    Thomas Strohmer, Ke Wei
    Subjects: Information Theory (cs.IT)

    Assume we are given a sum of linear measurements of (s) different rank-(r)
    matrices of the form (y = sum_{k=1}^{s} mathcal{A}_k ({X}_k)). When and under
    which conditions is it possible to extract (demix) the individual matrices
    ({X}_k) from the single measurement vector ({y})? And can we do the demixing
    numerically efficiently? We present two computationally efficient algorithms
    based on hard thresholding to solve this low rank demixing problem. We prove
    that under suitable conditions these algorithms are guaranteed to converge to
    the correct solution at a linear rate. We discuss applications in connection
    with quantum tomography and the Internet-of-Things. Numerical simulations
    demonstrate empirically the performance of the proposed algorithms.

    Community detection and stochastic block models: recent developments

    Emmanuel Abbe
    Subjects: Probability (math.PR); Computational Complexity (cs.CC); Information Theory (cs.IT); Social and Information Networks (cs.SI); Machine Learning (stat.ML)

    The stochastic block model (SBM) is a random graph model with planted
    clusters. It is widely employed as a canonical model to study clustering and
    community detection, and provides generally a fertile ground to study the
    statistical and computational tradeoffs that arise in network and data
    sciences.

    This note surveys the recent developments that establish the fundamental
    limits for community detection in the SBM, both with respect to
    information-theoretic and computational thresholds, and for various recovery
    requirements such as exact, partial and weak recovery (a.k.a., detection). The
    main results discussed are the phase transitions for exact recovery at the
    Chernoff-Hellinger threshold, the phase transition for weak recovery at the
    Kesten-Stigum threshold, the optimal distortion-SNR tradeoff for partial
    recovery, the learning of the SBM parameters and the gap between
    information-theoretic and computational thresholds.

    The note also covers some of the algorithms developed in the quest of
    achieving the limits, in particular two-round algorithms via graph-splitting,
    semi-definite programming, linearized belief propagation, classical and
    nonbacktracking spectral methods. A few open problems are also discussed.

    Priv'IT: Private and Sample Efficient Identity Testing

    Bryan Cai, Constantinos Daskalakis, Gautam Kamath
    Subjects: Data Structures and Algorithms (cs.DS); Cryptography and Security (cs.CR); Information Theory (cs.IT); Learning (cs.LG); Statistics Theory (math.ST)

    We develop differentially private hypothesis testing methods for the small
    sample regime. Given a sample (cal D) from a categorical distribution (p) over
    some domain (Sigma), an explicitly described distribution (q) over (Sigma),
    some privacy parameter (varepsilon), accuracy parameter (alpha), and
    requirements (eta_{
    m I}) and (eta_{
    m II}) for the type I and type II
    errors of our test, the goal is to distinguish between (p=q) and
    (d_{
    m{TV}}(p,q) geq alpha).

    We provide theoretical bounds for the sample size (|{cal D}|) so that our
    method both satisfies ((varepsilon,0))-differential privacy, and guarantees
    (eta_{
    m I}) and (eta_{
    m II}) type I and type II errors. We show that
    differential privacy may come for free in some regimes of parameters, and we
    always beat the sample complexity resulting from running the (chi^2)-test with
    noisy counts, or standard approaches such as repetition for endowing
    non-private (chi^2)-style statistics with differential privacy guarantees. We
    experimentally compare the sample complexity of our method to that of recently
    proposed methods for private hypothesis testing.

    A generalized quantum Slepian-Wolf

    Anurag Anshu, Rahul Jain, Naqueeb Ahmad Warsi
    Comments: version 1, 1 figure, 18 pages
    Subjects: Quantum Physics (quant-ph); Information Theory (cs.IT)

    In this work we consider a quantum generalization of the task considered by
    Slepian and Wolf [1973] regarding distributed source compression. In our task
    Alice, Bob, Charlie and Referee share a joint pure state. Alice and Bob wish to
    send a part of their respective systems to Charlie without collaborating with
    each other. We give achievability bounds for this task in the one-shot setting
    and provide asymptotic analysis in the case when there is no side information
    with Charlie.

    Our result implies the result of Abeyesinghe, Devetak, Hayden and Winter
    [2009] who studied a special case of this problem. As another special case
    wherein Bob holds trivial registers, we recover the result of Devetak and Yard
    [2008] regarding quantum state redistribution.




沪ICP备19023445号-2号
友情链接