IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Wed, 23 Nov 2016

    我爱机器学习(52ml.net)发表于 2016-11-23 00:00:00
    love 0

    Neural and Evolutionary Computing

    Recurrent Neural Networks With Limited Numerical Precision

    Joachim Ott, Zhouhan Lin, Ying Zhang, Shih-Chii Liu, Yoshua Bengio
    Comments: NIPS 2016 EMDNN Workshop paper, condensed version of arXiv:1608.06902
    Subjects: Neural and Evolutionary Computing (cs.NE)

    Recurrent Neural Networks (RNNs) produce state-of-art performance on many
    machine learning tasks but their demand on resources in terms of memory and
    computational power are often high. Therefore, there is a great interest in
    optimizing the computations performed with these models especially when
    considering development of specialized low-power hardware for deep networks.
    One way of reducing the computational needs is to limit the numerical precision
    of the network weights and biases, and this will be addressed for the case of
    RNNs. We present results from the use of different stochastic and deterministic
    reduced precision training methods applied to two major RNN types, which are
    then tested on three datasets. The results show that the stochastic and
    deterministic ternarization, pow2- ternarization, and exponential quantization
    methods gave rise to low-precision RNNs that produce similar and even higher
    accuracy on certain datasets, therefore providing a path towards training more
    efficient implementations of RNNs in specialized hardware.

    Deep Learning Approximation for Stochastic Control Problems

    Jiequn Han, E Weinan
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE); Optimization and Control (math.OC); Machine Learning (stat.ML)

    Many real world stochastic control problems suffer from the “curse of
    dimensionality”. To overcome this difficulty, we develop a deep learning
    approach that directly solves high-dimensional stochastic control problems
    based on Monte-Carlo sampling. We approximate the time-dependent controls as
    feedforward neural networks and stack these networks together through model
    dynamics. The objective function for the control problem plays the role of the
    loss function for the deep neural network. We test this approach using examples
    from the areas of optimal trading and energy storage. Our results suggest that
    the algorithm presented here achieves satisfactory accuracy and at the same
    time, can handle rather high dimensional problems.


    Computer Vision and Pattern Recognition

    Scene Labeling using Recurrent Neural Networks with Explicit Long Range Contextual Dependency

    Qiangui Huang, Weiyue Wang, Kevin Zhou, Suya You, Ulrich Neumann
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Spatial contextual dependencies are crucial for scene labeling problems.
    Recurrent neural network (RNN) is one of state-of-the-art methods for modeling
    contextual dependencies. However, RNNs are fundamentally designed for
    sequential data, not spatial data. This work shows that directly applying
    traditional RNN architectures, which unfold a 2D lattice grid into a sequence,
    is not sufficient to model structure dependencies in images due to the “impact
    vanishing” problem. A new RNN unit with Explicit Long-range Conditioning
    (RNN-ELC) is designed to overcome this problem. Based on this new RNN-ELC unit,
    a novel neural network architecture is built for scene labeling tasks. This
    architecture achieves state-of-the-art performances on several standard scene
    labeling datasets. Comprehensive experiments demonstrate that scene labeling
    tasks benefit a lot from the explicit long range contextual dependencies
    encoded in our algorithm.

    Smart Library: Identifying Books in a Library using Richly Supervised Deep Scene Text Reading

    Xiao Yang, Dafang He, Wenyi Huang, Zihan Zhou, Alex Ororbia, Dan Kifer, C. Lee Giles
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Physical library collections are valuable and long standing resources for
    knowledge and learning. However, managing books in a large bookshelf and
    finding books on it often leads to tedious manual work, especially for large
    book collections where books might be missing or misplaced. Recently, deep
    neural models, such as Convolutional Neural Networks (CNN) and Recurrent Neural
    Networks (RNN) have achieved great success for scene text detection and
    recognition. Motivated by these recent successes, we aim to investigate their
    viability in facilitating book management, a task that introduces further
    challenges including large amounts of cluttered scene text, distortion, and
    varied lighting conditions. In this paper, we present a library inventory
    building and retrieval system based on scene text reading methods. We
    specifically design our scene text recognition model using rich supervision to
    accelerate training and achieve state-of-the-art performance on several
    benchmark datasets. Our proposed system has the potential to greatly reduce the
    amount of human labor required in managing book inventories as well as the
    space needed to store book information.

    PVR: Patch-to-Volume Reconstruction for Large Area Motion Correction of Fetal MRI

    Amir Alansary, Bernhard Kainz, Martin Rajchl, Maria Murgasova, Mellisa Damodaram, David F.A. Lloyd, Alice Davidson, Steven G. McDonagh, Mary Rutherford, Joseph V. Hajnal, Daniel Rueckert
    Comments: 10 pages, 13 figures, submitted to IEEE Transactions on Medical Imaging
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper we present a novel method for the correction of motion
    artifacts that are present in fetal Magnetic Resonance Imaging (MRI) scans of
    the whole uterus. Contrary to current slice-to-volume registration (SVR)
    methods, requiring an inflexible anatomical enclosure of a single investigated
    organ, the proposed patch-to-volume reconstruction (PVR) approach is able to
    reconstruct a large field of view of non-rigidly deforming structures. It
    relaxes rigid motion assumptions by introducing a specific amount of redundant
    information that is exploited with parallelized patch-wise optimization,
    super-resolution, and automatic outlier rejection. We further describe and
    provide an efficient parallel implementation of PVR allowing its execution
    within reasonable time on commercially available graphics processing units
    (GPU), enabling its use in the clinical practice. We evaluate PVR’s
    computational overhead compared to standard methods and observe improved
    reconstruction accuracy in presence of affine motion artifacts of approximately
    30% compared to conventional SVR in synthetic experiments. Furthermore, we have
    evaluated our method qualitatively and quantitatively on real fetal MRI data
    subject to maternal breathing and sudden fetal movements. We evaluate
    peak-signal-to-noise ratio (PSNR), structural similarity index (SSIM), and
    cross correlation (CC) with respect to the originally acquired data and provide
    a method for visual inspection of reconstruction uncertainty. With these
    experiments we demonstrate successful application of PVR motion compensation to
    the whole uterus, the human fetus, and the human placenta.

    Active learning with version spaces for object detection

    Soumya Roy, Vinay P. Namboodiri, Arijit Biswas
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Given an image, we would like to learn to detect objects belonging to
    particular object categories. Common object detection methods train on large
    annotated datasets which are annotated in terms of bounding boxes that contain
    the object of interest. Previous works on object detection model the problem as
    a structured regression problem which ranks the correct bounding boxes more
    than the background ones. In this paper we develop algorithms which actively
    obtain annotations from human annotators for a small set of images, instead of
    all images, thereby reducing the annotation effort. Towards this goal, we make
    the following contributions: 1. We develop a principled version space based
    active learning method that solves for object detection as a structured
    prediction problem in a weakly supervised setting 2. We derive the relation
    between our proposed method with other active learning techniques such as
    maximum model change 3. Additionally, we propose two variants of the
    margin-based querying strategy 4. We analyse the results on standard object
    detection benchmarks that show that with only 20% of the data we can obtain
    more than 95% of the localization accuracy of full supervision. Our methods
    outperform random sampling and the classical uncertainty-based active learning
    algorithms like entropy

    Deep Single and Direct Multi-View Depth Fusion

    José M. Fácil, Alejo Concha, Luis Montesano, Javier Civera
    Comments: Submitted to the International Conference on Robotics and Automation (ICRA) 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)

    Dense 3D mapping from a monocular sequence is a key technology for several
    applications and still a research problem. This paper leverages recent results
    on single-view CNN-based depth estimation and fuses them with direct multi-view
    depth estimation. Both approaches present complementary strengths. Multi-view
    depth estimation is highly accurate but only in high-texture and high-parallax
    cases. Single-view depth captures the local structure of mid-level regions,
    including textureless areas, but the estimated depth lacks global coherence.

    The single and multi-view fusion we propose has several challenges. First,
    both depths are related by a non-rigid deformation that depends on the image
    content. And second, the selection of multi-view points of high accuracy might
    be difficult for low-parallax configurations. We present contributions for both
    problems. Our results in the public datasets of NYU and TUM shows that our
    algorithm outperforms the individual single and multi-view approaches.

    CAS-CNN: A Deep Convolutional Neural Network for Image Compression Artifact Suppression

    Lukas Cavigelli, Pascal Hager, Luca Benini
    Comments: 8 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Information Retrieval (cs.IR); Multimedia (cs.MM)

    Lossy image compression algorithms are pervasively used to reduce the size of
    images transmitted over the web and recorded on data storage media. However, we
    pay for their high compression rate with visual artifacts degrading the user
    experience. Deep convolutional neural networks have become a widespread tool to
    address high-level computer vision tasks very successfully. Recently, they have
    found their way into the areas of low-level computer vision and image
    processing to solve regression problems mostly with relatively shallow
    networks.

    We present a novel 12-layer deep convolutional network for image compression
    artifact suppression with hierarchical skip connections and a multi-scale loss
    function. We achieve a boost of up to 1.79 dB in PSNR over ordinary JPEG and an
    improvement of up to 0.36 dB over the best previous ConvNet result. We show
    that a network trained for a specific quality factor (QF) is resilient to the
    QF used to compress the input image – a single network trained for QF 60
    provides a PSNR gain of more than 1.5 dB over the wide QF range from 40 to 76.

    A Spatial and Temporal Non-Local Filter Based Data Fusion

    Qing Cheng, Huiqing Liu, Huanfeng Shen, Penghai Wu, Liangpei Zhang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    The trade-off in remote sensing instruments that balances the spatial
    resolution and temporal frequency limits our capacity to monitor spatial and
    temporal dynamics effectively. The spatiotemporal data fusion technique is
    considered as a cost-effective way to obtain remote sensing data with both high
    spatial resolution and high temporal frequency, by blending observations from
    multiple sensors with different advantages or characteristics. In this paper,
    we develop the spatial and temporal non-local filter based fusion model
    (STNLFFM) to enhance the prediction capacity and accuracy, especially for
    complex changed landscapes. The STNLFFM method provides a new transformation
    relationship between the fine-resolution reflectance images acquired from the
    same sensor at different dates with the help of coarse-resolution reflectance
    data, and makes full use of the high degree of spatiotemporal redundancy in the
    remote sensing image sequence to produce the final prediction. The proposed
    method was tested over both the Coleambally Irrigation Area study site and the
    Lower Gwydir Catchment study site. The results show that the proposed method
    can provide a more accurate and robust prediction, especially for heterogeneous
    landscapes and temporally dynamic areas.

    Object detection can be improved using human-derived contextual expectations

    Harish Katti, Marius V. Peelen, S. P. Arun
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Neurons and Cognition (q-bio.NC)

    Each object in the world occurs in a specific context: cars are seen on
    highways but not in forests. Contextual information is generally thought to
    facilitate computation by constraining locations to search. But can knowing
    context yield tangible benefits in object detection? For it to do so, scene
    context needs to be learned independently from target features. However this is
    impossible in traditional object detection where classifiers are trained on
    images containing both target features and surrounding coarse scene features.
    In contrast, we humans have the opportunity to learn context and target
    features separately, such as when we see highways without cars. Here we show
    for the first time that human-derived scene expectations can be used to improve
    object detection performance in machines. To measure these expectations, we
    asked human subjects to indicate the scale, location and likelihood at which
    targets may occur on scenes containing no targets. Humans showed highly
    systematic expectations that we could accurately predict using scene features.
    We then augmented state-of-the-art object detectors (based on deep neural
    networks) with these human-derived expectations on novel scenes to produce a
    significant (1-3%) improvement in detecting cars and people in scenes. This
    improvement was due to low-confidence detector matches being correctly
    relabeled as targets when they occurred in likely scenes.

    Recurrent Attention Models for Depth-Based Person Identification

    Albert Haque, Alexandre Alahi, Li Fei-Fei
    Comments: Computer Vision and Pattern Recognition (CVPR) 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We present an attention-based model that reasons on human body shape and
    motion dynamics to identify individuals in the absence of RGB information,
    hence in the dark. Our approach leverages unique 4D spatio-temporal signatures
    to address the identification problem across days. Formulated as a
    reinforcement learning task, our model is based on a combination of
    convolutional and recurrent neural networks with the goal of identifying small,
    discriminative regions indicative of human identity. We demonstrate that our
    model produces state-of-the-art results on several published datasets given
    only depth images. We further study the robustness of our model towards
    viewpoint, appearance, and volumetric changes. Finally, we share insights
    gleaned from interpretable 2D, 3D, and 4D visualizations of our model’s
    spatio-temporal attention.

    Exploiting Web Images for Dataset Construction: A Domain Robust Approach

    Yazhou Yao, Jian Zhang, Fumin Shen, Xiansheng Hua, Jingsong Xu, Zhenmin Tang
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)

    Labelled image datasets have played a critical role in high-level image
    understanding; however the process of manual labelling is both time-consuming
    and labor intensive. To reduce the cost of manual labelling, there has been
    increased research interest in automatically constructing image datasets by
    exploiting web images. Datasets constructed by existing methods tend to have a
    weak domain adaptation ability, which is known as the “dataset bias problem”.
    To address this issue, we present a novel image dataset construction framework
    which can be generalized well to unseen target domains. In specific, the given
    queries are first expanded by searching in the Google Books Ngrams Corpus to
    obtain a richer semantic description, from which the visually non-salient and
    less relevant expansions are filtered out. By treating each unfiltered
    expansion as a “bag” and the retrieved images as “instances”, image selection
    can be formulated as a multi-instance learning problem with constrained
    positive bags. We propose to solve the employed problems by the cutting-plane
    and concave-convex procedure (CCCP) algorithm. Using this approach, images from
    different distributions will be retained while noisy images will be filtered
    out. To verify the effectiveness of our proposed approach, we build a
    domain-robust image dataset with 20 categories, which we refer to as DRID-20.
    We compare DRID-20 with three publicly available datasets STL-10, CIFAR-10 and
    ImageNet. The experimental results confirm the effectiveness of our dataset in
    terms of image classification ability, cross-dataset generalization ability and
    dataset diversity. We further run object detection on PASCAL VOC 2007 using our
    data, and the results demonstrate the superiority of our method to the weakly
    supervised and web-supervised state-of-the-art detection methods.

    Learning Multi-level Deep Representations for Image Emotion Classification

    Tianrong Rao, Min Xu, Dong Xu
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we propose a new deep network that learns multi-level deep
    representations for image emotion classification (MldrNet). Image emotion can
    be recognized through image semantics, image aesthetics and low-level visual
    features from both global and local views. Existing image emotion
    classification works using hand-crafted features or deep features mainly focus
    on either low-level visual features or semantic-level image representations
    without taking all factors into consideration. Our proposed MldrNet unifies
    deep representations of three levels, i.e. image semantics, image aesthetics
    and low-level visual features through multiple instance learning (MIL) in order
    to effectively cope with noisy labeled data, such as images collected from the
    Internet. Extensive experiments on both Internet images and abstract paintings
    demonstrate the proposed method outperforms the state-of-the-art methods using
    deep features or hand-crafted features. The proposed approach also outperforms
    the state-of-the-art methods with at least 6% performance improvement in terms
    of overall classification accuracy.

    Learning Multi-level Features For Sensor-based Human Action Recognition

    Yan Xu, Zhengyang Shen, Xin Zhang, Yifan Gao, Shujian Deng, Yipei Wang, Yubo Fan, Eric I-Chao Chang
    Comments: 28 pages, 23 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper proposes a multi-level feature learning framework for human action
    recognition using body-worn inertial sensors. The framework consists of three
    phases, respectively designed to analyze signal-based (low-level), components
    (mid-level) and semantic (high-level) information. Low-level features,
    extracted from raw signals, capture the time and frequency domain property
    while mid-level representations, obtained through the dictionary learning
    method, learn the composition of the action. The Max-margin Latent Pattern
    Learning (MLPL) method is proposed and implemented on the concatenation of low-
    and mid-level features to learn high-level semantic descriptions of latent
    action patterns as the output of our framework. Various experiments on Opp,
    Skoda and WISDM datasets show that the semantic feature learned by this
    framework possesses higher representation ability than low- and mid-level
    features. Compared with existing methods, the proposed method achieves
    state-of-the-art performances.

    Cascaded Neural Networks with Selective Classifiers and its evaluation using Lung X-ray CT Images

    Masaharu Sakamoto, Hiroki Nakano
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Lung nodule detection is a class imbalanced problem because nodules are found
    with much lower frequency than non-nodules. In the class imbalanced problem,
    conventional classifiers tend to be overwhelmed by the majority class and
    ignore the minority class. We therefore propose cascaded convolutional neural
    networks to cope with the class imbalanced problem. In the proposed approach,
    cascaded convolutional neural networks that perform as selective classifiers
    filter out obvious non-nodules. Successively, a convolutional neural network
    trained with a balanced data set calculates nodule probabilities. The proposed
    method achieved the detection sensitivity of 85.3% and 90.7% at 1 and 4 false
    positives per scan in FROC curve, respectively.

    Max-Margin Deep Generative Models for (Semi-)Supervised Learning

    Chongxuan Li, Jun Zhu, Bo Zhang
    Comments: arXiv admin note: substantial text overlap with arXiv:1504.06787
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)

    Deep generative models (DGMs) are effective on learning multilayered
    representations of complex data and performing inference of input data by
    exploring the generative ability. However, it is relatively insufficient to
    empower the discriminative ability of DGMs on making accurate predictions. This
    paper presents max-margin deep generative models (mmDGMs) and a
    class-conditional variant (mmDCGMs), which explore the strongly discriminative
    principle of max-margin learning to improve the predictive performance of DGMs
    in both supervised and semi-supervised learning, while retaining the generative
    capability. In semi-supervised learning, we use the predictions of a max-margin
    classifier as the missing labels instead of performing full posterior inference
    for efficiency; we also introduce additional max-margin and label-balance
    regularization terms of unlabeled data for effectiveness. We develop an
    efficient doubly stochastic subgradient algorithm for the piecewise linear
    objectives in different settings. Empirical results on various datasets
    demonstrate that: (1) max-margin learning can significantly improve the
    prediction performance of DGMs and meanwhile retain the generative ability; (2)
    in supervised learning, mmDGMs are competitive to the best fully discriminative
    networks when employing convolutional neural networks as the generative and
    recognition models; and (3) in semi-supervised learning, mmDCGMs can perform
    efficient inference and achieve state-of-the-art classification results on
    several benchmarks.

    Inducing Interpretable Representations with Variational Autoencoders

    N. Siddharth, Brooks Paige, Alban Desmaison, Jan-Willem Van de Meent, Frank Wood, Noah D. Goodman, Pushmeet Kohli, Philip H.S. Torr
    Comments: Presented at NIPS 2016 Workshop on Interpretable Machine Learning in Complex Systems
    Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    We develop a framework for incorporating structured graphical models in the
    emph{encoders} of variational autoencoders (VAEs) that allows us to induce
    interpretable representations through approximate variational inference. This
    allows us to both perform reasoning (e.g. classification) under the structural
    constraints of a given graphical model, and use deep generative models to deal
    with messy, high-dimensional domains where it is often difficult to model all
    the variation. Learning in this framework is carried out end-to-end with a
    variational objective, applying to both unsupervised and semi-supervised
    schemes.

    Grad-CAM: Why did you say that?

    Ramprasaath R Selvaraju, Abhishek Das, Ramakrishna Vedantam, Michael Cogswell, Devi Parikh, Dhruv Batra
    Comments: Presented at NIPS 2016 Workshop on Interpretable Machine Learning in Complex Systems. This is an extended abstract version of arXiv:1610.02391
    Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    We propose a technique for making Convolutional Neural Network (CNN)-based
    models more transparent by visualizing input regions that are ‘important’ for
    predictions — or visual explanations. Our approach, called Gradient-weighted
    Class Activation Mapping (Grad-CAM), uses class-specific gradient information
    to localize important regions. These localizations are combined with existing
    pixel-space visualizations to create a novel high-resolution and
    class-discriminative visualization called Guided Grad-CAM. These methods help
    better understand CNN-based models, including image captioning and visual
    question answering (VQA) models. We evaluate our visual explanations by
    measuring their ability to discriminate between classes, to inspire trust in
    humans, and their correlation with occlusion maps. Grad-CAM provides a new way
    to understand CNN-based models.

    We have released code, an online demo hosted on CloudCV, and a full version
    of this extended abstract.

    3D Image Reconstruction from X-Ray Measurements with Overlap

    Maria Klodt, Raphael Hauser
    Comments: Published in Computer Vision – ECCV 2016. The final publication is available at link.springer.com/chapter/10.1007/978-3-319-46466-4_2
    Subjects: Medical Physics (physics.med-ph); Computer Vision and Pattern Recognition (cs.CV)

    3D image reconstruction from a set of X-ray projections is an important image
    reconstruction problem, with applications in medical imaging, industrial
    inspection and airport security. The innovation of X-ray emitter arrays allows
    for a novel type of X-ray scanners with multiple simultaneously emitting
    sources. However, two or more sources emitting at the same time can yield
    measurements from overlapping rays, imposing a new type of image reconstruction
    problem based on nonlinear constraints. Using traditional linear reconstruction
    methods, respective scanner geometries have to be implemented such that no rays
    overlap, which severely restricts the scanner design. We derive a new type of
    3D image reconstruction model with nonlinear constraints, based on measurements
    with overlapping X-rays. Further, we show that the arising optimization problem
    is partially convex, and present an algorithm to solve it. Experiments show
    highly improved image reconstruction results from both simulated and real-world
    measurements.

    Geometry of 3D Environments and Sum of Squares Polynomials

    Amir Ali Ahmadi, Georgina Hall, Ameesh Makadia, Vikas Sindhwani
    Subjects: Optimization and Control (math.OC); Computational Geometry (cs.CG); Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)

    Motivated by applications in robotics and computer vision, we study problems
    related to spatial reasoning of a 3D environment using sublevel sets of
    polynomials. These include: tightly containing a cloud of points (e.g.,
    representing an obstacle) with convex or nearly-convex basic semialgebraic
    sets, computation of Euclidean distances between two such sets, separation of
    two convex basic semalgebraic sets that overlap, and tight containment of the
    union of several basic semialgebraic sets with a single convex one. We use
    algebraic techniques from sum of squares optimization that reduce all these
    tasks to semidefinite programs of small size and present numerical experiments
    in realistic scenarios.

    Distributable Consistent Multi-Graph Matching

    Nan Hu, Boris Thibert, Leonidas Guibas
    Subjects: Data Structures and Algorithms (cs.DS); Computer Vision and Pattern Recognition (cs.CV)

    In this paper we propose an optimization-based framework to multiple graph
    matching. The framework takes as input maps computed between pairs of graphs,
    and outputs maps that 1) are consistent among all pairs of graphs, and 2)
    preserve edge connectivity between pairs of graphs. We show how to formulate
    this as solving a piece-wise low-rank matrix recovery problem using a
    generalized message passing scheme. We also present necessary and sufficient
    conditions under which such global consistency is guaranteed. The key feature
    of our approach is that it is scalable to large datasets, while still produce
    maps whose quality is competent against state-of-the-art global
    optimization-based techniques.


    Artificial Intelligence

    An unexpected unity among methods for interpreting model predictions

    Scott Lundberg, Su-In Lee
    Comments: Short extended abstract
    Subjects: Artificial Intelligence (cs.AI)

    Understanding why a model made a certain prediction is crucial in many data
    science fields. Interpretable predictions engender appropriate trust and
    provide insight into how the model may be improved. However, with large modern
    datasets the best accuracy is often achieved by complex models even experts
    struggle to interpret, which creates a tension between accuracy and
    interpretability. Recently, several methods have been proposed for interpreting
    predictions from complex models by estimating the importance of input features.
    Here, we present how a model-agnostic additive representation of the importance
    of input features unifies current methods. This representation is optimal, in
    the sense that it is the only set of additive values that satisfies important
    properties. We show how we can leverage these properties to create novel visual
    explanations of model predictions. The thread of unity that this representation
    weaves through the literature indicates that there are common principles to be
    learned about the interpretation of model predictions that apply in many
    scenarios.

    A Deep Learning Approach for Joint Video Frame and Reward Prediction in Atari Games

    Felix Leibfried, Nate Kushman, Katja Hofmann
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG)

    Reinforcement learning is concerned with learning to interact with
    environments that are initially unknown. State-of-the-art reinforcement
    learning approaches, such as DQN, are model-free and learn to act effectively
    across a wide range of environments such as Atari games, but require huge
    amounts of data. Model-based techniques are more data-efficient, but need to
    acquire explicit knowledge about the environment dynamics or the reward
    structure.

    In this paper we take a step towards using model-based techniques in
    environments with high-dimensional visual state space when system dynamics and
    the reward structure are both unknown and need to be learned, by demonstrating
    that it is possible to learn both jointly. Empirical evaluation on five Atari
    games demonstrate accurate cumulative reward prediction of up to 200 frames. We
    consider these positive results as opening up important directions for
    model-based RL in complex, initially unknown environments.

    Variational Intrinsic Control

    Karol Gregor, Danilo Jimenez Rezende, Daan Wierstra
    Comments: 15 pages, 6 figures
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

    In this paper, we present a fast cascaded regression for face alignment, via
    a novel local feature. Our proposed local lightweight feature, namely intimacy
    definition feature (IDF), is more discriminative than landmark shape-indexed
    feature, more efficient than the handcrafted scale-invariant feature transform
    (SIFT) feature, and more compact than the local binary feature (LBF).
    Experimental results show that our approach achieves state-of-the-art
    performance, when tested on the most challenging benchmarks. Compared with an
    LBF-based algorithm, our method is able to obtain about two times the speed-up
    and more than 20% improvement, in terms of alignment error measurement, and
    able to save an order of magnitude of memory requirement.

    Deep Learning Approximation for Stochastic Control Problems

    Jiequn Han, E Weinan
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE); Optimization and Control (math.OC); Machine Learning (stat.ML)

    Many real world stochastic control problems suffer from the “curse of
    dimensionality”. To overcome this difficulty, we develop a deep learning
    approach that directly solves high-dimensional stochastic control problems
    based on Monte-Carlo sampling. We approximate the time-dependent controls as
    feedforward neural networks and stack these networks together through model
    dynamics. The objective function for the control problem plays the role of the
    loss function for the deep neural network. We test this approach using examples
    from the areas of optimal trading and energy storage. Our results suggest that
    the algorithm presented here achieves satisfactory accuracy and at the same
    time, can handle rather high dimensional problems.

    Randomized Mechanisms for Selling Reserved Instances in Cloud

    Jia Zhang, Weidong Ma, Tao Qin, Xiaoming Sun, Tie-Yan Liu
    Comments: Already accepted by AAAI’17
    Subjects: Computer Science and Game Theory (cs.GT); Artificial Intelligence (cs.AI)

    Selling reserved instances (or virtual machines) is a basic service in cloud
    computing. In this paper, we consider a more flexible pricing model for
    instance reservation, in which a customer can propose the time length and
    number of resources of her request, while in today’s industry, customers can
    only choose from several predefined reservation packages. Under this model, we
    design randomized mechanisms for customers coming online to optimize social
    welfare and providers’ revenue.

    We first consider a simple case, where the requests from the customers do not
    vary too much in terms of both length and value density. We design a randomized
    mechanism that achieves a competitive ratio (frac{1}{42}) for both
    emph{social welfare} and emph{revenue}, which is a improvement as there is
    usually no revenue guarantee in previous works such as
    cite{azar2015ec,wang2015selling}. This ratio can be improved up to
    (frac{1}{11}) when we impose a realistic constraint on the maximum number of
    resources used by each request. On the hardness side, we show an upper bound
    (frac{1}{3}) on competitive ratio for any randomized mechanism. We then extend
    our mechanism to the general case and achieve a competitive ratio
    (frac{1}{42log klog T}) for both social welfare and revenue, where (T) is
    the ratio of the maximum request length to the minimum request length and (k)
    is the ratio of the maximum request value density to the minimum request value
    density. This result outperforms the previous upper bound (frac{1}{CkT}) for
    deterministic mechanisms cite{wang2015selling}. We also prove an upper bound
    (frac{2}{log 8kT}) for any randomized mechanism. All the mechanisms we
    provide are in a greedy style. They are truthful and easy to be integrated into
    practical cloud systems.

    Limbo: A Fast and Flexible Library for Bayesian Optimization

    Antoine Cully, Konstantinos Chatzilygeroudis, Federico Allocati, Jean-Baptiste Mouret
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO); Machine Learning (stat.ML)

    Limbo is an open-source C++11 library for Bayesian optimization which is
    designed to be both highly flexible and very fast. It can be used to optimize
    functions for which the gradient is unknown, evaluations are expensive, and
    runtime cost matters (e.g., on embedded systems or robots). Benchmarks on
    standard functions show that Limbo is about 2 times faster than BayesOpt
    (another C++ library) for a similar accuracy.

    CAS-CNN: A Deep Convolutional Neural Network for Image Compression Artifact Suppression

    Lukas Cavigelli, Pascal Hager, Luca Benini
    Comments: 8 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Information Retrieval (cs.IR); Multimedia (cs.MM)

    Lossy image compression algorithms are pervasively used to reduce the size of
    images transmitted over the web and recorded on data storage media. However, we
    pay for their high compression rate with visual artifacts degrading the user
    experience. Deep convolutional neural networks have become a widespread tool to
    address high-level computer vision tasks very successfully. Recently, they have
    found their way into the areas of low-level computer vision and image
    processing to solve regression problems mostly with relatively shallow
    networks.

    We present a novel 12-layer deep convolutional network for image compression
    artifact suppression with hierarchical skip connections and a multi-scale loss
    function. We achieve a boost of up to 1.79 dB in PSNR over ordinary JPEG and an
    improvement of up to 0.36 dB over the best previous ConvNet result. We show
    that a network trained for a specific quality factor (QF) is resilient to the
    QF used to compress the input image – a single network trained for QF 60
    provides a PSNR gain of more than 1.5 dB over the wide QF range from 40 to 76.

    Interpreting Finite Automata for Sequential Data

    Christian Albert Hammerschmidt, Qin Lin, Sicco Verwer, Radu State
    Comments: Presented at NIPS 2016 Workshop on Interpretable Machine Learning in Complex Systems
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI)

    Automaton models are often seen as interpretable models. Interpretability
    itself is not well defined: it remains unclear what interpretability means
    without first explicitly specifying objectives or desired attributes. In this
    paper, we identify the key properties used to interpret automata and propose a
    modification of a state-merging approach to learn variants of finite state
    automata. We apply the approach to problems beyond typical grammar inference
    tasks. Additionally, we cover several use-cases for prediction, classification,
    and clustering on sequential data in both supervised and unsupervised scenarios
    to show how the identified key properties are applicable in a wide range of
    contexts.

    An Efficient Training Algorithm for Kernel Survival Support Vector Machines

    Sebastian Pölsterl, Nassir Navab, Amin Katouzian
    Comments: ECML PKDD MLLS 2016: 3rd Workshop on Machine Learning in Life Sciences
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    Survival analysis is a fundamental tool in medical research to identify
    predictors of adverse events and develop systems for clinical decision support.
    In order to leverage large amounts of patient data, efficient optimisation
    routines are paramount. We propose an efficient training algorithm for the
    kernel survival support vector machine (SSVM). We directly optimise the primal
    objective function and employ truncated Newton optimisation and order statistic
    trees to significantly lower computational costs compared to previous training
    algorithms, which require (O(n^4)) space and (O(p n^6)) time for datasets with
    (n) samples and (p) features. Our results demonstrate that our proposed
    optimisation scheme allows analysing data of a much larger scale with no loss
    in prediction performance. Experiments on synthetic and 5 real-world datasets
    show that our technique outperforms existing kernel SSVM formulations if the
    amount of right censoring is high ((geq85\%)), and performs comparably
    otherwise.

    Associative Adversarial Networks

    Tarik Arici, Asli Celikyilmaz
    Comments: NIPS 2016 Workshop on Adversarial Training
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

    We propose a higher-level associative memory for learning adversarial
    networks. Generative adversarial network (GAN) framework has a discriminator
    and a generator network. The generator (G) maps white noise (z) to data samples
    while the discriminator (D) maps data samples to a single scalar. To do so, G
    learns how to map from high-level representation space to data space, and D
    learns to do the opposite. We argue that higher-level representation spaces
    need not necessarily follow a uniform probability distribution. In this work,
    we use Restricted Boltzmann Machines (RBMs) as a higher-level associative
    memory and learn the probability distribution for the high-level features
    generated by D. The associative memory samples its underlying probability
    distribution and G learns how to map these samples to data space. The proposed
    associative adversarial networks (AANs) are generative models in the
    higher-levels of the learning, and use adversarial non-stochastic models D and
    G for learning the mapping between data and higher-level representation spaces.
    Experiments show the potential of the proposed networks.


    Information Retrieval

    CAS-CNN: A Deep Convolutional Neural Network for Image Compression Artifact Suppression

    Lukas Cavigelli, Pascal Hager, Luca Benini
    Comments: 8 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Information Retrieval (cs.IR); Multimedia (cs.MM)

    Lossy image compression algorithms are pervasively used to reduce the size of
    images transmitted over the web and recorded on data storage media. However, we
    pay for their high compression rate with visual artifacts degrading the user
    experience. Deep convolutional neural networks have become a widespread tool to
    address high-level computer vision tasks very successfully. Recently, they have
    found their way into the areas of low-level computer vision and image
    processing to solve regression problems mostly with relatively shallow
    networks.

    We present a novel 12-layer deep convolutional network for image compression
    artifact suppression with hierarchical skip connections and a multi-scale loss
    function. We achieve a boost of up to 1.79 dB in PSNR over ordinary JPEG and an
    improvement of up to 0.36 dB over the best previous ConvNet result. We show
    that a network trained for a specific quality factor (QF) is resilient to the
    QF used to compress the input image – a single network trained for QF 60
    provides a PSNR gain of more than 1.5 dB over the wide QF range from 40 to 76.

    A Natural Language Query Interface for Searching Personal Information on Smartwatches

    Reza Rawassizadeh, Chelsea Dobbins, Manouchehr Nourizadeh, Zahra Ghamchili, Michael Pazzani
    Comments: 6 pages, 4 figures, 2 tables
    Subjects: Human-Computer Interaction (cs.HC); Computation and Language (cs.CL); Information Retrieval (cs.IR)

    Currently, personal assistant systems, run on smartphones and use natural
    language interfaces. However, these systems rely mostly on the web for finding
    information. Mobile and wearable devices can collect an enormous amount of
    contextual personal data such as sleep and physical activities. These
    information objects and their applications are known as quantified-self, mobile
    health or personal informatics, and they can be used to provide a deeper
    insight into our behavior. To our knowledge, existing personal assistant
    systems do not support all types of quantified-self queries. In response to
    this, we have undertaken a user study to analyze a set of “textual
    questions/queries” that users have used to search their quantified-self or
    mobile health data. Through analyzing these questions, we have constructed a
    light-weight natural language based query interface, including a text parser
    algorithm and a user interface, to process the users’ queries that have been
    used for searching quantified-self information. This query interface has been
    designed to operate on small devices, i.e. smartwatches, as well as augmenting
    the personal assistant systems by allowing them to process end users’ natural
    language queries about their quantified-self data.

    Leveraging Citation Networks to Visualize Scholarly Influence Over Time

    Jason Portenoy, Jessica Hullman, Jevin D. West
    Subjects: Human-Computer Interaction (cs.HC); Digital Libraries (cs.DL); Information Retrieval (cs.IR)

    Assessing the influence of a scholar’s work is an important task for funding
    organizations, academic departments, and researchers. Common methods, such as
    measures of citation counts, can ignore much of the nuance and
    multidimensionality of scholarly influence. We present an approach for
    generating dynamic narrative visualizations of scholars’ careers. This approach
    uses an animated node-link diagram showing the citation network accumulated
    around the researcher over the course of the career in concert with key
    indicators, highlighting influence both within and across fields. We developed
    our design in collaboration with one funding organization—the Pew Biomedical
    Scholars program—but the methods are generalizable to visualizations of
    scholarly influence in general. We applied the design method to the Microsoft
    Academic Graph, which includes more than 120 million publications. We validate
    our abstractions throughout the process through collaboration with the Pew
    Biomedical Scholars program officers and summative evaluations with their
    scholars.


    Computation and Language

    Compositional Learning of Relation Paths Embedding for Knowledge Base Completion

    Xixun Lin, Yanchun Liang, Renchu Guan
    Comments: 9 Pages,1 figure
    Subjects: Computation and Language (cs.CL)

    Nowadays, large-scale knowledge bases containing billions of facts have
    reached impressive sizes; however, they are still far from completion. In
    addition, most existing methods only consider the direct links between
    entities, ignoring the vital impact about the semantic of relation paths. In
    this paper, we study the problem of how to better embed entities and relations
    into different low dimensional spaces. A compositional learning model of
    relation paths embedding (RPE) is proposed to take full advantage of additional
    semantic information expressed by relation paths. More specifically, using
    corresponding projection matrices, RPE can simultaneously embed entities into
    corresponding relation and path spaces. It is also suggested that type
    constraints could be extended from traditional relation-specific to the new
    proposed path-specific ones. Both of the two type constraints can be seamlessly
    incorporated into RPE and decrease the errors in prediction. Experiments are
    conducted on the benchmark datasets and the proposed model achieves significant
    and consistent improvements compared with the state-of-the-art algorithms for
    knowledge base completion.

    Learning to Distill: The Essence Vector Modeling Framework

    Kuan-Yu Chen, Shih-Hung Liu, Berlin Chen, Hsin-Min Wang
    Subjects: Computation and Language (cs.CL)

    In the context of natural language processing, representation learning has
    emerged as a newly active research subject because of its excellent performance
    in many applications. Learning representations of words is a pioneering study
    in this school of research. However, paragraph (or sentence and document)
    embedding learning is more suitable/reasonable for some tasks, such as
    sentiment classification and document summarization. Nevertheless, as far as we
    are aware, there is relatively less work focusing on the development of
    unsupervised paragraph embedding methods. Classic paragraph embedding methods
    infer the representation of a given paragraph by considering all of the words
    occurring in the paragraph. Consequently, those stop or function words that
    occur frequently may mislead the embedding learning process to produce a misty
    paragraph representation. Motivated by these observations, our major
    contributions in this paper are twofold. First, we propose a novel unsupervised
    paragraph embedding method, named the essence vector (EV) model, which aims at
    not only distilling the most representative information from a paragraph but
    also excluding the general background information to produce a more informative
    low-dimensional vector representation for the paragraph. Second, in view of the
    increasing importance of spoken content processing, an extension of the EV
    model, named the denoising essence vector (D-EV) model, is proposed. The D-EV
    model not only inherits the advantages of the EV model but also can infer a
    more robust representation for a given spoken paragraph against imperfect
    speech recognition.

    An Experimental Comparison of Deep Neural Networks for End-to-end Speech Recognition

    Zewang Zhang, Zheng Sun, Jiaqi Liu, Jingwen Chen, Zhao Huo, Xiao Zhang
    Comments: 10pages, 13figures
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    Performance of end-to-end automatic speech recognition (ASR) systems can
    significantly be improved by the increasing large speech corpus and deeper
    neural network. Given the arising problem of training speed and recent success
    of deep convolutional neural network in ASR, we build a novel deep recurrent
    convolutional network for acoustic modeling and apply deep residual learning
    framework to it, our experiments show that it has not only faster convergence
    speed but better recognition accuracy over traditional deep convolutional
    recurrent network. We mainly compare convergence speed of two acoustic models,
    which are novel deep recurrent convolutional networks and traditional deep
    convolutional recurrent networks. With faster convergence speed, our novel deep
    recurrent convolutional networks can reach the comparable performance. We
    further show that applying deep residual learning can boost both convergence
    speed and recognition accuracy of our novel recurret convolutional networks.
    Finally, we evaluate all our experimental networks by phoneme error rate (PER)
    with newly proposed bidirectional statistical language model. Our evaluation
    results show that our model applied with deep residual learning can reach the
    best PER of 17.33% with fastest convergence speed in TIMIT database.

    A Natural Language Query Interface for Searching Personal Information on Smartwatches

    Reza Rawassizadeh, Chelsea Dobbins, Manouchehr Nourizadeh, Zahra Ghamchili, Michael Pazzani
    Comments: 6 pages, 4 figures, 2 tables
    Subjects: Human-Computer Interaction (cs.HC); Computation and Language (cs.CL); Information Retrieval (cs.IR)

    Currently, personal assistant systems, run on smartphones and use natural
    language interfaces. However, these systems rely mostly on the web for finding
    information. Mobile and wearable devices can collect an enormous amount of
    contextual personal data such as sleep and physical activities. These
    information objects and their applications are known as quantified-self, mobile
    health or personal informatics, and they can be used to provide a deeper
    insight into our behavior. To our knowledge, existing personal assistant
    systems do not support all types of quantified-self queries. In response to
    this, we have undertaken a user study to analyze a set of “textual
    questions/queries” that users have used to search their quantified-self or
    mobile health data. Through analyzing these questions, we have constructed a
    light-weight natural language based query interface, including a text parser
    algorithm and a user interface, to process the users’ queries that have been
    used for searching quantified-self information. This query interface has been
    designed to operate on small devices, i.e. smartwatches, as well as augmenting
    the personal assistant systems by allowing them to process end users’ natural
    language queries about their quantified-self data.


    Distributed, Parallel, and Cluster Computing

    Can Broken Multicore Hardware be Mended?

    János Végh
    Comments: 3 figures; a Viewpoint
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Hardware Architecture (cs.AR)

    A suggestion is made for mending multicore hardware, which has been diagnosed
    as broken.

    Time and Space Optimal Counting in Population Protocols

    James Aspnes (YALE), Joffroy Beauquier (LRI), Janna Burman (LRI), Devan Sohier
    Journal-ref: 20th International Conference on Principles of Distributed
    Systems, OPODIS 2015, Dec 2016, Madrid, Spain. 2016, 20th International
    Conference on Principles of Distributed Systems, OPODIS 2015
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Networking and Internet Architecture (cs.NI)

    This work concerns the general issue of combined optimality in terms of time
    and space complexity. In this context, we study the problem of (exact) counting
    resource-limited and passively mobile nodes in the model of population
    protocols, in which the space complexity is crucial. The counted nodes are
    memory-limited anonymous devices (called agents) communicating asynchronously
    in pairs (according to a fairness condition). Moreover, we assume that these
    agents are prone to failures so that they cannot be correctly initialized. This
    study considers two classical fairness conditions, and for each we investigate
    the issue of time optimality of counting given the optimal space per agent. In
    the case of randomly interacting agents (probabilistic fairness), as usual, the
    convergence time is measured in terms of parallel time (or parallel
    interactions), which is defined as the number of pairwise interactions until
    convergence, divided by n (the number of agents). In case of weak fairness,
    where it is only required that every pair of agents interacts infinitely often,
    the convergence time is defined in terms of non-null transitions, i.e, the
    transitions that affect the states of the interacting agents.First, assuming
    probabilistic fairness, we present a “non-guessing” time optimal protocol of
    O(n log n) expected time given an optimal space of only one bit, and we prove
    the time optimality of this protocol. Then, for weak fairness, we show that a
    space optimal (semi-uniform) solution cannot converge faster than in
    (Omega)(2^n) time (non-null transitions). This result, together with the time
    complexity analysis of an already known space optimal protocol, shows that it
    is also optimal in time (given the optimal space constrains).

    Fast and Energy-Efficient CNN Inference on IoT Devices

    Mohammad Motamedi, Daniel Fong, Soheil Ghiasi
    Comments: 7 pages, 10 figures
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Learning (cs.LG)

    Convolutional Neural Networks (CNNs) exhibit remarkable performance in
    various machine learning tasks. As sensor-equipped internet of things (IoT)
    devices permeate into every aspect of modern life, it is increasingly important
    to run CNN inference, a computationally intensive application, on resource
    constrained devices. We present a technique for fast and energy-efficient CNN
    inference on mobile SoC platforms, which are projected to be a major player in
    the IoT space. We propose techniques for efficient parallelization of CNN
    inference targeting mobile GPUs, and explore the underlying tradeoffs.
    Experiments with running Squeezenet on three different mobile devices confirm
    the effectiveness of our approach. For further study, please refer to the
    project repository available on our GitHub page:
    this https URL

    pocl: A Performance-Portable OpenCL Implementation

    Pekka Jääskeläinen, Carlos Sánchez de La Lama, Erik Schnetter, Kalle Raiskila, Jarmo Takala, Heikki Berg
    Comments: This article was published in 2015; it is now openly accessible via arxiv
    Journal-ref: Int J Parallel Prog (2015) 43: 752
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    OpenCL is a standard for parallel programming of heterogeneous systems. The
    benefits of a common programming standard are clear; multiple vendors can
    provide support for application descriptions written according to the standard,
    thus reducing the program porting effort. While the standard brings the obvious
    benefits of platform portability, the performance portability aspects are
    largely left to the programmer. The situation is made worse due to multiple
    proprietary vendor implementations with different characteristics, and, thus,
    required optimization strategies.

    In this paper, we propose an OpenCL implementation that is both portable and
    performance portable. At its core is a kernel compiler that can be used to
    exploit the data parallelism of OpenCL programs on multiple platforms with
    different parallel hardware styles. The kernel compiler is modularized to
    perform target-independent parallel region formation separately from the
    target-specific parallel mapping of the regions to enable support for various
    styles of fine-grained parallel resources such as subword SIMD extensions, SIMD
    datapaths and static multi-issue. Unlike previous similar techniques that work
    on the source level, the parallel region formation retains the information of
    the data parallelism using the LLVM IR and its metadata infrastructure. This
    data can be exploited by the later generic compiler passes for efficient
    parallelization.

    The proposed open source implementation of OpenCL is also platform portable,
    enabling OpenCL on a wide range of architectures, both already commercialized
    and on those that are still under research. The paper describes how the
    portability of the implementation is achieved. Our results show that most of
    the benchmarked applications when compiled using pocl were faster or close to
    as fast as the best proprietary OpenCL implementation for the platform at hand.

    Call Trace and Memory Access Pattern based Runtime Insider Threat Detection for Big Data Platforms

    Santosh Aditham, Nagarajan Ranganathan, Srinivas Katkoori
    Subjects: Cryptography and Security (cs.CR); Distributed, Parallel, and Cluster Computing (cs.DC)

    Big data platforms such as Hadoop and Spark are being widely adopted both by
    academia and industry. In this paper, we propose a runtime intrusion detection
    technique that understands and works according to the properties of such
    distributed compute platforms. The proposed method is based on runtime analysis
    of system and library calls and memory access patterns of tasks running on the
    datanodes (slaves). First, the primary datanode of a big data system creates a
    behavior profile for every task it executes. A behavior profile includes (a)
    trace of the system & library calls made, and (b) sequence representing the
    sizes of private and shared memory accesses made during task execution. Then,
    the process behavior profile is shared with other replica datanodes that are
    scheduled to execute the same task on their copy of the same data. Next, these
    replica datanodes verify their local tasks with the help of the information
    embedded in the received behavior profiles. This is realized in two steps: (i)
    comparing the system & library calls metadata, and (ii) statistical matching of
    the memory access patterns. Finally, datanodes share their observations for
    consensus and report an intrusion to the namenode (master) if they find any
    discrepancy. The proposed solution was tested on a small hadoop cluster using
    the default MapReduce examples and the results show that our approach can
    detect insider attacks that cannot be detected with the traditional analysis
    metrics.


    Learning

    A causal framework for discovering and removing direct and indirect discrimination

    Lu Zhang (1), Yongkai Wu (1), Xintao Wu (1) ((1) University of Arkansas)
    Subjects: Learning (cs.LG)

    Anti-discrimination is an increasingly important task in data science. In
    this paper, we investigate the problem of discovering both direct and indirect
    discrimination from the historical data, and removing the discriminatory
    effects before the data is used for predictive analysis (e.g., building
    classifiers). We make use of the causal network to capture the causal structure
    of the data. Then we model direct and indirect discrimination as the
    path-specific effects, which explicitly distinguish the two types of
    discrimination as the causal effects transmitted along different paths in the
    network. Based on that, we propose an effective algorithm for discovering
    direct and indirect discrimination, as well as an algorithm for precisely
    removing both types of discrimination while retaining good data utility.
    Different from previous works, our approaches can ensure that the predictive
    models built from the modified data will not incur discrimination in decision
    making. Experiments using real datasets show the effectiveness of our
    approaches.

    Variational Intrinsic Control

    Karol Gregor, Danilo Jimenez Rezende, Daan Wierstra
    Comments: 15 pages, 6 figures
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

    In this paper we introduce a new unsupervised reinforcement learning method
    for discovering the set of intrinsic options available to an agent. This set is
    learned by maximizing the number of different states an agent can reliably
    reach, as measured by the mutual information between the set of options and
    option termination states. To this end, we instantiate two policy gradient
    based algorithms, one that creates an explicit embedding space of options and
    one that represents options implicitly. The algorithms also provide an explicit
    measure of empowerment in a given state that can be used by an empowerment
    maximizing agent. The algorithm scales well with function approximation and we
    demonstrate the applicability of the algorithm on a range of tasks.

    Singularity of the Hessian in Deep Learning

    Levent Sagun, Leon Bottou, Yann LeCun
    Comments: ICLR 2017 Submission on Nov 4, 2016
    Subjects: Learning (cs.LG)

    We look at the eigenvalues of the Hessian of a loss function before and after
    training. The eigenvalue distribution is seen to be composed of two parts, the
    bulk which is concentrated around zero, and the edges which are scattered away
    from zero. We present empirical evidence for the bulk indicating how
    over-parametrized the system is, and for the edges indicating the complexity of
    the input data.

    Achieving non-discrimination in data release

    Lu Zhang (1), Yongkai Wu (1), Xintao Wu (1) ((1) University of Arkansas)
    Subjects: Learning (cs.LG)

    Discrimination discovery and prevention/removal are increasingly important
    tasks in data mining. Discrimination discovery aims to unveil discriminatory
    practices on the protected attribute (e.g., gender) by analyzing the dataset of
    historical decision records, and discrimination prevention aims to remove
    discrimination by modifying the biased data before conducting predictive
    analysis. In this paper, we show that the key to discrimination discovery and
    prevention is to find the meaningful partitions that can be used to provide
    quantitative evidences for the judgment of discrimination. With the support of
    the causal graph, we present a graphical condition for identifying a meaningful
    partition. Based on that, we develop a simple criterion for the claim of
    non-discrimination, and propose discrimination removal algorithms which
    accurately remove discrimination while retaining good data utility. Experiments
    using real datasets show the effectiveness of our approaches.

    Deep Learning Approximation for Stochastic Control Problems

    Jiequn Han, E Weinan
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE); Optimization and Control (math.OC); Machine Learning (stat.ML)

    Many real world stochastic control problems suffer from the “curse of
    dimensionality”. To overcome this difficulty, we develop a deep learning
    approach that directly solves high-dimensional stochastic control problems
    based on Monte-Carlo sampling. We approximate the time-dependent controls as
    feedforward neural networks and stack these networks together through model
    dynamics. The objective function for the control problem plays the role of the
    loss function for the deep neural network. We test this approach using examples
    from the areas of optimal trading and energy storage. Our results suggest that
    the algorithm presented here achieves satisfactory accuracy and at the same
    time, can handle rather high dimensional problems.

    Limbo: A Fast and Flexible Library for Bayesian Optimization

    Antoine Cully, Konstantinos Chatzilygeroudis, Federico Allocati, Jean-Baptiste Mouret
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO); Machine Learning (stat.ML)

    Limbo is an open-source C++11 library for Bayesian optimization which is
    designed to be both highly flexible and very fast. It can be used to optimize
    functions for which the gradient is unknown, evaluations are expensive, and
    runtime cost matters (e.g., on embedded systems or robots). Benchmarks on
    standard functions show that Limbo is about 2 times faster than BayesOpt
    (another C++ library) for a similar accuracy.

    Correlation Clustering with Low-Rank Matrices

    Nate Veldt, Anthony Wirth, David F. Gleich
    Subjects: Learning (cs.LG); Data Structures and Algorithms (cs.DS); Numerical Analysis (cs.NA)

    Correlation clustering is a technique for aggregating data based on
    qualitative information about which pairs of objects are labeled ‘similar’ or
    ‘dissimilar.’ Because the optimization problem is NP-hard, much of the previous
    literature focuses on finding approximation algorithms. In this paper we
    explore a new approach to correlation clustering by considering how to solve
    the problem when the data to be clustered can be represented by a low-rank
    matrix. Many real-world datasets are known to be inherently low-dimensional,
    and our goal is to establish a tractable approach to correlation clustering in
    this important setting. We prove in particular that correlation clustering can
    be solved in polynomial time when the underlying matrix is positive
    semidefinite with small constant rank, but that the task remains NP-hard in the
    presence of even one negative eigenvalue. Based on our theoretical results, we
    develop an algorithm for efficiently solving low-rank positive semidefinite
    correlation clustering by employing a procedure for zonotope vertex
    enumeration. We demonstrate the effectiveness and speed of our algorithm by
    using it to solve several clustering problems on both synthetic and real-world
    data.

    Risk-Sensitive Learning and Pricing for Demand Response

    Kia Khezeli, Eilyan Bitar
    Subjects: Learning (cs.LG); Optimization and Control (math.OC)

    We consider the setting in which an electric power utility seeks to curtail
    its peak electricity demand by offering a fixed group of customers a uniform
    price for reductions in consumption relative to their predetermined baselines.
    The underlying demand curve, which describes the aggregate reduction in
    consumption in response to the offered price, is assumed to be affine and
    subject to unobservable random shocks. Assuming that both the parameters of the
    demand curve and the distribution of the random shocks are initially unknown to
    the utility, we investigate the extent to which the utility might dynamically
    adjust its offered prices to maximize its cumulative risk-sensitive payoff over
    a finite number of (T) days. In order to do so effectively, the utility must
    design its pricing policy to balance the tradeoff between the need to learn the
    unknown demand model (exploration) and maximize its payoff (exploitation) over
    time. In this paper, we propose such a pricing policy, which is shown to
    exhibit an expected payoff loss over (T) days that is at most (O(sqrt{T})),
    relative to an oracle pricing policy that knows the underlying demand model.
    Moreover, the proposed pricing policy is shown to yield a sequence of prices
    that converge to the oracle optimal prices in the mean square sense.

    An Efficient Training Algorithm for Kernel Survival Support Vector Machines

    Sebastian Pölsterl, Nassir Navab, Amin Katouzian
    Comments: ECML PKDD MLLS 2016: 3rd Workshop on Machine Learning in Life Sciences
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    Survival analysis is a fundamental tool in medical research to identify
    predictors of adverse events and develop systems for clinical decision support.
    In order to leverage large amounts of patient data, efficient optimisation
    routines are paramount. We propose an efficient training algorithm for the
    kernel survival support vector machine (SSVM). We directly optimise the primal
    objective function and employ truncated Newton optimisation and order statistic
    trees to significantly lower computational costs compared to previous training
    algorithms, which require (O(n^4)) space and (O(p n^6)) time for datasets with
    (n) samples and (p) features. Our results demonstrate that our proposed
    optimisation scheme allows analysing data of a much larger scale with no loss
    in prediction performance. Experiments on synthetic and 5 real-world datasets
    show that our technique outperforms existing kernel SSVM formulations if the
    amount of right censoring is high ((geq85\%)), and performs comparably
    otherwise.

    Inducing Interpretable Representations with Variational Autoencoders

    N. Siddharth, Brooks Paige, Alban Desmaison, Jan-Willem Van de Meent, Frank Wood, Noah D. Goodman, Pushmeet Kohli, Philip H.S. Torr
    Comments: Presented at NIPS 2016 Workshop on Interpretable Machine Learning in Complex Systems
    Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    We develop a framework for incorporating structured graphical models in the
    emph{encoders} of variational autoencoders (VAEs) that allows us to induce
    interpretable representations through approximate variational inference. This
    allows us to both perform reasoning (e.g. classification) under the structural
    constraints of a given graphical model, and use deep generative models to deal
    with messy, high-dimensional domains where it is often difficult to model all
    the variation. Learning in this framework is carried out end-to-end with a
    variational objective, applying to both unsupervised and semi-supervised
    schemes.

    Can Co-robots Learn to Teach?

    Harshal Maske, Emily Kieson, Girish Chowdhary, Charles Abramson
    Comments: 9 pages, conference
    Subjects: Robotics (cs.RO); Learning (cs.LG)

    We explore beyond existing work on learning from demonstration by asking the
    question: Can robots learn to teach?, that is, can a robot autonomously learn
    an instructional policy from expert demonstration and use it to instruct or
    collaborate with humans in executing complex tasks in uncertain environments?
    In this paper we pursue a solution to this problem by leveraging the idea that
    humans often implicitly decompose a higher level task into several subgoals
    whose execution brings the task closer to completion. We propose Dirichlet
    process based non-parametric Inverse Reinforcement Learning (DPMIRL) approach
    for reward based unsupervised clustering of task space into subgoals. This
    approach is shown to capture the latent subgoals that a human teacher would
    have utilized to train a novice. The notion of action primitive is introduced
    as the means to communicate instruction policy to humans in the least
    complicated manner, and as a computationally efficient tool to segment
    demonstration data. We evaluate our approach through experiments on hydraulic
    actuated scaled model of an excavator and evaluate and compare different
    teaching strategies utilized by the robot.

    Grad-CAM: Why did you say that?

    Ramprasaath R Selvaraju, Abhishek Das, Ramakrishna Vedantam, Michael Cogswell, Devi Parikh, Dhruv Batra
    Comments: Presented at NIPS 2016 Workshop on Interpretable Machine Learning in Complex Systems. This is an extended abstract version of arXiv:1610.02391
    Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    We propose a technique for making Convolutional Neural Network (CNN)-based
    models more transparent by visualizing input regions that are ‘important’ for
    predictions — or visual explanations. Our approach, called Gradient-weighted
    Class Activation Mapping (Grad-CAM), uses class-specific gradient information
    to localize important regions. These localizations are combined with existing
    pixel-space visualizations to create a novel high-resolution and
    class-discriminative visualization called Guided Grad-CAM. These methods help
    better understand CNN-based models, including image captioning and visual
    question answering (VQA) models. We evaluate our visual explanations by
    measuring their ability to discriminate between classes, to inspire trust in
    humans, and their correlation with occlusion maps. Grad-CAM provides a new way
    to understand CNN-based models.

    We have released code, an online demo hosted on CloudCV, and a full version
    of this extended abstract.

    TreeView: Peeking into Deep Neural Networks Via Feature-Space Partitioning

    Jayaraman J. Thiagarajan, Bhavya Kailkhura, Prasanna Sattigeri, Karthikeyan Natesan Ramamurthy
    Comments: Presented at NIPS 2016 Workshop on Interpretable Machine Learning in Complex Systems
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    With the advent of highly predictive but opaque deep learning models, it has
    become more important than ever to understand and explain the predictions of
    such models. Existing approaches define interpretability as the inverse of
    complexity and achieve interpretability at the cost of accuracy. This
    introduces a risk of producing interpretable but misleading explanations. As
    humans, we are prone to engage in this kind of behavior cite{mythos}. In this
    paper, we take a step in the direction of tackling the problem of
    interpretability without compromising the model accuracy. We propose to build a
    Treeview representation of the complex model via hierarchical partitioning of
    the feature space, which reveals the iterative rejection of unlikely class
    labels until the correct association is predicted.

    Variational Graph Auto-Encoders

    Thomas N. Kipf, Max Welling
    Comments: Bayesian Deep Learning Workshop (NIPS 2016)
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    We introduce the variational graph auto-encoder (VGAE), a framework for
    unsupervised learning on graph-structured data based on the variational
    auto-encoder (VAE). This model makes use of latent variables and is capable of
    learning interpretable latent representations for undirected graphs. We
    demonstrate this model using a graph convolutional network (GCN) encoder and a
    simple inner product decoder. Our model achieves competitive results on a link
    prediction task in citation networks. In contrast to most existing models for
    unsupervised learning on graph-structured data and link prediction, our model
    can naturally incorporate node features, which significantly improves
    predictive performance on a number of benchmark datasets.

    Investigating the influence of noise and distractors on the interpretation of neural networks

    Pieter-Jan Kindermans, Kristof Schütt, Klaus-Robert Müller, Sven Dähne
    Comments: Presented at NIPS 2016 Workshop on Interpretable Machine Learning in Complex Systems
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Understanding neural networks is becoming increasingly important. Over the
    last few years different types of visualisation and explanation methods have
    been proposed. However, none of them explicitly considered the behaviour in the
    presence of noise and distracting elements. In this work, we will show how
    noise and distracting dimensions can influence the result of an explanation
    model. This gives a new theoretical insights to aid selection of the most
    appropriate explanation model within the deep-Taylor decomposition framework.

    An Experimental Comparison of Deep Neural Networks for End-to-end Speech Recognition

    Zewang Zhang, Zheng Sun, Jiaqi Liu, Jingwen Chen, Zhao Huo, Xiao Zhang
    Comments: 10pages, 13figures
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    Performance of end-to-end automatic speech recognition (ASR) systems can
    significantly be improved by the increasing large speech corpus and deeper
    neural network. Given the arising problem of training speed and recent success
    of deep convolutional neural network in ASR, we build a novel deep recurrent
    convolutional network for acoustic modeling and apply deep residual learning
    framework to it, our experiments show that it has not only faster convergence
    speed but better recognition accuracy over traditional deep convolutional
    recurrent network. We mainly compare convergence speed of two acoustic models,
    which are novel deep recurrent convolutional networks and traditional deep
    convolutional recurrent networks. With faster convergence speed, our novel deep
    recurrent convolutional networks can reach the comparable performance. We
    further show that applying deep residual learning can boost both convergence
    speed and recognition accuracy of our novel recurret convolutional networks.
    Finally, we evaluate all our experimental networks by phoneme error rate (PER)
    with newly proposed bidirectional statistical language model. Our evaluation
    results show that our model applied with deep residual learning can reach the
    best PER of 17.33% with fastest convergence speed in TIMIT database.

    Fast and Energy-Efficient CNN Inference on IoT Devices

    Mohammad Motamedi, Daniel Fong, Soheil Ghiasi
    Comments: 7 pages, 10 figures
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Learning (cs.LG)

    Convolutional Neural Networks (CNNs) exhibit remarkable performance in
    various machine learning tasks. As sensor-equipped internet of things (IoT)
    devices permeate into every aspect of modern life, it is increasingly important
    to run CNN inference, a computationally intensive application, on resource
    constrained devices. We present a technique for fast and energy-efficient CNN
    inference on mobile SoC platforms, which are projected to be a major player in
    the IoT space. We propose techniques for efficient parallelization of CNN
    inference targeting mobile GPUs, and explore the underlying tradeoffs.
    Experiments with running Squeezenet on three different mobile devices confirm
    the effectiveness of our approach. For further study, please refer to the
    project repository available on our GitHub page:
    this https URL

    Max-Margin Deep Generative Models for (Semi-)Supervised Learning

    Chongxuan Li, Jun Zhu, Bo Zhang
    Comments: arXiv admin note: substantial text overlap with arXiv:1504.06787
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)

    Deep generative models (DGMs) are effective on learning multilayered
    representations of complex data and performing inference of input data by
    exploring the generative ability. However, it is relatively insufficient to
    empower the discriminative ability of DGMs on making accurate predictions. This
    paper presents max-margin deep generative models (mmDGMs) and a
    class-conditional variant (mmDCGMs), which explore the strongly discriminative
    principle of max-margin learning to improve the predictive performance of DGMs
    in both supervised and semi-supervised learning, while retaining the generative
    capability. In semi-supervised learning, we use the predictions of a max-margin
    classifier as the missing labels instead of performing full posterior inference
    for efficiency; we also introduce additional max-margin and label-balance
    regularization terms of unlabeled data for effectiveness. We develop an
    efficient doubly stochastic subgradient algorithm for the piecewise linear
    objectives in different settings. Empirical results on various datasets
    demonstrate that: (1) max-margin learning can significantly improve the
    prediction performance of DGMs and meanwhile retain the generative ability; (2)
    in supervised learning, mmDGMs are competitive to the best fully discriminative
    networks when employing convolutional neural networks as the generative and
    recognition models; and (3) in semi-supervised learning, mmDCGMs can perform
    efficient inference and achieve state-of-the-art classification results on
    several benchmarks.

    A Deep Learning Approach for Joint Video Frame and Reward Prediction in Atari Games

    Felix Leibfried, Nate Kushman, Katja Hofmann
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG)

    Reinforcement learning is concerned with learning to interact with
    environments that are initially unknown. State-of-the-art reinforcement
    learning approaches, such as DQN, are model-free and learn to act effectively
    across a wide range of environments such as Atari games, but require huge
    amounts of data. Model-based techniques are more data-efficient, but need to
    acquire explicit knowledge about the environment dynamics or the reward
    structure.

    In this paper we take a step towards using model-based techniques in
    environments with high-dimensional visual state space when system dynamics and
    the reward structure are both unknown and need to be learned, by demonstrating
    that it is possible to learn both jointly. Empirical evaluation on five Atari
    games demonstrate accurate cumulative reward prediction of up to 200 frames. We
    consider these positive results as opening up important directions for
    model-based RL in complex, initially unknown environments.

    The Recycling Gibbs Sampler for Efficient Learning

    Luca Martino, Victor Elvira, Gustau Camps-Valls
    Comments: The MATLAB code of the numerical examples is provided at this http URL
    Subjects: Computation (stat.CO); Learning (cs.LG); Machine Learning (stat.ML)

    Monte Carlo methods are essential tools for Bayesian inference. Gibbs
    sampling is a well-known Markov chain Monte Carlo (MCMC) algorithm, extensively
    used in signal processing, machine learning, and statistics, employed to draw
    samples from complicated high-dimensional posterior distributions. The key
    point for the successful application of the Gibbs sampler is the ability to
    draw efficiently samples from the full-conditional probability density
    functions. Since in the general case this is not possible, in order to speed up
    the convergence of the chain, it is required to generate auxiliary samples
    whose information is eventually disregarded. In this work, we show that these
    auxiliary samples can be recycled within the Gibbs estimators, improving their
    efficiency with no extra cost. This novel scheme arises naturally after
    pointing out the relationship between the standard Gibbs sampler and the chain
    rule used for sampling purposes. Numerical simulations involving simple and
    real inference problems confirm the excellent performance of the proposed
    scheme in terms of accuracy and computational efficiency. In particular we give
    empirical evidence of performance in a toy example, inference of Gaussian
    processes hyperparameters, and learning dependence graphs through regression.

    Measuring Sample Quality with Diffusions

    Jack Gorham, Andrew B. Duncan, Sebastian J. Vollmer, Lester Mackey
    Subjects: Machine Learning (stat.ML); Learning (cs.LG); Probability (math.PR)

    Standard Markov chain Monte Carlo diagnostics, like effective sample size,
    are ineffective for biased sampling procedures that sacrifice asymptotic
    correctness for computational speed. Recent work addresses this issue for a
    class of strongly log-concave target distributions by constructing a computable
    discrepancy measure based on Stein’s method that provably determines
    convergence to the target. We generalize this approach to cover any target with
    a fast-coupling Ito diffusion by bounding the derivatives of Stein equation
    solutions in terms of Markov process coupling rates. As example applications,
    we develop computable and convergence-determining diffusion Stein discrepancies
    for log-concave, heavy-tailed, and multimodal targets and use these quality
    measures to select the hyperparameters of biased samplers, compare random and
    deterministic quadrature rules, and quantify bias-variance tradeoffs in
    approximate Markov chain Monte Carlo. Our explicit multivariate Stein factor
    bounds may be of independent interest.


    Information Theory

    Non-Orthogonal Multiple Access (NOMA) for Downlink Multiuser MIMO Systems: User Clustering, Beamforming, and Power Allocation

    Md Shipon Ali, Ekram Hossain, Dong In Kim
    Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

    We investigate the application of non-orthogonal multiple access (NOMA) with
    successive interference cancellation (SIC) in downlink multiuser multiple-input
    multiple-output (MIMO) cellular systems, where the total number of receive
    antennas at user equipment (UE) ends in a cell is more than the number of
    transmit antennas at the base station (BS). We first dynamically group the UE
    receive antennas into a number of clusters equal to or more than the number of
    BS transmit antennas. A single beamforming vector is then shared by all the
    receive antennas in a cluster. We propose a linear beamforming technique in
    which all the receive antennas can significantly cancel the inter-cluster
    interference. On the other hand, the receive antennas in each cluster are
    scheduled on power domain NOMA basis with SIC at the receiver ends. For
    inter-cluster and intra-cluster power allocation, we provide dynamic power
    allocation solutions with an objective to maximizing the overall cell capacity.
    An extensive performance evaluation is carried out for the proposed MIMO-NOMA
    system and the results are compared with those for conventional orthogonal
    multiple access (OMA)-based MIMO systems and other existing MIMO-NOMA
    solutions. The numerical results quantify the capacity gain of the proposed
    MIMO-NOMA model over MIMO-OMA and other existing MIMO-NOMA solutions.

    Efficient Feedback Mechanisms for FDD Massive MIMO under User-level Cooperation

    Junting Chen, Haifan Yin, Laura Cottatellucci, David Gesbert
    Comments: 13 pages
    Subjects: Information Theory (cs.IT)

    Channel state information (CSI) feedback is a challenging issue in frequency
    division multiplexing (FDD) massive MIMO systems. This paper studies a
    cooperative feedback scheme, where the users first exchange their CSI with each
    other by exploiting device-to-device (D2D) communications, then compute the
    precoder by themselves, and feed back the precoder to the base station (BS).
    Analytical results are derived to show that the cooperative precoder feedback
    is more efficient than the CSI feedback in terms of interference mitigation.
    Under the constraint of limited D2D communication capacity, we develop an
    adaptive CSI exchange strategy based on signal subspace projection and optimal
    bit partition. Numerical results demonstrate that the proposed cooperative
    precoder feedback scheme with adaptive CSI exchange significantly outperforms
    the CSI feedback scheme, even when the CSI is exchanged via rate-limited D2D
    communications.

    What to Expect When You Are Expecting on the Grassmannian

    Armin Eftekhari, Laura Balzano, Michael B. Wakin
    Subjects: Information Theory (cs.IT)

    Consider an incoming sequence of vectors, all belonging to an unknown
    subspace (operatorname{S}), and each with many missing entries. In order to
    estimate (operatorname{S}), it is common to partition the data into blocks and
    iteratively update the estimate of (operatorname{S}) with each new incoming
    measurement block.

    In this paper, we investigate a rather basic question: Is it possible to
    identify (operatorname{S}) by averaging the column span of the partially
    observed incoming measurement blocks on the Grassmannian?

    We find that in general the span of the incoming blocks is in fact a biased
    estimator of (operatorname{S}) when data suffers from erasures, and we find an
    upper bound for this bias. We reach this conclusion by examining the defining
    optimization program for the Fr'{e}chet expectation on the Grassmannian, and
    with the aid of a sharp perturbation bound and standard large deviation
    results.

    Comparing entropy rates on finite and infinite rooted trees with length functions

    Thomas Hirschler, Wolfgang Woess
    Subjects: Information Theory (cs.IT); Probability (math.PR)

    We consider stochastic processes with (or without) memory whose evolution is
    encoded by a finite or infinite rooted tree. The main goal is to compare the
    entropy rates of a given base process and a second one, to be considered as a
    perturbation of the former. The processes are described by probability measures
    on the boundary of the given tree, and by corresponding forward transition
    probabilities at the inner nodes. The comparison is in terms of
    Kullback-Leibler divergence. We elaborate and extend ideas and results of
    B”ocherer and Amjad. Our extensions involve length functions on the edges of
    the tree as well as nodes with countably many successors. In the last part, we
    consider trees with infinite geodesic rays and random perturbations of a given
    process.

    New bounds of permutation codes under Hamming metric and Kendall's (τ)-metric

    Xin Wang, Yiwei Zhang, Yiting Yang, Gennian Ge
    Subjects: Information Theory (cs.IT)

    Permutation codes are widely studied objects due to their numerous
    applications in various areas, such as power line communications, block
    ciphers, and the rank modulation scheme for flash memories. Several kinds of
    metrics are considered for permutation codes according to their specific
    applications. This paper concerns some improvements on the bounds of
    permutation codes under Hamming metric and Kendall’s ( au)-metric
    respectively, using mainly a graph coloring approach. Specifically, under
    Hamming metric, we improve the Gilbert-Varshamov bound asymptotically by a
    factor (n), when the minimum Hamming distance (d) is fixed and the code length
    (n) goes to infinity. Under Kendall’s ( au)-metric, we narrow the gap between
    the known lower bounds and upper bounds. Besides, we also obtain some sporadic
    results under Kendall’s ( au)-metric for small parameters.

    Distance verification for classical and quantum LDPC codes

    Ilya Dumer, Alexey A. Kovalev, Leonid P. Pryadko
    Comments: 16 pages, 2 figures
    Subjects: Information Theory (cs.IT); Quantum Physics (quant-ph)

    The techniques of distance verification known for general linear codes are
    re-applied to quantum stabilizer codes. Then distance verification is addressed
    for classical and quantum LDPC codes. New complexity bounds for distance
    verification with provable performance are derived using the average weight
    spectra of the ensembles of LDPC codes. These bounds are expressed in terms of
    the erasure-correcting capacity of the corresponding ensemble. We also present
    a new irreducible-cluster technique that can be applied to any LDPC code and
    takes advantage of parity-checks’ sparsity for both classical and quantum LDPC
    codes. This technique reduces complexity exponents of all existing
    deterministic techniques designed for generic stabilizer codes with small
    relative distances, which also include all known families of quantum LDPC
    codes.

    Outage Effective Capacity of Buffer-Aided Diamond Relay Systems Using HARQ with Incremental Redundancy

    Deli Qiao
    Subjects: Information Theory (cs.IT)

    In this paper, transmission over buffer-aided diamond relay systems under
    statistical quality of service (QoS) constraints is studied. The statistical
    QoS constraints are imposed as limitations on delay violation probabilities. In
    the absence of channel state information (CSI) at the transmitter, truncated
    hybrid automatic repeat request-incremental redundancy (HARQ-IR) is
    incorporated to make better use of the wireless channel and the resources for
    each communication link. The packets that cannot be successfully received upon
    the maximum number of transmissions will be removed from buffer, i.e., outage
    occurs. The emph{outage effective capacity} of a communication link is defined
    as the maximum constant arrival rate to the source that can be supported by the
    emph{goodput} departure processes, i.e., the departure that can be
    successfully received by the receiver. Then, the outage effective capacity for
    the buffer-aided diamond relay system is obtained for HARQ-IR incorporated
    transmission strategy under the emph{end-to-end} delay constraints. In
    comparison with the DF protocol with perfect CSI at the transmitters, it is
    shown that HARQ-IR can achieve superior performance when the SNR levels at the
    relay are not so large or when the delay constraints are stringent.




沪ICP备19023445号-2号
友情链接