IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Tue, 6 Dec 2016

    我爱机器学习(52ml.net)发表于 2016-12-06 00:00:00
    love 0

    Neural and Evolutionary Computing

    BrainFrame: A heterogeneous accelerator platform for neuron simulations

    Georgios Smaragdos, Georgios Chatzikonstantis, Rahul Kukreja, Harrys Sidiropoulos, Dimitrios Rodopoulos, Ioannis Sourdis, Zaid Al-Ars, Christoforos Kachris, Dimitrios Soudris, Chris I. De Zeeuw, Christos Strydis
    Comments: 17 pages, 19 figures, 4 tables
    Subjects: Neural and Evolutionary Computing (cs.NE); Distributed, Parallel, and Cluster Computing (cs.DC)

    Objective: The advent of High-Performance Computing (HPC) in recent years has
    led to its increasing use in brain study through computational models. The
    scale and complexity of such models are constantly increasing, leading to
    challenging computational requirements. Even though modern HPC platforms can
    often deal with such challenges, the vast diversity of the modeling field does
    not permit for a single acceleration (or homogeneous) platform to effectively
    address the complete array of modeling requirements. Approach: In this paper we
    propose and build BrainFrame, a heterogeneous acceleration platform,
    incorporating three distinct acceleration technologies, a Dataflow Engine, a
    Xeon Phi and a GP-GPU. The PyNN framework is also integrated into the platform.
    As a challenging proof of concept, we analyze the performance of BrainFrame on
    different instances of a state-of-the-art neuron model, modeling the Inferior-
    Olivary Nucleus using a biophysically-meaningful, extended Hodgkin-Huxley
    representation. The model instances take into account not only the neuronal-
    network dimensions but also different network-connectivity circumstances that
    can drastically change application workload characteristics. Main results: The
    synthetic approach of three HPC technologies demonstrated that BrainFrame is
    better able to cope with the modeling diversity encountered. Our performance
    analysis shows clearly that the model directly affect performance and all three
    technologies are required to cope with all the model use cases.

    Enabling Bio-Plausible Multi-level STDP using CMOS Neurons with Dendrites and Bistable RRAMs

    Xinyu Wu, Vishal Saxena
    Subjects: Emerging Technologies (cs.ET); Neural and Evolutionary Computing (cs.NE)

    Large-scale integration of emerging nanoscale non-volatile memory devices,
    e.g. resistive random-access memory (RRAM), can enable a new generation of
    neuromorphic computers that can solve a wide range of machine learning
    problems. Such hybrid CMOS-RRAM neuromorphic architectures will result in
    several orders of magnitude reduction in energy consumption at a very small
    form factor, and herald autonomous learning machines capable of self-adapting
    to their environment. However, the progress in this area has been impeded from
    the realization that the actual memory devices fall well short of their
    expected behavior. In this work, we discuss the challenges associated with
    these memory devices and their use in neuromorphic computing circuits, and
    propose pathways to overcome these limitations by introducing ‘dendritic
    learning’.

    Message Passing Multi-Agent GANs

    Arnab Ghosh, Viveka Kulharia, Vinay Namboodiri
    Comments: The first 2 authors contributed equally for this work
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Communicating and sharing intelligence among agents is an important facet of
    achieving Artificial General Intelligence. As a first step towards this
    challenge, we introduce a novel framework for image generation: Message Passing
    Multi-Agent Generative Adversarial Networks (MPM GANs). While GANs have
    recently been shown to be very effective for image generation and other tasks,
    these networks have been limited to mostly single generator-discriminator
    networks. We show that we can obtain multi-agent GANs that communicate through
    message passing to achieve better image generation. The objectives of the
    individual agents in this framework are two fold: a co-operation objective and
    a competing objective. The co-operation objective ensures that the message
    sharing mechanism guides the other generator to generate better than itself
    while the competing objective encourages each generator to generate better than
    its counterpart. We analyze and visualize the messages that these GANs share
    among themselves in various scenarios. We quantitatively show that the message
    sharing formulation serves as a regularizer for the adversarial training.
    Qualitatively, we show that the different generators capture different traits
    of the underlying data distribution.

    Known Unknowns: Uncertainty Quality in Bayesian Neural Networks

    Ramon Oliveira, Pedro Tabacof, Eduardo Valle
    Comments: Workshop on Bayesian Deep Learning, NIPS 2016, Barcelona, Spain
    Subjects: Machine Learning (stat.ML); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    We evaluate the uncertainty quality in neural networks using anomaly
    detection. We extract uncertainty measures (e.g. entropy) from the predictions
    of candidate models, use those measures as features for an anomaly detector,
    and gauge how well the detector differentiates known from unknown classes. We
    assign higher uncertainty quality to candidate models that lead to better
    detectors. We also propose a novel method for sampling a variational
    approximation of a Bayesian neural network, called One-Sample Bayesian
    Approximation (OSBA). We experiment on two datasets, MNIST and CIFAR10. We
    compare the following candidate neural network models: Maximum Likelihood,
    Bayesian Dropout, OSBA, and — for MNIST — the standard variational
    approximation. We show that Bayesian Dropout and OSBA provide better
    uncertainty information than Maximum Likelihood, and are essentially equivalent
    to the standard variational approximation, but much faster.

    Semi-supervised learning of deep metrics for stereo reconstruction

    Stepan Tulyakov, Anton Ivanov, Francois Fleuret
    Comments: 11 pages, 3 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    Deep-learning metrics have recently demonstrated extremely good performance
    to match image patches for stereo reconstruction. However, training such
    metrics requires large amount of labeled stereo images, which can be difficult
    or costly to collect for certain applications. The main contribution of our
    work is a new semi-supervised method for learning deep metrics from unlabeled
    stereo images, given coarse information about the scenes and the optical
    system. Our method alternatively optimizes the metric with a standard
    stochastic gradient descent, and applies stereo constraints to regularize its
    prediction. Experiments on reference data-sets show that, for a given network
    architecture, training with this new method without ground-truth produces a
    metric with performance as good as state-of-the-art baselines trained with the
    said ground-truth. This work has three practical implications. Firstly, it
    helps to overcome limitations of training sets, in particular noisy ground
    truth. Secondly it allows to use much more training data during learning.
    Thirdly, it allows to tune deep metric for a particular stereo system, even if
    ground truth is not available.

    Positive blood culture detection in time series data using a BiLSTM network

    Leen De Baets, Joeri Ruyssinck, Thomas Peiffer, Johan Decruyenaere, Filip De Turck, Femke Ongenae, Tom Dhaene
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Quantitative Methods (q-bio.QM); Machine Learning (stat.ML)

    The presence of bacteria or fungi in the bloodstream of patients is abnormal
    and can lead to life-threatening conditions. A computational model based on a
    bidirectional long short-term memory artificial neural network, is explored to
    assist doctors in the intensive care unit to predict whether examination of
    blood cultures of patients will return positive. As input it uses nine
    monitored clinical parameters, presented as time series data, collected from
    2177 ICU admissions at the Ghent University Hospital. Our main goal is to
    determine if general machine learning methods and more specific, temporal
    models, can be used to create an early detection system. This preliminary
    research obtains an area of 71.95% under the precision recall curve, proving
    the potential of temporal neural networks in this context.

    Parameter Compression of Recurrent Neural Networks and Degredation of Short-term Memory

    Jonathan A. Cox
    Comments: Submitted to IJCNN 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    The significant computational costs of deploying neural networks in
    large-scale or resource constrained environments, such as data centers and
    mobile devices, has spurred interest in model compression, which can achieve a
    reduction in both arithmetic operations and storage memory. Several techniques
    have been proposed for reducing or compressing the parameters for feed-forward
    and convolutional neural networks, but less is understood about the effect of
    parameter compression on recurrent neural networks (RNN). In particular, the
    extent to which the recurrent parameters can be compressed and the impact on
    short-term memory performance, is not well understood. In this paper, we study
    the effect of complexity reduction, through singular value decomposition rank
    reduction, on RNN and minimal gated recurrent unit (MGRU) networks for several
    tasks. We show that considerable rank reduction is possible when compressing
    recurrent weights, even without fine tuning. Furthermore, we propose a
    perturbation model for the effect of general perturbations, such as a
    compression, on the recurrent parameters of RNNs. The model is tested against a
    noiseless memorization experiment that elucidates the short-term memory
    performance. In this way, we demonstrate that the effect of compression of
    recurrent parameters is dependent on the degree of temporal coherence present
    in the data and task. This work can guide on-the-fly RNN compression for novel
    environments or tasks, and provides insight for applying RNN compression in
    low-power devices, such as hearing aids.


    Computer Vision and Pattern Recognition

    ROAM: a Rich Object Appearance Model with Application to Rotoscoping

    Ondrej Miksik, Juan-Manuel Pérez-Rúa, Philip H. S. Torr, Patrick Pérez
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Rotoscoping, the detailed delineation of scene elements through a video shot,
    is a painstaking task of tremendous importance in professional post-production
    pipelines. While pixel-wise segmentation techniques can help for this task,
    professional rotoscoping tools rely on parametric curves that offer the artists
    a much better interactive control on the definition, editing and manipulation
    of the segments of interest. Sticking to this prevalent rotoscoping paradigm,
    we propose a novel framework to capture and track the visual aspect of an
    arbitrary object in a scene, given a first closed outline of this object. This
    model combines a collection of local foreground/background appearance models
    spread along the outline, a global appearance model of the enclosed object and
    a set of distinctive foreground landmarks. The structure of this rich
    appearance model allows simple initialization, efficient iterative optimization
    with exact minimization at each step, and on-line adaptation in videos. We
    demonstrate qualitatively and quantitatively the merit of this framework
    through comparisons with tools based on either dynamic segmentation with a
    closed curve or pixel-wise binary labelling.

    Authoring image decompositions with generative models

    Jason Rock, Theerasit Issaranon, Aditya Deshpande, David Forsyth
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We show how to extend traditional intrinsic image decompositions to
    incorporate further layers above albedo and shading. It is hard to obtain data
    to learn a multi-layer decomposition. Instead, we can learn to decompose an
    image into layers that are “like this” by authoring generative models for each
    layer using proxy examples that capture the Platonic ideal (Mondrian images for
    albedo; rendered 3D primitives for shading; material swatches for shading
    detail). Our method then generates image layers, one from each model, that
    explain the image. Our approach rests on innovation in generative models for
    images. We introduce a Convolutional Variational Auto Encoder (conv-VAE), a
    novel VAE architecture that can reconstruct high fidelity images. The approach
    is general, and does not require that layers admit a physical interpretation.

    Articulated Multi-person Tracking in the Wild

    Eldar Insafutdinov, Mykhaylo Andriluka, Leonid Pishchulin, Siyu Tang, Bjoern Andres, Bernt Schiele
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper we propose an approach for articulated tracking of multiple
    people in unconstrained videos. Our starting point is a model that resembles
    existing architectures for single-frame pose estimation but is several orders
    of magnitude faster. We achieve this in two ways: (1) by simplifying and
    sparsifying the body-part relationship graph and leveraging recent methods for
    faster inference, and (2) by offloading a substantial share of computation onto
    a feed-forward convolutional architecture that is able to detect and associate
    body joints of the same person even in clutter. We use this model to generate
    proposals for body joint locations and formulate articulated tracking as
    spatio-temporal grouping of such proposals. This allows to jointly solve the
    association problem for all people in the scene by propagating evidence from
    strong detections through time and enforcing constraints that each proposal can
    be assigned to one person only. We report results on a public MPII Human Pose
    benchmark and on a new dataset of videos with multiple people. We demonstrate
    that our model achieves state-of-the-art results while using only a fraction of
    time and is able to leverage temporal information to improve state-of-the-art
    for crowded scenes.

    ImageNet pre-trained models with batch normalization

    Marcel Simon, Erik Rodner, Joachim Denzler
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Convolutional neural networks (CNN) pre-trained on ImageNet are the backbone
    of most state-of-the-art approaches. In this paper, we present a new set of
    pre-trained models with popular state-of-the-art architectures for the Caffe
    framework. The first release includes Residual Networks (ResNets) with
    generation script as well as the batch-normalization-variants of AlexNet and
    VGG19. All models outperform previous models with the same architecture. The
    models and training code are available at
    this http URL and
    this https URL

    A Distance Function for Comparing Straight-Edge Geometric Figures

    Apoorva Honnegowda Roopa, Shrisha Rao
    Comments: 29 pages, 12 figures including appendices
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Computational Geometry (cs.CG)

    This paper defines a distance function that measures the dissimilarity
    between planar geometric figures formed with straight lines. This function can
    in turn be used in partial matching of different geometric figures. For a given
    pair of geometric figures that are graphically isomorphic, one function
    measures the angular dissimilarity and another function measures the edge
    length disproportionality. The distance function is then defined as the convex
    sum of these two functions. The novelty of the presented function is that it
    satisfies all properties of a distance function and the computation of the same
    is done by projecting appropriate features to a cartesian plane. To compute the
    deviation from the angular similarity property, the Euclidean distance between
    the given angular pairs and the corresponding points on the (y=x) line is
    measured. Further while computing the deviation from the edge length
    proportionality property, the best fit line, for the set of edge lengths, which
    passes through the origin is found, and the Euclidean distance between the
    given edge length pairs and the corresponding point on a (y=mx) line is
    calculated. Iterative Proportional Fitting Procedure (IPFP) is used to find
    this best fit line. We demonstrate the behavior of the defined function for
    some sample pairs of figures.

    From One-Trick Ponies to All-Rounders: On-Demand Learning for Image Restoration

    Ruohan Gao, Kristen Grauman
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    While machine learning approaches to image restoration offer great promise,
    current methods risk training “one-trick ponies” that perform well only for
    image corruption of a particular level of difficulty—such as a certain level
    of noise or blur. First, we expose the weakness of today’s one-trick pony and
    demonstrate that training general models equipped to handle arbitrary levels of
    corruption is indeed non-trivial. Then, we propose an on-demand learning
    algorithm for training image restoration models with deep convolutional neural
    networks. The main idea is to exploit a feedback mechanism to self-generate
    training instances where they are needed most, thereby learning models that can
    generalize across difficulty levels. On four restoration tasks—image
    inpainting, pixel interpolation, image deblurring, and image denoising—and
    three diverse datasets, our approach consistently outperforms both the status
    quo training procedure and curriculum learning alternatives.

    Human-In-The-Loop Person Re-Identification

    Hanxiao Wang, Shaogang Gong, Xiatian Zhu, Tao Xiang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Current person re-identification (re-id) methods assume that (1) pre-labelled
    training data are available for every camera pair, (2) the gallery size for
    re-identification is moderate. Both assumptions scale poorly to real-world
    applications when camera network size increases and gallery size becomes large.
    Human verification of automatic model ranked re-id results becomes inevitable.
    In this work, a novel human-in-the-loop re-id model based on Human Verification
    Incremental Learning (HVIL) is formulated which does not require any
    pre-labelled training data to learn a model, therefore readily scalable to new
    camera pairs. This HVIL model learns cumulatively from human feedback to
    provide instant improvement to re-id ranking of each probe on-the-fly enabling
    the model scalable to large gallery sizes. We further formulate a Regularised
    Metric Ensemble Learning (RMEL) model to combine a series of incrementally
    learned HVIL models into a single ensemble model to be used when human feedback
    becomes unavailable.

    Highly Efficient Regression for Scalable Person Re-Identification

    Hanxiao Wang, Shaogang Gong, Tao Xiang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Existing person re-identification models are poor for scaling up to large
    data required in real-world applications due to: (1) Complexity: They employ
    complex models for optimal performance resulting in high computational cost for
    training at a large scale; (2) Inadaptability: Once trained, they are
    unsuitable for incremental update to incorporate any new data available. This
    work proposes a truly scalable solution to re-id by addressing both problems.
    Specifically, a Highly Efficient Regression (HER) model is formulated by
    embedding the Fisher’s criterion to a ridge regression model for very fast
    re-id model learning with scalable memory/storage usage. Importantly, this new
    HER model supports faster than real-time incremental model updates therefore
    making real-time active learning feasible in re-id with human-in-the-loop.
    Extensive experiments show that such a simple and fast model not only
    outperforms notably the state-of-the-art re-id methods, but also is more
    scalable to large data with additional benefits to active learning for reducing
    human labelling effort in re-id deployment.

    Classification With an Edge: Improving Semantic Image Segmentation with Boundary Detection

    Dimitrios Marmanis, Konrad Schindler, Jan Dirk Wegner, Silvano Galliani, Mihai Datcu, Uwe Stilla
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We present an end-to-end trainable deep convolutional neural network (DCNN)
    for semantic segmentation with built-in awareness of semantically meaningful
    boundaries. Semantic segmentation is a fundamental remote sensing task, and
    most state-of-the-art methods rely on DCNNs as their workhorse. A major reason
    for their success is that deep networks learn to accumulate contextual
    information over very large windows (receptive fields). However, this success
    comes at a cost, since the associated loss of effecive spatial resolution
    washes out high-frequency details and leads to blurry object boundaries. Here,
    we propose to counter this effect by combining semantic segmentation with
    semantically informed edge detection, thus making class-boundaries explicit in
    the model, First, we construct a comparatively simple, memory-efficient model
    by adding boundary detection to the Segnet encoder-decoder architecture.
    Second, we also include boundary detection in FCN-type models and set up a
    high-end classifier ensemble. We show that boundary detection significantly
    improves semantic segmentation with CNNs. Our high-end ensemble achieves > 90%
    overall accuracy on the ISPRS Vaihingen benchmark.

    Stereo image de-fencing using smartphones

    Sankaraganesh Jonna, Sukla Satapathy, Rajiv R. Sahay
    Comments: Under review as a conference paper
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Conventional approaches to image de-fencing have limited themselves to using
    only image data in adjacent frames of the captured video of an approximately
    static scene. In this work, we present a method to harness disparity using a
    stereo pair of fenced images in order to detect fence pixels. Tourists and
    amateur photographers commonly carry smartphones/phablets which can be used to
    capture a short video sequence of the fenced scene. We model the formation of
    the occluded frames in the captured video. Furthermore, we propose an
    optimization framework to estimate the de-fenced image using the total
    variation prior to regularize the ill-posed problem.

    Message Passing Multi-Agent GANs

    Arnab Ghosh, Viveka Kulharia, Vinay Namboodiri
    Comments: The first 2 authors contributed equally for this work
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Communicating and sharing intelligence among agents is an important facet of
    achieving Artificial General Intelligence. As a first step towards this
    challenge, we introduce a novel framework for image generation: Message Passing
    Multi-Agent Generative Adversarial Networks (MPM GANs). While GANs have
    recently been shown to be very effective for image generation and other tasks,
    these networks have been limited to mostly single generator-discriminator
    networks. We show that we can obtain multi-agent GANs that communicate through
    message passing to achieve better image generation. The objectives of the
    individual agents in this framework are two fold: a co-operation objective and
    a competing objective. The co-operation objective ensures that the message
    sharing mechanism guides the other generator to generate better than itself
    while the competing objective encourages each generator to generate better than
    its counterpart. We analyze and visualize the messages that these GANs share
    among themselves in various scenarios. We quantitatively show that the message
    sharing formulation serves as a regularizer for the adversarial training.
    Qualitatively, we show that the different generators capture different traits
    of the underlying data distribution.

    Point Pair Feature based Object Detection for Random Bin Picking

    Wim Abbeloos, Toon Goedemé
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Point pair features are a popular representation for free form 3D object
    detection and pose estimation. In this paper, their performance in an
    industrial random bin picking context is investigated. A new method to generate
    representative synthetic datasets is proposed. This allows to investigate the
    influence of a high degree of clutter and the presence of self similar
    features, which are typical to our application. We provide an overview of
    solutions proposed in literature and discuss their strengths and weaknesses. A
    simple heuristic method to drastically reduce the computational complexity is
    introduced, which results in improved robustness, speed and accuracy compared
    to the naive approach.

    Panoramic Structure from Motion via Geometric Relationship Detection

    Satoshi Ikehata, Ivaylo Boyadzhiev, Qi Shan, Yasutaka Furukawa
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper addresses the problem of Structure from Motion (SfM) for indoor
    panoramic image streams, extremely challenging even for the state-of-the-art
    due to the lack of textures and minimal parallax. The key idea is the fusion of
    single-view and multi-view reconstruction techniques via geometric relationship
    detection (e.g., detecting 2D lines as coplanar in 3D). Rough geometry suffices
    to perform such detection, and our approach utilizes rough surface normal
    estimates from an image-to-normal deep network to discover geometric
    relationships among lines. The detected relationships provide exact geometric
    constraints in our line-based linear SfM formulation. A constrained linear
    least squares is used to reconstruct a 3D model and camera motions, followed by
    the bundle adjustment. We have validated our algorithm on challenging datasets,
    outperforming various state-of-the-art reconstruction techniques.

    Deep Image Category Discovery using a Transferred Similarity Function

    Yen-Chang Hsu, Zhaoyang Lv, Zsolt Kira
    Comments: 13 pages, 9 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Automatically discovering image categories in unlabeled natural images is one
    of the important goals of unsupervised learning. However, the task is
    challenging and even human beings define visual categories based on a large
    amount of prior knowledge. In this paper, we similarly utilize prior knowledge
    to facilitate the discovery of image categories. We present a novel end-to-end
    network to map unlabeled images to categories as a clustering network. We
    propose that this network can be learned with contrastive loss which is only
    based on weak binary pair-wise constraints. Such binary constraints can be
    learned from datasets in other domains as transferred similarity functions,
    which mimic a simple knowledge transfer. We first evaluate our experiments on
    the MNIST dataset as a proof of concept, based on predicted similarities
    trained on Omniglot, showing a 99\% accuracy which significantly outperforms
    clustering based approaches. Then we evaluate the discovery performance on
    Cifar-10, STL-10, and ImageNet, which achieves both state-of-the-art accuracy
    and shows it can be scalable to various large natural images.

    Cancerous Nuclei Detection and Scoring in Breast Cancer Histopathological Images

    Pegah Faridi, Habibollah Danyali, Mohammad Sadegh Helfroush, Mojgan Akbarzadeh Jahromi
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Early detection and prognosis of breast cancer are feasible by utilizing
    histopathological grading of biopsy specimens. This research is focused on
    detection and grading of nuclear pleomorphism in histopathological images of
    breast cancer. The proposed method consists of three internal steps. First,
    unmixing colors of H&E is used in the preprocessing step. Second, nuclei
    boundaries are extracted incorporating the center of cancerous nuclei which are
    detected by applying morphological operations and Difference of Gaussian filter
    on the preprocessed image. Finally, segmented nuclei are scored to accomplish
    one parameter of the Nottingham grading system for breast cancer. In this
    approach, the nuclei area, chromatin density, contour regularity, and nucleoli
    presence, are features for nuclear pleomorphism scoring. Experimental results
    showed that the proposed algorithm, with an accuracy of 86.6%, made significant
    advancement in detecting cancerous nuclei compared to existing methods in the
    related literature.

    Turning an Urban Scene Video into a Cinemagraph

    Hang Yan, Yebin Liu, Yasutaka Furukawa
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper proposes an algorithm that turns a regular video capturing urban
    scenes into a high-quality endless animation, known as a Cinemagraph. The
    creation of a Cinemagraph usually requires a static camera in a carefully
    configured scene. The task becomes challenging for a regular video with a
    moving camera and objects. Our approach first warps an input video into the
    viewpoint of a reference camera. Based on the warped video, we propose
    effective temporal analysis algorithms to detect regions with static geometry
    and dynamic appearance, where geometric modeling is reliable and visually
    attractive animations can be created. Lastly, the algorithm applies a sequence
    of video processing techniques to produce a Cinemagraph movie. We have tested
    the proposed approach on numerous challenging real scenes. To our knowledge,
    this work is the first to automatically generate Cinemagraph animations from
    regular movies in the wild.

    Multi-way Particle Swarm Fusion

    Chen Liu, Hang Yan, Pushmeet Kohli, Yasutaka Furukawa
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper proposes a novel MAP inference framework for Markov Random Field
    (MRF) in parallel computing environments. The inference framework, dubbed Swarm
    Fusion, is a natural generalization of the Fusion Move method. Every thread (in
    a case of multi-threading environments) maintains and updates a solution. At
    each iteration, a thread can generate arbitrary number of solution proposals
    and take arbitrary number of concurrent solutions from the other threads to
    perform multi-way fusion in updating its solution. The framework is general,
    making popular existing inference techniques such as alpha-expansion, fusion
    move, parallel alpha-expansion, and hierarchical fusion, its special cases. We
    have evaluated the effectiveness of our approach against competing methods on
    three problems of varying difficulties, in particular, the stereo, the optical
    flow, and the layered depthmap estimation problems.

    Deep Pyramidal Residual Networks with Separated Stochastic Depth

    Yoshihiro Yamada, Masakazu Iwamura, Koichi Kise
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    On general object recognition, Deep Convolutional Neural Networks (DCNNs)
    achieve high accuracy. In particular, ResNet and its improvements have broken
    the lowest error rate records. In this paper, we propose a method to
    successfully combine two ResNet improvements, ResDrop and PyramidNet. We
    confirmed that the proposed network outperformed the conventional methods; on
    CIFAR-100, the proposed network achieved an error rate of 16.18% in contrast to
    PiramidNet achieving that of 18.29% and ResNeXt 17.31%.

    Local Blur Mapping: Exploiting High-Level Semantics by Deep Neural Networks

    Kede Ma, Huan Fu, Tongliang Liu, Zhou Wang, Dacheng Tao
    Comments: Tech report
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    The human visual system excels at detecting local blur of visual images, but
    the underlying mechanism is mysterious. Traditional views of blur such as
    reduction in local or global high-frequency energy and loss of local phase
    coherence have fundamental limitations. For example, they cannot well
    discriminate flat regions from blurred ones. Here we argue that high-level
    semantic information is critical in successfully detecting local blur.
    Therefore, we resort to deep neural networks that are proficient in learning
    high-level features and propose the first end-to-end local blur mapping
    algorithm based on a fully convolutional network (FCN). We empirically show
    that high-level features of deeper layers indeed play a more important role
    than low-level features of shallower layers in resolving challenging
    ambiguities for this task. We test the proposed method on a standard blur
    detection benchmark and demonstrate that it significantly advances the
    state-of-the-art (ODS F-score of 0.853). In addition, we explore the use of the
    generated blur map in three applications, including blur region segmentation,
    blur degree estimation, and blur magnification.

    Deep Multi-Modal Image Correspondence Learning

    Chen Liu, Jiajun Wu, Pushmeet Kohli, Yasutaka Furukawa
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Inference of correspondences between images from different modalities is an
    extremely important perceptual ability that enables humans to understand and
    recognize cross-modal concepts. In this paper, we consider an instance of this
    problem that involves matching photographs of building interiors with their
    corresponding floorplan. This is a particularly challenging problem because a
    floorplan, as a stylized architectural drawing, is very different in appearance
    from a color photograph. Furthermore, individual photographs by themselves
    depict only a part of a floorplan (e.g., kitchen, bathroom, and living room).
    We propose the use of a number of different neural network architectures for
    this task, which are trained and evaluated on a novel large-scale dataset of 5
    million floorplan images and 80 million associated photographs. Experimental
    evaluation reveals that our neural network architectures are able to identify
    visual cues that result in reliable matches across these two quite different
    modalities. In fact, the trained networks are able to even outperform human
    subjects in several challenging image matching problems. Our result implies
    that neural networks are effective at perceptual tasks that require long
    periods of reasoning even for humans to solve.

    Learnable Structured Clustering Framework for Deep Metric Learning

    Hyun Oh Song, Stefanie Jegelka, Vivek Rathod, Kevin Murphy
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Learning the representation and the similarity metric in an end-to-end
    fashion with deep networks have demonstrated outstanding results for clustering
    and retrieval. However, these recent approaches still suffer from the
    performance degradation stemming from the local metric training procedure which
    is unaware of the global structure of the embedding space.

    We propose a global metric learning scheme for optimizing the deep metric
    embedding with the learnable clustering function and the clustering metric
    (NMI) in a novel structured prediction framework.

    Our experiments on CUB200-2011, Cars196, and Stanford online products
    datasets show state of the art performance both on the clustering and retrieval
    tasks measured in the NMI and Recall@K evaluation metrics.

    DenseReg: Fully Convolutional Dense Shape Regression In-the-Wild

    Rıza Alp Güler, George Trigeorgis, Epameinondas Antonakos, Patrick Snape, Stefanos Zafeiriou, Iasonas Kokkinos
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper we propose to learn a mapping from image pixels into a dense
    template grid through a fully convolutional network. We formulate this task as
    a regression problem and train our network by leveraging upon manually
    annotated facial landmarks “in-the-wild”. We use such landmarks to establish a
    dense correspondence field between a three-dimensional object template and the
    input image, which then serves as the ground-truth for training our regression
    system. We show that we can combine ideas from semantic segmentation with
    regression networks, yielding a highly-accurate “quantized regression”
    architecture.

    Our system, called DenseReg, allows us to estimate dense image-to-template
    correspondences in a fully convolutional manner. As such our network can
    provide useful correspondence information as a stand-alone system, while when
    used as an initialization for Statistical Deformable Models we obtain landmark
    localization results that largely outperform the current state-of-the-art on
    the challenging 300W benchmark. We thoroughly evaluate our method on a host of
    facial analysis tasks, and also demonstrate its use for other correspondence
    estimation tasks, such as modelling of the human ear. DenseReg code is made
    available at this http URL along with supplementary
    materials.

    Online Localization and Prediction of Actions and Interactions

    Khurram Soomro, Haroon Idrees, Mubarak Shah
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper proposes a person-centric and online approach to the challenging
    problem of localization and prediction of actions and interactions in videos.
    Typically, localization or recognition is performed in an offline manner where
    all the frames in the video are processed together. This prevents timely
    localization and prediction of actions and interactions – an important
    consideration for many tasks including surveillance and human-machine
    interaction.

    In our approach, we estimate human poses at each frame and train
    discriminative appearance models using the superpixels inside the pose bounding
    boxes. Since the pose estimation per frame is inherently noisy, the conditional
    probability of pose hypotheses at current time-step (frame) is computed using
    pose estimations in the current frame and their consistency with poses in the
    previous frames. Next, both the superpixel and pose-based foreground
    likelihoods are used to infer the location of actors at each time through a
    Conditional Random. The issue of visual drift is handled by updating the
    appearance models, and refining poses using motion smoothness on joint
    locations, in an online manner. For online prediction of action (interaction)
    confidences, we propose an approach based on Structural SVM that operates on
    short video segments, and is trained with the objective that confidence of an
    action or interaction increases as time progresses. Lastly, we quantify the
    performance of both detection and prediction together, and analyze how the
    prediction accuracy varies as a time function of observed action (interaction)
    at different levels of detection performance. Our experiments on several
    datasets suggest that despite using only a few frames to localize actions
    (interactions) at each time instant, we are able to obtain competitive results
    to state-of-the-art offline methods.

    Who is Mistaken?

    Benjamin Eysenbach, Carl Vondrick, Antonio Torralba
    Comments: See project website at: this http URL
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Recognizing when people have false beliefs is crucial for understanding their
    actions. We introduce the novel problem of identifying when people in abstract
    scenes have incorrect beliefs. We present a dataset of scenes, each visually
    depicting an 8-frame story in which a character has a mistaken belief. We then
    create a representation of characters’ beliefs for two tasks in human action
    understanding: predicting who is mistaken, and when they are mistaken.
    Experiments suggest that our method for identifying mistaken characters
    performs better on these tasks than simple baselines. Diagnostics on our model
    suggest it learns important cues for recognizing mistaken beliefs, such as
    gaze. We believe models of people’s beliefs will have many applications in
    action understanding, robotics, and healthcare.

    General models for rational cameras and the case of two-slit projections

    Matthew Trager, Bernd Sturmfels, John Canny, Martial Hebert, Jean Ponce
    Comments: 9 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    The rational camera model recently introduced in [18] provides a general
    methodology for studying abstract nonlinear imaging systems and their
    multi-view geometry. This paper provides a concrete embedding of rational
    cameras explicitly accounting for the mapping between physical visual rays and
    image points, missing in the original model. This allows us to derive simple
    but general analytical expressions for direct and inverse projections, and
    define primitive rational cameras equivalent under the action of various
    projective transformations, leading to a generalized notion of intrinsic
    coordinates in this setting. The methodology is general, but it is illustrated
    concretely by an in-depth study of two-slit cameras, which we describe using a
    pair of linear projections. This simple analytical form allows us to describe
    models for the corresponding primitive cameras, to introduce intrinsic
    parameters with a clear geometric meaning, and to define an epipolar tensor
    characterizing two-view correspondences. In turn, this leads to new algorithms
    for structure from motion and self-calibration.

    A method for the segmentation of images based on thresholding and applied to vesicular textures

    Amelia Carolina Sparavigna
    Comments: Keywords: Segmentation, Edge Detection, Image Analysis, 2D Textures, Texture Functions
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In image processing, a segmentation is a process of partitioning an image
    into multiple sets of pixels, that are defined as super-pixels. Each
    super-pixel is characterized by a label or parameter. Here, we are proposing a
    method for determining the super-pixels based on the thresholding of the image.
    This approach is quite useful for studying the images showing vesicular
    textures.

    Pyramid Scene Parsing Network

    Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, Jiaya Jia
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Scene parsing is challenging for unrestricted open vocabulary and diverse
    scenes. In this paper, we exploit the capability of global context information
    by different-region-based context aggregation through our pyramid pooling
    module together with the proposed pyramid scene parsing network (PSPNet). Our
    global prior representation is effective to produce good quality results on the
    scene parsing task, while PSPNet provides a superior framework for pixel-level
    prediction tasks. The proposed approach achieves state-of-the-art performance
    on various datasets. It came first in ImageNet scene parsing challenge 2016,
    PASCAL VOC 2012 benchmark and Cityscapes benchmark. A single PSPNet yields new
    record of mIoU score as 85.4% on PASCAL VOC 2012 and 80.2% on Cityscapes.

    Multi-Label Image Classification with Regional Latent Semantic Dependencies

    Junjie Zhang, Qi Wu, Chunhua Shen, Jian Zhang, Jianfeng Lu
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Recent state-of-the-art approaches of multi-label image classification
    exploit the label dependencies at the global level, largely improving the
    labeling ability. However, accurately predicting small objects and visual
    concepts is still challenging due to the limited discrimination of the global
    visual features. In this paper, we propose a Regional Latent Semantic
    Dependencies model (RLSD) to address this problem. The utilized model includes
    a fully convolutional localization architecture to localize the regions that
    may contain multiple highly-dependent labels. The localized regions are further
    sent to the recurrent neural networks (RNN) to characterize the latent semantic
    dependencies at the regional level. Experimental results on several benchmark
    datasets show that our proposed model achieves the best performance compared to
    the state-of-the-art models, especially for predicting small objects occurred
    in the images. In addition, we set up an upper bound model (RLSD+ft-RPN) using
    bounding box coordinates during training, the experimental results also show
    that our RLSD can approach the upper bound without using the bounding-box
    annotations, which is more realistic in the real world.

    End-to-end Learning of Driving Models from Large-scale Video Datasets

    Huazhe Xu, Yang Gao, Fisher Yu, Trevor Darrell
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Robust perception-action models should be learned from training data with
    diverse visual appearances and realistic behaviors, yet current approaches to
    deep visuomotor policy learning have been generally limited to in-situ models
    learned from a single vehicle or a simulation environment. We advocate learning
    a generic vehicle motion model from large scale crowd-sourced video data, and
    develop an end-to-end trainable architecture for learning to predict a
    distribution over future vehicle egomotion from instantaneous monocular camera
    observations and previous vehicle state. Our model incorporates a novel
    FCN-LSTM architecture, which can be learned from large-scale crowd-sourced
    vehicle action data, and leverages available scene segmentation side tasks to
    improve performance under a privileged learning paradigm.

    Joint Visual Denoising and Classification using Deep Learning

    Gang Chen, Yawei Li, Sargur N. Srihari
    Comments: 5 pages, 7 figures, ICIP 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Visual restoration and recognition are traditionally addressed in pipeline
    fashion, i.e. denoising followed by classification. Instead, observing
    correlations between the two tasks, for example clearer image will lead to
    better categorization and vice visa, we propose a joint framework for visual
    restoration and recognition for handwritten images, inspired by advances in
    deep autoencoder and multi-modality learning. Our model is a 3-pathway deep
    architecture with a hidden-layer representation which is shared by multi-inputs
    and outputs, and each branch can be composed of a multi-layer deep model. Thus,
    visual restoration and classification can be unified using shared
    representation via non-linear mapping, and model parameters can be learnt via
    backpropagation. Using MNIST and USPS data corrupted with structured noise, the
    proposed framework performs at least 20\% better in classification than
    separate pipelines, as well as clearer recovered images. The noise model and
    the reproducible source code is available at
    {url{this https URL}}.

    Skin Cancer Detection and Tracking using Data Synthesis and Deep Learning

    Yunzhu Li, Andre Esteva, Brett Kuprel, Rob Novoa, Justin Ko, Sebastian Thrun
    Comments: 4 pages, 5 figures, Yunzhu Li and Andre Esteva contributed equally to this work
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Dense object detection and temporal tracking are needed across applications
    domains ranging from people-tracking to analysis of satellite imagery over
    time. The detection and tracking of malignant skin cancers and benign moles
    poses a particularly challenging problem due to the general uniformity of large
    skin patches, the fact that skin lesions vary little in their appearance, and
    the relatively small amount of data available. Here we introduce a novel data
    synthesis technique that merges images of individual skin lesions with
    full-body images and heavily augments them to generate significant amounts of
    data. We build a convolutional neural network (CNN) based system, trained on
    this synthetic data, and demonstrate superior performance to traditional
    detection and tracking techniques. Additionally, we compare our system to
    humans trained with simple criteria. Our system is intended for potential
    clinical use to augment the capabilities of healthcare providers. While
    domain-specific, we believe the methods invoked in this work will be useful in
    applying CNNs across domains that suffer from limited data availability.

    Word Recognition with Deep Conditional Random Fields

    Gang Chen, Yawei Li, Sargur N. Srihari
    Comments: 5 pages, published in ICIP 2016. arXiv admin note: substantial text overlap with arXiv:1412.3397
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Recognition of handwritten words continues to be an important problem in
    document analysis and recognition. Existing approaches extract hand-engineered
    features from word images–which can perform poorly with new data sets.
    Recently, deep learning has attracted great attention because of the ability to
    learn features from raw data. Moreover they have yielded state-of-the-art
    results in classification tasks including character recognition and scene
    recognition. On the other hand, word recognition is a sequential problem where
    we need to model the correlation between characters. In this paper, we propose
    using deep Conditional Random Fields (deep CRFs) for word recognition.
    Basically, we combine CRFs with deep learning, in which deep features are
    learned and sequences are labeled in a unified framework. We pre-train the deep
    structure with stacked restricted Boltzmann machines (RBMs) for feature
    learning and optimize the entire network with an online learning algorithm. The
    proposed model was evaluated on two datasets, and seen to perform significantly
    better than competitive baseline models. The source code is available at
    this https URL

    Learning to Segment Object Proposals via Recursive Neural Networks

    Tianshui Chen, Liang Lin, Xian Wu, Xiaonan Luo
    Comments: 12 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    To avoid the exhaustive search over locations and scales, current
    state-of-the-art object detection systems usually involve a crucial component
    generating a batch of candidate object proposals from images. In this paper, we
    present a simple yet effective approach for segmenting object proposals via a
    deep architecture of recursive neural networks (RNNs), which hierarchically
    groups regions for detecting object candidates over scales. Unlike traditional
    methods that mainly adopt fixed similarity measures for merging regions or
    finding object proposals, our approach adaptively learns the region merging
    similarity and the objectness measure during the process of hierarchical region
    grouping. Specifically, guided by a structured loss, the RNN model jointly
    optimizes the cross-region similarity metric with the region merging process as
    well as the objectness prediction. During inference of the object proposal
    generation, we introduce randomness into the greedy search to cope with the
    ambiguity of grouping regions. Extensive experiments on standard benchmarks,
    e.g., PASCAL VOC and ImageNet, suggest that our approach is capable of
    producing object proposals with high recall while well preserving the object
    boundaries and outperforms other existing methods in both accuracy and
    efficiency.

    SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving

    Bichen Wu, Forrest Iandola, Peter H. Jin, Kurt Keutzer
    Comments: The supplementary material of this paper, which discusses the energy efficiency of SqueezeDet, is attached after the main paper
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Object detection is a crucial task for autonomous driving. In addition to
    requiring high accuracy to ensure safety, object detection for autonomous
    driving also requires real-time inference speed to guarantee prompt vehicle
    control, as well as small model size and energy efficiency to enable embedded
    system deployment. In this work, we propose SqueezeDet, a fully convolutional
    neural network for object detection that aims to simultaneously satisfy all of
    the above constraints. In our network we use convolutional layers not only to
    extract feature maps, but also as the output layer to compute bounding boxes
    and class probabilities. The detection pipeline of our model only contains a
    single forward pass of a neural network, thus it is extremely fast. Our model
    is fully-convolutional, which leads to small model size and better energy
    efficiency. Finally, our experiments show that our model is very accurate,
    achieving state-of-the-art accuracy on the KITTI benchmark.

    Semi-Automated Annotation of Discrete States in Large Video Datasets

    Lex Fridman, Bryan Reimer
    Comments: To be presented at AAAI 2017. arXiv admin note: text overlap with arXiv:1508.04028
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We propose a framework for semi-automated annotation of video frames where
    the video is of an object that at any point in time can be labeled as being in
    one of a finite number of discrete states. A Hidden Markov Model (HMM) is used
    to model (1) the behavior of the underlying object and (2) the noisy
    observation of its state through an image processing algorithm. The key insight
    of this approach is that the annotation of frame-by-frame video can be reduced
    from a problem of labeling every single image to a problem of detecting a
    transition between states of the underlying objected being recording on video.
    The performance of the framework is evaluated on a driver gaze classification
    dataset composed of 16,000,000 images that were fully annotated over 6,000
    hours of direct manual annotation labor. On this dataset, we achieve a 13x
    reduction in manual annotation for an average accuracy of 99.1% and a 84x
    reduction for an average accuracy of 91.2%.

    Areas of Attention for Image Captioning

    Marco Pedersoli, Thomas Lucas, Cordelia Schmid, Jakob Verbeek
    Comments: Submitted to CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We propose “Areas of Attention”, a novel attention-based model for automatic
    image caption generation. Our approach models the interplay between the state
    of the RNN, image region descriptors and word embedding vectors by three
    pairwise interactions. It allows association of caption words with local visual
    appearances rather than with descriptors of the entire scene. This enables
    better generalization to complex scenes not seen during training. Our model is
    agnostic to the type of attention areas, and we instantiate it using regions
    based on CNN activation grids, object proposals, and spatial transformer
    networks. Our results show that all components of our model contribute to
    obtain state-of-the-art performance on the MSCOCO dataset. In addition, our
    results indicate that attention areas are correctly associated to meaningful
    latent semantic structure in the generated captions.

    Short-term traffic flow forecasting with spatial-temporal correlation in a hybrid deep learning framework

    Yuankai Wu, Huachun Tan
    Comments: 14 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Deep learning approaches have reached a celebrity status in artificial
    intelligence field, its success have mostly relied on Convolutional Networks
    (CNN) and Recurrent Networks. By exploiting fundamental spatial properties of
    images and videos, the CNN always achieves dominant performance on visual
    tasks. And the Recurrent Networks (RNN) especially long short-term memory
    methods (LSTM) can successfully characterize the temporal correlation, thus
    exhibits superior capability for time series tasks. Traffic flow data have
    plentiful characteristics on both time and space domain. However, applications
    of CNN and LSTM approaches on traffic flow are limited. In this paper, we
    propose a novel deep architecture combined CNN and LSTM to forecast future
    traffic flow (CLTFP). An 1-dimension CNN is exploited to capture spatial
    features of traffic flow, and two LSTMs are utilized to mine the short-term
    variability and periodicities of traffic flow. Given those meaningful features,
    the feature-level fusion is performed to achieve short-term forecasting. The
    proposed CLTFP is compared with other popular forecasting methods on an open
    datasets. Experimental results indicate that the CLTFP has considerable
    advantages in traffic flow forecasting. in additional, the proposed CLTFP is
    analyzed from the view of Granger Causality, and several interesting properties
    of CLTFP are discovered and discussed .

    A Non-Local Means Approach for Gaussian Noise Removal from Images using a Modified Weighting Kernel

    Mojtaba Kazemi, Ehsan Mohammadi.P, Parichehr shahidi sadeghi, Mohamad B. Menhaj
    Comments: 2017 25th Iranian Conference on Electrical Engineering (ICEE)
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Gaussian noise removal is an interesting area in digital image processing not
    only to improve the visual quality, but for its impact on other post-processing
    algorithms like image registration or segmentation. Many presented
    state-of-the-art denoising methods are based on the self-similarity or
    patch-based image processing. Specifically, Non-Local Means (NLM) as a
    patch-based filter has gained increasing attention in recent years.
    Essentially, this filter tends to obtain the noise-less signal value by
    computing the Gaussian-weighted Euclidean distance between the patch
    under-processing and other patches inside the image. However, the NLM filter is
    sensitive to the outliers (pixels that their intensity values are far away from
    other pixels) inside the patch, meaning that the pixels with the symmetric
    locations in the patch are assigned the same weight. This can lead to
    sub-optimal denoising performance when the destructive nature of noise
    generates some outliers inside patches. In this paper, we propose a new
    weighting approach to modify the Gaussian kernel of the NLM filter. Our
    approach employs the geometric distance between image intensities to come up
    with new weights for each pixel of a patch, lowering the impact of outliers on
    the denoising performance. Experiments on a set of standard images and
    different noise levels show that our proposed method outperforms the other
    compared denoising filters.

    Mining Spatio-temporal Data on Industrialization from Historical Registries

    David Berenbaum, Dwyer Deighan, Thomas Marlow, Ashley Lee, Scott Frickel, Mark Howison
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR)

    Despite the growing availability of big data in many fields, historical data
    on socioevironmental phenomena are often not available due to a lack of
    automated and scalable approaches for collecting, digitizing, and assembling
    them. We have developed a data-mining method for extracting tabulated, geocoded
    data from printed directories. While scanning and optical character recognition
    (OCR) can digitize printed text, these methods alone do not capture the
    structure of the underlying data. Our pipeline integrates both page layout
    analysis and OCR to extract tabular, geocoded data from structured text. We
    demonstrate the utility of this method by applying it to scanned manufacturing
    registries from Rhode Island that record 41 years of industrial land use. The
    resulting spatio-temporal data can be used for socioenvironmental analyses of
    industrialization at a resolution that was not previously possible. In
    particular, we find strong evidence for the dispersion of manufacturing from
    the urban core of Providence, the state’s capital, along the Interstate 95
    corridor to the north and south.

    Ensembles of Generative Adversarial Networks

    Yaxing Wang, Lichao Zhang, Joost van de Weijer
    Comments: accepted NIPS 2016 Workshop on Adversarial Training
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Ensembles are a popular way to improve results of discriminative CNNs. The
    combination of several networks trained starting from different initializations
    improves results significantly. In this paper we investigate the usage of
    ensembles of GANs. The specific nature of GANs opens up several new ways to
    construct ensembles. The first one is based on the fact that in the minimax
    game which is played to optimize the GAN objective the generator network keeps
    on changing even after the network can be considered optimal. As such ensembles
    of GANs can be constructed based on the same network initialization but just
    taking models which have different amount of iterations. These so-called self
    ensembles are much faster to train than traditional ensembles. The second
    method, called cascade GANs, redirects part of the training data which is badly
    modeled by the first GAN to another GAN. In experiments on the CIFAR10 dataset
    we show that ensembles of GANs obtain model probability distributions which
    better model the data distribution. In addition, we show that these improved
    results can be obtained at little additional computational cost.

    Deep Learning with Energy-efficient Binary Gradient Cameras

    Suren Jayasuriya, Orazio Gallo, Jinwei Gu, Jan Kautz
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Power consumption is a critical factor for the deployment of embedded
    computer vision systems. We explore the use of computational cameras that
    directly output binary gradient images to reduce the portion of the power
    consumption allocated to image sensing. We survey the accuracy of binary
    gradient cameras on a number of computer vision tasks using deep learning.
    These include object recognition, head pose regression, face detection, and
    gesture recognition. We show that, for certain applications, accuracy can be on
    par or even better than what can be achieved on traditional images. We are also
    the first to recover intensity information from binary spatial gradient
    images–useful for applications with a human observer in the loop, such as
    surveillance. Our results, which we validate with a prototype binary gradient
    camera, point to the potential of gradient-based computer vision systems.

    Food Image Recognition by Using Convolutional Neural Networks (CNNs)

    Yuzhen Lu
    Comments: 6 pages, 5 figures, 3 tables
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Food image recognition is one of the promising applications of visual object
    recognition in computer vision. In this study, a small-scale dataset consisting
    of 5822 images of ten categories and a five-layer CNN was constructed to
    recognize these images. The bag-of-features (BoF) model coupled with support
    vector machine was first tested as comparison, resulting in an overall accuracy
    of 56%, while the CNN performed much better with an overall accuracy of 74%.
    Data expansion techniques were applied to increase the size of training images,
    which achieved a significantly improved accuracy of more than 90% and prevent
    the overfitting issue that occurred to the CNN without using data expansion.
    Further improvement is within reach by collecting more images and optimizing
    the network architecture and relevant hyper-parameters.

    Semi-supervised learning of deep metrics for stereo reconstruction

    Stepan Tulyakov, Anton Ivanov, Francois Fleuret
    Comments: 11 pages, 3 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    Deep-learning metrics have recently demonstrated extremely good performance
    to match image patches for stereo reconstruction. However, training such
    metrics requires large amount of labeled stereo images, which can be difficult
    or costly to collect for certain applications. The main contribution of our
    work is a new semi-supervised method for learning deep metrics from unlabeled
    stereo images, given coarse information about the scenes and the optical
    system. Our method alternatively optimizes the metric with a standard
    stochastic gradient descent, and applies stereo constraints to regularize its
    prediction. Experiments on reference data-sets show that, for a given network
    architecture, training with this new method without ground-truth produces a
    metric with performance as good as state-of-the-art baselines trained with the
    said ground-truth. This work has three practical implications. Firstly, it
    helps to overcome limitations of training sets, in particular noisy ground
    truth. Secondly it allows to use much more training data during learning.
    Thirdly, it allows to tune deep metric for a particular stereo system, even if
    ground truth is not available.

    End-to-end learning of brain tissue segmentation from imperfect labeling

    Alex Fedorov, Jeremy Johnson, Eswar Damaraju, Alexei Ozerin, Vince Calhoun, Sergey Plis
    Comments: 7 pages, 8 figures, 2 tables
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Segmenting a structural magnetic resonance imaging (MRI) scan is an important
    pre-processing step for analytic procedures and subsequent inferences about
    longitudinal tissue changes. Manual segmentation defines the current gold
    standard in quality but is prohibitively expensive. Automatic approaches are
    computationally intensive, incredibly slow at scale, and error prone due to
    usually involving many potentially faulty intermediate steps. In order to
    streamline the segmentation, we introduce a deep learning model that is based
    on volumetric dilated convolutions, subsequently reducing both processing time
    and errors. Compared to its competitors, the model has a reduced set of
    parameters and thus is easier to train and much faster to execute. The contrast
    in performance between the dilated network and its competitors becomes obvious
    when both are tested on a large dataset of unprocessed human brain volumes. The
    dilated network consistently outperforms not only another state-of-the-art deep
    learning approach, the up convolutional network, but also the ground truth on
    which it was trained. Not only can the incredible speed of our model make large
    scale analyses much easier but we also believe it has great potential in a
    clinical setting where, with little to no substantial delay, a patient and
    provider can go over test results.

    Commonly Uncommon: Semantic Sparsity in Situation Recognition

    Mark Yatskar, Vicente Ordonez, Luke Zettlemoyer, Ali Farhadi
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

    Semantic sparsity is a common challenge in structured visual classification
    problems; when the output space is complex, the vast majority of the possible
    predictions are rarely, if ever, seen in the training set. This paper studies
    semantic sparsity in situation recognition, the task of producing structured
    summaries of what is happening in images, including activities, objects and the
    roles objects play within the activity. For this problem, we find empirically
    that most object-role combinations are rare, and current state-of-the-art
    models significantly underperform in this sparse data regime. We avoid many
    such errors by (1) introducing a novel tensor composition function that learns
    to share examples across role-noun combinations and (2) semantically augmenting
    our training data with automatically gathered examples of rarely observed
    outputs using web data. When integrated within a complete CRF-based structured
    prediction model, the tensor-based approach outperforms existing state of the
    art by a relative improvement of 2.11% and 4.40% on top-5 verb and noun-role
    accuracy, respectively. Adding 5 million images with our semantic augmentation
    techniques gives further relative improvements of 6.23% and 9.57% on top-5 verb
    and noun-role accuracy.

    Parameter Compression of Recurrent Neural Networks and Degredation of Short-term Memory

    Jonathan A. Cox
    Comments: Submitted to IJCNN 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    The significant computational costs of deploying neural networks in
    large-scale or resource constrained environments, such as data centers and
    mobile devices, has spurred interest in model compression, which can achieve a
    reduction in both arithmetic operations and storage memory. Several techniques
    have been proposed for reducing or compressing the parameters for feed-forward
    and convolutional neural networks, but less is understood about the effect of
    parameter compression on recurrent neural networks (RNN). In particular, the
    extent to which the recurrent parameters can be compressed and the impact on
    short-term memory performance, is not well understood. In this paper, we study
    the effect of complexity reduction, through singular value decomposition rank
    reduction, on RNN and minimal gated recurrent unit (MGRU) networks for several
    tasks. We show that considerable rank reduction is possible when compressing
    recurrent weights, even without fine tuning. Furthermore, we propose a
    perturbation model for the effect of general perturbations, such as a
    compression, on the recurrent parameters of RNNs. The model is tested against a
    noiseless memorization experiment that elucidates the short-term memory
    performance. In this way, we demonstrate that the effect of compression of
    recurrent parameters is dependent on the degree of temporal coherence present
    in the data and task. This work can guide on-the-fly RNN compression for novel
    environments or tasks, and provides insight for applying RNN compression in
    low-power devices, such as hearing aids.

    Procedural Generation of Videos to Train Deep Action Recognition Networks

    César Roberto de Souza, Adrien Gaidon, Yohann Cabon, Antonio Manuel López Peña
    Comments: Under review at CVPR 2017. this https URL
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Deep learning for human action recognition in videos is making significant
    progress, but is slowed down by its dependency on expensive manual labeling of
    large video collections. In this work, we investigate the generation of
    synthetic training data for action recognition, as it has recently shown
    promising results for a variety of other computer vision tasks. We propose an
    interpretable parametric generative model of human action videos that relies on
    procedural generation and other computer graphics techniques of modern game
    engines.We generate a diverse, realistic, and physically plausible dataset of
    human action videos, called PHAV for “Procedural Human Action Videos”. It
    contains a total of 37,536 videos, more than 1,000 examples for each action of
    35 categories. Our approach is not limited to existing motion capture
    sequences, and we procedurally define 14 synthetic actions. We introduce a deep
    multi-task representation learning architecture to mix synthetic and real
    videos, even if the action categories differ. Our experiments on the UCF101 and
    HMDB51 benchmarks suggest that combining our large set of synthetic videos with
    small real-world datasets can boost recognition performance, significantly
    outperforming fine-tuning state-of-the-art unsupervised generative models of
    videos.

    Multi-resolution Data Fusion for Super-Resolution Electron Microscopy

    Suhas Sreehari, S. V. Venkatakrishnan, Katherine L. Bouman, Jeffrey P. Simmons, Lawrence F. Drummy, Charles A. Bouman
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Perhaps surprisingly, the total electron microscopy (EM) data collected to
    date is less than a cubic millimeter. Consequently, there is an enormous demand
    in the materials and biological sciences to image at greater speed and lower
    dosage, while maintaining resolution. Traditional EM imaging based on
    homogeneous raster-order scanning severely limits the volume of high-resolution
    data that can be collected, and presents a fundamental limitation to
    understanding physical processes such as material deformation, crack
    propagation, and pyrolysis.

    We introduce a novel multi-resolution data fusion (MDF) method for
    super-resolution computational EM. Our method combines innovative data
    acquisition with novel algorithmic techniques to dramatically improve the
    resolution/volume/speed trade-off. The key to our approach is to collect the
    entire sample at low resolution, while simultaneously collecting a small
    fraction of data at high resolution. The high-resolution measurements are then
    used to create a material-specific patch-library that is used within the
    “plug-and-play” framework to dramatically improve super-resolution of the
    low-resolution data. We present results using FEI electron microscope data that
    demonstrate super-resolution factors of 4x, 8x, and 16x, while substantially
    maintaining high image quality and reducing dosage.


    Artificial Intelligence

    The Complexity of Bayesian Networks Specified by Propositional and Relational Languages

    Fabio Gagliardi Cozman, Denis Deratani Mauá
    Subjects: Artificial Intelligence (cs.AI)

    We examine the complexity of inference in Bayesian networks specified by
    logical languages. We consider representations that range from fragments of
    propositional logic to function-free first-order logic with equality; in doing
    so we cover a variety of plate models and of probabilistic relational models.
    We study the complexity of inferences when network, query and domain are the
    input (the inferential and the combined complexity), when the network is fixed
    and query and domain are the input (the query/data complexity), and when the
    network and query are fixed and the domain is the input (the domain
    complexity). We draw connections with probabilistic databases and liftability
    results, and obtain complexity classes that range from polynomial to
    exponential levels.

    Deep Learning of Robotic Tasks using Strong and Weak Human Supervision

    Bar Hilleli, Ran El-Yaniv
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Robotics (cs.RO)

    We propose a scheme for training a computerized agent to perform complex
    human tasks such as highway steering. The scheme resembles natural
    teaching-learning processes used by humans to teach themselves and each other
    complex tasks, and consists of the following four stages. In the first stage
    the agent learns by itself an informative low-dimensional representations of
    raw input signals in an unsupervised learning manner. In the second stage the
    agent learns to mimic the human instructor using supervised learning so as to
    reach a basic performance level; the third stage is devoted to learning an
    instantaneous reward model. Here, the (human) instructor observes (possibly in
    real time) the agent performing the task and provides reward feedback. During
    this stage the agent monitors both itself and the instructor feedback and
    learns a reward model using supervised learning. This stage terminates when the
    reward model is sufficiently accurate. In the last stage a reinforcement
    learning algorithm is deployed to optimize the agent policy. The guidance
    reward signal in the reinforcement learning algorithm relies on the previously
    learned reward model. As a proof of concept for the proposed scheme, we
    designed a system consisting of deep convolutional neural networks, and applied
    it to successfully learn a computerized agent capable of autonomous highway
    steering over the well-known racing game Assetto Corsa.

    Algorithmic Songwriting with ALYSIA

    Margareta Ackerman, David Loker
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD)

    This paper introduces ALYSIA: Automated LYrical SongwrIting Application.
    ALYSIA is based on a machine learning model using Random Forests, and we
    discuss its success at pitch and rhythm prediction. Next, we show how ALYSIA
    was used to create original pop songs that were subsequently recorded and
    produced. Finally, we discuss our vision for the future of Automated
    Songwriting for both co-creative and autonomous systems.

    DeepBach: a Steerable Model for Bach chorales generation

    Gaëtan Hadjeres, François Pachet
    Comments: 20 pages, 11 figures
    Subjects: Artificial Intelligence (cs.AI); Sound (cs.SD)

    The composition of polyphonic chorale music in the style of J.S Bach has
    represented a major challenge in automatic music composition over the last
    decades. The art of Bach chorales composition involves combining four-part
    harmony with characteristic rhythmic patterns and typical melodic movements to
    produce musical phrases which begin, evolve and end (cadences) in a harmonious
    way. To our knowledge, no model so far was able to solve all these problems
    simultaneously using an agnostic machine-learning approach. This paper
    introduces DeepBach, a statistical model aimed at modeling polyphonic music and
    specifically four parts, hymn-like pieces. We claim that, after being trained
    on the chorale harmonizations by Johann Sebastian Bach, our model is capable of
    generating highly convincing chorales in the style of Bach. We evaluate how
    indistinguishable our generated chorales are from existing Bach chorales with a
    listening test. The results corroborate our claim. A key strength of DeepBach
    is that it is agnostic and flexible. Users can constrain the generation by
    imposing some notes, rhythms or cadences in the generated score. This allows
    users to reharmonize user-defined melodies. DeepBach’s generation is fast,
    making it usable for interactive music composition applications. Several
    generation examples are provided and discussed from a musical point of view.

    RecSys Challenge 2016: job recommendations based on preselection of offers and gradient boosting

    Andrzej Pacuk, Piotr Sankowski, Karol Węgrzycki, Adam Witkowski, Piotr Wygocki
    Comments: 6 pages, 1 figure, 2 tables, Description of 2nd place winning solution of RecSys 2016 Challange. To be published in RecSys’16 Challange Proceedings
    Subjects: Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Software Engineering (cs.SE)

    We present the Mim-Solution’s approach to the RecSys Challenge 2016, which
    ranked 2nd. The goal of the competition was to prepare job recommendations for
    the users of the website Xing.com.

    Our two phase algorithm consists of candidate selection followed by the
    candidate ranking. We ranked the candidates by the predicted probability that
    the user will positively interact with the job offer. We have used Gradient
    Boosting Decision Trees as the regression tool.

    Using Discourse Signals for Robust Instructor Intervention Prediction

    Muthu Kumar Chandrasekaran, Carrie Demmans Epp, Min-Yen Kan, Diane Litman
    Comments: To appear in proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, USA
    Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computers and Society (cs.CY)

    We tackle the prediction of instructor intervention in student posts from
    discussion forums in Massive Open Online Courses (MOOCs). Our key finding is
    that using automatically obtained discourse relations improves the prediction
    of when instructors intervene in student discussions, when compared with a
    state-of-the-art, feature-rich baseline. Our supervised classifier makes use of
    an automatic discourse parser which outputs Penn Discourse Treebank (PDTB) tags
    that represent in-post discourse features. We show PDTB relation-based features
    increase the robustness of the classifier and complement baseline features in
    recalling more diverse instructor intervention patterns. In comprehensive
    experiments over 14 MOOC offerings from several disciplines, the PDTB discourse
    features improve performance on average. The resultant models are less
    dependent on domain-specific vocabulary, allowing them to better generalize to
    new courses.

    A Matrix Splitting Perspective on Planning with Options

    Pierre-Luc Bacon, Doina Precup
    Comments: Accepted at the Continual Learning and Deep Networks Workshop, NIPS 2016
    Subjects: Artificial Intelligence (cs.AI)

    We show that the Bellman operator underlying the options framework leads to a
    matrix splitting, an approach traditionally used to speed up convergence of
    iterative solvers for large linear systems of equations. Based on standard
    comparison theo- rems for matrix splittings, we then show how the asymptotic
    rate of convergence varies as a function of the inherent timescales of the
    options. This new perspective highlights a trade-off between asymptotic
    performance and the cost of computation associated with building a good set of
    options.

    N-gram Opcode Analysis for Android Malware Detection

    BooJoong Kang, Suleiman Y. Yerima, Sakir Sezer, Kieran McLaughlin
    Journal-ref: International Journal on Cyber Situational Awareness, Vol. 1, No.
    1, pp231-255 (2016)
    Subjects: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI)

    Android malware has been on the rise in recent years due to the increasing
    popularity of Android and the proliferation of third party application markets.
    Emerging Android malware families are increasingly adopting sophisticated
    detection avoidance techniques and this calls for more effective approaches for
    Android malware detection. Hence, in this paper we present and evaluate an
    n-gram opcode features based approach that utilizes machine learning to
    identify and categorize Android malware. This approach enables automated
    feature discovery without relying on prior expert or domain knowledge for
    pre-determined features. Furthermore, by using a data segmentation technique
    for feature selection, our analysis is able to scale up to 10-gram opcodes. Our
    experiments on a dataset of 2520 samples showed an f-measure of 98% using the
    n-gram opcode based approach. We also provide empirical findings that
    illustrate factors that have probable impact on the overall n-gram opcodes
    performance trends.

    Proportional Rankings

    Piotr Skowron, Martin Lackner, Markus Brill, Dominik Peters, Edith Elkind
    Subjects: Computer Science and Game Theory (cs.GT); Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)

    In this paper we extend the principle of proportional representation to
    rankings. We consider the setting where alternatives need to be ranked based on
    approval preferences. In this setting, proportional representation requires
    that cohesive groups of voters are represented proportionally in each initial
    segment of the ranking. Proportional rankings are desirable in situations where
    initial segments of different lengths may be relevant, e.g., hiring decisions
    (if it is unclear how many positions are to be filled), the presentation of
    competing proposals on a liquid democracy platform (if it is unclear how many
    proposals participants are taking into consideration), or recommender systems
    (if a ranking has to accommodate different user types). We study the
    proportional representation provided by several ranking methods and prove
    theoretical guarantees. Furthermore, we experimentally evaluate these methods
    and present preliminary evidence as to which methods are most suitable for
    producing proportional rankings.

    A New Type-II Fuzzy Logic Based Controller for Non-linear Dynamical Systems with Application to a 3-PSP Parallel Robot

    Hamid Reza Hassanzadeh
    Comments: Master’s thesis
    Subjects: Systems and Control (cs.SY); Artificial Intelligence (cs.AI)

    The concept of uncertainty is posed in almost any complex system including
    parallel robots as an outstanding instance of dynamical robotics systems. As
    suggested by the name, uncertainty, is some missing information that is beyond
    the knowledge of human thus we may tend to handle it properly to minimize the
    side-effects through the control process.

    Type-II fuzzy logic has shown its superiority over traditional fuzzy logic
    when dealing with uncertainty. Type-II fuzzy logic controllers are however
    newer and more promising approaches that have been recently applied to various
    fields due to their significant contribution especially when noise (as an
    important instance of uncertainty) emerges. During the design of Type-I fuzzy
    logic systems, we presume that we are almost certain about the fuzzy membership
    functions which is not true in many cases. Thus T2FLS as a more realistic
    approach dealing with practical applications might have a lot to offer. Type-II
    fuzzy logic takes into account a higher level of uncertainty, in other words,
    the membership grade for a type-II fuzzy variable is no longer a crisp number
    but rather is itself a type-I linguistic term. In this thesis the effects of
    uncertainty in dynamic control of a parallel robot is considered. More
    specifically, it is intended to incorporate the Type-II Fuzzy Logic paradigm
    into a model based controller, the so-called computed torque control method,
    and apply the result to a 3 degrees of freedom parallel manipulator.

    …

    Message Passing Multi-Agent GANs

    Arnab Ghosh, Viveka Kulharia, Vinay Namboodiri
    Comments: The first 2 authors contributed equally for this work
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Communicating and sharing intelligence among agents is an important facet of
    achieving Artificial General Intelligence. As a first step towards this
    challenge, we introduce a novel framework for image generation: Message Passing
    Multi-Agent Generative Adversarial Networks (MPM GANs). While GANs have
    recently been shown to be very effective for image generation and other tasks,
    these networks have been limited to mostly single generator-discriminator
    networks. We show that we can obtain multi-agent GANs that communicate through
    message passing to achieve better image generation. The objectives of the
    individual agents in this framework are two fold: a co-operation objective and
    a competing objective. The co-operation objective ensures that the message
    sharing mechanism guides the other generator to generate better than itself
    while the competing objective encourages each generator to generate better than
    its counterpart. We analyze and visualize the messages that these GANs share
    among themselves in various scenarios. We quantitatively show that the message
    sharing formulation serves as a regularizer for the adversarial training.
    Qualitatively, we show that the different generators capture different traits
    of the underlying data distribution.

    Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision (Short Version)

    Chen Liang, Jonathan Berant, Quoc Le, Kenneth D. Forbus, Ni Lao
    Comments: Published in NAMPI workshop at NIPS 2016. Short version of arXiv:1611.00020
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Extending the success of deep neural networks to natural language
    understanding and symbolic reasoning requires complex operations and external
    memory. Recent neural program induction approaches have attempted to address
    this problem, but are typically limited to differentiable memory, and
    consequently cannot scale beyond small synthetic tasks. In this work, we
    propose the Manager-Programmer-Computer framework, which integrates neural
    networks with non-differentiable memory to support abstract, scalable and
    precise operations through a friendly neural computer interface. Specifically,
    we introduce a Neural Symbolic Machine, which contains a sequence-to-sequence
    neural “programmer”, and a non-differentiable “computer” that is a Lisp
    interpreter with code assist. To successfully apply REINFORCE for training, we
    augment it with approximate gold programs found by an iterative maximum
    likelihood training process. NSM is able to learn a semantic parser from weak
    supervision over a large knowledge base. It achieves new state-of-the-art
    performance on WebQuestionsSP, a challenging semantic parsing dataset, with
    weak supervision. Compared to previous approaches, NSM is end-to-end, therefore
    does not rely on feature engineering or domain specific knowledge.

    Representing Independence Models with Elementary Triplets

    Jose M. Peña
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI)

    In an independence model, the triplets that represent conditional
    independences between singletons are called elementary. It is known that the
    elementary triplets represent the independence model unambiguously under some
    conditions. In this paper, we show how this representation helps performing
    some operations with independence models, such as finding the dominant triplets
    or a minimal independence map of an independence model, or computing the union
    or intersection of a pair of independence models, or performing causal
    reasoning. For the latter, we rephrase in terms of conditional independences
    some of Pearl’s results for computing causal effects.

    Enhancing Use Case Points Estimation Method Using Soft Computing Techniques

    Ali Bou Nassif, Luiz Fernando Capretz, Danny Ho
    Subjects: Software Engineering (cs.SE); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Software estimation is a crucial task in software engineering. Software
    estimation encompasses cost, effort, schedule, and size. The importance of
    software estimation becomes critical in the early stages of the software life
    cycle when the details of software have not been revealed yet. Several
    commercial and non-commercial tools exist to estimate software in the early
    stages. Most software effort estimation methods require software size as one of
    the important metric inputs and consequently, software size estimation in the
    early stages becomes essential. One of the approaches that has been used for
    about two decades in the early size and effort estimation is called use case
    points. Use case points method relies on the use case diagram to estimate the
    size and effort of software projects. Although the use case points method has
    been widely used, it has some limitations that might adversely affect the
    accuracy of estimation. This paper presents some techniques using fuzzy logic
    and neural networks to improve the accuracy of the use case points method.
    Results showed that an improvement up to 22% can be obtained using the proposed
    approach.

    Commonly Uncommon: Semantic Sparsity in Situation Recognition

    Mark Yatskar, Vicente Ordonez, Luke Zettlemoyer, Ali Farhadi
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

    Semantic sparsity is a common challenge in structured visual classification
    problems; when the output space is complex, the vast majority of the possible
    predictions are rarely, if ever, seen in the training set. This paper studies
    semantic sparsity in situation recognition, the task of producing structured
    summaries of what is happening in images, including activities, objects and the
    roles objects play within the activity. For this problem, we find empirically
    that most object-role combinations are rare, and current state-of-the-art
    models significantly underperform in this sparse data regime. We avoid many
    such errors by (1) introducing a novel tensor composition function that learns
    to share examples across role-noun combinations and (2) semantically augmenting
    our training data with automatically gathered examples of rarely observed
    outputs using web data. When integrated within a complete CRF-based structured
    prediction model, the tensor-based approach outperforms existing state of the
    art by a relative improvement of 2.11% and 4.40% on top-5 verb and noun-role
    accuracy, respectively. Adding 5 million images with our semantic augmentation
    techniques gives further relative improvements of 6.23% and 9.57% on top-5 verb
    and noun-role accuracy.

    Toward Efficient Task Assignment and Motion Planning for Large Scale Underwater Mission

    Somaiyeh Mahmoud Zadeh, David MW Powers, Karl Sammut, Amirmehdi Yazdani
    Comments: arXiv admin note: substantial text overlap with arXiv:1604.03308
    Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI)

    An Autonomous Underwater Vehicle (AUV) needs to acquire a certain degree of
    autonomy for any particular underwater mission to fulfill the mission
    objectives successfully and ensure its safety in all stages of the mission in a
    large scale operating filed. In this paper, a novel combinatorial
    conflict-free-task assignment strategy consisting an interactive engagement of
    a local path planner and an adaptive global route planner, is introduced. The
    method is established upon the heuristic search potency of the Particle Swarm
    Optimisation (PSO) algorithm to address the discrete nature of routing-task
    assignment approach and the complexity of NP-hard path planning problem. The
    proposed hybrid method is highly efficient for having a reactive guidance
    framework that guarantees successful completion of missions specifically in
    cluttered environments. To examine the performance of the method in a context
    of mission productivity, mission time management and vehicle safety, a series
    of simulation studies are undertaken. The results of simulations declare that
    the proposed method is reliable and robust, particularly in dealing with
    uncertainties, and it can significantly enhance the level of vehicle’s autonomy
    by relying on its reactive nature and capability of providing fast feasible
    solutions.


    Information Retrieval

    We used Neural Networks to Detect Clickbaits: You won't believe what happened Next!

    Ankesh Anand, Tanmoy Chakraborty, Noseong Park
    Comments: Accepted to the European Conference on Information Retrieval (ECIR), 2017
    Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)

    Online content publishers often use catchy headlines for their articles in
    order to attract users to their websites. These headlines, popularly known as
    clickbaits, exploit a user’s curiosity gap and lure them to click on links that
    often disappoint them. Existing methods for automatically detecting clickbaits
    rely on heavy feature engineering and domain knowledge. Here, we introduce a
    neural network architecture based on Recurrent Neural Networks for detecting
    clickbaits. Our model relies on distributed word representations learned from a
    large unannotated corpora, and character embeddings learned via Convolutional
    Neural Networks. Experimental results on a dataset of news headlines show that
    our model outperforms existing techniques for clickbait detection with an
    accuracy of 0.98 with F1-score of 0.98 and ROC-AUC of 0.99.

    Mining Spatio-temporal Data on Industrialization from Historical Registries

    David Berenbaum, Dwyer Deighan, Thomas Marlow, Ashley Lee, Scott Frickel, Mark Howison
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR)

    Despite the growing availability of big data in many fields, historical data
    on socioevironmental phenomena are often not available due to a lack of
    automated and scalable approaches for collecting, digitizing, and assembling
    them. We have developed a data-mining method for extracting tabulated, geocoded
    data from printed directories. While scanning and optical character recognition
    (OCR) can digitize printed text, these methods alone do not capture the
    structure of the underlying data. Our pipeline integrates both page layout
    analysis and OCR to extract tabular, geocoded data from structured text. We
    demonstrate the utility of this method by applying it to scanned manufacturing
    registries from Rhode Island that record 41 years of industrial land use. The
    resulting spatio-temporal data can be used for socioenvironmental analyses of
    industrialization at a resolution that was not previously possible. In
    particular, we find strong evidence for the dispersion of manufacturing from
    the urban core of Providence, the state’s capital, along the Interstate 95
    corridor to the north and south.

    RecSys Challenge 2016: job recommendations based on preselection of offers and gradient boosting

    Andrzej Pacuk, Piotr Sankowski, Karol Węgrzycki, Adam Witkowski, Piotr Wygocki
    Comments: 6 pages, 1 figure, 2 tables, Description of 2nd place winning solution of RecSys 2016 Challange. To be published in RecSys’16 Challange Proceedings
    Subjects: Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Software Engineering (cs.SE)

    We present the Mim-Solution’s approach to the RecSys Challenge 2016, which
    ranked 2nd. The goal of the competition was to prepare job recommendations for
    the users of the website Xing.com.

    Our two phase algorithm consists of candidate selection followed by the
    candidate ranking. We ranked the candidates by the predicted probability that
    the user will positively interact with the job offer. We have used Gradient
    Boosting Decision Trees as the regression tool.


    Computation and Language

    Mapping the Dialog Act Annotations of the LEGO Corpus into the Communicative Functions of ISO 24617-2

    Eugénio Ribeiro, Ricardo Ribeiro, David Martins de Matos
    Comments: 20 pages, 2 figures
    Subjects: Computation and Language (cs.CL)

    In this paper we present strategies for mapping the dialog act annotations of
    the LEGO corpus into the communicative functions of the ISO 24617-2 standard.
    Using these strategies, we obtained an additional 347 dialogs annotated
    according to the standard. This is particularly important given the reduced
    amount of existing data in those conditions due to the recency of the standard.
    Furthermore, these are dialogs from a widely explored corpus for dialog related
    tasks. However, its dialog annotations have been neglected due to their high
    domain-dependency, which renders them unuseful outside the context of the
    corpus. Thus, through our mapping process, we both obtain more data annotated
    according to a recent standard and provide useful dialog act annotations for a
    widely explored corpus in the context of dialog research.

    We used Neural Networks to Detect Clickbaits: You won't believe what happened Next!

    Ankesh Anand, Tanmoy Chakraborty, Noseong Park
    Comments: Accepted to the European Conference on Information Retrieval (ECIR), 2017
    Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)

    Online content publishers often use catchy headlines for their articles in
    order to attract users to their websites. These headlines, popularly known as
    clickbaits, exploit a user’s curiosity gap and lure them to click on links that
    often disappoint them. Existing methods for automatically detecting clickbaits
    rely on heavy feature engineering and domain knowledge. Here, we introduce a
    neural network architecture based on Recurrent Neural Networks for detecting
    clickbaits. Our model relies on distributed word representations learned from a
    large unannotated corpora, and character embeddings learned via Convolutional
    Neural Networks. Experimental results on a dataset of news headlines show that
    our model outperforms existing techniques for clickbait detection with an
    accuracy of 0.98 with F1-score of 0.98 and ROC-AUC of 0.99.

    Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision (Short Version)

    Chen Liang, Jonathan Berant, Quoc Le, Kenneth D. Forbus, Ni Lao
    Comments: Published in NAMPI workshop at NIPS 2016. Short version of arXiv:1611.00020
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Extending the success of deep neural networks to natural language
    understanding and symbolic reasoning requires complex operations and external
    memory. Recent neural program induction approaches have attempted to address
    this problem, but are typically limited to differentiable memory, and
    consequently cannot scale beyond small synthetic tasks. In this work, we
    propose the Manager-Programmer-Computer framework, which integrates neural
    networks with non-differentiable memory to support abstract, scalable and
    precise operations through a friendly neural computer interface. Specifically,
    we introduce a Neural Symbolic Machine, which contains a sequence-to-sequence
    neural “programmer”, and a non-differentiable “computer” that is a Lisp
    interpreter with code assist. To successfully apply REINFORCE for training, we
    augment it with approximate gold programs found by an iterative maximum
    likelihood training process. NSM is able to learn a semantic parser from weak
    supervision over a large knowledge base. It achieves new state-of-the-art
    performance on WebQuestionsSP, a challenging semantic parsing dataset, with
    weak supervision. Compared to previous approaches, NSM is end-to-end, therefore
    does not rely on feature engineering or domain specific knowledge.

    CER: Complementary Entity Recognition via Knowledge Expansion on Large Unlabeled Product Reviews

    Hu Xu, Sihong Xie, Lei Shu, Philip S. Yu
    Comments: 10 pages, 2 figures, IEEE BigData 2016
    Subjects: Computation and Language (cs.CL)

    Product reviews contain a lot of useful information about product features
    and customer opinions. One important product feature is the complementary
    entity (products) that may potentially work together with the reviewed product.
    Knowing complementary entities of the reviewed product is very important
    because customers want to buy compatible products and avoid incompatible ones.
    In this paper, we address the problem of Complementary Entity Recognition
    (CER). Since no existing method can solve this problem, we first propose a
    novel unsupervised method to utilize syntactic dependency paths to recognize
    complementary entities. Then we expand category-level domain knowledge about
    complementary entities using only a few general seed verbs on a large amount of
    unlabeled reviews. The domain knowledge helps the unsupervised method to adapt
    to different products and greatly improves the precision of the CER task. The
    advantage of the proposed method is that it does not require any labeled data
    for training. We conducted experiments on 7 popular products with about 1200
    reviews in total to demonstrate that the proposed approach is effective.

    Unit Dependency Graph and its Application to Arithmetic Word Problem Solving

    Subhro Roy, Dan Roth
    Comments: AAAI 2017
    Subjects: Computation and Language (cs.CL)

    Math word problems provide a natural abstraction to a range of natural
    language understanding problems that involve reasoning about quantities, such
    as interpreting election results, news about casualties, and the financial
    section of a newspaper. Units associated with the quantities often provide
    information that is essential to support this reasoning. This paper proposes a
    principled way to capture and reason about units and shows how it can benefit
    an arithmetic word problem solver. This paper presents the concept of Unit
    Dependency Graphs (UDGs), which provides a compact representation of the
    dependencies between units of numbers mentioned in a given problem. Inducing
    the UDG alleviates the brittleness of the unit extraction system and allows for
    a natural way to leverage domain knowledge about unit compatibility, for word
    problem solving. We introduce a decomposed model for inducing UDGs with minimal
    additional annotations, and use it to augment the expressions used in the
    arithmetic word problem solver of (Roy and Roth 2015) via a constrained
    inference framework. We show that introduction of UDGs reduces the error of the
    solver by over 10 %, surpassing all existing systems for solving arithmetic
    word problems. In addition, it also makes the system more robust to adaptation
    to new vocabulary and equation forms .

    End-to-End Joint Learning of Natural Language Understanding and Dialogue Manager

    Xuesong Yang, Yun-Nung Chen, Dilek Hakkani-Tur, Paul Crook, Xiujun Li, Jianfeng Gao, Li Deng
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    Natural language understanding and dialogue policy learning are both
    essential in conversational systems that predict the next system actions in
    response to a current user utterance. Conventional approaches aggregate
    separate models of natural language understanding (NLU) and system action
    prediction (SAP) as a pipeline that is sensitive to noisy outputs of
    error-prone NLU. To address the issues, we propose an end-to-end deep recurrent
    neural network with limited contextual dialogue memory by jointly training NLU
    and SAP on DSTC4 multi-domain human-human dialogues. Experiments show that our
    proposed model significantly outperforms the state-of-the-art pipeline models
    for both NLU and SAP, which indicates that our joint model is capable of
    mitigating the affects of noisy NLU outputs, and NLU model can be refined by
    error flows backpropagating from the extra supervised signals of system
    actions.

    Creating a Real-Time, Reproducible Event Dataset

    John Beieler
    Subjects: Computation and Language (cs.CL)

    The generation of political event data has remained much the same since the
    mid-1990s, both in terms of data acquisition and the process of coding text
    into data. Since the 1990s, however, there have been significant improvements
    in open-source natural language processing software and in the availability of
    digitized news content. This paper presents a new, next-generation event
    dataset, named Phoenix, that builds from these and other advances. This dataset
    includes improvements in the underlying news collection process and event
    coding software, along with the creation of a general processing pipeline
    necessary to produce daily-updated data. This paper provides a face validity
    checks by briefly examining the data for the conflict in Syria, and a
    comparison between Phoenix and the Integrated Crisis Early Warning System data.

    Using Discourse Signals for Robust Instructor Intervention Prediction

    Muthu Kumar Chandrasekaran, Carrie Demmans Epp, Min-Yen Kan, Diane Litman
    Comments: To appear in proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, USA
    Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computers and Society (cs.CY)

    We tackle the prediction of instructor intervention in student posts from
    discussion forums in Massive Open Online Courses (MOOCs). Our key finding is
    that using automatically obtained discourse relations improves the prediction
    of when instructors intervene in student discussions, when compared with a
    state-of-the-art, feature-rich baseline. Our supervised classifier makes use of
    an automatic discourse parser which outputs Penn Discourse Treebank (PDTB) tags
    that represent in-post discourse features. We show PDTB relation-based features
    increase the robustness of the classifier and complement baseline features in
    recalling more diverse instructor intervention patterns. In comprehensive
    experiments over 14 MOOC offerings from several disciplines, the PDTB discourse
    features improve performance on average. The resultant models are less
    dependent on domain-specific vocabulary, allowing them to better generalize to
    new courses.


    Distributed, Parallel, and Cluster Computing

    A Randomized Concurrent Algorithm for Disjoint Set Union

    Siddhartha V. Jayanti, Robert E. Tarjan
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS)

    The disjoint set union problem is a basic problem in data structures with a
    wide variety of applications. We extend a known efficient sequential algorithm
    for this problem to obtain a simple and efficient concurrent wait-free
    algorithm running on an asynchronous parallel random access machine (APRAM).
    Crucial to our result is the use of randomization. Under a certain independence
    assumption, for a problem instance in which there are n elements, m operations,
    and p processes, our algorithm does Theta(m (alpha(n, m/(np)) + log(np/m + 1)))
    expected work, where the expectation is over the random choices made by the
    algorithm and alpha is a functional inverse of Ackermann’s function. In
    addition, each operation takes O(log n) steps with high probability. Our
    algorithm is significantly simpler and more efficient than previous algorithms
    proposed by Anderson and Woll. Under our independence assumption, our algorithm
    achieves almost-linear speed-up for applications in which all or most of the
    processes can be kept busy.

    Support vector regression model for BigData systems

    Alessandro Maria Rizzi
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Learning (cs.LG); Performance (cs.PF)

    Nowadays Big Data are becoming more and more important. Many sectors of our
    economy are now guided by data-driven decision processes. Big Data and business
    intelligence applications are facilitated by the MapReduce programming model
    while, at infrastructural layer, cloud computing provides flexible and cost
    effective solutions for allocating on demand large clusters. In such systems,
    capacity allocation, which is the ability to optimally size minimal resources
    for achieve a certain level of performance, is a key challenge to enhance
    performance for MapReduce jobs and minimize cloud resource costs. In order to
    do so, one of the biggest challenge is to build an accurate performance model
    to estimate job execution time of MapReduce systems. Previous works applied
    simulation based models for modeling such systems. Although this approach can
    accurately describe the behavior of Big Data clusters, it is too
    computationally expensive and does not scale to large system. We try to
    overcome these issues by applying machine learning techniques. More precisely
    we focus on Support Vector Regression (SVR) which is intrinsically more robust
    w.r.t other techniques, like, e.g., neural networks, and less sensitive to
    outliers in the training set. To better investigate these benefits, we compare
    SVR to linear regression.

    High-Performance Distributed Machine Learning using Apache SPARK

    Celestine Dünner, Thomas Parnell, Kubilay Atasu, Manolis Sifalakis, Haralampos Pozidis
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Learning (cs.LG)

    In this paper we compare the performance of distributed learning using Apache
    SPARK and MPI by implementing a distributed linear learning algorithm from
    scratch on the two programming frameworks. We then explore the performance gap
    and show how SPARK learning can be accelerated, by reducing computational cost
    as well as communication-related overheads, to reduce the relative loss in
    performance versus MPI from 20x to 2x. With these different implementations at
    hand, we will illustrate how the optimal parameters of the algorithm depend
    strongly on the characteristics of the framework on which it is executed. We
    will show that carefully tuning a distributed algorithm to trade-off
    communication and computation can improve performance by orders of magnitude.
    Hence, understanding system aspects of the framework and their implications,
    and then correctly adapting the algorithm proves to be the key to performance.

    The communication-hiding pipelined BiCGStab method for the efficient parallel solution of large unsymmetric linear systems

    Siegfried Cools, Wim Vanroose
    Comments: 21 pages, 5 figures, 4 tables
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    A High Performance Computing alternative to traditional Krylov methods,
    pipelined Krylov solvers offer better scalability in the strong scaling limit
    compared to standard Krylov methods for large and sparse linear systems. The
    traditional synchronization bottleneck is mitigated by overlapping
    time-consuming global communication phases with local computations in the
    algorithm. This paper describes a general framework for deriving the pipelined
    variant of any Krylov algorithm. The proposed framework was implicitly used to
    derive the pipelined Conjugate Gradient (p-CG) method in Hiding global
    synchronization latency in the preconditioned Conjugate Gradient algorithm by
    P. Ghysels and W. Vanroose, Parallel Computing, 40(7):224–238, 2014. The
    pipelining framework is subsequently illustrated by formulating a pipelined
    version of the BiCGStab method for the solution of large unsymmetric linear
    systems on parallel hardware. A residual replacement strategy is proposed to
    account for the possible loss of attainable accuracy and robustness by the
    pipelined BiCGStab method. It is shown that the pipelined algorithm improves
    scalability on distributed memory machines, leading to significant speedups
    compared to standard preconditioned BiCGStab.

    Adaptive Work-Efficient Connected Components on the GPU

    Michael Sutton, Tal Ben-Nun, Amnon Barak, Sreepathi Pai, Keshav Pingali
    Comments: 4 pages
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS)

    This report presents an adaptive work-efficient approach for implementing the
    Connected Components algorithm on GPUs. The results show a considerable
    increase in performance (up to 6.8( imes)) over current state-of-the-art
    solutions.

    QoS-based Computing Resources Partitioning between Virtual Machines in the Cloud Architecture

    Evgeny Nikulchev, Evgeniy Pluzhnik, Oleg Lukyanchikov, Dmitry Biryukov, Elena Andrianova
    Comments: 6 pages, International Journal of Advanced Computer Science and Applications (2016) 7
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Cloud services have been used very widely, but configuration of the
    parameters, including the efficient allocation of resources, is an important
    objective for the system architect. The article is devoted to solving the
    problem of choosing the architecture of computers based on simulation and
    developed program for monitoring computing resources. Techniques were developed
    aimed at providing the required quality of service and efficient use of
    resources. The article describes the monitoring program of computing resources
    and time efficiency of the target application functions. On the basis of this
    application the technique is shown and described in the experiment, designed to
    ensure the requirements for quality of service, by isolating one process from
    the others on different virtual machines inside the hypervisor.

    On spreading rumor in heterogeneous systems

    Jacek Cichoń, Zbigniew Gołeobbiewski, Marcin Kardas, Marek Klonowski, Filip Zagórski
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    In this paper we consider a model of spreading information in heterogeneous
    systems wherein we have two kinds of objects. Some of them are active and
    others are passive. Active objects can, if they possess information, share it
    with an encountered passive object. We focus on a particular case such that
    active objects communicate independently with randomly chosen passive objects.
    Such model is motivated by two real-life scenarios. The first one is a very
    dynamic system of mobile devices distributing information among stationary
    devices. The second is an architecture wherein clients communicate with several
    servers and can leave some information learnt from other servers. The main
    question we investigate is how many rounds is needed to deliver the information
    to all objects under the assumption that at the beginning exactly one object
    has the information?

    In this paper we provide mathematical models of such process and show rigid
    and very precise mathematical analysis for some special cases important from
    practical point of view. Some mathematical results are quite surprising — we
    find relation of investigated process to both coupon collector’s problem as
    well as the birthday paradox. Additionally, we present simulations for showing
    behaviour for general parameters

    BrainFrame: A heterogeneous accelerator platform for neuron simulations

    Georgios Smaragdos, Georgios Chatzikonstantis, Rahul Kukreja, Harrys Sidiropoulos, Dimitrios Rodopoulos, Ioannis Sourdis, Zaid Al-Ars, Christoforos Kachris, Dimitrios Soudris, Chris I. De Zeeuw, Christos Strydis
    Comments: 17 pages, 19 figures, 4 tables
    Subjects: Neural and Evolutionary Computing (cs.NE); Distributed, Parallel, and Cluster Computing (cs.DC)

    Objective: The advent of High-Performance Computing (HPC) in recent years has
    led to its increasing use in brain study through computational models. The
    scale and complexity of such models are constantly increasing, leading to
    challenging computational requirements. Even though modern HPC platforms can
    often deal with such challenges, the vast diversity of the modeling field does
    not permit for a single acceleration (or homogeneous) platform to effectively
    address the complete array of modeling requirements. Approach: In this paper we
    propose and build BrainFrame, a heterogeneous acceleration platform,
    incorporating three distinct acceleration technologies, a Dataflow Engine, a
    Xeon Phi and a GP-GPU. The PyNN framework is also integrated into the platform.
    As a challenging proof of concept, we analyze the performance of BrainFrame on
    different instances of a state-of-the-art neuron model, modeling the Inferior-
    Olivary Nucleus using a biophysically-meaningful, extended Hodgkin-Huxley
    representation. The model instances take into account not only the neuronal-
    network dimensions but also different network-connectivity circumstances that
    can drastically change application workload characteristics. Main results: The
    synthetic approach of three HPC technologies demonstrated that BrainFrame is
    better able to cope with the modeling diversity encountered. Our performance
    analysis shows clearly that the model directly affect performance and all three
    technologies are required to cope with all the model use cases.

    StreamNF: Performance and Correctness for Stateful Chained NFs

    Junaid Khalid, Aditya Akella
    Subjects: Networking and Internet Architecture (cs.NI); Distributed, Parallel, and Cluster Computing (cs.DC)

    Network functions virtualization (NFV) — deploying network functions in
    software on commodity machines — allows operators to employ rich chains of NFs
    to realize custom performance, security, and compliance policies, and ensure
    high performance by dynamically adding instances and/or failing over. Because
    NFs are stateful, it is important to carefully manage their state, especially
    during such dynamic actions. Crucially, state management must: (1) offer good
    performance to match the needs of modern networks; (2) ensure NF chain-wide
    properties; and (3) not require the operator to manage low-level state
    management details. We present StreamNF, an NFV framework that satisfies the
    above requirements. To do so, StreamNF leverages an external state store with
    novel caching strategies and offloading of state operations, and chain-level
    logical packet clocks and packet logging/replay. Extensive evaluation of a
    StreamNF prototype built atop Apache Storm shows that the significant benefits
    of StreamNF in terms of state management performance and chain-wide properties
    come at a modest per-packet latency cost.

    Online Page Migration on Ring Networks in Uniform Model

    Amanj Khorramian, Akira Matsubayashi
    Comments: 8 pages, 5 figures
    Subjects: Data Structures and Algorithms (cs.DS); Distributed, Parallel, and Cluster Computing (cs.DC); Information Theory (cs.IT)

    This paper explores the problem of page migration in ring networks. A ring
    network is a connected graph, in which each node is connected with exactly two
    other nodes. In this problem, one of the nodes in a given network holds a page
    of size D. This node is called the server and the page is a non-duplicable data
    in the network. Requests are issued by nodes to access the page one after
    another. Every time a new request is issued, the server must serve the request
    and may migrate to another node before the next request arrives. A service
    costs the distance between the server and the requesting node, and the
    migration costs the distance of the migration multiplied by D. The problem is
    to minimize the total costs of services and migrations. We study this problem
    in uniform model, for which the page has a unit size, i.e. D=1. A
    3.326-competitive algorithm improving the current best upper bound is designed.
    We show that this ratio is tight for our algorithm.

    Expander Graph and Communication-Efficient Decentralized Optimization

    Yat-Tin Chow, Wei Shi, Tianyu Wu, Wotao Yin
    Comments: 2016 IEEE Asilomar Conference on Signals, Systems, and Computers
    Subjects: Optimization and Control (math.OC); Distributed, Parallel, and Cluster Computing (cs.DC); Multiagent Systems (cs.MA)

    In this paper, we discuss how to design the graph topology to reduce the
    communication complexity of certain algorithms for decentralized optimization.
    Our goal is to minimize the total communication needed to achieve a prescribed
    accuracy. We discover that the so-called expander graphs are near-optimal
    choices. We propose three approaches to construct expander graphs for different
    numbers of nodes and node degrees. Our numerical results show that the
    performance of decentralized optimization is significantly better on expander
    graphs than other regular graphs.


    Learning

    Incomplete data representation for SVM classification

    Łukasz Struski, Marek Śmieja, Jacek Tabor
    Comments: 15 pages, 3 figures
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    In this paper we propose two ways of incomplete data representation. The
    first one is a generalization of a flag representation, where a vector with
    missing attributes is filled with some values and joined with flag vectors
    indicating missing components. Our generalization uses pointed affine
    subspaces, which in addition to flag representation allows to perform various
    affine transformations of data, as whitening or dimensionality reduction. We
    show how to embed such affine subspaces into a vector space and how to define a
    proper scalar product. In the second approach, we represent missing data points
    by degenerated Gaussian densities, which additionally model the uncertainty
    connected with missing features. This representation allows to construct an
    analogue of RBF kernel on incomplete data space.

    Zeroth-order Asynchronous Doubly Stochastic Algorithm with Variance Reduction

    Bin Gu, Zhouyuan Huo, Heng Huang
    Subjects: Learning (cs.LG)

    Zeroth-order (derivative-free) optimization attracts a lot of attention in
    machine learning, because explicit gradient calculations may be computationally
    expensive or infeasible. To handle large scale problems both in volume and
    dimension, recently asynchronous doubly stochastic zeroth-order algorithms were
    proposed. The convergence rate of existing asynchronous doubly stochastic
    zeroth order algorithms is (O(frac{1}{sqrt{T}})) (also for the sequential
    stochastic zeroth-order optimization algorithms). In this paper, we focus on
    the finite sums of smooth but not necessarily convex functions, and propose an
    asynchronous doubly stochastic zeroth-order optimization algorithm using the
    accelerated technology of variance reduction (AsyDSZOVR). Rigorous theoretical
    analysis show that the convergence rate can be improved from
    (O(frac{1}{sqrt{T}})) the best result of existing algorithms to
    (O(frac{1}{T})). Also our theoretical results is an improvement to the ones of
    the sequential stochastic zeroth-order optimization algorithms.

    Sparse Label Propagation

    Alexander Jung
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    We consider massive heterogeneous datasets with intrinsic network structure,
    i.e., big data over networks. These datasets can be modelled by graph signals,
    which are defined over large-scale irregular graphs representing complex
    networks. We show that (semi-supervised) learning of the entire underlying
    graph signal based on incomplete information provided by few initial labels can
    be reduced to a compressed sensing recovery problem within the cosparse
    analysis model. This reduction provides two things: First, it allows to apply
    highly developed compressed sensing methods to the learning problem. In
    particular, by implementing a recent primal-dual method for convex
    optimization, we obtain a sparse label propagation algorithm. Moreover, by
    casting the learning problem within compressed sensing, we are able to derive
    sufficient conditions on the graph structure and available label information,
    such that sparse label propagation is accurate.

    Learning Adversary-Resistant Deep Neural Networks

    Qinglong Wang, Wenbo Guo, Kaixuan Zhang, Alexander G. Ororbia II, Xinyu Xing, C. Lee Giles, Xue Liu
    Subjects: Learning (cs.LG)

    Deep neural networks (DNNs) have proven to be quite effective in a vast array
    of machine learning tasks, with recent examples in cyber security and
    autonomous vehicles. Despite the superior performance of DNNs in these
    applications, it has been recently shown that these models are susceptible to a
    particular type of attack that exploits a fundamental flaw in their design.
    This attack consists of generating particular synthetic examples referred to as
    adversarial samples. These samples are constructed by slightly manipulating
    real data-points in order to “fool” the original DNN model, forcing it to
    mis-classify previously correctly classified samples with high confidence.
    Addressing this flaw in the model is essential if DNNs are to be used in
    critical applications such as those in cyber security. Previous work has
    provided various defense mechanisms by either augmenting the training set or
    enhancing model complexity. However, after a thorough analysis, we discover
    that DNNs protected by these defense mechanisms are still susceptible to
    adversarial samples, indicating that there are no theoretical guarantees of
    resistance provided by these mechanisms. To the best of our knowledge, we are
    the first to investigate this issue shared across previous research work and to
    propose a unifying framework for protecting DNN models by integrating a data
    transformation module with the DNN. More importantly, we provide a theoretical
    guarantee for protection under our proposed framework. We evaluate our method
    and several other existing solutions on both MNIST, CIFAR-10, and a malware
    dataset, to demonstrate the generality of our proposed method and its potential
    for handling cyber security applications. The results show that our framework
    provides better resistance compared to state-of-art solutions while
    experiencing negligible degradation in accuracy.

    Implicit Modeling — A Generalization of Discriminative and Generative Approaches

    Dmitrij Schlesinger, Carsten Rother
    Subjects: Learning (cs.LG)

    We propose a new modeling approach that is a generalization of generative and
    discriminative models. The core idea is to use an implicit parameterization of
    a joint probability distribution by specifying only the conditional
    distributions. The proposed scheme combines the advantages of both worlds — it
    can use powerful complex discriminative models as its parts, having at the same
    time better generalization capabilities. We thoroughly evaluate the proposed
    method for a simple classification task with artificial data and illustrate its
    advantages for real-word scenarios on a semantic image segmentation problem.

    A Nearly Optimal Contextual Bandit Algorithm

    Mohammadreza Mohaghegh Neyshabouri, Kaan Gokcesu, Selami Ciftci, Suleyman S. Kozat
    Subjects: Learning (cs.LG)

    We investigate the contextual multi-armed bandit problem in an adversarial
    setting and introduce an online algorithm that asymptotically achieves the
    performance of the best contextual bandit arm selection strategy under certain
    conditions. We show that our algorithm is highly efficient and provides
    significantly improved performance with a guaranteed performance upper bound in
    a strong mathematical sense. We have no statistical assumptions on the context
    vectors and the loss of the bandit arms, hence our results are guaranteed to
    hold even in adversarial environments. We use a tree notion in order to
    partition the space of context vectors in a nested structure. Using this tree,
    we construct a large class of context dependent bandit arm selection strategies
    and adaptively combine them to achieve the performance of the best strategy. We
    use the hierarchical nature of introduced tree to implement this combination
    with a significantly low computational complexity, thus our algorithm can be
    efficiently used in applications involving big data. Through extensive set of
    experiments involving synthetic and real data, we demonstrate significant
    performance gains achieved by the proposed algorithm with respect to the
    state-of-the-art adversarial bandit algorithms.

    Diagnostic Prediction Using Discomfort Drawings

    Cheng Zhang, Hedvig Kjellstrom, Bo C. Bertilson
    Comments: NIPS 2016 Workshop on Machine Learning for Health
    Subjects: Learning (cs.LG)

    In this paper, we explore the possibility to apply machine learning to make
    diagnostic predictions using discomfort drawings. A discomfort drawing is an
    intuitive way for patients to express discomfort and pain related symptoms.
    These drawings have proven to be an effective method to collect patient data
    and make diagnostic decisions in real-life practice. A dataset from real-world
    patient cases is collected for which medical experts provide diagnostic labels.
    Next, we extend a factorized multimodal topic model, Inter-Battery Topic Model
    (IBTM), to train a system that can make diagnostic predictions given an unseen
    discomfort drawing. Experimental results show reasonable predictions of
    diagnostic labels given an unseen discomfort drawing. The positive result
    indicates a significant potential of machine learning to be used for parts of
    the pain diagnostic process and to be a decision support system for physicians
    and other health care personnel.

    A One class Classifier based Framework using SVDD : Application to an Imbalanced Geological Dataset

    Soumi Chaki, Akhilesh Kumar Verma, Aurobinda Routray, William K. Mohanty, Mamata Jenamani
    Comments: presented at IEEE Students Technology Symposium (TechSym), 28 February to 2 March 2014, IIT Kharagpur, India. 6 pages, 7 figures, 2tables
    Subjects: Learning (cs.LG); Applications (stat.AP); Machine Learning (stat.ML)

    Evaluation of hydrocarbon reservoir requires classification of petrophysical
    properties from available dataset. However, characterization of reservoir
    attributes is difficult due to the nonlinear and heterogeneous nature of the
    subsurface physical properties. In this context, present study proposes a
    generalized one class classification framework based on Support Vector Data
    Description (SVDD) to classify a reservoir characteristic water saturation into
    two classes (Class high and Class low) from four logs namely gamma ray, neutron
    porosity, bulk density, and P sonic using an imbalanced dataset. A comparison
    is carried out among proposed framework and different supervised classification
    algorithms in terms of g metric means and execution time. Experimental results
    show that proposed framework has outperformed other classifiers in terms of
    these performance evaluators. It is envisaged that the classification analysis
    performed in this study will be useful in further reservoir modeling.

    Cryptocurrency Portfolio Management with Deep Reinforcement Learning

    Zhengyao Jiang, Jinjun Liang
    Subjects: Learning (cs.LG)

    We present a convolutional neural network with historic prices of a set of
    financial assets as its input, outputting portfolio weights of the set. The
    network is trained to achieve maximum accumulative return. A back-test trading
    experiment is conducted in a cryptocurrency market, achieving 10-fold returns
    in 1.8 month’s periods, outperforming many traditional portfolio management
    algorithms and benchmarks.

    Deep Symbolic Representation Learning for Heterogeneous Time-series Classification

    Shengdong Zhang, Soheil Bahrampour, Naveen Ramakrishnan, Mohak Shah
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    In this paper, we consider the problem of event classification with
    multi-variate time series data consisting of heterogeneous (continuous and
    categorical) variables. The complex temporal dependencies between the variables
    combined with sparsity of the data makes the event classification problem
    particularly challenging. Most state-of-art approaches address this either by
    designing hand-engineered features or breaking up the problem over homogeneous
    variates. In this work, we propose and compare three representation learning
    algorithms over symbolized sequences which enables classification of
    heterogeneous time-series data using a deep architecture. The proposed
    representations are trained jointly along with the rest of the network
    architecture in an end-to-end fashion that makes the learned features
    discriminative for the given task. Experiments on three real-world datasets
    demonstrate the effectiveness of the proposed approaches.

    Robust nonparametric nearest neighbor random process clustering

    Michael Tschannen, Helmut Bölcskei
    Comments: 13 pages, 4 figures
    Subjects: Learning (cs.LG); Information Theory (cs.IT); Machine Learning (stat.ML)

    We consider the problem of clustering noisy finite-length observations of
    stationary ergodic random processes according to their generative models
    without prior knowledge of the model statistics and the number of generative
    models. Two algorithms, both using the L1-distance between estimated power
    spectral densities (PSDs) as a measure of dissimilarity, are analyzed. The
    first one, termed nearest neighbor process clustering (NNPC) is new and relies
    on partitioning the nearest neighbor graph of the observations via spectral
    clustering. The second algorithm, simply referred to as k-means (KM), consists
    of a single k-means iteration with farthest point initialization and was
    considered before in the literature, albeit with a different dissimilarity
    measure and with asymptotic performance results only. We prove that both
    algorithms succeed with high probability in the presence of noise and missing
    entries, and even when the generative process PSDs overlap significantly, all
    provided that the observation length is sufficiently large. Our results
    quantify the tradeoff between the overlap of the generative process PSDs, the
    observation length, the fraction of missing entries, and the noise variance.
    Furthermore, we prove that treating the finite-length observations of
    stationary ergodic random processes as vectors in Euclidean space and
    clustering them using the thresholding-based subspace clustering (TSC)
    algorithm, the subspace clustering cousin of NNPC, results in performance
    strictly inferior to that of NNPC. We argue that the underlying cause is to be
    found in TSC employing spherical distance as dissimilarity measure, thereby
    ignoring the stationary process structure of the observations. Finally, we
    provide extensive numerical results for synthetic and real data and find that
    NNPC outperforms state-of-the-art algorithms in human motion sequence
    clustering.

    Learning to superoptimize programs – Workshop Version

    Rudy Bunel, Alban Desmaison, M. Pawan Kumar, Philip H.S.Torr, Pushmeet Kohli
    Comments: Workshop version for the NIPS NAMPI Workshop. Extended version at arXiv:1611.01787
    Subjects: Learning (cs.LG)

    Superoptimization requires the estimation of the best program for a given
    computational task. In order to deal with large programs, superoptimization
    techniques perform a stochastic search. This involves proposing a modification
    of the current program, which is accepted or rejected based on the improvement
    achieved. The state of the art method uses uniform proposal distributions,
    which fails to exploit the problem structure to the fullest. To alleviate this
    deficiency, we learn a proposal distribution over possible modifications using
    Reinforcement Learning. We provide convincing results on the superoptimization
    of “Hacker’s Delight” programs.

    Trained Ternary Quantization

    Chenzhuo Zhu, Song Han, Huizi Mao, William J. Dally
    Comments: Submitted to ICLR 17
    Subjects: Learning (cs.LG)

    Product reviews contain a lot of useful information about product features
    and customer opinions. One important product feature is the complementary
    entity (products) that may potentially work together with the reviewed product.
    Knowing complementary entities of the reviewed product is very important
    because customers want to buy compatible products and avoid incompatible ones.
    In this paper, we address the problem of Complementary Entity Recognition
    (CER). Since no existing method can solve this problem, we first propose a
    novel unsupervised method to utilize syntactic dependency paths to recognize
    complementary entities. Then we expand category-level domain knowledge about
    complementary entities using only a few general seed verbs on a large amount of
    unlabeled reviews. The domain knowledge helps the unsupervised method to adapt
    to different products and greatly improves the precision of the CER task. The
    advantage of the proposed method is that it does not require any labeled data
    for training. We conducted experiments on 7 popular products with about 1200
    reviews in total to demonstrate that the proposed approach is effective.

    Positive blood culture detection in time series data using a BiLSTM network

    Leen De Baets, Joeri Ruyssinck, Thomas Peiffer, Johan Decruyenaere, Filip De Turck, Femke Ongenae, Tom Dhaene
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Quantitative Methods (q-bio.QM); Machine Learning (stat.ML)

    The presence of bacteria or fungi in the bloodstream of patients is abnormal
    and can lead to life-threatening conditions. A computational model based on a
    bidirectional long short-term memory artificial neural network, is explored to
    assist doctors in the intensive care unit to predict whether examination of
    blood cultures of patients will return positive. As input it uses nine
    monitored clinical parameters, presented as time series data, collected from
    2177 ICU admissions at the Ghent University Hospital. Our main goal is to
    determine if general machine learning methods and more specific, temporal
    models, can be used to create an early detection system. This preliminary
    research obtains an area of 71.95% under the precision recall curve, proving
    the potential of temporal neural networks in this context.

    Success Probability of Exploration: a Concrete Analysis of Learning Efficiency

    Liangpeng Zhang, Ke Tang, Xin Yao
    Subjects: Learning (cs.LG)

    Exploration has been a crucial part of reinforcement learning, yet several
    important questions concerning exploration efficiency are still not answered
    satisfactorily by existing analytical frameworks. These questions include
    exploration parameter setting, situation analysis, and hardness of MDPs, all of
    which are unavoidable for practitioners. To bridge the gap between the theory
    and practice, we propose a new analytical framework called the success
    probability of exploration. We show that those important questions of
    exploration above can all be answered under our framework, and the answers
    provided by our framework meet the needs of practitioners better than the
    existing ones. More importantly, we introduce a concrete and practical approach
    to evaluating the success probabilities in certain MDPs without the need of
    actually running the learning algorithm. We then provide empirical results to
    verify our approach, and demonstrate how the success probability of exploration
    can be used to analyse and predict the behaviours and possible outcomes of
    exploration, which are the keys to the answer of the important questions of
    exploration.

    A Novel Framework based on SVDD to Classify Water Saturation from Seismic Attributes

    Soumi Chaki, Akhilesh Kumar Verma, Aurobinda Routray, William K. Mohanty, Mamata Jenamani
    Comments: 6 pages, 8 figures, 2table Presented at Fourth International Conference on Emerging Applications of Information Technology (EAIT 2014), ISI Kolkata, India
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    Water saturation is an important property in reservoir engineering domain.
    Thus, satisfactory classification of water saturation from seismic attributes
    is beneficial for reservoir characterization. However, diverse and non-linear
    nature of subsurface attributes makes the classification task difficult. In
    this context, this paper proposes a generalized Support Vector Data Description
    (SVDD) based novel classification framework to classify water saturation into
    two classes (Class high and Class low) from three seismic attributes seismic
    impedance, amplitude envelop, and seismic sweetness. G-metric means and program
    execution time are used to quantify the performance of the proposed framework
    along with established supervised classifiers. The documented results imply
    that the proposed framework is superior to existing classifiers. The present
    study is envisioned to contribute in further reservoir modeling.

    A novel multiclassSVM based framework to classify lithology from well logs: a real-world application

    Soumi Chaki, Aurobinda Routray, William K. Mohanty, Mamata Jenamani
    Comments: 5 pages, 5 figures, 4 tables Presented at INDICON 2015 at New Delhi, India
    Subjects: Learning (cs.LG); Applications (stat.AP); Machine Learning (stat.ML)

    Support vector machines (SVMs) have been recognized as a potential tool for
    supervised classification analyses in different domains of research. In
    essence, SVM is a binary classifier. Therefore, in case of a multiclass
    problem, the problem is divided into a series of binary problems which are
    solved by binary classifiers, and finally the classification results are
    combined following either the one-against-one or one-against-all strategies. In
    this paper, an attempt has been made to classify lithology using a multiclass
    SVM based framework using well logs as predictor variables. Here, the lithology
    is classified into four classes such as sand, shaly sand, sandy shale and shale
    based on the relative values of sand and shale fractions as suggested by an
    expert geologist. The available dataset consisting well logs (gamma ray,
    neutron porosity, density, and P-sonic) and class information from four closely
    spaced wells from an onshore hydrocarbon field is divided into training and
    testing sets. We have used one-against-all strategy to combine the results of
    multiple binary classifiers. The reported results established the superiority
    of multiclass SVM compared to other classifiers in terms of classification
    accuracy. The selection of kernel function and associated parameters has also
    been investigated here. It can be envisaged from the results achieved in this
    study that the proposed framework based on multiclass SVM can further be used
    to solve classification problems. In future research endeavor, seismic
    attributes can be introduced in the framework to classify the lithology
    throughout a study area from seismic inputs.

    A Nonparametric Latent Factor Model For Location-Aware Video Recommendations

    Ehtsham Elahi
    Comments: NIPS 2016 Workshop on Practical Bayesian Nonparametrics
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    We are interested in learning customers’ video preferences from their
    historic viewing patterns and geographical location. We consider a Bayesian
    latent factor modeling approach for this task. In order to tune the complexity
    of the model to best represent the data, we make use of Bayesian nonparameteric
    techniques. We describe an inference technique that can scale to large
    real-world data sets. Finally we show results obtained by applying the model to
    a large internal Netflix data set, that illustrates that the model was able to
    capture interesting relationships between viewing patterns and geographical
    location.

    Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles

    Balaji Lakshminarayanan, Alexander Pritzel, Charles Blundell
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Deep neural networks are powerful black box predictors that have recently
    achieved impressive performance on a wide spectrum of tasks. Quantifying
    predictive uncertainty in neural networks is a challenging and yet unsolved
    problem. Bayesian neural networks, which learn a distribution over weights, are
    currently the state-of-the-art for estimating predictive uncertainty; however
    these require significant modifications to the training procedure and are
    computationally expensive compared to standard (non-Bayesian) neural neural
    networks. We propose an alternative to Bayesian neural networks, that is simple
    to implement, readily parallelisable and yields high quality predictive
    uncertainty estimates. Through a series of experiments on classification and
    regression benchmarks, we demonstrate that our method produces well-calibrated
    uncertainty estimates which are as good or better than approximate Bayesian
    neural networks. Finally, we evaluate the predictive uncertainty on test
    examples from known and unknown classes, and show that our method is able to
    express higher degree of uncertainty on unknown classes, unlike existing
    methods which make overconfident predictions even on unknown classes.

    Support vector regression model for BigData systems

    Alessandro Maria Rizzi
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Learning (cs.LG); Performance (cs.PF)

    Nowadays Big Data are becoming more and more important. Many sectors of our
    economy are now guided by data-driven decision processes. Big Data and business
    intelligence applications are facilitated by the MapReduce programming model
    while, at infrastructural layer, cloud computing provides flexible and cost
    effective solutions for allocating on demand large clusters. In such systems,
    capacity allocation, which is the ability to optimally size minimal resources
    for achieve a certain level of performance, is a key challenge to enhance
    performance for MapReduce jobs and minimize cloud resource costs. In order to
    do so, one of the biggest challenge is to build an accurate performance model
    to estimate job execution time of MapReduce systems. Previous works applied
    simulation based models for modeling such systems. Although this approach can
    accurately describe the behavior of Big Data clusters, it is too
    computationally expensive and does not scale to large system. We try to
    overcome these issues by applying machine learning techniques. More precisely
    we focus on Support Vector Regression (SVR) which is intrinsically more robust
    w.r.t other techniques, like, e.g., neural networks, and less sensitive to
    outliers in the training set. To better investigate these benefits, we compare
    SVR to linear regression.

    High-Performance Distributed Machine Learning using Apache SPARK

    Celestine Dünner, Thomas Parnell, Kubilay Atasu, Manolis Sifalakis, Haralampos Pozidis
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Learning (cs.LG)

    In this paper we compare the performance of distributed learning using Apache
    SPARK and MPI by implementing a distributed linear learning algorithm from
    scratch on the two programming frameworks. We then explore the performance gap
    and show how SPARK learning can be accelerated, by reducing computational cost
    as well as communication-related overheads, to reduce the relative loss in
    performance versus MPI from 20x to 2x. With these different implementations at
    hand, we will illustrate how the optimal parameters of the algorithm depend
    strongly on the characteristics of the framework on which it is executed. We
    will show that carefully tuning a distributed algorithm to trade-off
    communication and computation can improve performance by orders of magnitude.
    Hence, understanding system aspects of the framework and their implications,
    and then correctly adapting the algorithm proves to be the key to performance.

    Extracting Implicit Social Relation for Social Recommendation Techniques in User Rating Prediction

    Seyed Mohammad Taheri, Hamidreza Mahyar, Mohammad Firouzi, Ali Movaghar
    Subjects: Social and Information Networks (cs.SI); Learning (cs.LG)

    Recommendation plays an increasingly important role in our daily lives.
    Recommender systems automatically suggest items to users that might be
    interesting for them. Recent studies illustrate that incorporating social trust
    in Matrix Factorization methods demonstrably improves accuracy of rating
    prediction. Such approaches mainly use the trust scores explicitly expressed by
    users. However, it is often challenging to have users provide explicit trust
    scores of each other. There exist quite a few works, which propose Trust
    Metrics to compute and predict trust scores between users based on their
    interactions. In this paper, first we present how social relation can be
    extracted from users’ ratings to items by describing Hellinger distance between
    users in recommender systems. Then, we propose to incorporate the predicted
    trust scores into social matrix factorization models. By analyzing social
    relation extraction from three well-known real-world datasets, which both:
    trust and recommendation data available, we conclude that using the implicit
    social relation in social recommendation techniques has almost the same
    performance compared to the actual trust scores explicitly expressed by users.
    Hence, we build our method, called Hell-TrustSVD, on top of the
    state-of-the-art social recommendation technique to incorporate both the
    extracted implicit social relations and ratings given by users on the
    prediction of items for an active user. To the best of our knowledge, this is
    the first work to extend TrustSVD with extracted social trust information. The
    experimental results support the idea of employing implicit trust into matrix
    factorization whenever explicit trust is not available, can perform much better
    than the state-of-the-art approaches in user rating prediction.

    Ranking Biomarkers Through Mutual Information

    Konstantinos Sechidis, Emily Turner, Paul D. Metcalfe, James Weatherall, Gavin Brown
    Comments: Accepted at NIPS 2016 Workshop on Machine Learning for Health
    Subjects: Machine Learning (stat.ML); Learning (cs.LG); Applications (stat.AP)

    We study information theoretic methods for ranking biomarkers. In clinical
    trials there are two, closely related, types of biomarkers: predictive and
    prognostic, and disentangling them is a key challenge. Our first step is to
    phrase biomarker ranking in terms of optimizing an information theoretic
    quantity. This formalization of the problem will enable us to derive rankings
    of predictive/prognostic biomarkers, by estimating different, high dimensional,
    conditional mutual information terms. To estimate these terms, we suggest
    efficient low dimensional approximations, and we derive an empirical Bayes
    estimator, which is suitable for small or sparse datasets. Finally, we
    introduce a new visualisation tool that captures the prognostic and the
    predictive strength of a set of biomarkers. We believe this representation will
    prove to be a powerful tool in biomarker discovery.

    Message Passing Multi-Agent GANs

    Arnab Ghosh, Viveka Kulharia, Vinay Namboodiri
    Comments: The first 2 authors contributed equally for this work
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Communicating and sharing intelligence among agents is an important facet of
    achieving Artificial General Intelligence. As a first step towards this
    challenge, we introduce a novel framework for image generation: Message Passing
    Multi-Agent Generative Adversarial Networks (MPM GANs). While GANs have
    recently been shown to be very effective for image generation and other tasks,
    these networks have been limited to mostly single generator-discriminator
    networks. We show that we can obtain multi-agent GANs that communicate through
    message passing to achieve better image generation. The objectives of the
    individual agents in this framework are two fold: a co-operation objective and
    a competing objective. The co-operation objective ensures that the message
    sharing mechanism guides the other generator to generate better than itself
    while the competing objective encourages each generator to generate better than
    its counterpart. We analyze and visualize the messages that these GANs share
    among themselves in various scenarios. We quantitatively show that the message
    sharing formulation serves as a regularizer for the adversarial training.
    Qualitatively, we show that the different generators capture different traits
    of the underlying data distribution.

    Deep Image Category Discovery using a Transferred Similarity Function

    Yen-Chang Hsu, Zhaoyang Lv, Zsolt Kira
    Comments: 13 pages, 9 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Automatically discovering image categories in unlabeled natural images is one
    of the important goals of unsupervised learning. However, the task is
    challenging and even human beings define visual categories based on a large
    amount of prior knowledge. In this paper, we similarly utilize prior knowledge
    to facilitate the discovery of image categories. We present a novel end-to-end
    network to map unlabeled images to categories as a clustering network. We
    propose that this network can be learned with contrastive loss which is only
    based on weak binary pair-wise constraints. Such binary constraints can be
    learned from datasets in other domains as transferred similarity functions,
    which mimic a simple knowledge transfer. We first evaluate our experiments on
    the MNIST dataset as a proof of concept, based on predicted similarities
    trained on Omniglot, showing a 99\% accuracy which significantly outperforms
    clustering based approaches. Then we evaluate the discovery performance on
    Cifar-10, STL-10, and ImageNet, which achieves both state-of-the-art accuracy
    and shows it can be scalable to various large natural images.

    Known Unknowns: Uncertainty Quality in Bayesian Neural Networks

    Ramon Oliveira, Pedro Tabacof, Eduardo Valle
    Comments: Workshop on Bayesian Deep Learning, NIPS 2016, Barcelona, Spain
    Subjects: Machine Learning (stat.ML); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    We evaluate the uncertainty quality in neural networks using anomaly
    detection. We extract uncertainty measures (e.g. entropy) from the predictions
    of candidate models, use those measures as features for an anomaly detector,
    and gauge how well the detector differentiates known from unknown classes. We
    assign higher uncertainty quality to candidate models that lead to better
    detectors. We also propose a novel method for sampling a variational
    approximation of a Bayesian neural network, called One-Sample Bayesian
    Approximation (OSBA). We experiment on two datasets, MNIST and CIFAR10. We
    compare the following candidate neural network models: Maximum Likelihood,
    Bayesian Dropout, OSBA, and — for MNIST — the standard variational
    approximation. We show that Bayesian Dropout and OSBA provide better
    uncertainty information than Maximum Likelihood, and are essentially equivalent
    to the standard variational approximation, but much faster.

    Learnable Structured Clustering Framework for Deep Metric Learning

    Hyun Oh Song, Stefanie Jegelka, Vivek Rathod, Kevin Murphy
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Learning the representation and the similarity metric in an end-to-end
    fashion with deep networks have demonstrated outstanding results for clustering
    and retrieval. However, these recent approaches still suffer from the
    performance degradation stemming from the local metric training procedure which
    is unaware of the global structure of the embedding space.

    We propose a global metric learning scheme for optimizing the deep metric
    embedding with the learnable clustering function and the clustering metric
    (NMI) in a novel structured prediction framework.

    Our experiments on CUB200-2011, Cars196, and Stanford online products
    datasets show state of the art performance both on the clustering and retrieval
    tasks measured in the NMI and Recall@K evaluation metrics.

    Optimal and Adaptive Off-policy Evaluation in Contextual Bandits

    Yu-Xiang Wang, Alekh Agarwal, Miroslav Dudik
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    We consider the problem of off-policy evaluation—estimating the value of a
    target policy using data collected by another policy—under the contextual
    bandit model. We establish a minimax lower bound on the mean squared error
    (MSE), and show that it is matched up to constant factors by the inverse
    propensity scoring (IPS) estimator. Since in the multi-armed bandit problem the
    IPS is suboptimal (Li et. al, 2015), our result highlights the difficulty of
    the contextual setting with non-degenerate context distributions. We further
    consider improvements on this minimax MSE bound, given access to a reward
    model. We show that the existing doubly robust approach, which utilizes such a
    reward model, may continue to suffer from high variance even when the reward
    model is perfect. We propose a new estimator called SWITCH which more
    effectively uses the reward model and achieves a superior bias-variance
    tradeoff compared with prior work. We prove an upper bound on its MSE and
    demonstrate its benefits empirically on a diverse collection of datasets, often
    seeing orders of magnitude improvements over a number of baselines.

    Intra-day Activity Better Predicts Chronic Conditions

    Tom Quisel, David C. Kale, Luca Foschini
    Comments: Presented at the NIPS 2016 Workshop on Machine Learning for Health
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    In this work we investigate intra-day patterns of activity on a population of
    7,261 users of mobile health wearable devices and apps. We show that: (1) using
    intra-day step and sleep data recorded from passive trackers significantly
    improves classification performance on self-reported chronic conditions related
    to mental health and nervous system disorders, (2) Convolutional Neural
    Networks achieve top classification performance vs. baseline models when
    trained directly on multivariate time series of activity data, and (3) jointly
    predicting all condition classes via multi-task learning can be leveraged to
    extract features that generalize across data sets and achieve the highest
    classification performance.

    Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision (Short Version)

    Chen Liang, Jonathan Berant, Quoc Le, Kenneth D. Forbus, Ni Lao
    Comments: Published in NAMPI workshop at NIPS 2016. Short version of arXiv:1611.00020
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Extending the success of deep neural networks to natural language
    understanding and symbolic reasoning requires complex operations and external
    memory. Recent neural program induction approaches have attempted to address
    this problem, but are typically limited to differentiable memory, and
    consequently cannot scale beyond small synthetic tasks. In this work, we
    propose the Manager-Programmer-Computer framework, which integrates neural
    networks with non-differentiable memory to support abstract, scalable and
    precise operations through a friendly neural computer interface. Specifically,
    we introduce a Neural Symbolic Machine, which contains a sequence-to-sequence
    neural “programmer”, and a non-differentiable “computer” that is a Lisp
    interpreter with code assist. To successfully apply REINFORCE for training, we
    augment it with approximate gold programs found by an iterative maximum
    likelihood training process. NSM is able to learn a semantic parser from weak
    supervision over a large knowledge base. It achieves new state-of-the-art
    performance on WebQuestionsSP, a challenging semantic parsing dataset, with
    weak supervision. Compared to previous approaches, NSM is end-to-end, therefore
    does not rely on feature engineering or domain specific knowledge.

    On the propriety of restricted Boltzmann machines

    Andee Kaplan, Daniel Nordman, Stephen Vardeman
    Comments: 35 pages, 13 figures
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    A restricted Boltzmann machine (RBM) is an undirected graphical model
    constructed for discrete or continuous random variables, with two layers, one
    hidden and one visible, and no conditional dependency within a layer. In recent
    years, RBMs have risen to prominence due to their connection to deep learning.
    By treating a hidden layer of one RBM as the visible layer in a second RBM, a
    deep architecture can be created. RBMs are thought to thereby have the ability
    to encode very complex and rich structures in data, making them attractive for
    supervised learning. However, the generative behavior of RBMs is largely
    unexplored. In this paper, we discuss the relationship between RBM parameter
    specification in the binary case and the tendency to undesirable model
    properties such as degeneracy, instability and uninterpretability. We also
    describe the difficulties that arise in likelihood-based and Bayes fitting of
    such (highly flexible) models, especially as Gibbs sampling (quasi-Bayes)
    methods are often advocated for the RBM model structure.

    Deep Learning of Robotic Tasks using Strong and Weak Human Supervision

    Bar Hilleli, Ran El-Yaniv
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Robotics (cs.RO)

    We propose a scheme for training a computerized agent to perform complex
    human tasks such as highway steering. The scheme resembles natural
    teaching-learning processes used by humans to teach themselves and each other
    complex tasks, and consists of the following four stages. In the first stage
    the agent learns by itself an informative low-dimensional representations of
    raw input signals in an unsupervised learning manner. In the second stage the
    agent learns to mimic the human instructor using supervised learning so as to
    reach a basic performance level; the third stage is devoted to learning an
    instantaneous reward model. Here, the (human) instructor observes (possibly in
    real time) the agent performing the task and provides reward feedback. During
    this stage the agent monitors both itself and the instructor feedback and
    learns a reward model using supervised learning. This stage terminates when the
    reward model is sufficiently accurate. In the last stage a reinforcement
    learning algorithm is deployed to optimize the agent policy. The guidance
    reward signal in the reinforcement learning algorithm relies on the previously
    learned reward model. As a proof of concept for the proposed scheme, we
    designed a system consisting of deep convolutional neural networks, and applied
    it to successfully learn a computerized agent capable of autonomous highway
    steering over the well-known racing game Assetto Corsa.

    Enhancing Use Case Points Estimation Method Using Soft Computing Techniques

    Ali Bou Nassif, Luiz Fernando Capretz, Danny Ho
    Subjects: Software Engineering (cs.SE); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Software estimation is a crucial task in software engineering. Software
    estimation encompasses cost, effort, schedule, and size. The importance of
    software estimation becomes critical in the early stages of the software life
    cycle when the details of software have not been revealed yet. Several
    commercial and non-commercial tools exist to estimate software in the early
    stages. Most software effort estimation methods require software size as one of
    the important metric inputs and consequently, software size estimation in the
    early stages becomes essential. One of the approaches that has been used for
    about two decades in the early size and effort estimation is called use case
    points. Use case points method relies on the use case diagram to estimate the
    size and effort of software projects. Although the use case points method has
    been widely used, it has some limitations that might adversely affect the
    accuracy of estimation. This paper presents some techniques using fuzzy logic
    and neural networks to improve the accuracy of the use case points method.
    Results showed that an improvement up to 22% can be obtained using the proposed
    approach.

    Algorithmic Songwriting with ALYSIA

    Margareta Ackerman, David Loker
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD)

    Support vector machines (SVMs) have been recognized as a potential tool for
    supervised classification analyses in different domains of research. In
    essence, SVM is a binary classifier. Therefore, in case of a multiclass
    problem, the problem is divided into a series of binary problems which are
    solved by binary classifiers, and finally the classification results are
    combined following either the one-against-one or one-against-all strategies. In
    this paper, an attempt has been made to classify lithology using a multiclass
    SVM based framework using well logs as predictor variables. Here, the lithology
    is classified into four classes such as sand, shaly sand, sandy shale and shale
    based on the relative values of sand and shale fractions as suggested by an
    expert geologist. The available dataset consisting well logs (gamma ray,
    neutron porosity, density, and P-sonic) and class information from four closely
    spaced wells from an onshore hydrocarbon field is divided into training and
    testing sets. We have used one-against-all strategy to combine the results of
    multiple binary classifiers. The reported results established the superiority
    of multiclass SVM compared to other classifiers in terms of classification
    accuracy. The selection of kernel function and associated parameters has also
    been investigated here. It can be envisaged from the results achieved in this
    study that the proposed framework based on multiclass SVM can further be used
    to solve classification problems. In future research endeavor, seismic
    attributes can be introduced in the framework to classify the lithology
    throughout a study area from seismic inputs.

    Modeling trajectories of mental health: challenges and opportunities

    Lauren Erdman, Ekansh Sharma, Eva Unternahrer, Shantala Hari Dass, Kieran ODonnell, Sara Mostafavi, Rachel Edgar, Michael Kobor, Helene Gaudreau, Michael Meaney, Anna Goldenberg
    Comments: extended abstract for ML4HC at NIPS 2016, 4 pages
    Subjects: Machine Learning (stat.ML); Learning (cs.LG); Applications (stat.AP)

    More than two thirds of mental health problems have their onset during
    childhood or adolescence. Identifying children at risk for mental illness later
    in life and predicting the type of illness is not easy. We set out to develop a
    platform to define subtypes of childhood social-emotional development using
    longitudinal, multifactorial trait-based measures. Subtypes discovered through
    this study could ultimately advance psychiatric knowledge of the early
    behavioural signs of mental illness. To this extent we have examined two types
    of models: latent class mixture models and GP-based models. Our findings
    indicate that while GP models come close in accuracy of predicting future
    trajectories, LCMMs predict the trajectories as well in a fraction of the time.
    Unfortunately, neither of the models are currently accurate enough to lead to
    immediate clinical impact. The available data related to the development of
    childhood mental health is often sparse with only a few time points measured
    and require novel methods with improved efficiency and accuracy.

    Large scale modeling of antimicrobial resistance with interpretable classifiers

    Alexandre Drouin, Frédéric Raymond, Gaël Letarte St-Pierre, Mario Marchand, Jacques Corbeil, François Laviolette
    Comments: Peer-reviewed and accepted for presentation at the Machine Learning for Health Workshop, NIPS 2016, Barcelona, Spain
    Subjects: Genomics (q-bio.GN); Learning (cs.LG); Machine Learning (stat.ML)

    Antimicrobial resistance is an important public health concern that has
    implications in the practice of medicine worldwide. Accurately predicting
    resistance phenotypes from genome sequences shows great promise in promoting
    better use of antimicrobial agents, by determining which antibiotics are likely
    to be effective in specific clinical cases. In healthcare, this would allow for
    the design of treatment plans tailored for specific individuals, likely
    resulting in better clinical outcomes for patients with bacterial infections.
    In this work, we present the recent work of Drouin et al. (2016) on using Set
    Covering Machines to learn highly interpretable models of antibiotic resistance
    and complement it by providing a large scale application of their method to the
    entire PATRIC database. We report prediction results for 36 new datasets and
    present the Kover AMR platform, a new web-based tool allowing the visualization
    and interpretation of the generated models.

    Transformation Function Based Methods for Model Shift

    Simon Shaolei Du, Jayanth Koushik, Aarti Singh, Barnabas Poczos
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Transfer learning techniques are often used when one tries to adapt a model
    learned from a source domain with abundant labeled samples to the target domain
    with limited labeled samples. In this paper, we consider the regression problem
    under model shift condition, i.e., regression functions are different but
    related in the source and target domains. We approach this problem through the
    use of transformation functions which characterize the relation between the
    source and the target domain. These transformation functions are able to
    transform the original problem of learning the complicated regression function
    of target domain into a problem of learning a simple auxiliary function. This
    transformation function based technique includes some previous works as special
    cases, but the class we propose is significantly more general. In this work we
    consider two widely used non-parametric estimators, Kernel Smoothing (KS) and
    Kernel Ridge Regression (KRR) for this setting and show improved statistical
    rates for excess risk than non-transfer learning. Through an (epsilon)-cover
    technique, we show that we can find the best transformation function a function
    class. Lastly, experiments on synthesized, robotics and neural imaging data
    demonstrate the effectiveness of our framework.

    Estimating latent feature-feature interactions in large feature-rich graphs

    Corrado Monti, Paolo Boldi
    Subjects: Social and Information Networks (cs.SI); Learning (cs.LG); Machine Learning (stat.ML)

    Real-world complex networks describe connections between objects; in reality,
    those objects are often endowed with some kind of features. How does the
    presence or absence of such features interplay with the network link structure?
    Although the situation here described is truly ubiquitous, there is a limited
    body of research dealing with large graphs of this kind. Many previous works
    considered homophily as the only possible transmission mechanism translating
    node features into links. Other authors, instead, developed more sophisticated
    models, that are able to handle complex feature interactions, but are unfit to
    scale to very large networks. We expand on the MGJ model, where interactions
    between pairs of features can foster or discourage link formation. In this
    work, we will investigate how to estimate the latent feature-feature
    interactions in this model. We shall propose two solutions: the first one
    assumes feature independence and it is essentially based on Naive Bayes; the
    second one, which relaxes the independence assumption assumption, is based on
    perceptrons. In fact, we show it is possible to cast the model equation in
    order to see it as the prediction rule of a perceptron. We analyze how
    classical results for the perceptrons can be interpreted in this context; then,
    we define a fast and simple perceptron-like algorithm for this task, which can
    process (10^8) links in minutes. We then compare these two techniques, first
    with synthetic datasets that follows our model, gaining evidence that the Naive
    independence assumptions are detrimental in practice. Secondly, we consider a
    real, large-scale citation network where each node (i.e., paper) can be
    described by different types of characteristics; there, our algorithm can
    assess how well each set of features can explain the links, and thus finding
    meaningful latent feature-feature interactions.

    End-to-End Joint Learning of Natural Language Understanding and Dialogue Manager

    Xuesong Yang, Yun-Nung Chen, Dilek Hakkani-Tur, Paul Crook, Xiujun Li, Jianfeng Gao, Li Deng
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    Natural language understanding and dialogue policy learning are both
    essential in conversational systems that predict the next system actions in
    response to a current user utterance. Conventional approaches aggregate
    separate models of natural language understanding (NLU) and system action
    prediction (SAP) as a pipeline that is sensitive to noisy outputs of
    error-prone NLU. To address the issues, we propose an end-to-end deep recurrent
    neural network with limited contextual dialogue memory by jointly training NLU
    and SAP on DSTC4 multi-domain human-human dialogues. Experiments show that our
    proposed model significantly outperforms the state-of-the-art pipeline models
    for both NLU and SAP, which indicates that our joint model is capable of
    mitigating the affects of noisy NLU outputs, and NLU model can be refined by
    error flows backpropagating from the extra supervised signals of system
    actions.

    Parameter Compression of Recurrent Neural Networks and Degredation of Short-term Memory

    Jonathan A. Cox
    Comments: Submitted to IJCNN 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    The significant computational costs of deploying neural networks in
    large-scale or resource constrained environments, such as data centers and
    mobile devices, has spurred interest in model compression, which can achieve a
    reduction in both arithmetic operations and storage memory. Several techniques
    have been proposed for reducing or compressing the parameters for feed-forward
    and convolutional neural networks, but less is understood about the effect of
    parameter compression on recurrent neural networks (RNN). In particular, the
    extent to which the recurrent parameters can be compressed and the impact on
    short-term memory performance, is not well understood. In this paper, we study
    the effect of complexity reduction, through singular value decomposition rank
    reduction, on RNN and minimal gated recurrent unit (MGRU) networks for several
    tasks. We show that considerable rank reduction is possible when compressing
    recurrent weights, even without fine tuning. Furthermore, we propose a
    perturbation model for the effect of general perturbations, such as a
    compression, on the recurrent parameters of RNNs. The model is tested against a
    noiseless memorization experiment that elucidates the short-term memory
    performance. In this way, we demonstrate that the effect of compression of
    recurrent parameters is dependent on the degree of temporal coherence present
    in the data and task. This work can guide on-the-fly RNN compression for novel
    environments or tasks, and provides insight for applying RNN compression in
    low-power devices, such as hearing aids.


    Information Theory

    Capacity Regions of Two-Receiver Broadcast Erasure Channels with Feedback and Memory

    Michael Heindlmaier, Shirin Saeedi Bidokhti
    Subjects: Information Theory (cs.IT)

    The two-receiver broadcast packet erasure channel with feedback and memory is
    studied. Memory is modeled using a finite- state Markov chain representing a
    channel state. Two scenarios are considered: (i) when the transmitter has
    causal knowledge of the channel state (i.e., the state is visible), and (ii)
    when the channel state is unknown at the transmitter, but observations of it
    are available at the transmitter through feedback (i.e., the state is hidden).
    In both scenarios, matching outer and inner bounds are derived on the rates of
    communication and the capacity region is determined. It is shown that similar
    results carry over to channels with memory and delayed feedback, and memoryless
    compound channels with feedback. When the state is visible, the capacity region
    has a single-letter characterization and is in terms of a linear program. Two
    optimal coding schemes are devised that use feedback to keep track of the
    sent/received packets via a network of queues: a probabilistic scheme and a
    deterministic backpressure-like algorithm. The former bases its decisions
    solely on the past channel state information and the latter follows a
    max-weight queue-based policy. The performance of the algorithms are analyzed
    using the frameworks of rate-stability in networks of queues, max-flow min-cut
    duality in networks, and finite-horizon Lyapunov drift analysis. When the state
    is hidden, the capacity region does not have a single-letter characterization
    and is, in this sense, uncomputable. Approximations of the capacity region are
    provided and two optimal coding algorithms are outlined. The first algorithm is
    a probabilistic coding scheme that bases its decisions on the past L feedback
    sequences and its achievable rate-region approaches the capacity region
    exponentially fast in L. The second algorithm is a backpressure-like algorithm
    that performs optimally in the long run.

    Approximate Support Recovery of Atomic Line Spectral Estimation: A Tale of Resolution and Precision

    Qiuwei Li, Gongguo Tang
    Subjects: Information Theory (cs.IT); Optimization and Control (math.OC)

    This work investigates the parameter estimation performance of
    super-resolution line spectral estimation using atomic norm minimization. The
    focus is on analyzing the algorithm’s accuracy of inferring the frequencies and
    complex magnitudes from noisy observations. When the Signal-to-Noise Ratio is
    reasonably high and the true frequencies are separated by (O(frac{1}{n})), the
    atomic norm estimator is shown to localize the correct number of frequencies,
    each within a neighborhood of size (O(sqrt{frac{log n}{n^3}} sigma)) of one
    of the true frequencies. Here (n) is half the number of temporal samples and
    (sigma^2) is the Gaussian noise variance. The analysis is based on a
    primal-dual witness construction procedure. The obtained error bound matches
    the Cram’er-Rao lower bound up to a logarithmic factor. The relationship
    between resolution (separation of frequencies) and precision or accuracy of the
    estimator is highlighted. Our analysis also reveals that the atomic norm
    minimization can be viewed as a convex way to solve a (ell_1)-norm
    regularized, nonlinear and nonconvex least-squares problem to global
    optimality.

    Repairing Reed-Solomon Codes With Multiple Erasures

    Hoang Dau, Iwan Duursma, Han Mao Kiah, Olgica Milenkovic
    Comments: 16 pages
    Subjects: Information Theory (cs.IT)

    Despite their exceptional error-correcting properties, Reed-Solomon (RS)
    codes have been overlooked in distributed storage applications due to the
    common belief that they have poor repair bandwidth: A naive repair approach
    would require the whole file to be reconstructed in order to recover a single
    erased codeword symbol. In a recent work, Guruswami and Wootters (STOC’16)
    proposed a single-erasure repair method for RS codes that achieves the optimal
    repair bandwidth amongst all linear encoding schemes. Their key idea is to
    recover the erased symbol by collecting a sufficiently large number of its
    traces, each of which can be constructed from a number of traces of other
    symbols. As all traces belong to a subfield of the defining field of the RS
    code and many of them are linearly dependent, the total repair bandwidth is
    significantly reduced compared to that of the naive repair scheme. We extend
    the trace collection technique to cope with multiple erasures.

    Rate-Compatible Punctured Polar Codes: Optimal Construction Based on Polar Spectra

    Kai Niu, Jincheng Dai, Kai Chen, Jiaru Lin, Q. T. Zhang, Athanasios V. Vasilakos
    Comments: Submitted to IEEE Transactions on Information Theory
    Subjects: Information Theory (cs.IT)

    Polar codes are the first class of constructive channel codes achieving the
    symmetric capacity of the binary-input discrete memoryless channels. But the
    corresponding code length is limited to the power of two. In this paper, we
    establish a systematic framework to design the rate-compatible punctured polar
    (RCPP) codes with arbitrary code length. A new theoretic tool, called polar
    spectra, is proposed to count the number of paths on the code tree with the
    same number of zeros or ones respectively. Furthermore, a spectrum distance SD0
    (SD1) and a joint spectrum distance (JSD) are presented as performance criteria
    to optimize the puncturing tables. For the capacity-zero puncturing mode
    (punctured bits are unknown to the decoder), we propose a quasi-uniform
    puncturing algorithm, analyze the number of equivalent puncturings and prove
    that this scheme can maximize SD1 and JSD. Similarly, for the capacity-one mode
    (punctured bits are known to the decoder), we also devise a reversal
    quasi-uniform puncturing scheme and prove that it has the maximum SD0 and JSD.
    Both schemes have a universal puncturing table without any exhausted search.
    These optimal RCPP codes outperform the performance of turbo codes in LTE
    wireless communication systems.

    On the Capacity of Discrete-Time Laguerre Optical Channels

    Hossein Esmaeili, Jawad Salehi
    Subjects: Information Theory (cs.IT)

    In this paper, new upper and lower bounds are proposed for the capacity of
    discrete-time Laguerre channel. Laguerre behavior is used to model various
    types of optical systems and networks such as optical amplifiers, short
    distance visible light communication systems with direct detection and coherent
    code division multiple access (CDMA) networks. Bounds are derived for short
    distance visible light communication systems and coherent CDMA networks. These
    bounds are separated in three main cases: when both average and peak power
    constraints are imposed, when peak power constraint is inactive and when only
    peak power constraint is active.

    In-network Compression for Multiterminal Cascade MIMO Systems

    Inaki Estella Aguerri, Abdellatif Zaidi
    Comments: Submitted to IEEE Transactions on Communications
    Subjects: Information Theory (cs.IT)

    We study the problem of receive beamforming in uplink cascade multiple-input
    multiple-output (MIMO) systems as an instance of that of cascade multiterminal
    source coding for lossy function computation. Using this connection, we develop
    two coding schemes for the second and show that their application leads to
    efficient beamforming schemes for the first. In the first coding scheme, each
    terminal in the cascade sends a description of the source that it observes; the
    decoder reconstructs all sources, lossily, and then computes an estimate of the
    desired function. This scheme improves upon standard routing in that every
    terminal only compresses the innovation of its source w.r.t. the descriptions
    that are sent by the previous terminals in the cascade. In the second scheme,
    the desired function is computed gradually in the cascade network, and each
    terminal sends a finer description of it. In the context of uplink cascade MIMO
    systems, the application of these two schemes leads to efficient centralized
    receive-beamforming and distributed receive-beamforming, respectively.

    Towards a Tractable Delay Analysis in Large Wireless Networks

    Yi Zhong, Martin Haenggi, Fu-Chun Zheng, Wenyi Zhang, Tony Q.S. Quek, Weili Nie
    Comments: 17 pages, 4 figures, 1 table, submitted to IEEE Communications Magazine
    Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

    Meeting the diverse delay requirements of next-generation wireless
    communication networks is one of the most critical goals of wireless system
    design. Though the delay of point-to-point communications has been well
    investigated using classical queueing theory, the delay of multi-point to
    multi-point communications has not been explored in depth. The main technical
    difficulty lies in the interacting queues problem, in which the service rate is
    coupled with the statuses of other queues. In this article, we elaborate on the
    main challenges of delay analysis in large wireless networks. Several promising
    approaches to bypass these difficulties are proposed and summarized to provide
    useful guidance.

    Cache-Enabled Physical-Layer Security for Video Streaming in Wireless Networks with Limited Backhaul

    Lin Xiang, Derrick Wing Kwan Ng, Robert Schober, Vincent W.S. Wong
    Comments: Accepted for presentation at IEEE Globecom 2016, Washington, DC, Dec. 2016
    Subjects: Information Theory (cs.IT)

    In this paper, we investigate for the first time the benefits of wireless
    caching for the physical layer security (PLS) of wireless networks. In
    particular, a caching scheme enabling power-efficient PLS is proposed for
    cellular video streaming with constrained backhaul capacity. By sharing video
    data across a subset of base stations (BSs) through both caching and backhaul
    loading, secure cooperative transmission of several BSs is dynamically enabled
    in accordance with the cache status, the channel conditions, and the backhaul
    capacity. Thereby, caching reduces the data sharing overhead over the
    capacity-constrained backhaul links. More importantly, caching introduces
    additional secure degrees of freedom and enables a power-efficient design. We
    investigate the optimal caching and transmission policies for minimizing the
    total transmit power while providing quality of service (QoS) and guaranteeing
    secrecy during video delivery. A two-stage non-convex mixed-integer
    optimization problem is formulated, which optimizes the caching policy in an
    offline video caching stage and the cooperative transmission policy in an
    online video delivery stage. As the problem is NP-hard, suboptimal
    polynomial-time algorithms are proposed for low-complexity cache training and
    delivery control, respectively. Sufficient optimality conditions, under which
    the proposed schemes attain global optimal solutions, are also provided.
    Simulation results show that the proposed schemes achieve low secrecy outage
    probability and high power efficiency simultaneously.

    Vector Approximate Message Passing for the Generalized Linear Model

    Philip Schniter, Sundeep Rangan, Alyson K. Fletcher
    Subjects: Information Theory (cs.IT)

    The generalized linear model (GLM), where a random vector (oldsymbol{x}) is
    observed through a noisy, possibly nonlinear, function of a linear transform
    output (oldsymbol{z}=oldsymbol{Ax}), arises in a range of applications such
    as robust regression, binary classification, quantized compressed sensing,
    phase retrieval, photon-limited imaging, and inference from neural spike
    trains. When (oldsymbol{A}) is large and i.i.d. Gaussian, the generalized
    approximate message passing (GAMP) algorithm is an efficient means of MAP or
    marginal inference, and its performance can be rigorously characterized by a
    scalar state evolution. For general (oldsymbol{A}), though, GAMP can
    misbehave. Damping and sequential-updating help to robustify GAMP, but their
    effects are limited. Recently, a “vector AMP” (VAMP) algorithm was proposed for
    additive white Gaussian noise channels. VAMP extends AMP’s guarantees from
    i.i.d. Gaussian (oldsymbol{A}) to the larger class of rotationally invariant
    (oldsymbol{A}). In this paper, we show how VAMP can be extended to the GLM.
    Numerical experiments show that the proposed GLM-VAMP is much more robust to
    ill-conditioning in (oldsymbol{A}) than damped GAMP.

    Onsager-Corrected Deep Networks for Sparse Linear Inverse Problems

    Mark Borgerding, Philip Schniter
    Comments: arXiv admin note: text overlap with arXiv:1607.05966
    Subjects: Information Theory (cs.IT)

    Deep learning has gained great popularity due to its widespread success on
    many inference problems. We consider the application of deep learning to the
    sparse linear inverse problem encountered in compressive sensing, where one
    seeks to recover a sparse signal from a few noisy linear measurements. In this
    paper, we propose two novel neural-network architectures that decouple
    prediction errors across layers in the same way that the approximate message
    passing (AMP) algorithms decouple them across iterations: through Onsager
    correction. We show numerically that our “learned AMP” network significantly
    improves upon Gregor and LeCun’s “learned ISTA” when both use soft-thresholding
    shrinkage. We then show that additional improvements result from jointly
    learning the shrinkage functions together with the linear transforms. Finally,
    we propose a network design inspired by an unfolding of the recently proposed
    “vector AMP” (VAMP) algorithm, and show that it outperforms all previously
    considered networks. Interestingly, the linear transforms and shrinkage
    functions prescribed by VAMP coincide with the values learned through
    backpropagation, yielding an intuitive explanation for the design of this deep
    network.

    Novel Delivery Schemes for Decentralized Coded Caching in the Finite File Size Regime

    Kai Wan, Daniela Tuninetti, Pablo Piantanida
    Comments: 6 pages, 1 figure, submitted to ICC 2017-WT10
    Subjects: Information Theory (cs.IT)

    This paper analyzes the achievable tradeoff between cache~size and
    download~rate in decentralized caching systems with the uncoded cache placement
    originally proposed by Maddah-Ali and Niesen. It proposes two novel delivery
    schemes that take advantage of the multicasting opportunities that arise when a
    file is demanded by multiple users. These delivery schemes are extensions of
    known ones to the regime where the file size is finite. Numerical evaluations
    for the case of file uniform popularity show that the proposed schemes
    outperform previous ones for all value of the cache size.

    On the Performance of Visible Light Communications Systems with Non-Orthogonal Multiple Access

    Hanaa Marshoud, Paschalis C. Sofotasios, Sami Muhaidat, George K. Karagiannidis
    Subjects: Information Theory (cs.IT)

    Visible light communications (VLC) have been recently proposed as a promising
    and efficient solution to indoor ubiquitous broadband connectivity. In this
    paper, non-orthogonal multiple access, which has been recently proposed as an
    effective scheme for fifth generation (5G) wireless networks, is considered in
    the context of VLC systems, under different channel uncertainty models. To this
    end, we first derive a novel closed-form expression for the bit-error-rate
    (BER) under perfect channel state information (CSI). Capitalizing on this, we
    quantify the effect of noisy and outdated CSI by deriving a simple approximated
    expression for the former and a tight upper bound for the latter. The offered
    results are corroborated by respective results from extensive Monte Carlo
    simulations and are used to provide useful insights on the effect of imperfect
    CSI knowledge on the system performance. It was shown that, while noisy CSI
    leads to slight degradation in the BER performance, outdated CSI can cause
    detrimental performance degradation if the order of the users’ channel gains
    change as a result of mobility

    Linear Codes over (mathbb{F}_{q}[x]/(x^2)) and (GR(p^2,m)) Reaching the Griesmer Bound

    Jin Li, Aixian Zhang, Keqin Feng
    Subjects: Information Theory (cs.IT)

    We construct two series of linear codes over finite ring
    (mathbb{F}_{q}[x]/(x^2)) and Galois ring (GR(p^2,m)) respectively reaching the
    Griesmer bound. They derive two series of codes over finite field
    (mathbb{F}_{q}) by Gray map. The first series of codes over (mathbb{F}_{q})
    derived from (mathbb{F}_{q}[x]/(x^2)) are linear and also reach the Griesmer
    bound in some cases. Many of linear codes over finite field we constructed have
    two Hamming (non-zero) weights.

    A Mathematical Proof of the Superiority of NOMA Compared to Conventional OMA

    Zhiyong Chen, Zhiguo Ding, Xuchu Dai, Rui Zhang
    Comments: 28 pages, 8 figures, submitted to IEEE Transactions on Signal Processing
    Subjects: Information Theory (cs.IT)

    While existing works about non-orthogonal multiple access (NOMA) have
    indicated that NOMA can yield a significant performance gain over orthogonal
    multiple access (OMA) with fixed resource allocation, it is not clear whether
    such a performance gain will diminish when optimal resource
    (Time/Frequency/Power) allocation is carried out. In this paper, the
    performance comparison between NOMA and conventional OMA systems is
    investigated, from an optimization point of view. Firstly, by using the idea of
    power splitting, a closed-form expression for the optimum sum rate of NOMA
    systems is derived. Then, with rigorous mathematical proofs, we reveal the fact
    that NOMA can always outperform conventional OMA systems, even if both are
    equipped with the optimal resource allocation policies. Finally, computer
    simulations are conducted to validate the accuracy of the analytical results.

    Placement Optimization of UAV-Mounted Mobile Base Stations

    Jiangbin Lyu, Yong Zeng, Rui Zhang, Teng Joon Lim
    Comments: 4 pages, 3 figures, 1 table, accepted for publications in IEEE Communications Letters
    Subjects: Information Theory (cs.IT)

    In terrestrial communication networks without fixed infrastructure, unmanned
    aerial vehicle (UAV)-mounted mobile base stations (MBSs) provide an efficient
    solution to achieve wireless connectivity. This letter aims to minimize the
    number of MBSs needed to provide wireless coverage for a group of distributed
    ground terminals (GTs), ensuring that each GT is within the communication range
    of at least one MBS. We propose a polynomial-time algorithm with successive MBS
    placement, where the MBSs are placed sequentially starting on the area
    perimeter of the uncovered GTs along a spiral path towards the center, until
    all GTs are covered. Each MBS is placed to cover as many uncovered GTs as
    possible, with higher priority given to the GTs on the boundary to reduce the
    occurrence of outlier GTs that each may require one dedicated MBS for its
    coverage. Numerical results show that the proposed algorithm performs favorably
    compared to other schemes in terms of the total number of required MBSs and/or
    time complexity.

    A Protocol for a Secure Remote Keyless Entry System Applicable in Vehicles using Symmetric-Key Cryptography

    Tobias Glocker, Timo Mantere, Mohammed Elmusrati
    Subjects: Information Theory (cs.IT); Cryptography and Security (cs.CR)

    In our modern society comfort became a standard. This comfort, especially in
    cars can only be achieved by equipping the car with more electronic devices.
    Some of the electronic devices must cooperate with each other and thus they
    require a communication channel, which can be wired or wireless. In these days,
    it would be hard to sell a new car operating with traditional keys. Almost all
    modern cars can be locked or unlocked with a Remote Keyless System. A Remote
    Keyless System consists of a key fob that communicates wirelessly with the car
    transceiver that is responsible for locking and unlocking the car. However
    there are several threats for wireless communication channels.

    This paper describes the possible attacks against a Remote Keyless System and
    introduces a secure protocol as well as a lightweight Symmetric Encryption
    Algorithm for a Remote Keyless Entry System applicable in vehicles.

    Two new families of two-weight codes

    Minjia Shi, Yue Guan, Patrick Sole
    Comments: 13 pages, submitted on 10 November
    Subjects: Information Theory (cs.IT)

    We construct two new infinite families of trace codes of dimension (2m), over
    the ring (mathbb{F}_p+umathbb{F}_p,) when (p) is an odd prime. They have the
    algebraic structure of abelian codes. Their Lee weight distribution is computed
    by using Gauss sums. By Gray mapping, we obtain two infinite families of linear
    (p)-ary codes of respective lengths ((p^m-1)^2) and (2(p^m-1)^2.) When (m) is
    singly-even, the first family gives five-weight codes. When (m) is odd, and
    (pequiv 3 pmod{4},) the first family yields (p)-ary two-weight codes, which
    are shown to be optimal by application of the Griesmer bound. The second family
    consists of two-weight codes that are shown to be optimal, by the Griesmer
    bound, whenever (p=3) and (m ge 3,) or (pge 5) and (mge 4.) Applications to
    secret sharing schemes are given.

    Optimal two weight codes from trace codes over a chain ring

    Minjia Shi, Rongsheng Wu, Patrick Sole
    Comments: 12 pages, submitted on 4 October. arXiv admin note: text overlap with arXiv:1612.00123, arXiv:1612.00128
    Subjects: Information Theory (cs.IT)

    In this paper, we construct an infinite family of two-weight codes for the
    homogeneous metric over the ring (R=mathbb{F}_2+umathbb{F}_2+cdots
    +u^{k-1}mathbb{F}_{2}), where (u^k=0.) These codes are defined as trace codes.
    They have the algebraic structure of abelian codes. Their homogeneous weight
    distribution is computed by using character sums. In particular, we give a
    necessary and sufficient condition of optimality for the Gray image codes by
    using the Griesmer bound. We have shown that if (k>3), these images are
    projective. Their support structure is determined. An application to secret
    sharing schemes is given.

    Several Classes of (p)-ary Linear Codes with Few Weights

    Rongsheng Wu, Minjia Shi, Patrick Sole
    Comments: 11 pages
    Subjects: Information Theory (cs.IT)

    Few weights codes have been an important topic of coding theory for decades.
    In this paper, a class of two-weight and three-weight codes for the homogeneous
    metric over the chain ring (R=mathbb{F}_p+umathbb{F}_p+cdots
    +u^{k-1}mathbb{F}_{p}) are constructed. These codes are defined as trace
    codes. They are shown to be abelian. Their homogeneous weight distributions are
    computed by using exponential sums. In particular, we give a necessary and
    sufficient condition of the optimality for their Gray images by using the
    Griesmer bound in the two-weight case, and the information about the dual
    homogeneous distance is also given. In addition, the codewords of these codes
    have been turn out to be minimal, it is significant to obtain secret sharing
    schemes with interesting access structures.

    Some ternary cubic two-weight codes

    Minjia Shi, Daitao Huang, Patrick Sole
    Comments: 11 pages, submitted on 2 December. arXiv admin note: text overlap with arXiv:1612.00118
    Subjects: Information Theory (cs.IT)

    We study trace codes with defining set (L,) a subgroup of the multiplicative
    group of an extension of degree (m) of the alphabet ring
    (mathbb{F}_3+umathbb{F}_3+u^{2}mathbb{F}_{3},) with (u^{3}=1.) These codes
    are abelian, and their ternary images are quasi-cyclic of co-index three
    (a.k.a. cubic codes). Their Lee weight distributions are computed by using
    Gauss sums. These codes have three nonzero weights when (m) is singly-even and
    (|L|=frac{3^{3m}-3^{2m}}{2}.) When (m) is odd, and
    (|L|=frac{3^{3m}-3^{2m}}{2}), or (|L|={3^{3m}-3^{2m}}) and (m) is a positive
    integer, we obtain two new infinite families of two-weight codes which are
    optimal. Applications of the image codes to secret sharing schemes are also
    given.

    SNIPE for Memory-Limited PCA From Incomplete Data

    Armin Eftekhari, Laura Balzano, Dehui Yang, Michael B. Wakin
    Subjects: Information Theory (cs.IT)

    The linear subspace model is pervasive in science and engineering and
    particularly in large datasets which are often incomplete due to missing
    measurements and privacy issues. Therefore, a critical problem in modeling is
    to develop algorithms for estimating a low-dimensional subspace model from
    incomplete data efficiently in terms of both computational complexity and
    memory storage. In this paper we study an algorithm that processes blocks of
    incomplete data to estimate the underlying subspace model. Our algorithm has a
    simple interpretation as optimizing the subspace to fit the observed data block
    but remain close to the previous estimate. We prove a linear rate of
    convergence for the algorithm and our rate holds with high probability.

    Analysis of finite sample size quantum hypothesis testing via martingale concentration inequalities

    Cambyse Rouze, Nilanjana Datta
    Comments: 24 pages, 2 figures. arXiv admin note: text overlap with arXiv:1510.04682
    Subjects: Quantum Physics (quant-ph); Information Theory (cs.IT); Statistics Theory (math.ST)

    Martingale concentration inequalities constitute a powerful mathematical tool
    in the analysis of problems in a wide variety of fields ranging from
    probability and statistics to information theory and machine learning. Here we
    apply such inequalities to finite sample size quantum hypothesis testing, which
    is the problem of discriminating quantum states belonging to two different
    sequences ({
    ho_n}_n) and ({sigma_n}_n), for a finite (n). We obtain
    upper bounds on the type II Stein- and Hoeffding errors, which, for
    i.i.d.~states, are in general tighter than the corresponding bounds obtained by
    Audenaert, Mosonyi and Verstrate. Moreover, our bounds are also valid for
    sequences of non-i.i.d.~states which satisfy certain conditions. We also use a
    martingale concentration inequality to obtain an upper bound on the second
    order asymptotics of the type II error exponent in the setting of quantum
    Stein’s lemma, for certain classes of states.

    Robust nonparametric nearest neighbor random process clustering

    Michael Tschannen, Helmut Bölcskei
    Comments: 13 pages, 4 figures
    Subjects: Learning (cs.LG); Information Theory (cs.IT); Machine Learning (stat.ML)

    We consider the problem of clustering noisy finite-length observations of
    stationary ergodic random processes according to their generative models
    without prior knowledge of the model statistics and the number of generative
    models. Two algorithms, both using the L1-distance between estimated power
    spectral densities (PSDs) as a measure of dissimilarity, are analyzed. The
    first one, termed nearest neighbor process clustering (NNPC) is new and relies
    on partitioning the nearest neighbor graph of the observations via spectral
    clustering. The second algorithm, simply referred to as k-means (KM), consists
    of a single k-means iteration with farthest point initialization and was
    considered before in the literature, albeit with a different dissimilarity
    measure and with asymptotic performance results only. We prove that both
    algorithms succeed with high probability in the presence of noise and missing
    entries, and even when the generative process PSDs overlap significantly, all
    provided that the observation length is sufficiently large. Our results
    quantify the tradeoff between the overlap of the generative process PSDs, the
    observation length, the fraction of missing entries, and the noise variance.
    Furthermore, we prove that treating the finite-length observations of
    stationary ergodic random processes as vectors in Euclidean space and
    clustering them using the thresholding-based subspace clustering (TSC)
    algorithm, the subspace clustering cousin of NNPC, results in performance
    strictly inferior to that of NNPC. We argue that the underlying cause is to be
    found in TSC employing spherical distance as dissimilarity measure, thereby
    ignoring the stationary process structure of the observations. Finally, we
    provide extensive numerical results for synthetic and real data and find that
    NNPC outperforms state-of-the-art algorithms in human motion sequence
    clustering.

    The Optimality of Correlated Sampling

    Mohammad Bavarian, Badih Ghazi, Elad Haramaty, Pritish Kamath, Ronald L. Rivest, Madhu Sudan
    Subjects: Computational Complexity (cs.CC); Information Theory (cs.IT)

    In the “correlated sampling” problem, two players, say Alice and Bob, are
    given two distributions, say (P) and (Q) respectively, over the same universe
    and access to shared randomness. The two players are required to output two
    elements, without any interaction, sampled according to their respective
    distributions, while trying to minimize the probability that their outputs
    disagree. A well-known protocol due to Holenstein, with close variants (for
    similar problems) due to Broder, and to Kleinberg and Tardos, solves this task
    with disagreement probability at most (2 delta/(1+delta)), where (delta) is
    the total variation distance between (P) and (Q). This protocol has been used
    in several different contexts including sketching algorithms, approximation
    algorithms based on rounding linear programming relaxations, the study of
    parallel repetition and cryptography.

    In this note, we give a surprisingly simple proof that this protocol is in
    fact tight. Specifically, for every (delta in (0,1)), we show that any
    correlated sampling scheme should have disagreement probability at least
    (2delta/(1+delta)). This partially answers a recent question of Rivest.

    Our proof is based on studying a new problem we call “constrained agreement”.
    Here, Alice is given a subset (A subseteq [n]) and is required to output an
    element (i in A), Bob is given a subset (B subseteq [n]) and is required to
    output an element (j in B), and the goal is to minimize the probability that
    (i
    eq j). We prove tight bounds on this question, which turn out to imply
    tight bounds for correlated sampling. Though we settle basic questions about
    the two problems, our formulation also leads to several questions that remain
    open.

    Online Page Migration on Ring Networks in Uniform Model

    Amanj Khorramian, Akira Matsubayashi
    Comments: 8 pages, 5 figures
    Subjects: Data Structures and Algorithms (cs.DS); Distributed, Parallel, and Cluster Computing (cs.DC); Information Theory (cs.IT)

    This paper explores the problem of page migration in ring networks. A ring
    network is a connected graph, in which each node is connected with exactly two
    other nodes. In this problem, one of the nodes in a given network holds a page
    of size D. This node is called the server and the page is a non-duplicable data
    in the network. Requests are issued by nodes to access the page one after
    another. Every time a new request is issued, the server must serve the request
    and may migrate to another node before the next request arrives. A service
    costs the distance between the server and the requesting node, and the
    migration costs the distance of the migration multiplied by D. The problem is
    to minimize the total costs of services and migrations. We study this problem
    in uniform model, for which the page has a unit size, i.e. D=1. A
    3.326-competitive algorithm improving the current best upper bound is designed.
    We show that this ratio is tight for our algorithm.

    On The Fundamental Energy Tradeoffs of Geographical Load Balancing

    Abbas Kiani, Nirwan Ansari
    Comments: to appear IEEE Communications Magazine
    Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT)

    Geographical load balancing can optimize the utilization of green energy and
    the cost of electricity by taking the advantages of green and price diversities
    at geographical dispersed data centers. However, higher green energy
    utilization or lower electricity cost may actually increase the total energy
    consumption, and is not necessarily the best option. The achievable energy
    tradeoffs can be captured by taking into consideration of a defined service
    efficiency parameter for geo-dispersed data centers.

    Asymptotically Good Convolutional Codes

    Giuliano Gadioli La Guardia
    Subjects: Combinatorics (math.CO); Information Theory (cs.IT)

    In this paper, we construct new sequences of asymptotically good
    convolutional codes. These sequences are obtained from sequences of transitive,
    self-orthogonal and self-dual block codes that attain the Tsfasman-Vladut-Zink
    bound. Furthermore, by applying the techniques of expanding, extending,
    puncturing, direct sum, the |u|u+v| construction and the product code
    construction to these block codes, we construct more new sequences of
    asymptotically good convolutional codes. Additionally, we show that the
    proposed construction method presented here also works when applied for all
    sequences of good block codes where lim kj/nj and lim dj/nj exist.




沪ICP备19023445号-2号
友情链接