IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Tue, 27 Sep 2016

    我爱机器学习(52ml.net)发表于 2016-09-27 00:00:00
    love 0

    Neural and Evolutionary Computing

    An Ontology of Preference-Based Multiobjective Evolutionary Algorithms

    Longmei Li, Iryna Yevseyeva, Vitor Basto-Fernandes, Heike Trautmann, Ning Jing, Michael Emmerich
    Comments: 19 pages, 7 figures
    Subjects: Neural and Evolutionary Computing (cs.NE)

    User preference integration is of great importance in multiobjective
    optimization, in particular in many objective optimization. Preferences have
    long been considered in traditional multicriteria decision making (MCDM) which
    is based on mathematical programming and recently it is integrated in
    evolutionary multiobjective optimization (EMO), resulting in focus on preferred
    parts of the Pareto front instead of the whole Pareto front. The number of
    publications and results on preference-based multiobjective evolutionary
    algorithms (PMOEAs) has increased rapidly over the past decade. There already
    exists a large variety of preference handling methods and EMO methods, which
    have been combined in various ways. This article proposes to use the Web
    Ontology Language (OWL) to model and systematize the results developed in this
    field. An extensive review of the existing work is provided, based on which an
    ontology is built and instantiated with state of the art results. The OWL
    ontology is made public and open to future extension. Moreover, the usage of
    the ontology is exemplified for different use-cases, including training new
    researchers into this knowledge domain, querying for methods that match an
    application problem in engineering optimization, checking existence of
    combinations of preference models and EMO techniques, and discovering
    opportunities for new research and open research questions.

    Multiplicative LSTM for sequence modelling

    Ben Krause, Liang Lu, Iain Murray, Steve Renals
    Subjects: Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    This paper introduces multiplicative LSTM, a novel hybrid recurrent neural
    network architecture for sequence modelling that combines the long short-term
    memory (LSTM) and multiplicative recurrent neural network architectures.
    Multiplicative LSTM is motivated by its flexibility to have very different
    recurrent transition functions for each possible input, which we argue helps
    make it more expressive in autoregressive density estimation. We show
    empirically that multiplicative LSTM outperforms standard LSTM and deep
    variants for a range of character level modelling tasks. We also found that
    this improvement increases as the complexity of the task scales up. This model
    achieves a validation error of 1.20 bits/character on the Hutter prize dataset
    when combined with dynamic evaluation.

    Accurate and Efficient Hyperbolic Tangent Activation Function on FPGA using the DCT Interpolation Filter

    Ahmed M. Abdelsalam, J.M. Pierre Langlois, F. Cheriet
    Comments: 8 pages, 6 figures, 5 tables, submitted for the 25th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (ISFPGA), 22-24 February 2017, California, USA
    Subjects: Neural and Evolutionary Computing (cs.NE); Learning (cs.LG)

    Implementing an accurate and fast activation function with low cost is a
    crucial aspect to the implementation of Deep Neural Networks (DNNs) on FPGAs.
    We propose a high-accuracy approximation approach for the hyperbolic tangent
    activation function of artificial neurons in DNNs. It is based on the Discrete
    Cosine Transform Interpolation Filter (DCTIF). The proposed architecture
    combines simple arithmetic operations on stored samples of the hyperbolic
    tangent function and on input data. The proposed DCTIF implementation achieves
    two orders of magnitude greater precision than previous work while using the
    same or fewer computational resources. Various combinations of DCTIF parameters
    can be chosen to tradeoff the accuracy and complexity of the hyperbolic tangent
    function. In one case, the proposed architecture approximates the hyperbolic
    tangent activation function with 10E-5 maximum error while requiring only 1.52
    Kbits memory and 57 LUTs of a Virtex-7 FPGA. We also discuss how the activation
    function accuracy affects the performance of DNNs in terms of their training
    and testing accuracies. We show that a high accuracy approximation can be
    necessary in order to maintain the same DNN training and testing performances
    realized by the exact function.

    The RNN-ELM Classifier

    Athanasios Vlontzos
    Subjects: Neural and Evolutionary Computing (cs.NE); Learning (cs.LG)

    In this paper we examine learning methods combining the Random Neural
    Network, a biologically inspired neural network and the Extreme Learning
    Machine that achieve state of the art classification performance while
    requiring much shorter training time. The Random Neural Network is a integrate
    and fire computational model of a neural network whose mathematical structure
    permits the efficient analysis of large ensembles of neurons. An activation
    function is derived from the RNN and used in an Extreme Learning Machine. We
    compare the performance of this combination against the ELM with various
    activation functions, we reduce the input dimensionality via PCA and compare
    its performance vs. autoencoder based versions of the RNN-ELM.

    Sooner than Expected: Hitting the Wall of Complexity in Evolution

    Thomas Schmickl, Payam Zahadat, Heiko Hamann
    Comments: 24 pages, 15 figures
    Subjects: Neural and Evolutionary Computing (cs.NE)

    In evolutionary robotics an encoding of the control software, which maps
    sensor data (input) to motor control values (output), is shaped by stochastic
    optimization methods to complete a predefined task. This approach is assumed to
    be beneficial compared to standard methods of controller design in those cases
    where no a-priori model is available that could help to optimize performance.
    Also for robots that have to operate in unpredictable environments, an
    evolutionary robotics approach is favorable. We demonstrate here that such a
    model-free approach is not a free lunch, as already simple tasks can represent
    unsolvable barriers for fully open-ended uninformed evolutionary computation
    techniques. We propose here the ‘Wankelmut’ task as an objective for an
    evolutionary approach that starts from scratch without pre-shaped controller
    software or any other informed approach that would force the behavior to be
    evolved in a desired way. Our focal claim is that ‘Wankelmut’ represents the
    simplest set of problems that makes plain-vanilla evolutionary computation
    fail. We demonstrate this by a series of simple standard evolutionary
    approaches using different fitness functions and standard artificial neural
    networks as well as continuous-time recurrent neural networks. All our tested
    approaches failed. We claim that any other evolutionary approach will also fail
    that does per-se not favor or enforce modularity and does not freeze or protect
    already evolved functionalities. Thus we propose a hard-to-pass benchmark and
    make a strong statement for self-complexifying and generative approaches in
    evolutionary computation. We anticipate that defining such a ‘simplest task to
    fail’ is a valuable benchmark for promoting future development in the field of
    artificial intelligence, evolutionary robotics and artificial life.

    Learning by Stimulation Avoidance: A Principle to Control Spiking Neural Networks Dynamics

    Lana Sinapayen, Atsushi Masumori, Takashi Ikegami
    Comments: 17 pages, 11 figures
    Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Learning based on networks of real neurons, and by extension biologically
    inspired models of neural networks, has yet to find general learning rules
    leading to widespread applications. In this paper, we argue for the existence
    of a principle allowing to steer the dynamics of a biologically inspired neural
    network. Using carefully timed external stimulation, the network can be driven
    towards a desired dynamical state. We term this principle “Learning by
    Stimulation Avoidance” (LSA). We demonstrate through simulation that the
    minimal sufficient conditions leading to LSA in artificial networks are also
    sufficient to reproduce learning results similar to those obtained in
    biological neurons by Shahaf and Marom [1]. We examine the mechanism’s basic
    dynamics in a reduced network, and demonstrate how it scales up to a network of
    100 neurons. We show that LSA has a higher explanatory power than existing
    hypotheses about the response of biological neural networks to external
    simulation, and can be used as a learning rule for an embodied application:
    learning of wall avoidance by a simulated robot. The surge in popularity of
    artificial neural networks is mostly directed to disembodied models of neurons
    with biologically irrelevant dynamics: to the authors’ knowledge, this is the
    first work demonstrating sensory-motor learning with random spiking networks
    through pure Hebbian learning.


    Computer Vision and Pattern Recognition

    Learning Language-Visual Embedding for Movie Understanding with Natural-Language

    Atousa Torabi, Niket Tandon, Leonid Sigal
    Comments: 13 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Learning a joint language-visual embedding has a number of very appealing
    properties and can result in variety of practical application, including
    natural language image/video annotation and search. In this work, we study
    three different joint language-visual neural network model architectures. We
    evaluate our models on large scale LSMDC16 movie dataset for two tasks: 1)
    Standard Ranking for video annotation and retrieval 2) Our proposed movie
    multiple-choice test. This test facilitate automatic evaluation of
    visual-language models for natural language video annotation based on human
    activities. In addition to original Audio Description (AD) captions, provided
    as part of LSMDC16, we collected and will make available a) manually generated
    re-phrasings of those captions obtained using Amazon MTurk b) automatically
    generated human activity elements in “Predicate + Object” (PO) phrases based on
    “Knowlywood”, an activity knowledge mining model. Our best model archives
    Recall@10 of 19.2% on annotation and 18.9% on video retrieval tasks for subset
    of 1000 samples. For multiple-choice test, our best model achieve accuracy
    58.11% over whole LSMDC16 public test-set.

    Swipe Mosaics from Video

    Malcolm Reynolds, Tom S. F. Haines, Gabriel J. Brostow
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    A panoramic image mosaic is an attractive visualization for viewing many
    overlapping photos, but its images must be both captured and processed
    correctly to produce an acceptable composite. We propose Swipe Mosaics, an
    interactive visualization that places the individual video frames on a 2D
    planar map that represents the layout of the physical scene. Compared to
    traditional panoramic mosaics, our capture is easier because the user can both
    translate the camera center and film moving subjects. Processing and display
    degrade gracefully if the footage lacks distinct, overlapping, non-repeating
    texture. Our proposed visual odometry algorithm produces a distribution over
    (x,y) translations for image pairs. Inferring a distribution of possible camera
    motions allows us to better cope with parallax, lack of texture, dynamic
    scenes, and other phenomena that hurt deterministic reconstruction techniques.
    Robustness is obtained by training on synthetic scenes with known camera
    motions. We show that Swipe Mosaics are easy to generate, support a wide range
    of difficult scenes, and are useful for documenting a scene for closer
    inspection.

    Robust Matrix Decomposition for Image Segmentation under Heavy Noises and Uneven Background Intensities

    Garret Vo, Chiwoo Park
    Comments: 13 Pages; 11 Figures; 3 Tables; Submitted to IEEE T-PAMI
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper presents a robust matrix decomposition approach that automatically
    segments a binary image to foreground regions and background regions under high
    observation noise levels and uneven background intensities. The work is
    motivated by the need of identifying foreground objects in a noisy electron
    microscopic image, but the method can be applied for a general binary
    classification problem. The proposed method models an input image as a matrix
    of image pixel values, and the matrix is represented by a mixture of three
    component matrices of the same size, background, foreground and noise matrices.
    We propose a robust matrix decomposition approach to separate the input matrix
    into the three components through robust singular value decomposition. The
    proposed approach is more robust to high image noises and uneven background
    than the existing matrix-based approaches, which is numerically shown using
    simulated images and five electron microscope images with manually achieved
    ground truth data.

    BioLeaf: a professional mobile application to measure foliar damage caused by insect herbivory

    Bruno Machado, Jonatan Orue, Mauro Arrudaa, Cleidimar Santos, Diogo Sarath, Wesley Goncalves, Gercina Silva, Hemerson Pistoric, Antonia Roel, Jose Rodrigues-Jr
    Journal-ref: Computers and Electronics in Agriculture 129: 1. 44-55 (2016)
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Soybean is one of the ten greatest crops in the world, answering for
    billion-dollar businesses every year. This crop suffers from insect herbivory
    that costs millions from producers. Hence, constant monitoring of the crop
    foliar damage is necessary to guide the application of insecticides. However,
    current methods to measure foliar damage are expensive and dependent on
    laboratory facilities, in some cases, depending on complex devices. To cope
    with these shortcomings, we introduce an image processing methodology to
    measure the foliar damage in soybean leaves. We developed a non-destructive
    imaging method based on two techniques, Otsu segmentation and Bezier curves, to
    estimate the foliar loss in leaves with or without border damage. We
    instantiate our methodology in a mobile application named BioLeaf, which is
    freely distributed for smartphone users. We experimented with real-world leaves
    collected from a soybean crop in Brazil. Our results demonstrated that BioLeaf
    achieves foliar damage quantification with precision comparable to that of
    human specialists. With these results, our proposal might assist soybean
    producers, reducing the time to measure foliar damage, reducing analytical
    costs, and defining a commodity application that is applicable not only to soy,
    but also to different crops such as cotton, bean, potato, coffee, and
    vegetables.

    Super-resolving multiresolution images with band-independant geometry of multispectral pixels

    Nicolas Brodu
    Comments: Source code with a ready-to-use script for super-resolving Sentinel-2 data is available at this http URL
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    A new resolution enhancement method is presented for multispectral and
    multi-resolution images, such as these provided by the Sentinel-2 satellites.
    Starting from the highest resolution bands, band-dependent information
    (reflectance) is separated from information that is common to all bands
    (geometry of scene elements). This model is then applied to unmix
    low-resolution bands, preserving their reflectance, while propagating
    band-independent information to preserve the sub-pixel details. A reference
    implementation is provided, with an application example for super-resolving
    Sentinel-2 data.

    Optimistic and Pessimistic Neural Networks for Scene and Object Recognition

    Rene Grzeszick, Sebastian Sudholt, Gernot A. Fink
    Comments: Submitted to WACV’17
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper the application of uncertainty modeling to convolutional neural
    networks is evaluated. A novel method for adjusting the network’s predictions
    based on uncertainty information is introduced. This allows the network to be
    either optimistic or pessimistic in its prediction scores. The proposed method
    builds on the idea of applying dropout at test time and sampling a predictive
    mean and variance from the network’s output. Besides the methodological
    aspects, implementation details allowing for a fast evaluation are presented.
    Furthermore, a multilabel network architecture is introduced that strongly
    benefits from the presented approach. In the evaluation it will be shown that
    modeling uncertainty allows for improving the performance of a given model
    purely at test time without any further training steps. The evaluation
    considers several applications in the field of computer vision, including
    object classification and detection as well as scene attribute recognition.

    Deep Structured Features for Semantic Segmentation

    Michael Tschannen, Lukas Cavigelli, Fabian Mentzer, Thomas Wiatowski, Luca Benini
    Comments: 10 pages, 2 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    We propose a highly structured neural network architecture for semantic
    segmentation of images that combines i) a Haar wavelet-based tree-like
    convolutional neural network (CNN), ii) a random layer realizing a radial basis
    function kernel approximation, and iii) a linear classifier. While stages i)
    and ii) are completely pre-specified, only the linear classifier is learned
    from data. Thanks to its high degree of structure, our architecture has a very
    small memory footprint and thus fits onto low-power embedded and mobile
    platforms. We apply the proposed architecture to outdoor scene and aerial image
    semantic segmentation and show that the accuracy of our architecture is
    competitive with conventional pixel classification CNNs. Furthermore, we
    demonstrate that the proposed architecture is data efficient in the sense of
    matching the accuracy of pixel classification CNNs when trained on a much
    smaller data set.

    Linear Support Tensor Machine: Pedestrian Detection in Thermal Infrared Images

    Sujoy Kumar Biswas, Peyman Milanfar
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Pedestrian detection in thermal infrared images poses unique challenges
    because of the low resolution and noisy nature of the image. Here we propose a
    mid-level attribute in the form of multidimensional template, or tensor, using
    Local Steering Kernel (LSK) as low-level descriptors for detecting pedestrians
    in far infrared images. LSK is specifically designed to deal with intrinsic
    image noise and pixel level uncertainty by capturing local image geometry
    succinctly instead of collecting local orientation statistics (e.g., histograms
    in HOG). Our second contribution is the introduction of a new image similarity
    kernel in the popular maximum margin framework of support vector machines that
    results in a relatively short and simple training phase for building a rigid
    pedestrian detector. Our third contribution is to replace the sluggish but de
    facto sliding window based detection methodology with multichannel discrete
    Fourier transform, facilitating very fast and efficient pedestrian
    localization. The experimental studies on publicly available thermal infrared
    images justify our proposals and model assumptions. In addition, the proposed
    work also involves the release of our in-house annotations of pedestrians in
    more than 17000 frames of OSU Color Thermal database for the purpose of sharing
    with the research community.

    Visual Fashion-Product Search at SK Planet

    Taewan Kim, Seyeong Kim, Sangil Na, Hayoon Kim, Moonki Kim, Beyeongki Jeon
    Comments: 13 pages, 11 figures, 3 tables
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We build a large-scale visual search system which finds similar product
    images given a fashion item. Defining similarity among arbitrary
    fashion-products is still remains a challenging problem, even there is no exact
    ground-truth. To resolve this problem, we define more than 90 fashion-related
    attributes, and combination of these attributes can represent thousands of
    unique fashion-styles. The fashion- attributes are one of the ingredients to
    define semantic similarity among fashion- product images. To build our system
    at scale, these fashion-attributes are again used to build an inverted indexing
    scheme. In addition to these fashion-attributes for semantic similarity, we
    extract colour and appearance features in a region-of- interest (ROI) of a
    fashion item for visual similarity. By sharing our approach, we expect active
    discussion on that how to apply current computer vision research into the
    e-commerce industry.

    Multiview RGB-D Dataset for Object Instance Detection

    Georgios Georgakis, Md Alimoor Reza, Arsalan Mousavian, Phi-Hung Le, Jana Kosecka
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)

    This paper presents a new multi-view RGB-D dataset of nine kitchen scenes,
    each containing several objects in realistic cluttered environments including a
    subset of objects from the BigBird dataset. The viewpoints of the scenes are
    densely sampled and objects in the scenes are annotated with bounding boxes and
    in the 3D point cloud. Also, an approach for detection and recognition is
    presented, which is comprised of two parts: i) a new multi-view 3D proposal
    generation method and ii) the development of several recognition baselines
    using AlexNet to score our proposals, which is trained either on crops of the
    dataset or on synthetically composited training images. Finally, we compare the
    performance of the object proposals and a detection baseline to the Washington
    RGB-D Scenes (WRGB-D) dataset and demonstrate that our Kitchen scenes dataset
    is more challenging for object detection and recognition. The dataset is
    available at: this http URL

    Joint Rain Detection and Removal via Iterative Region Dependent Multi-Task Learning

    Wenhan Yang, Robby T. Tan, Jiashi Feng, Jiaying Liu, Zongming Guo, Shuicheng Yan
    Comments: 12 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we address a rain removal problem from a single image, even in
    the presence of heavy rain and rain accumulation. Our core ideas lie in our new
    rain image models and a novel deep learning architecture. We first modify the
    commonly used model, which is a linear combination of a rain streak layer and a
    background layer, by adding a binary map that locates rain streak regions.
    Second, we create a model consisting of a component representing rain
    accumulation (where individual streaks cannot be seen, and thus visually
    similar to mist or fog), and another component representing various shapes and
    directions of overlapping rain streaks, which normally happen in heavy rain.
    Based on the first model, we develop a multi-task deep learning architecture
    that learns the binary rain streak map, the appearance of rain streaks, and the
    clean background, which is our ultimate output. The additional binary map is
    critically beneficial, since its loss function can provide additional strong
    information to the network. In many cases though, rain streaks can be dense and
    large in their size, thus to obtain the clean background, we need spatial
    contextual information. For this, we utilize the dilated convolution. To handle
    rain accumulation (again, a phenomenon visually similar to mist or fog) and
    various shapes and directions of overlapping rain streaks, we propose an
    iterative information feedback (IIF) network that removes rain streaks and
    clears up the rain accumulation iteratively and progressively. Overall, this
    multi-task learning and iterative information feedback benefits each other and
    constitutes a network that is end-to-end trainable. Our extensive evaluation on
    real images, particularly on heavy rain, shows the effectiveness of our novel
    models and architecture, outperforming the state-of-the-art methods
    significantly.

    Deep learning based fence segmentation and removal from an image using a video sequence

    Sankaraganesh Jonna, Krishna K. Nakka, Rajiv R. Sahay
    Comments: ECCV Workshop on Video Segmentation, 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Conventional approaches to image de-fencing use multiple adjacent frames for
    segmentation of fences in the reference image and are limited to restoring
    images of static scenes only. In this paper, we propose a de-fencing algorithm
    for images of dynamic scenes using an occlusion-aware optical flow method. We
    divide the problem of image de-fencing into the tasks of automated fence
    segmentation from a single image, motion estimation under known occlusions and
    fusion of data from multiple frames of a captured video of the scene.
    Specifically, we use a pre-trained convolutional neural network to segment
    fence pixels from a single image. The knowledge of spatial locations of fences
    is used to subsequently estimate optical flow in the occluded frames of the
    video for the final data fusion step. We cast the fence removal problem in an
    optimization framework by modeling the formation of the degraded observations.
    The inverse problem is solved using fast iterative shrinkage thresholding
    algorithm (FISTA). Experimental results show the effectiveness of proposed
    algorithm.

    Perceptual uniform descriptor and Ranking on manifold: A bridge between image representation and ranking for image retrieval

    Shenglan Liu, Jun Wu, Lin Feng, Yang Liu, Hong Qiao, Wenbo Luo Muxin Sun, Wei Wang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Incompatibility of image descriptor and ranking is always neglected in image
    retrieval. In this paper, manifold learning and Gestalt psychology theory are
    involved to solve the incompatibility problem. A new holistic descriptor called
    Perceptual Uniform Descriptor (PUD) based on Gestalt psychology is proposed,
    which combines color and gradient direction to imitate the human visual
    uniformity. PUD features in the same class images distributes on one manifold
    in most cases because PUD improves the visual uniformity of the traditional
    descriptors. Thus, we use manifold ranking and PUD to realize image retrieval.
    Experiments were carried out on five benchmark data sets, and the proposed
    method can greatly improve the accuracy of image retrieval. Our experimental
    results in the Ukbench and Corel-1K datasets demonstrated that N-S score
    reached to 3.58 (HSV 3.4) and mAP to 81.77% (ODBTC 77.9%) respectively by
    utilizing PUD which has only 280 dimension. The results are higher than other
    holistic image descriptors (even some local ones) and state-of-the-arts
    retrieval methods.

    Three Tiers Neighborhood Graph and Multi-graph Fusion Ranking for Multi-feature Image Retrieval: A Manifold Aspect

    Shenglan Liu, Muxin Sun, Lin Feng, Yang Liu, Jun Wu
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Single feature is inefficient to describe content of an image, which is a
    shortcoming in traditional image retrieval task. We know that one image can be
    described by different features. Multi-feature fusion ranking can be utilized
    to improve the ranking list of query. In this paper, we first analyze graph
    structure and multi-feature fusion re-ranking from manifold aspect. Then, Three
    Tiers Neighborhood Graph (TTNG) is constructed to re-rank the original ranking
    list by single feature and to enhance precision of single feature. Furthermore,
    we propose Multi-graph Fusion Ranking (MFR) for multi-feature ranking, which
    considers the correlation of all images in multiple neighborhood graphs.
    Evaluations are conducted on UK-bench, Corel-1K, Corel-10K and Cifar-10
    benchmark datasets. The experimental results show that our TTNG and MFR
    outperform than other state-of-the-art methods. For example, we achieve
    competitive results N-S score 3.91 and precision 65.00% on UK-bench and
    Corel-10K datasets respectively.

    DimensionApp : android app to estimate object dimensions

    Suriya Singh, Vijay Kumar
    Comments: Project Report 2014
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this project, we develop an android app that uses on computer vision
    techniques to estimate an object dimension present in field of view. The app
    while having compact size, is accurate upto +/- 5 mm and robust towards touch
    inputs. We use single-view metrology to compute accurate measurement. Unlike
    previous approaches, our technique does not rely on line detection and can be
    generalize to any object shape easily.

    A Rotation Invariant Latent Factor Model for Moveme Discovery from Static Poses

    Matteo Ruggero Ronchi, Joon Sik Kim, Yisong Yue
    Comments: Long version of the paper accepted at the IEEE ICDM 2016 conference. 10 pages, 9 figures, 1 table. Project page: this http URL
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    We tackle the problem of learning a rotation invariant latent factor model
    when the training data is comprised of lower-dimensional projections of the
    original feature space. The main goal is the discovery of a set of 3-D bases
    poses that can characterize the manifold of primitive human motions, or
    movemes, from a training set of 2-D projected poses obtained from still images
    taken at various camera angles. The proposed technique for basis discovery is
    data-driven rather than hand-designed. The learned representation is rotation
    invariant, and can reconstruct any training instance from multiple viewing
    angles. We apply our method to modeling human poses in sports (via the Leeds
    Sports Dataset), and demonstrate the effectiveness of the learned bases in a
    range of applications such as activity classification, inference of dynamics
    from a single frame, and synthetic representation of movements.

    Lexicon-Free Fingerspelling Recognition from Video: Data, Models, and Signer Adaptation

    Taehwan Kim, Jonathan Keane, Weiran Wang, Hao Tang, Jason Riggle, Gregory Shakhnarovich, Diane Brentari, Karen Livescu
    Comments: arXiv admin note: substantial text overlap with arXiv:1608.08339
    Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)

    We study the problem of recognizing video sequences of fingerspelled letters
    in American Sign Language (ASL). Fingerspelling comprises a significant but
    relatively understudied part of ASL. Recognizing fingerspelling is challenging
    for a number of reasons: It involves quick, small motions that are often highly
    coarticulated; it exhibits significant variation between signers; and there has
    been a dearth of continuous fingerspelling data collected. In this work we
    collect and annotate a new data set of continuous fingerspelling videos,
    compare several types of recognizers, and explore the problem of signer
    variation. Our best-performing models are segmental (semi-Markov) conditional
    random fields using deep neural network-based features. In the signer-dependent
    setting, our recognizers achieve up to about 92% letter accuracy. The
    multi-signer setting is much more challenging, but with neural network
    adaptation we achieve up to 83% letter accuracies in this setting.

    Autonomous Exploration with a Low-Cost Quadrocopter using Semi-Dense Monocular SLAM

    Lukas von Stumberg, Vladyslav Usenko, Jakob Engel, Jörg Stückler, Daniel Cremers
    Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)

    Micro aerial vehicles (MAVs) are strongly limited in their payload and power
    capacity. In order to implement autonomous navigation, algorithms are therefore
    desirable that use sensory equipment that is as small, low-weight, and low-
    power consuming as possible. In this paper, we propose a method for autonomous
    MAV navigation and exploration using a low-cost consumer-grade quadrocopter
    equipped with a monocular camera. Our vision-based navigation system builds on
    LSD-SLAM which estimates the MAV trajectory and a semi-dense reconstruction of
    the environment in real-time. Since LSD-SLAM only determines depth at high
    gradient pixels, texture-less areas are not directly observed. We propose an
    obstacle mapping and exploration approach that takes this property into
    account. In experiments, we demonstrate our vision-based autonomous navigation
    and exploration system with a commercially available Parrot Bebop MAV.

    Low-complexity Image and Video Coding Based on an Approximate Discrete Tchebichef Transform

    P. A. M. Oliveira, R. J. Cintra, F. M. Bayer, S. Kulasekera, A. Madanayake
    Comments: 11 pages, 5 figures, 4 tables
    Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV); Data Structures and Algorithms (cs.DS); Computation (stat.CO); Methodology (stat.ME)

    The usage of linear transformations has great relevance for data
    decorrelation applications, like image and video compression. In that sense,
    the discrete Tchebichef transform (DTT) possesses useful coding and
    decorrelation properties. The DTT transform kernel does not depend on the input
    data and fast algorithms can be developed to real time applications. However,
    the DTT fast algorithm presented in literature possess high computational
    complexity. In this work, we introduce a new low-complexity approximation for
    the DTT. The fast algorithm of the proposed transform is multiplication-free
    and requires a reduced number of additions and bit-shifting operations. Image
    and video compression simulations in popular standards shows good performance
    of the proposed transform. Regarding hardware resource consumption for FPGA
    shows 43.1% reduction of configurable logic blocks and ASIC place and route
    realization shows 57.7% reduction in the area-time figure when compared with
    the 2-D version of the exact DTT.


    Artificial Intelligence

    Commonsense reasoning, commonsense knowledge, and the SP theory of intelligence

    J Gerard Wolff
    Subjects: Artificial Intelligence (cs.AI)

    This paper describes how the “SP theory of intelligence”, outlined in an
    Appendix, may throw light on aspects of commonsense reasoning (CSR) and
    commonsense knowledge (CSK) (together shortened to CSRK), as discussed in
    another paper by Ernest Davis and Gary Marcus (DM). The SP system has the
    generality needed for CSRK: Turing equivalence; the generality of information
    compression as the foundation for the SP system both in the representation of
    knowledge and in concepts of prediction and probability; the versatility of the
    SP system in the representation of knowledge and in aspects of intelligence
    including forms of reasoning; and the potential of the system for the seamless
    integration of diverse forms of knowledge and diverse aspects of intelligence.
    Several examples discussed by DM, and how they may be processed in the SP
    system, are discussed. Also discussed are current successes in CSR (taxonomic
    reasoning, temporal reasoning, action and change, and qualitative reasoning),
    how the SP system may promote seamless integration across these areas, and how
    insights gained from the SP programme of research may yield some potentially
    useful new ways of approaching these topics. The paper considers how the SP
    system may help overcome several challenges in the automation of CSR described
    by DM, and how it meets several of the objectives for research in CSRK that
    they have described.

    Testing Quantum Models of Conjunction Fallacy on the World Wide Web

    Diederik Aerts, Jonito Aerts Arguëlles, Lyneth Beltran, Massimiliano Sassoli de Bianchi, Sandro Sozzo, Tomas Veloz
    Comments: 12 pages
    Subjects: Artificial Intelligence (cs.AI); Quantum Physics (quant-ph)

    The ‘conjunction fallacy’ has been extensively debated by scholars in
    cognitive science and, in recent times, the discussion has been enriched by the
    proposal of modeling the fallacy using the quantum formalism. Two major quantum
    approaches have been put forward: the first assumes that respondents use a
    two-step sequential reasoning and that the fallacy results from the presence of
    ‘question order effects’; the second assumes that respondents evaluate the
    cognitive situation as a whole and that the fallacy results from the ’emergence
    of new meanings’, as an ‘effect of overextension’ in the conceptual
    conjunction. Thus, the question arises as to determine whether and to what
    extent conjunction fallacies would result from ‘order effects’ or, instead,
    from ’emergence effects’. To help clarify this situation, we propose to use the
    World Wide Web as an ‘information space’ that can be interrogated both in a
    sequential and non-sequential way, to test these two quantum approaches. We
    find that ’emergence effects’, and not ‘order effects’, should be considered
    the main cognitive mechanism producing the observed conjunction fallacies.

    Towards Evidence-Based Ontology for Supporting Systematic Literature Review

    Yueming Sun, Ye Yang, He Zhang, Wen Zhang, Qing Wang
    Subjects: Digital Libraries (cs.DL); Artificial Intelligence (cs.AI); Software Engineering (cs.SE)

    [Background]: Systematic Literature Review (SLR) has become an important
    software engineering research method but costs tremendous efforts. [Aim]: This
    paper proposes an approach to leverage on empirically evolved ontology to
    support automating key SLR activities. [Method]: First, we propose an ontology,
    SLRONT, built on SLR experiences and best practices as a groundwork to capture
    common terminologies and their relationships during SLR processes; second, we
    present an extended version of SLRONT, the COSONT and instantiate it with the
    knowledge and concepts extracted from structured abstracts. Case studies
    illustrate the details of applying it for supporting SLR steps. [Results]:
    Results show that through using COSONT, we acquire the same conclusion compared
    with sheer manual works, but the efforts involved is significantly reduced.
    [Conclusions]: The approach of using ontology could effectively and efficiently
    support the conducting of systematic literature review.

    Dropout with Expectation-linear Regularization

    Xuezhe Ma, Yingkai Gao, Zhiting Hu, Yaoliang Yu, Yuntian Deng, Eduard Hovy
    Comments: Under review as a conference paper at ICLR 2017
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    Dropout, a simple and effective way to train deep neural networks, has led to
    a number of impressive empirical successes and spawned many recent theoretical
    investigations. However, the gap between dropout’s training and inference
    phases, introduced due to tractability considerations, has largely remained
    under-appreciated. In this work, we first formulate dropout as a tractable
    approximation of some latent variable model, leading to a clean view of
    parameter sharing and enabling further theoretical analysis. Then, we introduce
    (approximate) expectation-linear dropout neural networks, whose inference gap
    we are able to formally characterize. Algorithmically, we show that our
    proposed measure of the inference gap can be used to regularize the standard
    dropout training objective, resulting in an emph{explicit} control of the gap.
    Our method is as simple and efficient as standard dropout. We further prove the
    upper bounds on the loss in accuracy due to expectation-linearization, describe
    classes of input distributions that expectation-linearize easily. Experiments
    on three image classification benchmark datasets demonstrate that reducing the
    inference gap can indeed improve the performance consistently.

    Pointer Sentinel Mixture Models

    Stephen Merity, Caiming Xiong, James Bradbury, Richard Socher
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

    Recent neural network sequence models with softmax classifiers have achieved
    their best language modeling performance only with very large hidden states and
    large vocabularies. Even then they struggle to predict rare or unseen words
    even if the context makes the prediction unambiguous. We introduce the pointer
    sentinel mixture architecture for neural sequence models which has the ability
    to either reproduce a word from the recent context or produce a word from a
    standard softmax classifier. Our pointer sentinel-LSTM model achieves state of
    the art language modeling performance on the Penn Treebank (70.9 perplexity)
    while using far fewer parameters than a standard softmax LSTM. In order to
    evaluate how well language models can exploit longer contexts and deal with
    more realistic vocabularies and larger corpora we also introduce the freely
    available WikiText corpus.

    Learning by Stimulation Avoidance: A Principle to Control Spiking Neural Networks Dynamics

    Lana Sinapayen, Atsushi Masumori, Takashi Ikegami
    Comments: 17 pages, 11 figures
    Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Learning based on networks of real neurons, and by extension biologically
    inspired models of neural networks, has yet to find general learning rules
    leading to widespread applications. In this paper, we argue for the existence
    of a principle allowing to steer the dynamics of a biologically inspired neural
    network. Using carefully timed external stimulation, the network can be driven
    towards a desired dynamical state. We term this principle “Learning by
    Stimulation Avoidance” (LSA). We demonstrate through simulation that the
    minimal sufficient conditions leading to LSA in artificial networks are also
    sufficient to reproduce learning results similar to those obtained in
    biological neurons by Shahaf and Marom [1]. We examine the mechanism’s basic
    dynamics in a reduced network, and demonstrate how it scales up to a network of
    100 neurons. We show that LSA has a higher explanatory power than existing
    hypotheses about the response of biological neural networks to external
    simulation, and can be used as a learning rule for an embodied application:
    learning of wall avoidance by a simulated robot. The surge in popularity of
    artificial neural networks is mostly directed to disembodied models of neurons
    with biologically irrelevant dynamics: to the authors’ knowledge, this is the
    first work demonstrating sensory-motor learning with random spiking networks
    through pure Hebbian learning.

    Predictive modelling of football injuries

    Stylianos Kampakis
    Comments: PhD Thesis submitted and defended successfully at the Department of Computer Science at University College London
    Subjects: Applications (stat.AP); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    The goal of this thesis is to investigate the potential of predictive
    modelling for football injuries. This work was conducted in close collaboration
    with Tottenham Hotspurs FC (THFC), the PGA European tour and the participation
    of Wolverhampton Wanderers (WW).

    Three investigations were conducted:

    1. Predicting the recovery time of football injuries using the UEFA injury
    recordings: The UEFA recordings is a common standard for recording injuries in
    professional football. For this investigation, three datasets of UEFA injury
    recordings were available. Different machine learning algorithms were used in
    order to build a predictive model. The performance of the machine learning
    models is then improved by using feature selection conducted through
    correlation-based subset feature selection and random forests.

    2. Predicting injuries in professional football using exposure records: The
    relationship between exposure (in training hours and match hours) in
    professional football athletes and injury incidence was studied. A common
    problem in football is understanding how the training schedule of an athlete
    can affect the chance of him getting injured. The task was to predict the
    number of days a player can train before he gets injured.

    3. Predicting intrinsic injury incidence using in-training GPS measurements:
    A significant percentage of football injuries can be attributed to overtraining
    and fatigue. GPS data collected during training sessions might provide
    indicators of fatigue, or might be used to detect very intense training
    sessions which can lead to overtraining. This research used GPS data gathered
    during training sessions of the first team of THFC, in order to predict whether
    an injury would take place during a week.


    Computation and Language

    An Unsupervised Probability Model for Speech-to-Translation Alignment of Low-Resource Languages

    Antonios Anastasopoulos, David Chiang, Long Duong
    Comments: accepted at EMNLP 2016
    Subjects: Computation and Language (cs.CL)

    For many low-resource languages, spoken language resources are more likely to
    be annotated with translations than with transcriptions. Translated speech data
    is potentially valuable for documenting endangered languages or for training
    speech translation systems. A first step towards making use of such data would
    be to automatically align spoken words with their translations. We present a
    model that combines Dyer et al.’s reparameterization of IBM Model 2
    (fast-align) and k-means clustering using Dynamic Time Warping as a distance
    metric. The two components are trained jointly using expectation-maximization.
    In an extremely low-resource scenario, our model performs significantly better
    than both a neural model and a strong baseline.

    Creating Causal Embeddings for Question Answering with Minimal Supervision

    Rebecca Sharp, Mihai Surdeanu, Peter Jansen, Peter Clark, Michael Hammond
    Comments: To appear in EMNLP 2016
    Subjects: Computation and Language (cs.CL)

    A common model for question answering (QA) is that a good answer is one that
    is closely related to the question, where relatedness is often determined using
    general-purpose lexical models such as word embeddings. We argue that a better
    approach is to look for answers that are related to the question in a relevant
    way, according to the information need of the question, which may be determined
    through task-specific embeddings. With causality as a use case, we implement
    this insight in three steps. First, we generate causal embeddings
    cost-effectively by bootstrapping cause-effect pairs extracted from free text
    using a small set of seed patterns. Second, we train dedicated embeddings over
    this data, by using task-specific contexts, i.e., the context of a cause is its
    effect. Finally, we extend a state-of-the-art reranking approach for QA to
    incorporate these causal embeddings. We evaluate the causal embedding models
    both directly with a casual implication task, and indirectly, in a downstream
    causal QA task using data from Yahoo! Answers. We show that explicitly modeling
    causality improves performance in both tasks. In the QA task our best model
    achieves 37.3% P@1, significantly outperforming a strong baseline by 7.7%
    (relative).

    Toward Socially-Infused Information Extraction: Embedding Authors, Mentions, and Entities

    Yi Yang, Ming-Wei Chang, Jacob Eisenstein
    Comments: Accepted to EMNLP 2016
    Subjects: Computation and Language (cs.CL)

    Entity linking is the task of identifying mentions of entities in text, and
    linking them to entries in a knowledge base. This task is especially difficult
    in microblogs, as there is little additional text to provide disambiguating
    context; rather, authors rely on an implicit common ground of shared knowledge
    with their readers. In this paper, we attempt to capture some of this implicit
    context by exploiting the social network structure in microblogs. We build on
    the theory of homophily, which implies that socially linked individuals share
    interests, and are therefore likely to mention the same sorts of entities. We
    implement this idea by encoding authors, mentions, and entities in a continuous
    vector space, which is constructed so that socially-connected authors have
    similar vector representations. These vectors are incorporated into a neural
    structured prediction model, which captures structural constraints that are
    inherent in the entity linking task. Together, these design decisions yield F1
    improvements of 1%-5% on benchmark datasets, as compared to the previous
    state-of-the-art.

    S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking

    Yi Yang, Ming-Wei Chang
    Comments: Appeared in ACL 2015 proceedings. This is an updated version. More details available in the pdf file
    Subjects: Computation and Language (cs.CL)

    Non-linear models recently receive a lot of attention as people are starting
    to discover the power of statistical and embedding features. However,
    tree-based models are seldom studied in the context of structured learning
    despite their recent success on various classification and ranking tasks. In
    this paper, we propose S-MART, a tree-based structured learning framework based
    on multiple additive regression trees. S-MART is especially suitable for
    handling tasks with dense features, and can be used to learn many different
    structures under various loss functions.

    We apply S-MART to the task of tweet entity linking — a core component of
    tweet information extraction, which aims to identify and link name mentions to
    entities in a knowledge base. A novel inference algorithm is proposed to handle
    the special structure of the task. The experimental results show that S-MART
    significantly outperforms state-of-the-art tweet entity linking systems.

    Lexicon-Free Fingerspelling Recognition from Video: Data, Models, and Signer Adaptation

    Taehwan Kim, Jonathan Keane, Weiran Wang, Hao Tang, Jason Riggle, Gregory Shakhnarovich, Diane Brentari, Karen Livescu
    Comments: arXiv admin note: substantial text overlap with arXiv:1608.08339
    Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)

    We study the problem of recognizing video sequences of fingerspelled letters
    in American Sign Language (ASL). Fingerspelling comprises a significant but
    relatively understudied part of ASL. Recognizing fingerspelling is challenging
    for a number of reasons: It involves quick, small motions that are often highly
    coarticulated; it exhibits significant variation between signers; and there has
    been a dearth of continuous fingerspelling data collected. In this work we
    collect and annotate a new data set of continuous fingerspelling videos,
    compare several types of recognizers, and explore the problem of signer
    variation. Our best-performing models are segmental (semi-Markov) conditional
    random fields using deep neural network-based features. In the signer-dependent
    setting, our recognizers achieve up to about 92% letter accuracy. The
    multi-signer setting is much more challenging, but with neural network
    adaptation we achieve up to 83% letter accuracies in this setting.

    Pointer Sentinel Mixture Models

    Stephen Merity, Caiming Xiong, James Bradbury, Richard Socher
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

    Recent neural network sequence models with softmax classifiers have achieved
    their best language modeling performance only with very large hidden states and
    large vocabularies. Even then they struggle to predict rare or unseen words
    even if the context makes the prediction unambiguous. We introduce the pointer
    sentinel mixture architecture for neural sequence models which has the ability
    to either reproduce a word from the recent context or produce a word from a
    standard softmax classifier. Our pointer sentinel-LSTM model achieves state of
    the art language modeling performance on the Penn Treebank (70.9 perplexity)
    while using far fewer parameters than a standard softmax LSTM. In order to
    evaluate how well language models can exploit longer contexts and deal with
    more realistic vocabularies and larger corpora we also introduce the freely
    available WikiText corpus.

    A Factorized Model for Transitive Verbs in Compositional Distributional Semantics

    Lilach Edelstein, Roi Reichart
    Subjects: Computation and Language (cs.CL)

    We present a factorized compositional distributional semantics model for the
    representation of transitive verb constructions. Our model first produces
    (subject, verb) and (verb, object) vector representations based on the
    similarity of the nouns in the construction to each of the nouns in the
    vocabulary and the tendency of these nouns to take the subject and object roles
    of the verb. These vectors are then combined into a final (subject,verb,object)
    representation through simple vector operations. On two established tasks for
    the transitive verb construction our model outperforms recent previous work.

    Lattice-Based Recurrent Neural Network Encoders for Neural Machine Translation

    Jinsong Su, Zhixing Tan, Deyi Xiong, Yang Liu
    Subjects: Computation and Language (cs.CL)

    Neural machine translation (NMT) heavily relies on word level modelling to
    learn semantic representations of input sentences. However, for languages
    without natural word delimiters (e.g., Chinese) where input sentences have to
    be tokenized first, conventional NMT is confronted with two issues: 1) it is
    difficult to find an optimal tokenization granularity for source sentence
    modelling, and 2) errors in 1-best tokenizations may propagate to the encoder
    of NMT. To handle these issues, we propose word-lattice based Recurrent Neural
    Network (RNN) encoders for NMT, which generalize the standard RNN to word
    lattice topology. The proposed encoders take as input a word lattice that
    compactly encodes multiple tokenizations, and learn to generate new hidden
    states from arbitrarily many inputs and hidden states in preceding time steps.
    As such, the word-lattice based encoders not only alleviate the negative impact
    of tokenization errors but also are more expressive and flexible to embed input
    sentences. Experiment results on Chinese-English translation demonstrate the
    superiorities of the proposed encoders over the conventional encoder.

    Large-Scale Machine Translation between Arabic and Hebrew: Available Corpora and Initial Results

    Yonatan Belinkov, James Glass
    Comments: SeMaT 2016
    Subjects: Computation and Language (cs.CL)

    Machine translation between Arabic and Hebrew has so far been limited by a
    lack of parallel corpora, despite the political and cultural importance of this
    language pair. Previous work relied on manually-crafted grammars or pivoting
    via English, both of which are unsatisfactory for building a scalable and
    accurate MT system. In this work, we compare standard phrase-based and neural
    systems on Arabic-Hebrew translation. We experiment with tokenization by
    external tools and sub-word modeling by character-level neural models, and show
    that both methods lead to improved translation performance, with a small
    advantage to the neural models.

    The distribution of information content in English sentences

    Shuiyuan Yu, Jin Cong, Junying Liang, Haitao Liu
    Subjects: Computation and Language (cs.CL)

    Sentence is a basic linguistic unit, however, little is known about how
    information content is distributed across different positions of a sentence.
    Based on authentic language data of English, the present study calculated the
    entropy and other entropy-related statistics for different sentence positions.
    The statistics indicate a three-step staircase-shaped distribution pattern,
    with entropy in the initial position lower than the medial positions (positions
    other than the initial and final), the medial positions lower than the final
    position and the medial positions showing no significant difference. The
    results suggest that: (1) the hypotheses of Constant Entropy Rate and Uniform
    Information Density do not hold for the sentence-medial positions; (2) the
    context of a word in a sentence should not be simply defined as all the words
    preceding it in the same sentence; and (3) the contextual information content
    in a sentence does not accumulate incrementally but follows a pattern of “the
    whole is greater than the sum of parts”.

    Existence of Hierarchies and Human's Pursuit of Top Hierarchy Lead to Power Law

    Shuiyuan Yu, Junying Liang, Haitao Liu
    Subjects: Computation and Language (cs.CL); Physics and Society (physics.soc-ph)

    The power law is ubiquitous in natural and social phenomena, and is
    considered as a universal relationship between the frequency and its rank for
    diverse social systems. However, a general model is still lacking to interpret
    why these seemingly unrelated systems share great similarity. Through a
    detailed analysis of natural language texts and simulation experiments based on
    the proposed ‘Hierarchical Selection Model’, we found that the existence of
    hierarchies and human’s pursuit of top hierarchy lead to the power law.
    Further, the power law is a statistical and emergent performance of
    hierarchies, and it is the universality of hierarchies that contributes to the
    ubiquity of the power law.

    An Investigation of Recurrent Neural Architectures for Drug Name Recognition

    Raghavendra Chalapathy, Ehsan Zare Borzeshi, Massimo Piccardi
    Comments: Accepted for Oral Presentation at LOUHI 2016 : EMNLP 2016 Workshop – The Seventh International Workshop on Health Text Mining and Information Analysis (LOUHI 2016)
    Subjects: Computation and Language (cs.CL)

    Drug name recognition (DNR) is an essential step in the Pharmacovigilance
    (PV) pipeline. DNR aims to find drug name mentions in unstructured biomedical
    texts and classify them into predefined categories. State-of-the-art DNR
    approaches heavily rely on hand crafted features and domain specific resources
    which are difficult to collect and tune. For this reason, this paper
    investigates the effectiveness of contemporary recurrent neural architectures –
    the Elman and Jordan networks and the bidirectional LSTM with CRF decoding – at
    performing DNR straight from the text. The experimental results achieved on the
    authoritative SemEval-2013 Task 9.1 benchmarks show that the bidirectional
    LSTM-CRF ranks closely to highly-dedicated, hand-crafted systems.

    A Character-level Convolutional Neural Network for Distinguishing Similar Languages and Dialects

    Yonatan Belinkov, James Glass
    Comments: DSL 2016
    Subjects: Computation and Language (cs.CL)

    Discriminating between closely-related language varieties is considered a
    challenging and important task. This paper describes our submission to the DSL
    2016 shared-task, which included two sub-tasks: one on discriminating similar
    languages and one on identifying Arabic dialects. We developed a
    character-level neural network for this task. Given a sequence of characters,
    our model embeds each character in vector space, runs the sequence through
    multiple convolutions with different filter widths, and pools the convolutional
    representations to obtain a hidden vector representation of the text that is
    used for predicting the language or dialect. We primarily focused on the Arabic
    dialect identification task and obtained an F1 score of 0.4834, ranking 6th out
    of 18 participants. We also analyze errors made by our system on the Arabic
    data in some detail, and point to challenges such an approach is faced with.

    Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser

    Adhiguna Kuncoro, Miguel Ballesteros, Lingpeng Kong, Chris Dyer, Noah A. Smith
    Comments: 10 pages. To appear at EMNLP 2016
    Subjects: Computation and Language (cs.CL)

    We introduce two first-order graph-based dependency parsers achieving a new
    state of the art. The first is a consensus parser built from an ensemble of
    independently trained greedy LSTM transition-based parsers with different
    random initializations. We cast this approach as minimum Bayes risk decoding
    (under the Hamming cost) and argue that weaker consensus within the ensemble is
    a useful signal of difficulty or ambiguity. The second parser is a
    “distillation” of the ensemble into a single model. We train the distillation
    parser using a structured hinge loss objective with a novel cost that
    incorporates ensemble uncertainty estimates for each possible attachment,
    thereby avoiding the intractable cross-entropy computations required by
    applying standard distillation objectives to problems with structured outputs.
    The first-order distillation parser matches or surpasses the state of the art
    on English, Chinese, and German.

    Speaker Recognition for Children's Speech

    Saeid Safavi, Maryam Najafian, Abualsoud Hanani, Martin J Russell, Peter Jancovic, Michael J Carey
    Comments: INTERSPEECH 2012, Pages 1836-1839
    Subjects: Sound (cs.SD); Computation and Language (cs.CL)

    This paper presents results on Speaker Recognition (SR) for children’s
    speech, using the OGI Kids corpus and GMM-UBM and GMM-SVM SR systems. Regions
    of the spectrum containing important speaker information for children are
    identified by conducting SR experiments over 21 frequency bands. As for adults,
    the spectrum can be split into four regions, with the first (containing primary
    vocal tract resonance information) and third (corresponding to high frequency
    speech sounds) being most useful for SR. However, the frequencies at which
    these regions occur are from 11% to 38% higher for children. It is also noted
    that subband SR rates are lower for younger children. Finally results are
    presented of SR experiments to identify a child in a class (30 children,
    similar age) and school (288 children, varying ages). Class performance depends
    on age, with accuracy varying from 90% for young children to 99% for older
    children. The identification rate achieved for a child in a school is 81%.


    Distributed, Parallel, and Cluster Computing

    Solving Batched Linear Programs on GPU and Multicore CPU

    Amit Gurung, Rajarshi Ray
    Comments: contains 31 pages in double line spacing, 11 figures with 2 tables. A preliminary work has been accepted in the 8th IEEE International Student Research Symposim on High Performance Computing, HiPC’2015, Bangalore, December 16-19, 2015. this http URL
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Linear Programs (LPs) appear in a large number of applications and offloading
    them to the GPU is viable to gain performance. Existing work on offloading and
    solving an LP on GPU suggests that performance is gained from large sized LPs
    (typically 500 constraints, 500 variables and above). In order to gain
    performance from GPU for applications involving small to medium sized LPs, we
    propose batched solving of a large number of LPs in parallel. In this paper, we
    present the design and CUDA implementation of our batched LP solver library,
    keeping memory coalescent access, reduced CPU-GPU memory transfer latency and
    load balancing as the goals. The performance of the batched LP solver is
    compared against sequential solving in the CPU using an open source solver GLPK
    (GNU Linear Programming Kit). The performance is evaluated for three types of
    LPs. The first type is the initial basic solution as feasible, the second type
    is the initial basic solution as infeasible and the third type is the feasible
    region as a Hyperbox. For the first type, we show a maximum speedup of
    $18.3 imes$ when running a batch of $50k$ LPs of size $100$ ($100$ variables,
    $100$ constraints). For the second type, a maximum speedup of $12 imes$ is
    obtained with a batch of $10k$ LPs of size $200$. For the third type, we show a
    significant speedup of $63 imes$ in solving a batch of nearly $4$ million LPs
    of size 5 and $34 imes$ in solving 6 million LPs of size $28$. In addition, we
    show that the open source library for solving linear programs-GLPK, can be
    easily extended to solve many LPs in parallel with multi-threading. The thread
    parallel GLPK implementation runs $9.6 imes$ faster in solving a batch of
    $1e5$ LPs of size $100$, on a $12$-core Intel Xeon processor. We demonstrate
    the application of our batched LP solver in the domain of state-space
    exploration of mathematical models of control systems design.

    Scalable Estimation of Precision Maps in a MapReduce Framework

    Claus Brenner
    Comments: ACM SIGSPATIAL’16, October 31-November 03, 2016, Burlingame, CA, USA
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Robotics (cs.RO)

    This paper presents a large-scale strip adjustment method for LiDAR mobile
    mapping data, yielding highly precise maps. It uses several concepts to achieve
    scalability. First, an efficient graph-based pre-segmentation is used, which
    directly operates on LiDAR scan strip data, rather than on point clouds.
    Second, observation equations are obtained from a dense matching, which is
    formulated in terms of an estimation of a latent map. As a result of this
    formulation, the number of observation equations is not quadratic, but rather
    linear in the number of scan strips. Third, the dynamic Bayes network, which
    results from all observation and condition equations, is partitioned into two
    sub-networks. Consequently, the estimation matrices for all position and
    orientation corrections are linear instead of quadratic in the number of
    unknowns and can be solved very efficiently using an alternating least squares
    approach. It is shown how this approach can be mapped to a standard key/value
    MapReduce implementation, where each of the processing nodes operates
    independently on small chunks of data, leading to essentially linear
    scalability. Results are demonstrated for a dataset of one billion measured
    LiDAR points and 278,000 unknowns, leading to maps with a precision of a few
    millimeters.

    Opportunistic Network Decoupling With Virtual Full-Duplex Operation in Multi-Source Interfering Relay Networks

    Won-Yong Shin, Vien V. Mai, Bang Chul Jung, Hyun Jong Yang
    Comments: 22 pages, 5 figures, To appear in IEEE Transactions on Mobile Computing
    Subjects: Information Theory (cs.IT); Distributed, Parallel, and Cluster Computing (cs.DC); Networking and Internet Architecture (cs.NI)

    We introduce a new achievability scheme, termed opportunistic network
    decoupling (OND), operating in virtual full-duplex mode. In the scheme, a novel
    relay scheduling strategy is utilized in the $K imes N imes K$ channel with
    interfering relays, consisting of $K$ source–destination pairs and $N$
    half-duplex relays in-between them. A subset of relays using alternate relaying
    is opportunistically selected in terms of producing the minimum total
    interference level, thereby resulting in network decoupling. As our main
    result, it is shown that under a certain relay scaling condition, the OND
    protocol achieves $K$ degrees of freedom even in the presence of interfering
    links among relays. Numerical evaluation is also shown to validate the
    performance of the proposed OND. Our protocol basically operates in a fully
    distributed fashion along with local channel state information, thereby
    resulting in a relatively easy implementation.

    Benchmarking SciDB Data Import on HPC Systems

    Siddharth Samsi, Laura Brattain, William Arcand, David Bestor, Bill Bergeron, Chansup Byun, Vijay Gadepally, Michael Houle, Matthew Hubbell, Michael Jones, Anna Klein, Peter Michaleas, Lauren Milechin, Julie Mullen, Andrew Prout, Antonio Rosa, Charles Yee, Jeremy Kepner, Albert Reuther
    Comments: 5 pages, 4 figures, IEEE High Performance Extreme Computing (HPEC) 2016, best paper finalist
    Subjects: Databases (cs.DB); Distributed, Parallel, and Cluster Computing (cs.DC); Performance (cs.PF); Quantitative Methods (q-bio.QM)

    SciDB is a scalable, computational database management system that uses an
    array model for data storage. The array data model of SciDB makes it ideally
    suited for storing and managing large amounts of imaging data. SciDB is
    designed to support advanced analytics in database, thus reducing the need for
    extracting data for analysis. It is designed to be massively parallel and can
    run on commodity hardware in a high performance computing (HPC) environment. In
    this paper, we present the performance of SciDB using simulated image data. The
    Dynamic Distributed Dimensional Data Model (D4M) software is used to implement
    the benchmark on a cluster running the MIT SuperCloud software stack. A peak
    performance of 2.2M database inserts per second was achieved on a single node
    of this system. We also show that SciDB and the D4M toolbox provide more
    efficient ways to access random sub-volumes of massive datasets compared to the
    traditional approaches of reading volumetric data from individual files. This
    work describes the D4M and SciDB tools we developed and presents the initial
    performance results. This performance was achieved by using parallel inserts, a
    in-database merging of arrays as well as supercomputing techniques, such as
    distributed arrays and single-program-multiple-data programming.


    Learning

    Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation

    Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Łukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, Jeffrey Dean
    Subjects: Learning (cs.LG)

    Neural Machine Translation (NMT) is an end-to-end learning approach for
    automated translation, with the potential to overcome many of the weaknesses of
    conventional phrase-based translation systems. Unfortunately, NMT systems are
    known to be computationally expensive both in training and in translation
    inference. Also, most NMT systems have difficulty with rare words. These issues
    have hindered NMT’s use in practical deployments and services, where both
    accuracy and speed are essential. In this work, we present GNMT, Google’s
    Neural Machine Translation system, which attempts to address many of these
    issues. Our model consists of a deep LSTM network with 8 encoder and 8 decoder
    layers using attention and residual connections. To improve parallelism and
    therefore decrease training time, our attention mechanism connects the bottom
    layer of the decoder to the top layer of the encoder. To accelerate the final
    translation speed, we employ low-precision arithmetic during inference
    computations. To improve handling of rare words, we divide words into a limited
    set of common sub-word units (“wordpieces”) for both input and output. This
    method provides a good balance between the flexibility of “character”-delimited
    models and the efficiency of “word”-delimited models, naturally handles
    translation of rare words, and ultimately improves the overall accuracy of the
    system. Our beam search technique employs a length-normalization procedure and
    uses a coverage penalty, which encourages generation of an output sentence that
    is most likely to cover all the words in the source sentence. On the WMT’14
    English-to-French and English-to-German benchmarks, GNMT achieves competitive
    results to state-of-the-art. Using a human side-by-side evaluation on a set of
    isolated simple sentences, it reduces translation errors by an average of 60%
    compared to Google’s phrase-based production system.

    Dropout with Expectation-linear Regularization

    Xuezhe Ma, Yingkai Gao, Zhiting Hu, Yaoliang Yu, Yuntian Deng, Eduard Hovy
    Comments: Under review as a conference paper at ICLR 2017
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    Dropout, a simple and effective way to train deep neural networks, has led to
    a number of impressive empirical successes and spawned many recent theoretical
    investigations. However, the gap between dropout’s training and inference
    phases, introduced due to tractability considerations, has largely remained
    under-appreciated. In this work, we first formulate dropout as a tractable
    approximation of some latent variable model, leading to a clean view of
    parameter sharing and enabling further theoretical analysis. Then, we introduce
    (approximate) expectation-linear dropout neural networks, whose inference gap
    we are able to formally characterize. Algorithmically, we show that our
    proposed measure of the inference gap can be used to regularize the standard
    dropout training objective, resulting in an emph{explicit} control of the gap.
    Our method is as simple and efficient as standard dropout. We further prove the
    upper bounds on the loss in accuracy due to expectation-linearization, describe
    classes of input distributions that expectation-linearize easily. Experiments
    on three image classification benchmark datasets demonstrate that reducing the
    inference gap can indeed improve the performance consistently.

    Information-Theoretic Methods for Planning and Learning in Partially Observable Markov Decision Processes

    Roy Fox
    Subjects: Learning (cs.LG)

    Bounded agents are limited by intrinsic constraints on their ability to
    process information that is available in their sensors and memory and choose
    actions and memory updates. In this dissertation, we model these constraints as
    information-rate constraints on communication channels connecting these various
    internal components of the agent.

    We make four major contributions detailed below and many smaller
    contributions detailed in each section. First, we formulate the problem of
    optimizing the agent under both extrinsic and intrinsic constraints and develop
    the main tools for solving it. Second, we identify another reason for the
    challenging convergence properties of the optimization algorithm, which is the
    bifurcation structure of the update operator near phase transitions. Third, we
    study the special case of linear-Gaussian dynamics and quadratic cost (LQG),
    where the optimal solution has a particularly simple and solvable form. Fourth,
    we explore the learning task, where the model of the world dynamics is unknown
    and sample-based updates are used instead.

    Derivative Delay Embedding: Online Modeling of Streaming Time Series

    Zhifei Zhang, Yang Song, Wei Wang, Hairong Qi
    Comments: Accepted by The 25th ACM International Conference on Information and Knowledge Management (CIKM 2016)
    Subjects: Learning (cs.LG)

    The staggering amount of streaming time series coming from the real world
    calls for more efficient and effective online modeling solution. For time
    series modeling, most existing works make some unrealistic assumptions such as
    the input data is of fixed length or well aligned, which requires extra effort
    on segmentation or normalization of the raw streaming data. Although some
    literature claim their approaches to be invariant to data length and
    misalignment, they are too time-consuming to model a streaming time series in
    an online manner. We propose a novel and more practical online modeling and
    classification scheme, DDE-MGM, which does not make any assumptions on the time
    series while maintaining high efficiency and state-of-the-art performance. The
    derivative delay embedding (DDE) is developed to incrementally transform time
    series to the embedding space, where the intrinsic characteristics of data is
    preserved as recursive patterns regardless of the stream length and
    misalignment. Then, a non-parametric Markov geographic model (MGM) is proposed
    to both model and classify the pattern in an online manner. Experimental
    results demonstrate the effectiveness and superior classification accuracy of
    the proposed DDE-MGM in an online setting as compared to the state-of-the-art.

    Grounding object perception in a naive agent's sensorimotor experience

    Alban Laflaquière, Nikolas Hemion
    Comments: 7 pages, 4 figures, ICDL-Epirob 2015 conference
    Subjects: Robotics (cs.RO); Learning (cs.LG)

    Artificial object perception usually relies on a priori defined models and
    feature extraction algorithms. We study how the concept of object can be
    grounded in the sensorimotor experience of a naive agent. Without any knowledge
    about itself or the world it is immersed in, the agent explores its
    sensorimotor space and identifies objects as consistent networks of
    sensorimotor transitions, independent from their context. A fundamental drive
    for prediction is assumed to explain the emergence of such networks from a
    developmental standpoint. An algorithm is proposed and tested to illustrate the
    approach.

    Deep Structured Features for Semantic Segmentation

    Michael Tschannen, Lukas Cavigelli, Fabian Mentzer, Thomas Wiatowski, Luca Benini
    Comments: 10 pages, 2 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    We propose a highly structured neural network architecture for semantic
    segmentation of images that combines i) a Haar wavelet-based tree-like
    convolutional neural network (CNN), ii) a random layer realizing a radial basis
    function kernel approximation, and iii) a linear classifier. While stages i)
    and ii) are completely pre-specified, only the linear classifier is learned
    from data. Thanks to its high degree of structure, our architecture has a very
    small memory footprint and thus fits onto low-power embedded and mobile
    platforms. We apply the proposed architecture to outdoor scene and aerial image
    semantic segmentation and show that the accuracy of our architecture is
    competitive with conventional pixel classification CNNs. Furthermore, we
    demonstrate that the proposed architecture is data efficient in the sense of
    matching the accuracy of pixel classification CNNs when trained on a much
    smaller data set.

    Random Forest for Malware Classification

    Felan Carlo C. Garcia, Felix P. Muga II
    Subjects: Cryptography and Security (cs.CR); Learning (cs.LG)

    The challenge in engaging malware activities involves the correct
    identification and classification of different malware variants. Various
    malwares incorporate code obfuscation methods that alters their code signatures
    effectively countering antimalware detection techniques utilizing static
    methods and signature database. In this study, we utilized an approach of
    converting a malware binary into an image and use Random Forest to classify
    various malware families. The resulting accuracy of 0.9562 exhibits the
    effectivess of the method in detecting malware

    Accurate and Efficient Hyperbolic Tangent Activation Function on FPGA using the DCT Interpolation Filter

    Ahmed M. Abdelsalam, J.M. Pierre Langlois, F. Cheriet
    Comments: 8 pages, 6 figures, 5 tables, submitted for the 25th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (ISFPGA), 22-24 February 2017, California, USA
    Subjects: Neural and Evolutionary Computing (cs.NE); Learning (cs.LG)

    Implementing an accurate and fast activation function with low cost is a
    crucial aspect to the implementation of Deep Neural Networks (DNNs) on FPGAs.
    We propose a high-accuracy approximation approach for the hyperbolic tangent
    activation function of artificial neurons in DNNs. It is based on the Discrete
    Cosine Transform Interpolation Filter (DCTIF). The proposed architecture
    combines simple arithmetic operations on stored samples of the hyperbolic
    tangent function and on input data. The proposed DCTIF implementation achieves
    two orders of magnitude greater precision than previous work while using the
    same or fewer computational resources. Various combinations of DCTIF parameters
    can be chosen to tradeoff the accuracy and complexity of the hyperbolic tangent
    function. In one case, the proposed architecture approximates the hyperbolic
    tangent activation function with 10E-5 maximum error while requiring only 1.52
    Kbits memory and 57 LUTs of a Virtex-7 FPGA. We also discuss how the activation
    function accuracy affects the performance of DNNs in terms of their training
    and testing accuracies. We show that a high accuracy approximation can be
    necessary in order to maintain the same DNN training and testing performances
    realized by the exact function.

    The RNN-ELM Classifier

    Athanasios Vlontzos
    Subjects: Neural and Evolutionary Computing (cs.NE); Learning (cs.LG)

    In this paper we examine learning methods combining the Random Neural
    Network, a biologically inspired neural network and the Extreme Learning
    Machine that achieve state of the art classification performance while
    requiring much shorter training time. The Random Neural Network is a integrate
    and fire computational model of a neural network whose mathematical structure
    permits the efficient analysis of large ensembles of neurons. An activation
    function is derived from the RNN and used in an Extreme Learning Machine. We
    compare the performance of this combination against the ELM with various
    activation functions, we reduce the input dimensionality via PCA and compare
    its performance vs. autoencoder based versions of the RNN-ELM.

    Learning by Stimulation Avoidance: A Principle to Control Spiking Neural Networks Dynamics

    Lana Sinapayen, Atsushi Masumori, Takashi Ikegami
    Comments: 17 pages, 11 figures
    Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Learning based on networks of real neurons, and by extension biologically
    inspired models of neural networks, has yet to find general learning rules
    leading to widespread applications. In this paper, we argue for the existence
    of a principle allowing to steer the dynamics of a biologically inspired neural
    network. Using carefully timed external stimulation, the network can be driven
    towards a desired dynamical state. We term this principle “Learning by
    Stimulation Avoidance” (LSA). We demonstrate through simulation that the
    minimal sufficient conditions leading to LSA in artificial networks are also
    sufficient to reproduce learning results similar to those obtained in
    biological neurons by Shahaf and Marom [1]. We examine the mechanism’s basic
    dynamics in a reduced network, and demonstrate how it scales up to a network of
    100 neurons. We show that LSA has a higher explanatory power than existing
    hypotheses about the response of biological neural networks to external
    simulation, and can be used as a learning rule for an embodied application:
    learning of wall avoidance by a simulated robot. The surge in popularity of
    artificial neural networks is mostly directed to disembodied models of neurons
    with biologically irrelevant dynamics: to the authors’ knowledge, this is the
    first work demonstrating sensory-motor learning with random spiking networks
    through pure Hebbian learning.

    Dynamic Pricing in High-dimensions

    Adel Javanmard, Hamid Nazerzadeh
    Comments: 32 pages
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    We study the pricing problem faced by a firm that sells a large number of
    products, described via a wide range of features, to customers that arrive over
    time. This is motivated in part by the prevalence of online marketplaces that
    allow for real-time pricing. We propose a dynamic policy, called Regularized
    Maximum Likelihood Pricing (RMLP), that obtains asymptotically optimal revenue.
    Our policy leverages the structure (sparsity) of a high-dimensional demand
    space in order to obtain a logarithmic regret compared to the clairvoyant
    policy that knows the parameters of the demand in advance. More specifically,
    the regret of our algorithm is of $O(s_0 log T (log d + log T))$, where $d$
    and $s_0$ correspond to the dimension of the demand space and its sparsity.
    Furthermore, we show that no policy can obtain regret better than $O(s_0 (log
    d + log T))$.

    Informative Planning and Online Learning with Sparse Gaussian Processes

    Kai-Chieh Ma, Lantao Liu, Gaurav S. Sukhatme
    Subjects: Robotics (cs.RO); Learning (cs.LG); Machine Learning (stat.ML)

    A big challenge in environmental monitoring is the spatiotemporal variation
    of the phenomena to be observed. To enable persistent sensing and estimation in
    such a setting, it is beneficial to have a time-varying underlying
    environmental model. Here we present a planning and learning method that
    enables an autonomous marine vehicle to perform persistent ocean monitoring
    tasks by learning and refining an environmental model. To alleviate the
    computational bottleneck caused by large-scale data accumulated, we propose a
    framework that iterates between a planning component aimed at collecting the
    most information-rich data, and a sparse Gaussian Process learning component
    where the environmental model and hyperparameters are learned online by taking
    advantage of only a subset of data that provides the greatest contribution. Our
    simulations with ground-truth ocean data shows that the proposed method is both
    accurate and efficient.

    A Tutorial on Distributed (Non-Bayesian) Learning: Problem, Algorithms and Results

    Angelia Nedić, Alex Olshevsky, César A. Uribe
    Comments: Tutorial Presented in CDC2016
    Subjects: Optimization and Control (math.OC); Learning (cs.LG); Multiagent Systems (cs.MA); Social and Information Networks (cs.SI); Machine Learning (stat.ML)

    We overview some results on distributed learning with focus on a family of
    recently proposed algorithms known as non-Bayesian social learning. We consider
    different approaches to the distributed learning problem and its algorithmic
    solutions for the case of finitely many hypotheses. The original centralized
    problem is discussed at first, and then followed by a generalization to the
    distributed setting. The results on convergence and convergence rate are
    presented for both asymptotic and finite time regimes. Various extensions are
    discussed such as those dealing with directed time-varying networks, Nesterov’s
    acceleration technique and a continuum sets of hypothesis.

    A Rotation Invariant Latent Factor Model for Moveme Discovery from Static Poses

    Matteo Ruggero Ronchi, Joon Sik Kim, Yisong Yue
    Comments: Long version of the paper accepted at the IEEE ICDM 2016 conference. 10 pages, 9 figures, 1 table. Project page: this http URL
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    We tackle the problem of learning a rotation invariant latent factor model
    when the training data is comprised of lower-dimensional projections of the
    original feature space. The main goal is the discovery of a set of 3-D bases
    poses that can characterize the manifold of primitive human motions, or
    movemes, from a training set of 2-D projected poses obtained from still images
    taken at various camera angles. The proposed technique for basis discovery is
    data-driven rather than hand-designed. The learned representation is rotation
    invariant, and can reconstruct any training instance from multiple viewing
    angles. We apply our method to modeling human poses in sports (via the Leeds
    Sports Dataset), and demonstrate the effectiveness of the learned bases in a
    range of applications such as activity classification, inference of dynamics
    from a single frame, and synthetic representation of movements.


    Information Theory

    The Capacity of Private Information Retrieval from Coded Databases

    Karim Banawan, Sennur Ulukus
    Comments: Submitted to IEEE Transactions on Information Theory, September 2016
    Subjects: Information Theory (cs.IT); Cryptography and Security (cs.CR)

    We consider the problem of private information retrieval (PIR) over a
    distributed storage system. The storage system consists of $N$ non-colluding
    databases, each storing a coded version of $M$ messages. In the PIR problem,
    the user wishes to retrieve one of the available messages without revealing the
    message identity to any individual database. We derive the
    information-theoretic capacity of this problem, which is defined as the maximum
    number of bits of the desired message that can be privately retrieved per one
    bit of downloaded information. We show that the PIR capacity in this case is
    $C=left(1+frac{K}{N}+frac{K^2}{N^2}+cdots+frac{K^{M-1}}{N^{M-1}}
    ight)^{-1}=(1+R_c+R_c^2+cdots+R_c^{M-1})^{-1}=frac{1-R_c}{1-R_c^M}$,
    where $R_c$ is the rate of the $(N,K)$ code used. The capacity is a function of
    the code rate and the number of messages only regardless of the explicit
    structure of the storage code. The result implies a fundamental tradeoff
    between the optimal retrieval cost and the storage cost. The result generalizes
    the achievability and converse results for the classical PIR with replicating
    databases to the case of coded databases.

    Nearest-Neighbor and Contact Distance Distributions for Thomas Cluster Process

    Mehrnaz Afshang, Chiranjib Saha, Harpreet S. Dhillon
    Subjects: Information Theory (cs.IT)

    We characterize the statistics of nearest-neighbor and contact distance
    distributions for Thomas cluster process (TCP), which is a special case of
    Poisson cluster process. In particular, we derive the cumulative distribution
    function (CDF) of the distance to the nearest point of TCP from a reference
    point for three different cases: (i) reference point is not a part of the point
    process, (ii) it is chosen uniformly at random from the TCP, and (iii) it is a
    randomly chosen point from a cluster chosen uniformly at random from the TCP.
    While the first corresponds to the contact distance distribution, the other two
    provide two different viewpoints for the nearest-neighbor distance
    distribution.

    Power Talk for Multibus DC MicroGrids: Creating and Optimizing Communication Channels

    Marko Angjelichinoski, Cedomir Stefanovic, Petar Popovski
    Comments: Shorter version of a paper accepted for presentation at GLOBECOM 2016
    Subjects: Information Theory (cs.IT)

    We study a communication framework for nonlinear multibus DC MicroGrids based
    on a deliberate modification of the parameters of the primary control and
    termed power talk. We assess the case in which the information is modulated in
    the deviations of reference voltages of the primary control loops and show that
    the outputs of the power talk communication channels can be approximated
    through linear combinations of the respective inputs. We show that the
    coefficients of the linear combinations, representing equivalent channel gains,
    depend on the virtual resistances of the primary control loops, implying that
    they can be modified such that effective received signal-to-noise ratio (SNR)
    is increased. On the other hand, we investigate the constraints that power talk
    incurs on the supplied power deviations. We show that these constraints
    translate into constraints on the reference voltages and virtual resistances
    that are imposed on all units in the system. In this regard, we develop an
    optimization approach to find the set of controllable virtual resistances that
    maximize SNR under the constraints on the supplied power deviations.

    Metrics based on Finite Directed Graphs

    Tuvi Etzion, Marcelo Firer, Roberto Assis Machado
    Subjects: Information Theory (cs.IT); Discrete Mathematics (cs.DM)

    Given a finite directed graph with $n$ vertices, we define a metric $d_G$
    over $mathbb{F}_q^n$, where the weight of a word is the number of vertices
    that can be reached by a directed path starting at the support of the vector.
    Two canonical forms, which do not affect the metric, are given to each graph.
    Based on these forms we characterize each such metric. We further use these
    forms to prove that two graphs with different canonical forms yield different
    metrics. Efficient algorithms to check if a set of metric weights define a
    metric based on a graph are given. We provide tight bounds on the number of
    metric weights required to reconstruct the metric. Furthermore, we give a
    complete description of the group of linear isometries of the graph metrics and
    a characterization of the graphs for which every linear code admits a
    $G$-canonical decomposition. Considering those graphs, we are able to derive an
    expression of the packing radius of linear codes in such metric spaces.
    Finally, given a directed graph which determines a hierarchical poset, we
    present sufficient and necessary conditions to ensure the validity of the
    MacWilliams Identity and the MacWilliams Extension Property.

    Antenna Selection for MIMO-NOMA Networks

    Yuehua Yu, He Chen, Yonghui Li, Zhiguo Ding, Branka Vucetic
    Comments: Submitted for possible conference publication
    Subjects: Information Theory (cs.IT)

    This paper considers the joint antenna selection (AS) problem for a classical
    two-user non-orthogonal multiple access (NOMA) network where both the base
    station and users are equipped with multiple antennas. Since the
    exhaustive-search-based optimal AS scheme is computationally prohibitive when
    the number of antennas is large, two computationally efficient joint AS
    algorithms, namely max-min-max AS (AIA-AS) and max-max-max AS (A$^3$-AS), are
    proposed to maximize the system sum-rate. The asymptotic closed-form
    expressions for the average sum-rates for both AIA-AS and A$^3$-AS are derived
    in the high signal-to-noise ratio (SNR) regime, respectively. Numerical results
    demonstrate that both AIA-AS and A$^3$-AS can yield significant performance
    gains over comparable schemes. Furthermore, AIA-AS can provide better user
    fairness, while the A$^3$-AS scheme can achieve the near-optimal sum-rate
    performance.

    Function Computation through a Bidirectional Relay

    Jithin Ravi, Bikash Kumar Dey
    Comments: 32 pages, 6 figures
    Subjects: Information Theory (cs.IT)

    We consider a function computation problem in a three node wireless network.
    Nodes A and B observe two correlated sources $X$ and $Y$ respectively, and want
    to compute a function $f(X,Y)$. To achieve this, nodes A and B send messages to
    a relay node C at rates $R_A$ and $R_B$ respectively. The relay C then
    broadcasts a message to A and B at rate $R_C$. We allow block coding, and study
    the achievable region of rate triples under both zero-error and
    $epsilon$-error. As a preparation, we first consider a broadcast network from
    the relay to A and B. A and B have side information $X$ and $Y$ respectively.
    The relay node C observes both $X$ and $Y$ and broadcasts an encoded message to
    A and B. We want to obtain the optimal broadcast rate such that A and B can
    recover the function $f(X,Y)$ from the received message and their individual
    side information $X$ and $Y$ respectively. For a special case of $f(X,Y) = X
    oplus Y$, we show equivalence between $epsilon$-error and zero-error
    computations– this gives a rate characterization for zero-error XOR
    computation in the broadcast network. As a corollary, this also gives a rate
    characterization for zero-error in the relay network when the support set of
    $p_{XY}$ is full. For the relay network, the zero-error rate region for
    arbitrary functions is characterized in terms of graph coloring of some
    suitably defined probabilistic graphs. We then give a single-letter inner bound
    to this rate region. Further, we extend the graph theoretic ideas to address
    the $epsilon$-error problem and obtain a single-letter inner bound.

    Performance Comparison of Short-Length Error-Correcting Codes

    J. Van Wonterghem, A. Alloum, J. J. Boutros, M. Moeneclaey
    Comments: 6 pages, 5 figures, paper submitted to the IEEE SCVT 2016 conference
    Subjects: Information Theory (cs.IT)

    We compare the performance of short-length linear binary codes on the binary
    erasure channel and the binary-input Gaussian channel. We use a universal
    decoder that can decode any linear binary block code: Gaussian-elimination
    based Maximum-Likelihood decoder on the erasure channel and probabilistic
    Ordered Statistics Decoder on the Gaussian channel. As such we compare codes
    and not decoders. The word error rate versus the channel parameter is found for
    LDPC, Reed-Muller, Polar, and BCH codes at length 256 bits. BCH codes
    outperform other codes in absence of cyclic redundancy check. Under joint
    decoding, the concatenation of a cyclic redundancy check makes all codes
    perform very close to optimal lower bounds.

    Role of Interference Alignment in Wireless Cellular Network Optimization

    Gokul Sridharan, Siyu Liu, Wei Yu
    Comments: 15 pages, accepted for publication
    Subjects: Information Theory (cs.IT)

    The emergence of interference alignment (IA) as a degrees-of-freedom optimal
    strategy motivates the need to investigate whether IA can be leveraged to aid
    conventional network optimization algorithms that are only capable of finding
    locally optimal solutions. To test the usefulness of IA in this context, this
    paper proposes a two-stage optimization framework for the downlink of a
    $G$-cell multi-antenna network with $K$ users/cell. The first stage of the
    proposed framework focuses on nulling interference from a set of dominant
    interferers using IA, while the second stage optimizes transmit and receive
    beamformers to maximize a network-wide utility using the IA solution as the
    initial condition. Further, this paper establishes a set of new feasibility
    results for partial IA that can be used to guide the number of dominant
    interferers to be nulled in the first stage. Through simulations on specific
    topologies of a cluster of base-stations, it is observed that the impact of IA
    depends on the choice of the utility function and the presence of
    out-of-cluster interference. In the absence of out-of-cluster interference, the
    proposed framework outperforms straightforward optimization when maximizing the
    minimum rate, while providing marginal gains when maximizing sum-rate. However,
    the benefit of IA is greatly diminished in the presence of significant
    out-of-cluster interference.

    Uplink Performance Analysis of Dense Cellular Networks with LoS and NLoS Transmissions

    Tian Ding, Ming Ding, Guoqiang Mao, Zihuai Lin, David Lopez-Perez, Albert Zomaya
    Subjects: Information Theory (cs.IT)

    In this paper, we analyse the coverage probability and the area spectral
    efficiency (ASE) for the uplink (UL) of dense small cell networks (SCNs)
    considering a practical path loss model incorporating both line-of-sight (LoS)
    and non-line-of-sight (NLoS) transmissions. Compared with the existing work, we
    adopt the following novel approaches in our study: (i) we assume a practical
    user association strategy (UAS) based on the smallest path loss, or
    equivalently the strongest received signal strength; (ii) we model the
    positions of both base stations (BSs) and the user equipments (UEs) as two
    independent Homogeneous Poisson point processes (HPPPs); and (iii) the
    correlation of BSs’ and UEs’ positions is considered, thus making our
    analytical results more accurate. The performance impact of LoS and NLoS
    transmissions on the ASE for the UL of dense SCNs is shown to be significant,
    both quantitatively and qualitatively, compared with existing work that does
    not differentiate LoS and NLoS transmissions. In particular, existing work
    predicted that a larger UL power compensation factor would always result in a
    better ASE in the practical range of BS density, i.e., 10^1-10^3 BSs/km^2.
    However, our results show that a smaller UL power compensation factor can
    greatly boost the ASE in the UL of dense SCNs, i.e., 10^2-10^3 BSs/km^2, while
    a larger UL power compensation factor is more suitable for sparse SCNs, i.e.,
    10^1-10^2 BSs/km^2.

    Friendship-based Cooperative Jamming for Secure Communication in Poisson Networks

    Yuanyu Zhang, Yulong Shen, Hua Wang, Xiaohong Jiang
    Subjects: Information Theory (cs.IT)

    Wireless networks with the consideration of social relationships among
    network nodes are highly appealing for lots of important data communication
    services. Ensuring the security of such networks is of great importance to
    facilitate their applications in supporting future social-based services with
    strong security guarantee. This paper explores the physical layer
    security-based secure communication in a finite Poisson network with social
    friendships among nodes, for which a social friendship-based cooperative
    jamming scheme is proposed. The jamming scheme consists of a Local Friendship
    Circle (LFC) and a Long-range Friendship Annulus (LFA), where all legitimate
    nodes in the LFC serve as jammers, but the legitimate nodes in the LFA are
    selected as jammers through three location-based policies. To understand both
    the security and reliability performance of the proposed jamming scheme, we
    first model the sum interference at any location in the network by deriving its
    Laplace transform under two typical path loss scenarios. With the help of the
    interference Laplace transform results, we then derive the exact expression for
    the transmission outage probability (TOP) and determine both the upper and
    lower bounds on the secrecy outage probability (SOP), such that the overall
    outage performances of the proposed jamming scheme can be depicted. Finally, we
    present extensive numerical results to validate the theoretical analysis of TOP
    and SOP and also to illustrate the impacts of the friendship-based cooperative
    jamming on the network performances.

    Semantic Information Measure with Two Types of Probability for Falsification and Confirmation

    Cheguang Lu
    Comments: 27 pages, 8 figure, 10tables, a list of symbols
    Subjects: Information Theory (cs.IT); Logic (math.LO); Probability (math.PR); Statistics Theory (math.ST)

    Logical Probability (LP) is strictly distinguished from Statistical
    Probability (SP). To measure semantic information or confirm hypotheses, we
    need to use sampling distribution (conditional SP function) to test or confirm
    fuzzy truth function (conditional LP function). The Semantic Information
    Measure (SIM) proposed is compatible with Shannon’s information theory and
    Fisher’s likelihood method. It can ensure that the less the LP of a predicate
    is and the larger the true value of the proposition is, the more information
    there is. So the SIM can be used as Popper’s information criterion for
    falsification or test. The SIM also allows us to optimize the true-value of
    counterexamples or degrees of disbelief in a hypothesis to get the optimized
    degree of belief, i. e. Degree of Confirmation (DOC). To explain confirmation,
    this paper 1) provides the calculation method of the DOC of universal
    hypotheses; 2) discusses how to resolve Raven Paradox with new DOC and its
    increment; 3) derives the DOC of rapid HIV tests: DOC of
    test-positive=1-(1-specificity)/sensitivity, which is similar to Likelihood
    Ratio (=sensitivity/(1-specificity)) but has the upper limit 1; 4) discusses
    negative DOC for excessive affirmations, wrong hypotheses, or lies; and 5)
    discusses the DOC of general hypotheses with GPS as example.

    The Exact Rate-Memory Tradeoff for Caching with Uncoded Prefetching

    Qian Yu, Mohammad Ali Maddah-Ali, A. Salman Avestimehr
    Subjects: Information Theory (cs.IT)

    We consider a basic cache network, in which a single server is connected to
    multiple users via a shared bottleneck link. The server has a database of a set
    of files (content). Each user has an isolated memory that can be used to cache
    content in a prefetching phase. In a following delivery phase, each user
    requests a file from the database and the server needs to deliver users’
    demands as efficiently as possible by taking into account their cache contents.
    We focus on an important and commonly used class of prefetching schemes, where
    the caches are filled with uncoded data. We provide the exact characterization
    of rate-memory tradeoff for this problem, by deriving the both the minimum
    average rate (for a uniform file popularity) and the minimum peak-rate required
    on the bottleneck link for a given cache size available at each user. In
    particular, we propose a novel caching scheme, which strictly improves the
    state of the art by exploiting commonality among users’ demands. We then
    demonstrate the exact optimality of our proposed scheme through a matching
    converse, by dividing the set of all demands into types, and showing that the
    placement phase in the proposed caching scheme is universally optimal for all
    types. Using these techniques, we also fully characterize the rate-memory
    tradeoff for a decentralized setting, in which users fill out their cache
    content without any coordination.

    On Analytical and Geometric Lattice Design Criteria for Wiretap Coset Codes

    Alex Karrila, David Karpuk, Camilla Hollanti
    Comments: 25 pages, 6 figures
    Subjects: Information Theory (cs.IT)

    This paper considers physical layer security and the design of secure coset
    codes for wiretap channels, where information is not to be leaked to an
    eavesdropper having a degraded channel. The eavesdropper’s bounds for correct
    decoding probability and information are first revisited, and a new variant of
    the information bound is derived. The new bound is valid for a general channel
    with any fading and Gaussian noise. From these bounds, it is explicit that both
    the information and probability are upper bounded by the average flatness
    factor, i.e., the expected theta function of the faded lattice related to the
    eavesdropper. Taking the minimization of the average flatness factor as a
    design criterion, simple geometric heuristics to minimize it in the Gaussian
    and Rayleigh fast fading channels are motivated. It is concluded that in the
    Gaussian channel, the security boils down to the sphere packing density of the
    eavesdropper’s lattice, whereas in the Rayleigh fading channel a full-diversity
    well-rounded lattice with a dense sphere packing will provide the best secrecy.
    The proposed criteria are backed up by extensive numerical experiments.

    Simultaneous Spectrum Sensing and Data Reception for Cognitive Spatial Multiplexing Distributed Systems

    Nikolaos I. Miridakis, Theodoros A. Tsiftsis, George C. Alexandropoulos, Merouane Debbah
    Subjects: Information Theory (cs.IT)

    A multi-user cognitive (secondary) radio system is considered, where the
    spatial multiplexing mode of operation is implemented amongst the nodes, under
    the presence of multiple primary transmissions. The secondary receiver carries
    out minimum mean-squared error (MMSE) detection to effectively decode the
    secondary data streams, while it performs spectrum sensing at the remaining
    signal to capture the presence of primary activity or not. New analytical
    closed-form expressions regarding some important system measures are obtained,
    namely, the outage and detection probabilities; the transmission power of the
    secondary nodes; the probability of unexpected interference at the primary
    nodes; {color{blue}and the detection efficiency with the aid of the area under
    the receive operating characteristics curve}. The realistic scenarios of
    channel fading time variation and channel estimation errors are encountered for
    the derived results. Finally, the enclosed numerical results verify the
    accuracy of the proposed framework, while some useful engineering insights are
    also revealed, such as the key role of the detection accuracy to the overall
    performance and the impact of transmission power from the secondary nodes to
    the primary system.

    Performance Impact of Idle Mode Capability on Dense Small Cell Networks with LoS and NLoS Transmissions

    Ming Ding, David Lopez-Perez, Guoqiang Mao, Zihuai Lin
    Comments: submitted to IEEE TWC
    Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

    In dense small cell networks (SCNs), a large number of base stations (BSs)
    can be put to idle modes without signal transmission, if there is no active
    user equipment (UE) within their coverage areas. Setting those BSs to idle
    modes can mitigate unnecessary inter-cell interference and reduce energy
    consumption. Such idle mode feature at BSs is referred to as the idle mode
    capability (IMC) and it can largely improve the performance of the
    5th-generation (5G) networks. In this paper, we study the performance impact of
    the BS IMC on dense SCNs. Different from existing work, we consider a
    sophisticated and more realistic path loss model incorporating both
    line-of-sight (LoS) and non-line-of-sight (NLoS) transmissions. Analytical
    results are obtained for the coverage probability, the area spectral efficiency
    and the energy efficiency performance for SCNs with the BS IMC. An upper bound,
    a lower bound and an approximate expression of the density of the non-idle BSs
    are also derived. The performance impact of the IMC on network densification is
    shown to be significant. As the BS density surpasses the UE density, thus
    creating a surplus of BSs, the coverage probability will continuously increase
    toward one, which addresses the critical issue of coverage probability decrease
    caused by the LoS/NLoS transmissions. The results derived from our analysis
    shed valuable new light on the deployment and the operation of future dense
    SCNs in 5G.

    Well-Rounded Lattices for Coset Coding in MIMO Wiretap Channels

    Oliver W. Gnilke, Amaro Barreal, Alex Karrila, Ha Thanh Nguyen Tran, David A. Karpuk, Camilla Hollanti
    Subjects: Information Theory (cs.IT)

    The concept of well-rounded lattices has recently found important
    applications in the setting of a fading single-input single-output (SISO)
    wiretap channel. It has been shown that, under this setup, the property of
    being well-rounded is critical for minimizing the eavesdropper’s probability of
    correct decoding in lower SNR regimes. The superior performance of coset codes
    constructed from well-rounded lattices has been illustrated in several
    simulations.

    In the present article, this work is extended to fading multiple-input
    multiple-output (MIMO) wiretap channels, and similar design criteria as in the
    SISO case are derived. Further, explicit coset codes for Rayleigh fading MIMO
    wiretap channels are designed. In particular, it is shown through extensive
    simulations that sublattices of the well-known Alamouti code and Golden code
    which meet our design criteria perform better than scalar multiples of the code
    lattice for the same parameters.

    Opportunistic Network Decoupling With Virtual Full-Duplex Operation in Multi-Source Interfering Relay Networks

    Won-Yong Shin, Vien V. Mai, Bang Chul Jung, Hyun Jong Yang
    Comments: 22 pages, 5 figures, To appear in IEEE Transactions on Mobile Computing
    Subjects: Information Theory (cs.IT); Distributed, Parallel, and Cluster Computing (cs.DC); Networking and Internet Architecture (cs.NI)

    We introduce a new achievability scheme, termed opportunistic network
    decoupling (OND), operating in virtual full-duplex mode. In the scheme, a novel
    relay scheduling strategy is utilized in the $K imes N imes K$ channel with
    interfering relays, consisting of $K$ source–destination pairs and $N$
    half-duplex relays in-between them. A subset of relays using alternate relaying
    is opportunistically selected in terms of producing the minimum total
    interference level, thereby resulting in network decoupling. As our main
    result, it is shown that under a certain relay scaling condition, the OND
    protocol achieves $K$ degrees of freedom even in the presence of interfering
    links among relays. Numerical evaluation is also shown to validate the
    performance of the proposed OND. Our protocol basically operates in a fully
    distributed fashion along with local channel state information, thereby
    resulting in a relatively easy implementation.

    Compressed Hypothesis Testing: To Mix or Not to Mix?

    Myung Cho, Weiyu Xu, Lifeng Lai
    Comments: compressed sensing, hypothesis testing, Chernoff information, anomaly detection, anomalous random variable, quickest detection. arXiv admin note: substantial text overlap with arXiv:1208.2311
    Subjects: Information Theory (cs.IT)

    In this paper, we study the problem of determining $k$ anomalous random
    variables that have different probability distributions from the rest $(n-k)$
    random variables. Instead of sampling each individual random variable
    separately as in the conventional hypothesis testing, we propose to perform
    hypothesis testing using mixed observations that are functions of multiple
    random variables. We characterize the error exponents for correctly identifying
    the $k$ anomalous random variables under fixed time-invariant mixed
    observations, random time-varying mixed observations, and deterministic
    time-varying mixed observations. Our error exponent characterization is through
    newly introduced notions of emph{inner conditional Chernoff information} and
    emph{outer conditional Chernoff information}. It is demonstrated that mixed
    observations can strictly improve the error exponents of hypothesis testing,
    over separate observations of individual random variables. We further
    characterize the optimal sensing vector maximizing the error exponents, which
    lead to explicit constructions of the optimal mixed observations in special
    cases of hypothesis testing for Gaussian random variables. These results show
    that mixed observations of random variables can reduce the number of required
    samples in hypothesis testing applications.

    A Compressed Sampling and Dictionary Learning Framework for WDM-Based Distributed Fiber Sensing

    Christian Weiss, Abdelhak M. Zoubir
    Comments: Submitted on July 30. 2016, to [copyright 2016 Optical Society of America.]. One print or electronic copy may be made for personal use only. Systematic reproduction and distribution, duplication of any material in this paper for a fee or for commercial purposes, or modifications of the content of this paper are prohibited
    Subjects: Methodology (stat.ME); Information Theory (cs.IT)

    We propose a versatile framework that unifies compressed sampling and
    dictionary learning for fiber-optic sensing. It employs a redundant dictionary
    that is generated from a parametric signal model and establishes a relation to
    the physical quantity of interest. Imperfect prior knowledge is considered in
    terms of uncertain local an global parameters. To estimate a sparse
    representation and the dictionary parameters, we present a modified
    alternating-minimization algorithm that is equipped with a pre-processing
    routine to handle strong dictionary coherence. The performance is evaluated by
    simulations and experimental data for a practical system with common core
    architecture.

    Multi-Rate Control over AWGN Channels via Analog Joint Source-Channel Coding

    Anatoly Khina, Gustav M. Pettersson, Victoria Kostina, Babak Hassibi
    Comments: An extended version of a paper to be presented in CDC2016
    Subjects: Systems and Control (cs.SY); Information Theory (cs.IT)

    We consider the problem of controlling an unstable plant over an additive
    white Gaussian noise (AWGN) channel with a transmit power constraint, where the
    signaling rate of communication is larger than the sampling rate (for
    generating observations and applying control inputs) of the underlying plant.
    Such a situation is quite common since sampling is done at a rate that captures
    the dynamics of the plant and which is often much lower than the rate that can
    be communicated. This setting offers the opportunity of improving the system
    performance by employing multiple channel uses to convey a single message
    (output plant observation or control input). Common ways of doing so are
    through either repeating the message, or by quantizing it to a number of bits
    and then transmitting a channel coded version of the bits whose length is
    commensurate with the number of channel uses per sampled message. We argue that
    such “separated source and channel coding” can be suboptimal and propose to
    perform joint source-channel coding. Since the block length is short we obviate
    the need to go to the digital domain altogether and instead consider analog
    joint source-channel coding. For the case where the communication signaling
    rate is twice the sampling rate, we employ the Archimedean bi-spiral-based
    Shannon-Kotel’nikov analog maps to show significant improvement in stability
    margins and linear-quadratic Gaussian (LQG) costs over simple schemes that
    employ repetition.




沪ICP备19023445号-2号
友情链接