IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Thu, 8 Dec 2016

    我爱机器学习(52ml.net)发表于 2016-12-08 00:00:00
    love 0

    Neural and Evolutionary Computing

    Neural Turing Machines: Convergence of Copy Tasks

    Janez Aleš
    Comments: Predictor weights can be provided upon request
    Subjects: Neural and Evolutionary Computing (cs.NE)

    The architecture of neural Turing machines is differentiable end to end and
    is trainable with gradient descent methods. Due to their large unfolded depth
    Neural Turing Machines are hard to train and because of their linear access of
    complete memory they do not scale. Other architectures have been studied to
    overcome these difficulties. In this report we focus on improving the quality
    of prediction of the original linear memory architecture on copy and repeat
    copy tasks. Copy task predictions on sequences of length six times larger than
    those the neural Turing machine was trained on prove to be highly accurate and
    so do predictions of repeat copy tasks for sequences with twice the repetition
    number and twice the sequence length neural Turing machine was trained on.

    A simple and efficient SNN and its performance & robustness evaluation method to enable hardware implementation

    Anmol Biswas, Sidharth Prasad, Sandip Lashkare, Udayan Ganguly
    Comments: 9 page conference paper submitted at IJCNN 2017
    Subjects: Neural and Evolutionary Computing (cs.NE)

    Spiking Neural Networks (SNN) are more closely related to brain-like
    computation and inspire hardware implementation. This is enabled by small
    networks that give high performance on standard classification problems. In
    literature, typical SNNs are deep and complex in terms of network structure,
    weight update rules and learning algorithms. This makes it difficult to
    translate them into hardware. In this paper, we first develop a simple
    2-layered network in software which compares with the state of the art on four
    different standard data-sets within SNNs and has improved efficiency. For
    example, it uses lower number of neurons (3 x), synapses (3.5 x) and epochs for
    training (30 x) for the Fisher Iris classification problem. The efficient
    network is based on effective population coding and synapse-neuron co-design.
    Second, we develop a computationally efficient (15000 x) and accurate
    (correlation of 0.98) method to evaluate the performance of the network without
    standard recognition tests. Third, we show that the method produces a
    robustness metric that can be used to evaluate noise tolerance.

    Mode Regularized Generative Adversarial Networks

    Tong Che, Yanran Li, Athul Paul Jacob, Yoshua Bengio, Wenjie Li
    Comments: Under review as a conference paper at ICLR 2017
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Although Generative Adversarial Networks achieve state-of-the-art results on
    a variety of generative tasks, they are regarded as highly unstable and prone
    to miss modes. We argue that these bad behaviors of GANs are due to the very
    particular functional shape of the trained discriminators in high dimensional
    spaces, which can easily make training stuck or push probability mass in the
    wrong direction, towards that of higher concentration than that of the data
    generating distribution.

    We introduce several ways of regularizing the objective, which can
    dramatically stabilize the training of GAN models. We also show that our
    regularizers can help the fair distribution of probability mass across the
    modes of the data generating distribution, during the early phases of training
    and thus providing a unified solution to the missing modes problem.


    Computer Vision and Pattern Recognition

    DeMoN: Depth and Motion Network for Learning Monocular Stereo

    Benjamin Ummenhofer, Huizhong Zhou, Jonas Uhrig, Nikolaus Mayer, Eddy Ilg, Alexey Dosovitskiy, Thomas Brox
    Comments: Supplementary material included. Project page: this http URL
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper we formulate structure from motion as a learning problem. We
    train a convolutional network end-to-end to compute depth and camera motion
    from successive, unconstrained image pairs. The architecture is composed of
    multiple stacked encoder-decoder networks, the core part being an iterative
    network that is able to improve its own predictions. The network estimates not
    only depth and motion, but additionally surface normals, optical flow between
    the images and confidence of the matching. A crucial component of the approach
    is a training loss based on spatial relative differences. Compared to
    traditional two-frame structure from motion methods, results are more accurate
    and more robust. In contrast to the popular depth-from-single-image networks,
    DeMoN learns the concept of matching and, thus, better generalizes to
    structures not seen during training.

    Automatic Detection of ADHD and ASD from Expressive Behaviour in RGBD Data

    Shashank Jaiswal, Michel Valstar, Alinda Gillott, David Daley
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Attention Deficit Hyperactivity Disorder (ADHD) and Autism Spectrum Disorder
    (ASD) are neurodevelopmental conditions which impact on a significant number of
    children and adults. Currently, the diagnosis of such disorders is done by
    experts who employ standard questionnaires and look for certain behavioural
    markers through manual observation. Such methods for their diagnosis are not
    only subjective, difficult to repeat, and costly but also extremely time
    consuming. In this work, we present a novel methodology to aid diagnostic
    predictions about the presence/absence of ADHD and ASD by automatic visual
    analysis of a person’s behaviour. To do so, we conduct the questionnaires in a
    computer-mediated way while recording participants with modern RGBD
    (Colour+Depth) sensors. In contrast to previous automatic approaches which have
    focussed only detecting certain behavioural markers, our approach provides a
    fully automatic end-to-end system for directly predicting ADHD and ASD in
    adults. Using state of the art facial expression analysis based on Dynamic Deep
    Learning and 3D analysis of behaviour, we attain classification rates of 96%
    for Controls vs Condition (ADHD/ASD) group and 94% for Comorbid (ADHD+ASD) vs
    ASD only group. We show that our system is a potentially useful time saving
    contribution to the diagnostic field of ADHD and ASD.

    Differential Angular Imaging for Material Recognition

    Jia Xue, Hang Zhang, Kristin Dana, Ko Nishino
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Material recognition for real-world outdoor surfaces has become increasingly
    important for computer vision to support its operation “in the wild.”
    Computational surface modeling that underlies material recognition has
    transitioned from reflectance modeling using in-lab controlled radiometric
    measurements to image-based representations based on internet-mined images of
    materials captured in the scene. We propose to take a middle-ground approach
    for material recognition that takes advantage of both rich radiometric cues and
    flexible image capture. We realize this by developing a framework for
    differential angular imaging, where small angular variations in image capture
    provide an enhanced appearance representation and significant recognition
    improvement. We build a large-scale material database, Ground Terrain in
    Outdoor Scenes (GTOS) database, geared towards real use for autonomous agents.
    The database consists of over 30,000 images covering 40 classes of outdoor
    ground terrain under varying weather and lighting conditions. We develop a
    novel approach for material recognition called a Differential Angular Imaging
    Network (DAIN) to fully leverage this large dataset. With this novel network
    architecture, we extract characteristics of materials encoded in the angular
    and spatial gradients of their appearance. Our results show that DAIN achieves
    recognition performance that surpasses single view or coarsely quantized
    multiview images. These results demonstrate the effectiveness of differential
    angular imaging as a means for flexible, in-place material recognition.

    Pano2Vid: Automatic Cinematography for Watching 360(^{circ}) Videos

    Yu-Chuan Su, Dinesh Jayaraman, Kristen Grauman
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We introduce the novel task of Pano2Vid (-) automatic cinematography in
    panoramic 360(^{circ}) videos. Given a 360(^{circ}) video, the goal is to
    direct an imaginary camera to virtually capture natural-looking normal
    field-of-view (NFOV) video. By selecting “where to look” within the panorama at
    each time step, Pano2Vid aims to free both the videographer and the end viewer
    from the task of determining what to watch. Towards this goal, we first compile
    a dataset of 360(^{circ}) videos downloaded from the web, together with
    human-edited NFOV camera trajectories to facilitate evaluation. Next, we
    propose AutoCam, a data-driven approach to solve the Pano2Vid task. AutoCam
    leverages NFOV web video to discriminatively identify space-time “glimpses” of
    interest at each time instant, and then uses dynamic programming to select
    optimal human-like camera trajectories. Through experimental evaluation on
    multiple newly defined Pano2Vid performance measures against several baselines,
    we show that our method successfully produces informative videos that could
    conceivably have been captured by human videographers.

    Spatially Adaptive Computation Time for Residual Networks

    Michael Figurnov, Maxwell D. Collins, Yukun Zhu, Li Zhang, Jonathan Huang, Dmitry Vetrov, Ruslan Salakhutdinov
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    This paper proposes a deep learning architecture based on Residual Network
    that dynamically adjusts the number of executed layers for the regions of the
    image. This architecture is end-to-end trainable, deterministic and
    problem-agnostic. It is therefore applicable without any modifications to a
    wide range of computer vision problems such as image classification, object
    detection and image segmentation. We present experimental results showing that
    this model improves the computational efficiency of Residual Networks on the
    challenging ImageNet classification and COCO object detection datasets.
    Additionally, we evaluate the computation time maps on the visual saliency
    dataset cat2000 and find that they correlate surprisingly well with human eye
    fixation positions.

    Global Hypothesis Generation for 6D Object Pose Estimation

    Frank Michel, Alexander Kirillov, Erix Brachmann, Alexander Krull, Stefan Gumhold, Bogdan Savchynskyy, Carsten Rother
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper addresses the task of estimating the 6D pose of a known 3D object
    from a single RGB-D image. Most modern approaches solve this task in three
    steps: i) Compute local features; ii) Generate a pool of pose-hypotheses; iii)
    Select and refine a pose from the pool. This work focuses on the second step.
    While all existing approaches generate the hypotheses pool via local reasoning,
    e.g. RANSAC or Hough-voting, we are the first to show that global reasoning is
    beneficial at this stage. In particular, we formulate a novel fully-connected
    Conditional Random Field (CRF) that outputs a very small number of
    pose-hypotheses. Despite the potential functions of the CRF being non-Gaussian,
    we give a new and efficient two-step optimization procedure, with some
    guarantees for optimality. We utilize our global hypotheses generation
    procedure to produce results that exceed state-of-the-art for the challenging
    “Occluded Object Dataset”.

    Exploring the potential of combining time of flight and thermal infrared cameras for person detection

    Wim Abbeloos, Toon Goedemé
    Comments: Proceedings of the 10th International Conference on Informatics in Control, Automation and Robotics
    Journal-ref: Proceedings of the International Conference on Informatics in
    Control, Automation and Robotics (2013) 464-470
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Combining new, low-cost thermal infrared and time-of-flight range sensors
    provides new opportunities. In this position paper we explore the possibilities
    of combining these sensors and using their fused data for person detection. The
    proposed calibration approach for this sensor combination differs from the
    traditional stereo camera calibration in two fundamental ways. A first
    distinction is that the spectral sensitivity of the two sensors differs
    significantly. In fact, there is no sensitivity range overlap at all. A second
    distinction is that their resolution is typically very low, which requires
    special attention. We assume a situation in which the sensors’ relative
    position is known, but their orientation is unknown. In addition, some of the
    typical measurement errors are discussed, and methods to compensate for them
    are proposed. We discuss how the fused data could allow increased accuracy and
    robustness without the need for complex algorithms requiring large amounts of
    computational power and training data.

    Process Monitoring of Extrusion Based 3D Printing via Laser Scanning

    Matthias Faes, Wim Abbeloos, Frederik Vogeler, Hans Valkenaers, Kurt Coppens, Toon Goedemé, Eleonora Ferraris
    Comments: International Conference on Polymers and Moulds Innovations(PMI) 2014
    Journal-ref: Conference Proceedings PMI 6 (2014) 363-367
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Extrusion based 3D Printing (E3DP) is an Additive Manufacturing (AM)
    technique that extrudes thermoplastic polymer in order to build up components
    using a layerwise approach. Hereby, AM typically requires long production times
    in comparison to mass production processes such as Injection Molding. Failures
    during the AM process are often only noticed after build completion and
    frequently lead to part rejection because of dimensional inaccuracy or lack of
    mechanical performance, resulting in an important loss of time and material. A
    solution to improve the accuracy and robustness of a manufacturing technology
    is the integration of sensors to monitor and control process state-variables
    online. In this way, errors can be rapidly detected and possibly compensated at
    an early stage. To achieve this, we integrated a modular 2D laser triangulation
    scanner into an E3DP machine and analyzed feedback signals. A 2D laser
    triangulation scanner was selected here owing to the very compact size,
    achievable accuracy and the possibility of capturing geometrical 3D data. Thus,
    our implemented system is able to provide both quantitative and qualitative
    information. Also, in this work, first steps towards the development of a
    quality control loop for E3DP processes are presented and opportunities are
    discussed.

    Embedded Line Scan Image Sensors: The Low Cost Alternative for High Speed Imaging

    Stef Van Wolputte, Wim Abbeloos, Stijn Helsen, Abdellatif Bey-Temsamani, Toon Goedemé
    Comments: 2015 International Conference on Image Processing Theory, Tools and Applications (IPTA)
    Journal-ref: Proceedings of the International Conference on Image Processing
    Theory, Tools and Applications (2015) 543-549
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper we propose a low-cost high-speed imaging line scan system. We
    replace an expensive industrial line scan camera and illumination with a
    custom-built set-up of cheap off-the-shelf components, yielding a measurement
    system with comparative quality while costing about 20 times less. We use a
    low-cost linear (1D) image sensor, cheap optics including a LED-based or
    LASER-based lighting and an embedded platform to process the images. A
    step-by-step method to design such a custom high speed imaging system and
    select proper components is proposed. Simulations allowing to predict the final
    image quality to be obtained by the set-up has been developed. Finally, we
    applied our method in a lab, closely representing the real-life cases. Our
    results shows that our simulations are very accurate and that our low-cost line
    scan set-up acquired image quality compared to the high-end commercial vision
    system, for a fraction of the price.

    A Functional Regression approach to Facial Landmark Tracking

    Enrique Sánchez-Lozano, Georgios Tzimiropoulos, Brais Martinez, Fernando De la Torre, Michel Valstar
    Comments: Manuscript submitted to review at TPAMI, extending ECCV 2012 and ECCV 2016 papers on Continuous Regression
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Linear regression is a fundamental building block in many face detection and
    tracking algorithms, typically used to predict shape displacements from image
    features through a linear mapping. This paper presents a Functional Regression
    solution to the least squares problem, which we coin Continuous Regression,
    resulting in the first real-time incremental face tracker. Contrary to prior
    work in Functional Regression, in which B-splines or Fourier series were used,
    we propose to approximate the input space by its first-order Taylor expansion,
    yielding a closed-form solution for the continuous domain of displacements. We
    then extend the continuous least squares problem to correlated variables, and
    demonstrate the generalisation of our approach. We incorporate Continuous
    Regression into the cascaded regression framework, and show its computational
    benefits for both training and testing. We then present a fast approach for
    incremental learning within Cascaded Continuous Regression, coined iCCR, and
    show that its complexity allows real-time face tracking, being 20 times faster
    than the state of the art. To the best of our knowledge, this is the first
    incremental face tracker that is shown to operate in real-time. We show that
    iCCR achieves state-of-the-art performance in the 300-VW dataset, the most
    recent, large-scale benchmark for face tracking.

    Template Matching with Deformable Diversity Similarity

    Itamar Talmi, Roey Mechrez, Lihi Zelnik-Manor
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We propose a novel measure for template matching named Deformable Diversity
    Similarity — based on the diversity of feature matches between a target image
    window and the template. We rely on both local appearance and geometric
    information that jointly lead to a powerful approach for matching. Our key
    contribution is a similarity measure, that is robust to complex deformations,
    significant background clutter, and occlusions. Empirical evaluation on the
    most up-to-date benchmark shows that our method outperforms the current
    state-of-the-art in its detection accuracy while improving computational
    complexity.

    Saliency Driven Image Manipulation

    Roey Mechrez, Eli Shechtman, Lihi Zelnik-Manor
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Have you ever taken a picture only to find out that an unimportant background
    object ended up being overly salient? Or one of those team sports photos where
    your favorite player blends with the rest? Wouldn’t it be nice if you could
    tweak these pictures just a little bit so that the distractor would be
    attenuated and your favorite player will stand-out among her peers?
    Manipulating images in order to control the saliency of objects is the goal of
    this paper. We propose an approach that considers the internal color and
    saliency properties of the image. It changes the saliency map via an
    optimization framework that relies on patch-based manipulation using only
    patches from within the same image to achieve realistic looking results.
    Applications include object enhancement, distractors attenuation and background
    decluttering. Comparing our method to previous ones shows significant
    improvement, both in the achieved saliency manipulation and in the realistic
    appearance of the resulting images.

    Fusion of Range and Thermal Images for Person Detection

    Wim Abbeloos, Toon Goedemé
    Comments: VII International Conference on Electrical Engineering FIE 2014, Santiago de Cuba
    Journal-ref: Proceedings Conferencia Internacional de Ingenier’ia El’ectrica
    7 (2014) 1-4
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Detecting people in images is a challenging problem. Differences in pose,
    clothing and lighting, along with other factors, cause a lot of variation in
    their appearance. To overcome these issues, we propose a system based on fused
    range and thermal infrared images. These measurements show considerably less
    variation and provide more meaningful information. We provide a brief
    introduction to the sensor technology used and propose a calibration method.
    Several data fusion algorithms are compared and their performance is assessed
    on a simulated data set. The results of initial experiments on real data are
    analyzed and the measurement errors and the challenges they present are
    discussed. The resulting fused data are used to efficiently detect people in a
    fixed camera set-up. The system is extended to include person tracking.

    Deep Multi-scale Convolutional Neural Network for Dynamic Scene Deblurring

    Seungjun Nah, Tae Hyun Kim, Kyoung Mu Lee
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Non-uniform blind deblurring for general dynamic scenes is a challenging
    computer vision problem since blurs are caused by camera shake, scene depth as
    well as multiple object motions. To remove these complicated motion blurs,
    conventional energy optimization based methods rely on simple assumptions such
    that blur kernel is partially uniform or locally linear. Moreover, recent
    machine learning based methods also depend on synthetic blur datasets generated
    under these assumptions. This makes conventional deblurring methods fail to
    remove blurs where blur kernel is difficult to approximate or parameterize
    (e.g. object motion boundaries). In this work, we propose a multi-scale
    convolutional neural network that restores blurred images caused by various
    sources in an end-to-end manner. Furthermore, we present multi-scale loss
    function that mimics conventional coarse-to-fine approaches. Moreover, we
    propose a new large scale dataset that provides pairs of realistic blurry image
    and the corresponding ground truth sharp image that are obtained by a
    high-speed camera. With the proposed model trained on this dataset, we
    demonstrate empirically that our method achieves the state-of-the-art
    performance in dynamic scene deblurring not only qualitatively, but also
    quantitatively.

    Semi-Supervised Learning And Graph Cuts For Consensus Based Medical Image Segmentation

    Dwarikanath Mahapatra
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Medical image segmentation requires consensus ground truth segmentations to
    be derived from multiple expert annotations. A novel approach is proposed that
    obtains consensus segmentations from experts using graph cuts (GC) and semi
    supervised learning (SSL). Popular approaches use iterative Expectation
    Maximization (EM) to estimate the final annotation and quantify annotator’s
    performance. Such techniques pose the risk of getting trapped in local minima.
    We propose a self consistency (SC) score to quantify annotator consistency
    using low level image features. SSL is used to predict missing annotations by
    considering global features and local image consistency. The SC score also
    serves as the penalty cost in a second order Markov random field (MRF) cost
    function optimized using graph cuts to derive the final consensus label. Graph
    cut obtains a global maximum without an iterative procedure. Experimental
    results on synthetic images, real data of Crohn’s disease patients and retinal
    images show our final segmentation to be accurate and more consistent than
    competing methods.

    Re-identification of Humans in Crowds using Personal, Social and Environmental Constraints

    Shayan Modiri Assari, Haroon Idrees, Mubarak Shah
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper addresses the problem of human re-identification across
    non-overlapping cameras in crowds.Re-identification in crowded scenes is a
    challenging problem due to large number of people and frequent occlusions,
    coupled with changes in their appearance due to different properties and
    exposure of cameras. To solve this problem, we model multiple Personal, Social
    and Environmental (PSE) constraints on human motion across cameras. The
    personal constraints include appearance and preferred speed of each individual
    assumed to be similar across the non-overlapping cameras. The social influences
    (constraints) are quadratic in nature, i.e. occur between pairs of individuals,
    and modeled through grouping and collision avoidance. Finally, the
    environmental constraints capture the transition probabilities between gates
    (entrances / exits) in different cameras, defined as multi-modal distributions
    of transition time and destination between all pairs of gates. We incorporate
    these constraints into an energy minimization framework for solving human
    re-identification. Assigning (1-1) correspondence while modeling PSE
    constraints is NP-hard. We present a stochastic local search algorithm to
    restrict the search space of hypotheses, and obtain (1-1) solution in the
    presence of linear and quadratic PSE constraints. Moreover, we present an
    alternate optimization using Frank-Wolfe algorithm that solves the convex
    approximation of the objective function with linear relaxation on binary
    variables, and yields an order of magnitude speed up over stochastic local
    search with minor drop in performance. We evaluate our approach using
    Cumulative Matching Curves as well (1-1) assignment on several thousand frames
    of Grand Central, PRID and DukeMTMC datasets, and obtain significantly better
    results compared to existing re-identification methods.

    A Deep 3D Convolutional Neural Network Based Design for Manufacturability Framework

    Aditya Balu, Kin Gwn Lore, Gavin Young, Adarsh Krishnamurthy, Soumik Sarkar
    Comments: 9 Pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Deep 3D Convolutional Neural Networks (3D-CNN) are traditionally used for
    object recognition, video data analytics and human gesture recognition. In this
    paper, we present a novel application of 3D-CNNs in understanding
    difficult-to-manufacture features from computer-aided design (CAD) models to
    develop a decision support tool for cyber-enabled manufacturing. Traditionally,
    design for manufacturability (DFM) rules are hand-crafted and used to
    accelerate the engineering product design cycle by integrating
    manufacturability analysis during the design stage. Such a practice relies on
    the experience and training of the designer to create a complex component that
    is manufacturable. However, even after careful design, the inclusion of certain
    features might cause the part to be non-manufacturable. In this paper, we
    develop a framework using Deep 3D-CNNs to learn salient features from a CAD
    model of a mechanical part and determine if the part can be manufactured or
    not. CAD models of different manufacturable and non-manufacturable parts are
    generated using a solid modeling kernel and then converted into 3D voxel data
    using a fast GPU-accelerated voxelization algorithm. The voxel data is used to
    train a 3D-CNN model for manufacturability classification. Feature space and
    filter visualization is also performed to understand the learning capability in
    the context of manufacturability features. We demonstrate that the proposed
    3D-CNN based DFM framework is able to learn the DFM rules for
    non-manufacturable features without a human prior. The framework can be
    extended to identify a large variety of difficult-to-manufacture features at
    multiple spatial scales leading to a real-time decision support system for DFM.

    Richer Convolutional Features for Edge Detection

    Yun Liu, Ming-Ming Cheng, Xiaowei Hu, Kai Wang, Xiang Bai
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we propose an accurate edge detector using richer
    convolutional features (RCF). Since objects in nature images have various
    scales and aspect ratios, the automatically learned rich hierarchical
    representations by CNNs are very critical and effective to detect edges and
    object boundaries. And the convolutional features gradually become coarser with
    receptive fields increasing. Based on these observations, our proposed network
    architecture makes full use of multiscale and multi-level information to
    perform the image-to-image edge prediction by combining all of the useful
    convolutional features into a holistic framework. It is the first attempt to
    adopt such rich convolutional features in computer vision tasks. Using VGG16
    network, we achieve sArt results on several available datasets. When
    evaluating on the well-known BSDS500 benchmark, we achieve ODS F-measure of
    extbf{.811} while retaining a fast speed ( extbf{8} FPS). Besides, our fast
    version of RCF achieves ODS F-measure of extbf{.806} with extbf{30} FPS.

    Mining Pixels: Weakly Supervised Semantic Segmentation Using Image Labels

    Qinbin Hou, Puneet Kumar Dokania, Daniela Massiceti, Yunchao Wei, Ming-Ming Cheng, Philip Torr
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We consider the task of learning a classifier for semantic segmentation using
    weak supervision, in this case, image labels specifying the objects within the
    image. Our method uses deep convolutional neural networks (CNNs) and adopts an
    Expectation-Maximization (EM) based approach maintaining the uncertainty on
    pixel labels. We focus on the following three crucial aspects of the EM based
    approach: (i) initialization; (ii) latent posterior estimation (E step) and
    (iii) the parameter update (M step). We show that {em saliency} and {em
    attention} maps provide good cues to learn an initialization model and allows
    us to skip the bad local minimum to which EM methods are otherwise
    traditionally prone. In order to update the parameters, we propose minimizing
    the combination of the standard extit{softmax} loss and the KL divergence
    between the true latent posterior and the likelihood given by the CNN. We argue
    that this combination is more robust to wrong predictions made by the
    expectation step of the EM method. We support this argument with empirical and
    visual results. We additionally incorporate an approximate
    intersection-over-union (IoU) term into the loss function for better parameter
    estimation. Extensive experiments and discussions show that: (i) our method is
    very simple and intuitive; (ii) requires only image-level labels; and (iii)
    consistently outperforms other weakly supervised state-of-the-art methods with
    a very high margin on the PASCAL VOC 2012 dataset.

    Semi-Supervised Detection of Extreme Weather Events in Large Climate Datasets

    Evan Racah, Christopher Beckham, Tegan Maharaj, Prabhat, Christopher Pal
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

    The detection and identification of extreme weather events in large scale
    climate simulations is an important problem for risk management, informing
    governmental policy decisions and advancing our basic understanding of the
    climate system. Recent work has shown that fully supervised convolutional
    neural networks (CNNs) can yield acceptable accuracy for classifying well-known
    types of extreme weather events when large amounts of labeled data are
    available. However, there are many different types of spatially localized
    climate patterns of interest (including hurricanes, extra-tropical cyclones,
    weather fronts, blocking events, etc.) found in simulation data for which
    labeled data is not available at large scale for all simulations of interest.
    We present a multichannel spatiotemporal encoder-decoder CNN architecture for
    semi-supervised bounding box prediction and exploratory data analysis. This
    architecture is designed to fully model multi-channel simulation data, temporal
    dynamics and unlabelled data within a reconstruction and prediction framework
    so as to improve the detection of a wide range of extreme weather events. Our
    architecture can be viewed as a 3D convolutional autoencoder with an additional
    modified one-pass bounding box regression loss. We demonstrate that our
    approach is able to leverage temporal information and unlabelled data to
    improve localization of extreme weather events. Further, we explore the
    representations learned by our model in order to better understand this
    important data, and facilitate further work in understanding and mitigating the
    effects of climate change.


    Artificial Intelligence

    Extend natural neighbor: a novel classification method with self-adaptive neighborhood parameters in different stages

    Ji Feng, Qingsheng Zhu, Jinlong Huang, Lijun Yang
    Comments: 10 pages, 2 figures, 2 tables
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG)

    Various kinds of k-nearest neighbor (KNN) based classification methods are
    the bases of many well-established and high-performance pattern-recognition
    techniques, but both of them are vulnerable to their parameter choice.
    Essentially, the challenge is to detect the neighborhood of various data sets,
    while utterly ignorant of the data characteristic. This article introduces a
    new supervised classification method: the extend natural neighbor (ENaN)
    method, and shows that it provides a better classification result without
    choosing the neighborhood parameter artificially. Unlike the original KNN based
    method which needs a prior k, the ENaNE method predicts different k in
    different stages. Therefore, the ENaNE method is able to learn more from
    flexible neighbor information both in training stage and testing stage, and
    provide a better classification result.

    Knowledge Representation in Graphs using Convolutional Neural Networks

    Armando Vieira
    Subjects: Artificial Intelligence (cs.AI)

    Knowledge Graphs (KG) constitute a flexible representation of complex
    relationships between entities particularly useful for biomedical data. These
    KG, however, are very sparse with many missing edges (facts) and the
    visualisation of the mesh of interactions nontrivial. Here we apply a
    compositional model to embed nodes and relationships into a vectorised semantic
    space to perform graph completion. A visualisation tool based on Convolutional
    Neural Networks and Self-Organised Maps (SOM) is proposed to extract high-level
    insights from the KG. We apply this technique to a subset of CTD, containing
    interactions of compounds with human genes / proteins and show that the
    performance is comparable to the one obtained by structural models.

    Measuring the non-asymptotic convergence of sequential Monte Carlo samplers using probabilistic programming

    Marco F. Cusumano-Towner, Vikash K. Mansinghka
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    A key limitation of sampling algorithms for approximate inference is that it
    is difficult to quantify their approximation error. Widely used sampling
    schemes, such as sequential importance sampling with resampling and
    Metropolis-Hastings, produce output samples drawn from a distribution that may
    be far from the target posterior distribution. This paper shows how to
    upper-bound the symmetric KL divergence between the output distribution of a
    broad class of sequential Monte Carlo (SMC) samplers and their target posterior
    distributions, subject to assumptions about the accuracy of a separate
    gold-standard sampler. The proposed method applies to samplers that combine
    multiple particles, multinomial resampling, and rejuvenation kernels. The
    experiments show the technique being used to estimate bounds on the divergence
    of SMC samplers for posterior inference in a Bayesian linear regression model
    and a Dirichlet process mixture model.

    Effect of Reward Function Choices in Risk-Averse Reinforcement Learning

    Shuai Ma, Jiayuan Yu
    Comments: 23 pages, 4figures
    Subjects: Artificial Intelligence (cs.AI)

    This paper studies Value-at-Risk problems in finite-horizon Markov decision
    processes (MDPs) with finite state space and two forms of reward function.
    Firstly we study the effect of reward function on two criteria in a
    short-horizon MDP. Secondly, for long-horizon MDPs, we estimate the total
    reward distribution in a finite-horizon Markov chain (MC) with the help of
    spectral theory and the central limit theorem, and present a transformation
    algorithm for the MCs with a three-argument reward function and a salvage
    reward.

    A Multi-Pass Approach to Large-Scale Connectomics

    Yaron Meirovitch, Alexander Matveev, Hayk Saribekyan, David Budden, David Rolnick, Gergely Odor, Seymour Knowles-Barley Thouis Raymond Jones, Hanspeter Pfister, Jeff William Lichtman, Nir Shavit
    Comments: 18 pages, 10 figures
    Subjects: Quantitative Methods (q-bio.QM); Artificial Intelligence (cs.AI); Neurons and Cognition (q-bio.NC)

    The field of connectomics faces unprecedented “big data” challenges. To
    reconstruct neuronal connectivity, automated pixel-level segmentation is
    required for petabytes of streaming electron microscopy data. Existing
    algorithms provide relatively good accuracy but are unacceptably slow, and
    would require years to extract connectivity graphs from even a single cubic
    millimeter of neural tissue. Here we present a viable real-time solution, a
    multi-pass pipeline optimized for shared-memory multicore systems, capable of
    processing data at near the terabyte-per-hour pace of multi-beam electron
    microscopes. The pipeline makes an initial fast-pass over the data, and then
    makes a second slow-pass to iteratively correct errors in the output of the
    fast-pass. We demonstrate the accuracy of a sparse slow-pass reconstruction
    algorithm and suggest new methods for detecting morphological errors. Our
    fast-pass approach provided many algorithmic challenges, including the design
    and implementation of novel shallow convolutional neural nets and the
    parallelization of watershed and object-merging techniques. We use it to
    reconstruct, from image stack to skeletons, the full dataset of Kasthuri et al.
    (463 GB capturing 120,000 cubic microns) in a matter of hours on a single
    multicore machine rather than the weeks it has taken in the past on much larger
    distributed systems.


    Information Retrieval

    An Information-theoretic Approach to Machine-oriented Music Summarization

    Francisco Raposo, David Martins de Matos, Ricardo Ribeiro
    Comments: 5 pages, 1 table
    Subjects: Information Retrieval (cs.IR); Learning (cs.LG); Sound (cs.SD)

    Applying generic media-agnostic summarization to music allows for higher
    efficiency in automatic processing, storage, and communication of datasets
    while also alleviating copyright issues. This process has already been proven
    useful in the context of music genre classification. In this paper, we
    generalize conclusions from previous work by evaluating generic the impact of
    summarization in music from a probabilistic perspective and agnostic relative
    to certain tasks. We estimate Gaussian distributions for original and
    summarized songs and compute their relative entropy to measure how much
    information is lost in the summarization process. Based on this observation, we
    further propose a simple yet expressive summarization method that objectively
    outperforms previous methods and is better suited to avoid copyright issues. We
    present results suggesting that relative entropy is a good predictor of
    summarization performance in the context of tasks relying on a bag-of-features
    assumption.


    Computation and Language

    Multitask learning for semantic sequence prediction under varying data conditions

    Héctor Martínez Alonso, Barbara Plank
    Comments: To appear in EACL 2017
    Subjects: Computation and Language (cs.CL)

    Multitask learning has been applied successfully to a range of tasks, mostly
    morphosyntactic. However, little is known on when MTL works and whether there
    are data characteristics that help to determine the success of MTL. In this
    paper we evaluate a range of semantic sequence labeling tasks in a MTL setup.
    We examine different auxiliary task configurations, amongst which a novel
    setup, and correlate their impact to data-dependent conditions. Our results
    show that MTL is not always effective, because significant improvements are
    obtained only for 1 out of 5 tasks. When successful, auxiliary tasks with
    compact and more uniform label distributions are preferable.


    Distributed, Parallel, and Cluster Computing

    Asynchronous approach in the plane: A deterministic polynomial algorithm

    Sébastien Bouchard, Marjorie Bournat, Yoann Dieudonné, Swan Dubois, Franck Petit
    Subjects: Data Structures and Algorithms (cs.DS); Distributed, Parallel, and Cluster Computing (cs.DC)

    In this paper we study the task of approach of two mobile agents having the
    same limited range of vision and moving asynchronously in the plane. This task
    consists in getting them in finite time within each other’s range of vision.
    The agents execute the same deterministic algorithm and are assumed to have a
    compass showing the cardinal directions as well as a unit measure. On the other
    hand, they do not share any global coordinates system (like GPS), cannot
    communicate and have distinct labels. Each agent knows its label but does not
    know the label of the other agent or the initial position of the other agent
    relative to its own. The route of an agent is a sequence of segments that are
    subsequently traversed in order to achieve approach. For each agent, the
    computation of its route depends only on its algorithm and its label. An
    adversary chooses the initial positions of both agents in the plane and
    controls the way each of them moves along every segment of the routes, in
    particular by arbitrarily varying the speeds of the agents. A deterministic
    approach algorithm is a deterministic algorithm that always allows two agents
    with any distinct labels to solve the task of approach regardless of the
    choices and the behavior of the adversary. The cost of a complete execution of
    an approach algorithm is the length of both parts of route travelled by the
    agents until approach is completed. Let (Delta) and (l) be the initial
    distance separating the agents and the length of the shortest label,
    respectively. Assuming that (Delta) and (l) are unknown to both agents, does
    there exist a deterministic approach algorithm always working at a cost that is
    polynomial in (Delta) and (l)? In this paper, we provide a positive answer to
    the above question by designing such an algorithm.


    Learning

    A Communication-Efficient Parallel Method for Group-Lasso

    Binghong Chen, Jun Zhu
    Comments: 7 pages
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    Group-Lasso (gLasso) identifies important explanatory factors in predicting
    the response variable by considering the grouping structure over input
    variables. However, most existing algorithms for gLasso are not scalable to
    deal with large-scale datasets, which are becoming a norm in many applications.
    In this paper, we present a divide-and-conquer based parallel algorithm
    (DC-gLasso) to scale up gLasso in the tasks of regression with grouping
    structures. DC-gLasso only needs two iterations to collect and aggregate the
    local estimates on subsets of the data, and is provably correct to recover the
    true model under certain conditions. We further extend it to deal with
    overlappings between groups. Empirical results on a wide range of synthetic and
    real-world datasets show that DC-gLasso can significantly improve the time
    efficiency without sacrificing regression accuracy.

    Mode Regularized Generative Adversarial Networks

    Tong Che, Yanran Li, Athul Paul Jacob, Yoshua Bengio, Wenjie Li
    Comments: Under review as a conference paper at ICLR 2017
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Although Generative Adversarial Networks achieve state-of-the-art results on
    a variety of generative tasks, they are regarded as highly unstable and prone
    to miss modes. We argue that these bad behaviors of GANs are due to the very
    particular functional shape of the trained discriminators in high dimensional
    spaces, which can easily make training stuck or push probability mass in the
    wrong direction, towards that of higher concentration than that of the data
    generating distribution.

    We introduce several ways of regularizing the objective, which can
    dramatically stabilize the training of GAN models. We also show that our
    regularizers can help the fair distribution of probability mass across the
    modes of the data generating distribution, during the early phases of training
    and thus providing a unified solution to the missing modes problem.

    An Information-theoretic Approach to Machine-oriented Music Summarization

    Francisco Raposo, David Martins de Matos, Ricardo Ribeiro
    Comments: 5 pages, 1 table
    Subjects: Information Retrieval (cs.IR); Learning (cs.LG); Sound (cs.SD)

    Applying generic media-agnostic summarization to music allows for higher
    efficiency in automatic processing, storage, and communication of datasets
    while also alleviating copyright issues. This process has already been proven
    useful in the context of music genre classification. In this paper, we
    generalize conclusions from previous work by evaluating generic the impact of
    summarization in music from a probabilistic perspective and agnostic relative
    to certain tasks. We estimate Gaussian distributions for original and
    summarized songs and compute their relative entropy to measure how much
    information is lost in the summarization process. Based on this observation, we
    further propose a simple yet expressive summarization method that objectively
    outperforms previous methods and is better suited to avoid copyright issues. We
    present results suggesting that relative entropy is a good predictor of
    summarization performance in the context of tasks relying on a bag-of-features
    assumption.

    Extend natural neighbor: a novel classification method with self-adaptive neighborhood parameters in different stages

    Ji Feng, Qingsheng Zhu, Jinlong Huang, Lijun Yang
    Comments: 10 pages, 2 figures, 2 tables
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG)

    Various kinds of k-nearest neighbor (KNN) based classification methods are
    the bases of many well-established and high-performance pattern-recognition
    techniques, but both of them are vulnerable to their parameter choice.
    Essentially, the challenge is to detect the neighborhood of various data sets,
    while utterly ignorant of the data characteristic. This article introduces a
    new supervised classification method: the extend natural neighbor (ENaN)
    method, and shows that it provides a better classification result without
    choosing the neighborhood parameter artificially. Unlike the original KNN based
    method which needs a prior k, the ENaNE method predicts different k in
    different stages. Therefore, the ENaNE method is able to learn more from
    flexible neighbor information both in training stage and testing stage, and
    provide a better classification result.

    Spatially Adaptive Computation Time for Residual Networks

    Michael Figurnov, Maxwell D. Collins, Yukun Zhu, Li Zhang, Jonathan Huang, Dmitry Vetrov, Ruslan Salakhutdinov
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    This paper proposes a deep learning architecture based on Residual Network
    that dynamically adjusts the number of executed layers for the regions of the
    image. This architecture is end-to-end trainable, deterministic and
    problem-agnostic. It is therefore applicable without any modifications to a
    wide range of computer vision problems such as image classification, object
    detection and image segmentation. We present experimental results showing that
    this model improves the computational efficiency of Residual Networks on the
    challenging ImageNet classification and COCO object detection datasets.
    Additionally, we evaluate the computation time maps on the visual saliency
    dataset cat2000 and find that they correlate surprisingly well with human eye
    fixation positions.

    Large-Margin Softmax Loss for Convolutional Neural Networks

    Weiyang Liu, Yandong Wen, Zhiding Yu, Meng Yang
    Comments: Published in ICML 2016. Revised some typos
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Cross-entropy loss together with softmax is arguably one of the most common
    used supervision components in convolutional neural networks (CNNs). Despite
    its simplicity, popularity and excellent performance, the component does not
    explicitly encourage discriminative learning of features. In this paper, we
    propose a generalized large-margin softmax (L-Softmax) loss which explicitly
    encourages intra-class compactness and inter-class separability between learned
    features. Moreover, L-Softmax not only can adjust the desired margin but also
    can avoid overfitting. We also show that the L-Softmax loss can be optimized by
    typical stochastic gradient descent. Extensive experiments on four benchmark
    datasets demonstrate that the deeply-learned features with L-softmax loss
    become more discriminative, hence significantly boosting the performance on a
    variety of visual classification and verification tasks.

    Fast Adaptation in Generative Models with Generative Matching Networks

    Sergey Bartunov, Dmitry P. Vetrov
    Comments: Submitted to ICLR 2017
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Despite recent advances, the remaining bottlenecks in deep generative models
    are necessity of extensive training and difficulties with generalization from
    small number of training examples. Both problems may be addressed by
    conditional generative models that are trained to adapt the generative
    distribution to additional input data. So far this idea was explored only under
    certain limitations such as restricting the input data to be a single object or
    multiple objects representing the same concept. In this work we develop a new
    class of deep generative model called generative matching networks which is
    inspired by the recently proposed matching networks for one-shot learning in
    discriminative tasks and the ideas from meta-learning. By conditioning on the
    additional input dataset, generative matching networks may instantly learn new
    concepts that were not available during the training but conform to a similar
    generative process, without explicit limitations on the number of additional
    input objects or the number of concepts they represent. Our experiments on the
    Omniglot dataset demonstrate that generative matching networks can
    significantly improve predictive performance on the fly as more additional data
    is available to the model and also adapt the latent space which is beneficial
    in the context of feature extraction.

    Model-based Adversarial Imitation Learning

    Nir Baram, Oron Anschel, Shie Mannor
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Generative adversarial learning is a popular new approach to training
    generative models which has been proven successful for other related problems
    as well. The general idea is to maintain an oracle (D) that discriminates
    between the expert’s data distribution and that of the generative model (G).
    The generative model is trained to capture the expert’s distribution by
    maximizing the probability of (D) misclassifying the data it generates.
    Overall, the system is emph{differentiable} end-to-end and is trained using
    basic backpropagation. This type of learning was successfully applied to the
    problem of policy imitation in a model-free setup. However, a model-free
    approach does not allow the system to be differentiable, which requires the use
    of high-variance gradient estimations. In this paper we introduce the Model
    based Adversarial Imitation Learning (MAIL) algorithm. A model-based approach
    for the problem of adversarial imitation learning. We show how to use a forward
    model to make the system fully differentiable, which enables us to train
    policies using the (stochastic) gradient of (D). Moreover, our approach
    requires relatively few environment interactions, and fewer hyper-parameters to
    tune. We test our method on the MuJoCo physics simulator and report initial
    results that surpass the current state-of-the-art.

    Measuring the non-asymptotic convergence of sequential Monte Carlo samplers using probabilistic programming

    Marco F. Cusumano-Towner, Vikash K. Mansinghka
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    A key limitation of sampling algorithms for approximate inference is that it
    is difficult to quantify their approximation error. Widely used sampling
    schemes, such as sequential importance sampling with resampling and
    Metropolis-Hastings, produce output samples drawn from a distribution that may
    be far from the target posterior distribution. This paper shows how to
    upper-bound the symmetric KL divergence between the output distribution of a
    broad class of sequential Monte Carlo (SMC) samplers and their target posterior
    distributions, subject to assumptions about the accuracy of a separate
    gold-standard sampler. The proposed method applies to samplers that combine
    multiple particles, multinomial resampling, and rejuvenation kernels. The
    experiments show the technique being used to estimate bounds on the divergence
    of SMC samplers for posterior inference in a Bayesian linear regression model
    and a Dirichlet process mixture model.

    Predictive Business Process Monitoring with LSTM Neural Networks

    Niek Tax, Ilya Verenich, Marcello La Rosa, Marlon Dumas
    Subjects: Applications (stat.AP); Databases (cs.DB); Learning (cs.LG); Machine Learning (stat.ML)

    Predictive business process monitoring methods exploit logs of completed
    cases of a process in order to make predictions about running cases thereof.
    Existing methods in this space are tailor-made for specific prediction tasks.
    Moreover, their relative accuracy is highly sensitive to the dataset at hand,
    thus requiring users to engage in trial-and-error and tuning when applying them
    in a specific setting. This paper investigates Long Short-Term Memory (LSTM)
    neural networks as an approach to build consistently accurate models for a wide
    range of predictive process monitoring tasks. First, we show that LSTMs
    outperform existing techniques to predict the next event of a running case and
    its timestamp. Next, we show how to use models for predicting the next task in
    order to predict the full continuation of a running case. Finally, we apply the
    same approach to predict the remaining time, and show that this approach
    outperforms existing tailor-made methods.

    Statistical and Computational Guarantees of Lloyd's Algorithm and its Variants

    Yu Lu, Harrison H. Zhou
    Subjects: Statistics Theory (math.ST); Learning (cs.LG); Machine Learning (stat.ML)

    Clustering is a fundamental problem in statistics and machine learning.
    Lloyd’s algorithm, proposed in 1957, is still possibly the most widely used
    clustering algorithm in practice due to its simplicity and empirical
    performance. However, there has been little theoretical investigation on the
    statistical and computational guarantees of Lloyd’s algorithm. This paper is an
    attempt to bridge this gap between practice and theory. We investigate the
    performance of Lloyd’s algorithm on clustering sub-Gaussian mixtures. Under an
    appropriate initialization for labels or centers, we show that Lloyd’s
    algorithm converges to an exponentially small clustering error after an order
    of (log n) iterations, where (n) is the sample size. The error rate is shown
    to be minimax optimal. For the two-mixture case, we only require the
    initializer to be slightly better than random guess.

    In addition, we extend the Lloyd’s algorithm and its analysis to community
    detection and crowdsourcing, two problems that have received a lot of attention
    recently in statistics and machine learning. Two variants of Lloyd’s algorithm
    are proposed respectively for community detection and crowdsourcing. On the
    theoretical side, we provide statistical and computational guarantees of the
    two algorithms, and the results improve upon some previous signal-to-noise
    ratio conditions in literature for both problems. Experimental results on
    simulated and real data sets demonstrate competitive performance of our
    algorithms to the state-of-the-art methods.


    Information Theory

    Robust Low-Complexity Randomized Methods for Locating Outliers in Large Matrices

    Xingguo Li, Jarvis Haupt
    Comments: 16 pages, 4 figures
    Subjects: Information Theory (cs.IT)

    This paper examines the problem of locating outlier columns in a large,
    otherwise low-rank matrix, in settings where {}{the data} are noisy, or where
    the overall matrix has missing elements. We propose a randomized two-step
    inference framework, and establish sufficient conditions on the required sample
    complexities under which these methods succeed (with high probability) in
    accurately locating the outliers for each task. Comprehensive numerical
    experimental results are provided to verify the theoretical bounds and
    demonstrate the computational efficiency of the proposed algorithm.

    An Energy Efficiency Perspective on Massive MIMO Quantization

    Muris Sarajlić, Liang Liu, Ove Edfors
    Comments: To be published in Proceedings of 50th Asilomar Conference on Signals, Systems and Computers
    Subjects: Information Theory (cs.IT)

    One of the basic aspects of Massive MIMO (MaMi) that is in the focus of
    current investigations is its potential of using low-cost and energy-efficient
    hardware. It is often claimed that MaMi will allow for using analog-to-digital
    converters (ADCs) with very low resolutions and that this will result in
    overall improvement of energy efficiency. In this contribution, we perform a
    parametric energy efficiency analysis of MaMi uplink for the entire base
    station receiver system with varying ADC resolutions. The analysis shows that,
    for a wide variety of system parameters, ADCs with intermediate bit resolutions
    (4 – 10 bits) are optimal in energy efficiency sense, and that using very low
    bit resolutions results in degradation of energy efficiency.

    On Counting Subring-Subcodes of Free Linear Codes Over Finite Principal Ideal Rings

    Ramakrishna Bandi, Alexandre Fotue Tabue, Edgar Martínez-Moro
    Subjects: Information Theory (cs.IT)

    Let (R) be a finite principal ideal ring and (S) the Galois extension of (R)
    of degree (m). For (k) and (k_0), positive integers we determine the number of
    free (S)-linear codes (B) of length (l) with the property (k = rank_S(B)) and
    (k_0 = rank_R (Bcap R^l)). This corrects a wrong result which was given in the
    case of finite fields.

    Full diversity sets of unitary matrices from orthogonal sets of idempotents

    Ted Hurley
    Comments: arXiv admin note: text overlap with arXiv:1205.0703
    Subjects: Information Theory (cs.IT)

    Orthogonal sets of idempotents are used to design sets of unitary matrices,
    known as constellations, such that the modulus of the determinant of the
    difference of any two distinct elements is greater than (0). It is shown that
    unitary matrices in general are derived from orthogonal sets of idempotents
    reducing the design problem to a construction problem of unitary matrices from
    such sets. The quality of the constellations constructed in this way and the
    actual differences between the unitary matrices can be determined algebraically
    from the idempotents used.

    This has applications to the design of unitary space time constellations.

    Efficient use of paired spectrum bands through TDD small cell deployments

    A. Agustin, S. Lagen, J. Vidal, O. Muñoz, A. Pascual-Iserte, G. Zhiheng, W.Ronghui
    Comments: submitted to IEEE Communications Magazine
    Subjects: Information Theory (cs.IT)

    Traditionally, wireless cellular systems have been designed to operate in
    Frequency Division Duplexing (FDD) paired bands that allocates the same amount
    of spectrum for both downlink (DL) and uplink (UL) communication. Such design
    is very convenient under symmetric DL/UL traffic conditions, as it used to be
    the case when the voice transmission was the predominant service. However, with
    the overwhelming advent of data services, bringing along large asymmetries
    between DL and UL, the conventional FDD solution becomes inefficient. In this
    regard, flexible duplexing concepts aim to derive procedures for improving the
    spectrum utilization, by adjusting resources to the actual traffic demand. In
    this work we review these concepts and propose the use of unpaired Time
    Division Duplexing (TDD) spectrum on the unused resources for small eNBs
    (SeNB), so that user equipment (UEs) associated to those SeNB could be served
    either in DL or UL. This proposal alleviates the saturated DL in FDD-based
    system through user offloading towards the TDD-based system. The flexible
    duplexing concept is analyzed from three points of view: a) regulation, b) Long
    Term Evolution (LTE)standardization, and c) technical solutions.

    EMC Regulations and Spectral Constraints for Multicarrier Modulation in PLC

    Mauro Girotto, Andrea M. Tonello
    Comments: A version of this manuscript has been submitted to the IEEE Access for possible publication
    Subjects: Information Theory (cs.IT)

    This paper considers Electromagnetic Compatibility (EMC) aspects in the
    context of Power Line Communication (PLC) systems. It offers a complete
    overview of both narrow band PLC and broad band PLC EMC norms. How to interpret
    and translate such norms and measurement procedures into typical constraints
    used by designers of communication systems, is discussed. In particular, the
    constraints to the modulated signal spectrum are considered and the ability of
    pulse shaped OFDM (PS-OFDM), used in most of the PLC standards as IEEE P1901
    and P1901.2, to fulfill them is analyzed. In addition, aiming to improve the
    spectrum management ability, a novel scheme named Pulse Shaped Cyclic Block
    Filtered Multitone modulation (PS-CB-FMT) is introduced and compared to
    PS-OFDM. It is shown that, PS-CB-FMT offers better ability to fulfill the norms
    which translates in higher system capacity.

    A Unified Linear Precoding Design for Multi-user MIMO Systems

    Md. Abdul Latif Sarker
    Comments: 4
    Subjects: Information Theory (cs.IT)

    We address the problem of the bit error rate (BER) performance gap between
    the sub-optimal and optimal linear precoder (LP) for a multiuser (MU) multiple
    input and multiple output (MIMO) broadcast systems in this paper. Particularly,
    mobile users suffer noise enhancement effect due to a sub-optimal LP that can
    be suppressed by an optimal LP matrix. A sub-optimal LP matrix such as a linear
    zero-forcing (LZF) precoder performs in high signal to noise ratio (SNR) regime
    only, in contrast, an optimal precoder for instance a linear minimum
    mean-square-error (LMMSE) precoder outperforms in both low and high SNR
    scenarios. These kinds of precoder illustrates the BER gap distance at least
    0.1 when it is used in itself in a MU MIMO systems. Thus, we propose and design
    a unified linear precoding (ULP) matrix using a precoding selection technique
    that combines the sub-optimal and optimal LP matrix for a multi-user MIMO
    systems to ensure zero BER performance gap in this paper. The numerical results
    show that our proposed ULP technique offers significant performance in both low
    and high SNR scenarios.

    Rate-cost tradeoffs in control. Part II: achievable scheme

    Victoria Kostina, Babak Hassibi
    Subjects: Information Theory (cs.IT); Systems and Control (cs.SY)

    Consider a distributed control problem with a communication channel
    connecting the observer of a linear stochastic system to the controller. The
    goal of the controller is to minimize a quadratic cost function in the state
    variables and control signal, known as the linear quadratic regulator (LQR). We
    study the fundamental tradeoff between the communication rate r bits/sec and
    the limsup of the expected cost b. In the companion paper, which can be read
    independently of the current one, we show a lower bound on a certain cost
    function, which quantifies the minimum mutual information between the channel
    input and output, given the past, that is compatible with a target LQR cost.
    The bound applies as long as the system noise has a probability density
    function, and it holds for a general class of codes that can take full
    advantage of the memory of the data observed so far and that are not
    constrained to have any particular structure. In this paper, we prove that the
    bound can be approached by a simple variable-length lattice quantization
    scheme, as long as the system noise satisfies a smoothness condition. The
    quantization scheme only quantizes the innovation, that is, the difference
    between the controller’s belief about the current state and the encoder’s state
    estimate. Our proof technique leverages some recent results on nonasymptotic
    high resolution vector quantization.

    Rate-cost tradeoffs in control. Part I: lower bounds

    Victoria Kostina, Babak Hassibi
    Subjects: Information Theory (cs.IT); Systems and Control (cs.SY)

    Consider a distributed control problem with a communication channel
    connecting the observer of a linear stochastic system to the controller. The
    goal of the controller is to minimize a quadratic cost function in the state
    variables and control signal, known as the linear quadratic regulator (LQR). We
    study the fundamental tradeoff between the communication rate r bits/sec and
    the limsup of the expected cost b. We obtain a lower bound on a certain cost
    function, which quantifies the minimum mutual information between the channel
    input and output, given the past, that is compatible with a target LQR cost.
    The rate-cost function has operational significance in multiple scenarios of
    interest: among other, it allows us to lower bound the minimum communication
    rate for fixed and variable length quantization, and for control over a noisy
    channel. Our results extend and generalize an earlier explicit expression, due
    to Tatikonda el al., for the scalar Gaussian case to the vector, non-Gaussian,
    and partially observed one. The bound applies as long as the system noise has a
    probability density function. Apart from standard dynamic programming
    arguments, our proof technique leverages the Shannon lower bound on the
    rate-distortion function and proposes new estimates for information measures of
    linear combinations of random vectors.

    Fountain Code-Inspired Channel Estimation for Multi-user Millimeter Wave MIMO Systems

    Matthew Kokshoorn, He Chen, Yonghui Li, Branka Vucetic
    Comments: Submitted for publication
    Subjects: Information Theory (cs.IT)

    This paper develops a novel channel estimation approach for multi-user
    millimeter wave (mmWave) wireless systems with large antenna arrays. By
    exploiting the inherent mmWave channel sparsity, we propose a novel
    simultaneous-estimation with iterative fountain training (SWIFT) framework, in
    which the average number of channel measurements is adapted to various channel
    conditions. To this end, the base station (BS) and each user continue to
    measure the channel with a random subset of transmit/receive beamforming
    directions until the channel estimate converges. We formulate the channel
    estimation process as a compressed sensing problem and apply a sparse
    estimation approach to recover the virtual channel information. As SWIFT does
    not adapt the BS’s transmitting beams to any single user, we are able to
    estimate all user channels simultaneously. Simulation results show that SWIFT
    can significantly outperform existing random-beamforming based approaches that
    use a fixed number of measurements, over a range of signal-to-noise ratios and
    channel coherence times.




沪ICP备19023445号-2号
友情链接