IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Thu, 13 Oct 2016

    我爱机器学习(52ml.net)发表于 2016-10-13 00:00:00
    love 0

    Neural and Evolutionary Computing

    RetiNet: Automatic AMD identification in OCT volumetric data

    Stefanos Apostolopoulos, Carlos Ciller, Sandro I. De Zanet, Sebastian Wolf, Raphael Sznitman
    Comments: 14 pages, 10 figures, Code available
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Optical Coherence Tomography (OCT) provides a unique ability to image the eye
    retina in 3D at micrometer resolution and gives ophthalmologist the ability to
    visualize retinal diseases such as Age-Related Macular Degeneration (AMD).
    While visual inspection of OCT volumes remains the main method for AMD
    identification, doing so is time consuming as each cross-section within the
    volume must be inspected individually by the clinician. In much the same way,
    acquiring ground truth information for each cross-section is expensive and time
    consuming. This fact heavily limits the ability to acquire large amounts of
    ground truth, which subsequently impacts the performance of learning-based
    methods geared at automatic pathology identification. To avoid this burden, we
    propose a novel strategy for automatic analysis of OCT volumes where only
    volume labels are needed. That is, we train a classifier in a semi-supervised
    manner to conduct this task. Our approach uses a novel Convolutional Neural
    Network (CNN) architecture, that only needs volume-level labels to be trained
    to automatically asses whether an OCT volume is healthy or contains AMD. Our
    architecture involves first learning a cross-section pathology classifier using
    pseudo-labels that could be corrupted and then leverage these towards a more
    accurate volume-level classification. We then show that our approach provides
    excellent performances on a publicly available dataset and outperforms a number
    of existing automatic techniques.

    Optimizing Memory Efficiency for Deep Convolutional Neural Networks on GPUs

    Chao Li, Yi Yang, Min Feng, Srimat Chakradhar, Huiyang Zhou
    Comments: Published as a conference paper International Conference on High Performance Computing, Networking, Storage, and Analysis (SC’16), 2016
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Performance (cs.PF)

    Leveraging large data sets, deep Convolutional Neural Networks (CNNs) achieve
    state-of-the-art recognition accuracy. Due to the substantial compute and
    memory operations, however, they require significant execution time. The
    massive parallel computing capability of GPUs make them as one of the ideal
    platforms to accelerate CNNs and a number of GPU-based CNN libraries have been
    developed. While existing works mainly focus on the computational efficiency of
    CNNs, the memory efficiency of CNNs have been largely overlooked. Yet CNNs have
    intricate data structures and their memory behavior can have significant impact
    on the performance. In this work, we study the memory efficiency of various CNN
    layers and reveal the performance implication from both data layouts and memory
    access patterns. Experiments show the universal effect of our proposed
    optimizations on both single layers and various networks, with up to 27.9x for
    a single layer and up to 5.6x on the whole networks.


    Computer Vision and Pattern Recognition

    Video Depth-From-Defocus

    Hyeongwoo Kim, Christian Richardt, Christian Theobalt
    Comments: 13 pages, supplemental document included as appendix, 3DV 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Many compelling video post-processing effects, in particular aesthetic focus
    editing and refocusing effects, are feasible if per-frame depth information is
    available. Existing computational methods to capture RGB and depth either
    purposefully modify the optics (coded aperture, light-field imaging), or employ
    active RGB-D cameras. Since these methods are less practical for users with
    normal cameras, we present an algorithm to capture all-in-focus RGB-D video of
    dynamic scenes with an unmodified commodity video camera. Our algorithm turns
    the often unwanted defocus blur into a valuable signal. The input to our method
    is a video in which the focus plane is continuously moving back and forth
    during capture, and thus defocus blur is provoked and strongly visible. This
    can be achieved by manually turning the focus ring of the lens during
    recording. The core algorithmic ingredient is a new video-based
    depth-from-defocus algorithm that computes space-time-coherent depth maps,
    deblurred all-in-focus video, and the focus distance for each frame. We
    extensively evaluate our approach, and show that it enables compelling video
    post-processing effects, such as different types of refocusing.

    Deep disentangled representations for volumetric reconstruction

    Edward Grant, Pushmeet Kohli, Marcel van Gerven
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We introduce a convolutional neural network for inferring a compact
    disentangled graphical description of objects from 2D images that can be used
    for volumetric reconstruction. The network comprises an encoder and a
    twin-tailed decoder. The encoder generates a disentangled graphics code. The
    first decoder generates a volume, and the second decoder reconstructs the input
    image using a novel training regime that allows the graphics code to learn a
    separate representation of the 3D object and a description of its lighting and
    pose conditions. We demonstrate this method by generating volumes and
    disentangled graphical descriptions from images and videos of faces and chairs.

    Generating captions without looking beyond objects

    Hendrik Heuer, Christof Monz, Arnold W.M. Smeulders
    Comments: This paper is accepted to the ECCV2016 2nd Workshop on Storytelling with Images and Videos (VisStory)
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)

    This paper explores new evaluation perspectives for image captioning and
    introduces a noun translation task that achieves comparative image caption
    generation performance by translating from a set of nouns to captions. This
    implies that in image captioning, all word categories other than nouns can be
    evoked by a powerful language model without sacrificing performance on the
    precision-oriented metric BLEU. The paper also investigates lower and upper
    bounds of how much individual word categories in the captions contribute to the
    final BLEU score. A large possible improvement exists for nouns, verbs, and
    prepositions.

    Light Field Compression with Disparity Guided Sparse Coding based on Structural Key Views

    Jie Chen, Junhui Hou, Lap-Pui Chau
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Recent imaging technologies are rapidly evolving for sampling richer and more
    immersive representations of the 3D world. And one of the emerging technologies
    are light field (LF) cameras based on micro-lens arrays. To record the
    directional information of the light rays, a much larger storage space and
    transmission bandwidth are required by a LF image as compared to a conventional
    2D image of similar spatial dimension, and the compression of LF data becomes a
    vital part of its application.

    In this paper, we propose a LF codec that fully exploit the intrinsic
    geometry between the LF sub-views by first approximating the LF with disparity
    guided sparse coding over a perspective shifted light field dictionary. The
    sparse coding is only based on several optimized Structural Key Views (SKV),
    however the entire LF can be recovered from the coding coefficients. By keeping
    the approximation identical between encoder and decoder, only the sparse coding
    residual and the SKVs needs to be transmitted. An optimized SKV selection
    method is proposed such that most LF spatial information could be preserved.
    And to achieve optimum dictionary efficiency, the LF is divided into several
    Coding Regions (CR), over which the reconstruction works individually.
    Experiments and comparisons have been carried out over benchmark LF dataset
    which show that the proposed SC-SKV codec produces state-of-the-art compression
    results in terms of rate-distortion performance and visual quality compared
    with High Efficiency Video Coding (HEVC): with 37.79% BD rate reduction and
    0.92 dB BD-PSNR improvement achieved on average, especially with up to 4 dB
    improvement for low bit rate scenarios.

    Multi-Task Curriculum Transfer Deep Learning of Clothing Attributes

    Qi Dong, Shaogang Gong, Xiatian Zhu
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Recognising detailed clothing characteristics (fine-grained attributes) in
    unconstrained images of people in-the-wild is a challenging task for computer
    vision, especially when there is only limited training data from the wild
    whilst most data available for model learning are captured in well-controlled
    environments using fashion models (well lit, no background clutter, frontal
    view, high-resolution). In this work, we develop a deep learning framework
    capable of model transfer learning from well-controlled shop clothing images
    collected from web retailers to in-the-wild images from the street.
    Specifically, we formulate a novel Multi-Task Curriculum Transfer (MTCT) deep
    learning method to explore multiple sources of different types of web
    annotations with multi-labelled fine-grained attributes. Our multi-task loss
    function is designed to extract more discriminative representations in training
    by jointly learning all attributes, and our curriculum strategy exploits the
    staged easy-to-complex transfer learning motivated by cognitive studies. We
    demonstrate the advantages of the MTCT model over the state-of-the-art methods
    on the X-Domain benchmark, a large scale clothing attribute dataset. Moreover,
    we show that the MTCT model has a notable advantage over contemporary models
    when the training data size is small.

    Image Based Camera Localization: an Overview

    Yihong Wu
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Recently, virtual reality, augmented reality, robotics, self-driving cars et
    al attractive much attention of industrial community, in which image based
    camera localization is a key task. It is urgent to give an overview of image
    based camera localization. In this paper, an overview of image based camera
    localization is presented. It will be useful to not only researchers but also
    engineers.

    Analyzing the Affect of a Group of People Using Multi-modal Framework

    Xiaohua Huang, Abhinav Dhall, Xin Liu, Guoying Zhao, Jingang Shi, Roland Goecke, Matti Pietikainen
    Comments: Submitted to IEEE Transactions on Cybernetics
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Millions of images on the web enable us to explore images from social events
    such as a family party, thus it is of interest to understand and model the
    affect exhibited by a group of people in images. But analysis of the affect
    expressed by multiple people is challenging due to varied indoor and outdoor
    settings, and interactions taking place between various numbers of people. A
    few existing works on Group-level Emotion Recognition (GER) have investigated
    on face-level information. Due to the challenging environments, face may not
    provide enough information to GER. Relatively few studies have investigated
    multi-modal GER. Therefore, we propose a novel multi-modal approach based on a
    new feature description for understanding emotional state of a group of people
    in an image. In this paper, we firstly exploit three kinds of rich information
    containing face, upperbody and scene in a group-level image. Furthermore, in
    order to integrate multiple person’s information in a group-level image, we
    propose an information aggregation method to generate three features for face,
    upperbody and scene, respectively. We fuse face, upperbody and scene
    information for robustness of GER against the challenging environments.
    Intensive experiments are performed on two challenging group-level emotion
    databases to investigate the role of face, upperbody and scene as well as
    multi-modal framework. Experimental results demonstrate that our framework
    achieves very promising performance for GER.

    RetiNet: Automatic AMD identification in OCT volumetric data

    Stefanos Apostolopoulos, Carlos Ciller, Sandro I. De Zanet, Sebastian Wolf, Raphael Sznitman
    Comments: 14 pages, 10 figures, Code available
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Optical Coherence Tomography (OCT) provides a unique ability to image the eye
    retina in 3D at micrometer resolution and gives ophthalmologist the ability to
    visualize retinal diseases such as Age-Related Macular Degeneration (AMD).
    While visual inspection of OCT volumes remains the main method for AMD
    identification, doing so is time consuming as each cross-section within the
    volume must be inspected individually by the clinician. In much the same way,
    acquiring ground truth information for each cross-section is expensive and time
    consuming. This fact heavily limits the ability to acquire large amounts of
    ground truth, which subsequently impacts the performance of learning-based
    methods geared at automatic pathology identification. To avoid this burden, we
    propose a novel strategy for automatic analysis of OCT volumes where only
    volume labels are needed. That is, we train a classifier in a semi-supervised
    manner to conduct this task. Our approach uses a novel Convolutional Neural
    Network (CNN) architecture, that only needs volume-level labels to be trained
    to automatically asses whether an OCT volume is healthy or contains AMD. Our
    architecture involves first learning a cross-section pathology classifier using
    pseudo-labels that could be corrupted and then leverage these towards a more
    accurate volume-level classification. We then show that our approach provides
    excellent performances on a publicly available dataset and outperforms a number
    of existing automatic techniques.

    Fast Training of Convolutional Neural Networks via Kernel Rescaling

    Pedro Porto Buarque de Gusmão, Gianluca Francini, Skjalg Lepsøy, Enrico Magli
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Training deep Convolutional Neural Networks (CNN) is a time consuming task
    that may take weeks to complete. In this article we propose a novel,
    theoretically founded method for reducing CNN training time without incurring
    any loss in accuracy. The basic idea is to begin training with a pre-train
    network using lower-resolution kernels and input images, and then refine the
    results at the full resolution by exploiting the spatial scaling property of
    convolutions. We apply our method to the ImageNet winner OverFeat and to the
    more recent ResNet architecture and show a reduction in training time of nearly
    20% while test set accuracy is preserved in both cases.

    The Virtual Electromagnetic Interaction between Digital Images for Image Matching with Shifting Transformation

    Xiaodong Zhuang, N. E. Mastorakis
    Comments: 17 pages, 39 figures. arXiv admin note: substantial text overlap with arXiv:1610.02762
    Journal-ref: WSEAS Transactions on Computers, pp. 107-123, Volume 14, 2015
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    A novel way of matching two images with shifting transformation is studied.
    The approach is based on the presentation of the virtual edge current in
    images, and also the study of virtual electromagnetic interaction between two
    related images inspired by electromagnetism. The edge current in images is
    proposed as a discrete simulation of the physical current, which is based on
    the significant edge line extracted by Canny-like edge detection. Then the
    virtual interaction of the edge currents between related images is studied by
    imitating the electro-magnetic interaction between current-carrying wires.
    Based on the virtual interaction force between two related images, a novel
    method is presented and applied in image matching for shifting transformation.
    The preliminary experimental results indicate the effectiveness of the proposed
    method.

    A Model of Virtual Carrier Immigration in Digital Images for Region Segmentation

    Xiaodong Zhuang, N. E. Mastorakis
    Comments: 11 pages, 17 figures. arXiv admin note: text overlap with arXiv:1610.02760
    Journal-ref: WSEAS TRANSACTIONS on COMPUTERS, pp. 708-718, Volume 14, 2015
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    A novel model for image segmentation is proposed, which is inspired by the
    carrier immigration mechanism in physical P-N junction. The carrier diffusing
    and drifting are simulated in the proposed model, which imitates the physical
    self-balancing mechanism in P-N junction. The effect of virtual carrier
    immigration in digital images is analyzed and studied by experiments on test
    images and real world images. The sign distribution of net carrier at the
    model’s balance state is exploited for region segmentation. The experimental
    results for both test images and real-world images demonstrate self-adaptive
    and meaningful gathering of pixels to suitable regions, which prove the
    effectiveness of the proposed method for image region segmentation.

    The Analysis of Local Motion and Deformation in Image Sequences Inspired by Physical Electromagnetic Interaction

    Xiaodong Zhuang, N. E. Mastorakis
    Comments: 15 pages, 23 figures. arXiv admin note: substantial text overlap with arXiv:1610.02762
    Journal-ref: WSEAS TRANSACTIONS on COMPUTERS, pp. 231-245, Volume 14, 2015
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In order to analyze the moving and deforming of the objects in image
    sequence, a novel way is presented to analyze the local changes of object edges
    between two related images (such as two adjacent frames in a video sequence),
    which is inspired by the physical electromagnetic interaction. The changes of
    edge between adjacent frames in sequences are analyzed by simulation of virtual
    current interaction, which can reflect the change of the object’s position or
    shape. The virtual current along the main edge line is proposed based on the
    significant edge extraction. Then the virtual interaction between the current
    elements in the two related images is studied by imitating the interaction
    between physical current-carrying wires. The experimental results prove that
    the distribution of magnetic forces on the current elements in one image
    applied by the other can reflect the local change of edge lines from one image
    to the other, which is important in further analysis.

    Subspace clustering based on low rank representation and weighted nuclear norm minimization

    Yu Song, Yiquan Wu, Yimian Dai
    Comments: 17 pages, 3 figures, 5 tables This paper is also submitted to the journal ‘pattern recognition’. arXiv admin note: substantial text overlap with arXiv:1203.1005 by other authors
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Subspace clustering refers to the problem of segmenting a set of data points
    approximately drawn from a union of multiple linear subspaces. Aiming at the
    subspace clustering problem, various subspace clustering algorithms have been
    proposed and low rank representation based subspace clustering is a very
    promising and efficient subspace clustering algorithm. Low rank representation
    method seeks the lowest rank representation among all the candidates that can
    represent the data points as linear combinations of the bases in a given
    dictionary. Nuclear norm minimization is adopted to minimize the rank of the
    representation matrix. However, nuclear norm is not a very good approximation
    of the rank of a matrix and the representation matrix thus obtained can be of
    high rank which will affect the final clustering accuracy. Weighted nuclear
    norm (WNN) is a better approximation of the rank of a matrix and WNN is adopted
    in this paper to describe the rank of the representation matrix. The convex
    program is solved via conventional alternation direction method of multipliers
    (ADMM) and linearized alternating direction method of multipliers (LADMM) and
    they are respectively refer to as WNNM-LRR and WNNM-LRR(L). Experimental
    results show that, compared with low rank representation method and several
    other state-of-the-art subspace clustering methods, WNNM-LRR and WNNM-LRR(L)
    can get higher clustering accuracy.

    Recursive Diffeomorphism-Based Regression for Shape Functions

    Jieren Xu, Haizhao Yang, Ingrid Daubechies
    Subjects: Numerical Analysis (math.NA); Computer Vision and Pattern Recognition (cs.CV); Statistics Theory (math.ST)

    This paper proposes a recursive diffeomorphism based regression method for
    one-dimensional generalized mode decomposition problem that aims at extracting
    generalized modes $alpha_k(t)s_k(2pi N_kphi_k(t))$ from their superposition
    $sum_{k=1}^K alpha_k(t)s_k(2pi N_kphi_k(t))$. First, a one-dimensional
    synchrosqueezed transform is applied to estimate instantaneous information,
    e.g., $alpha_k(t)$ and $N_kphi_k(t)$. Second, a novel approach based on
    diffeomorphisms and nonparametric regression is proposed to estimate wave shape
    functions $s_k(t)$. These two methods lead to a framework for the generalized
    mode decomposition problem under a weak well-separation condition. Numerical
    examples of synthetic and real data are provided to demonstrate the fruitful
    applications of these methods.

    Deep Fruit Detection in Orchards

    Suchet Bargoti, James Underwood
    Comments: Submitted to the IEEE International Conference on Robotics and Automation 2017
    Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    An accurate and reliable image based fruit detection system is critical for
    supporting higher level agriculture tasks such as yield mapping and robotic
    harvesting. This paper presents the use of a state-of-the-art object detection
    framework, Faster R-CNN, in the context of fruit detection in orchards,
    including mangoes, almonds and apples. Ablation studies are presented to better
    understand the practical deployment of the detection network, including how
    much training data is required to capture variability in the dataset. Data
    augmentation techniques are shown to yield significant performance gains,
    resulting in a greater than two-fold reduction in the number of training images
    required. In contrast, transferring knowledge between orchards contributed to
    negligible performance gain over initialising the Deep Convolutional Neural
    Network directly from ImageNet features. Finally, to operate over orchard data
    containing between 100-1000 fruit per image, a tiling approach is introduced
    for the Faster R-CNN framework. The study has resulted in the best yet
    detection performance for these orchards relative to previous works, with an
    F1-score of >0.9 achieved for apples and mangoes.

    DOTmark – A Benchmark for Discrete Optimal Transport

    Jörn Schrieber, Dominic Schuhmacher, Carsten Gottschlich
    Subjects: Optimization and Control (math.OC); Computer Vision and Pattern Recognition (cs.CV)

    The Wasserstein metric or earth mover’s distance (EMD) is a useful tool in
    statistics, machine learning and computer science with many applications to
    biological or medical imaging, among others. Especially in the light of
    increasingly complex data, the computation of these distances via optimal
    transport is often the limiting factor. Inspired by this challenge, a variety
    of new approaches to optimal transport has been proposed in recent years and
    along with these new methods comes the need for a meaningful comparison.

    In this paper, we introduce a benchmark for discrete optimal transport,
    called DOTmark, which is designed to serve as a neutral collection of problems,
    where discrete optimal transport methods can be tested, compared to one
    another, and brought to their limits on large-scale instances. It consists of a
    variety of grayscale images, in various resolutions and classes, such as
    several types of randomly generated images, classical test images and real data
    from microscopy.

    Along with the DOTmark we present a survey and a performance test for a cross
    section of established methods ranging from more traditional algorithms, such
    as the transportation simplex, to recently developed approaches, such as the
    shielding neighborhood method, and including also a comparison with commercial
    solvers.


    Artificial Intelligence

    Detecting Unseen Falls from Wearable Devices using Channel-wise Ensemble of Autoencoders

    Shehroz S. Khan, Babak Taati
    Comments: 8 pages, 2 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)

    A fall is an abnormal activity that occurs rarely, so it is hard to collect
    real data for falls. It is, therefore, difficult to use supervised learning
    methods to automatically detect falls. Another challenge in using machine
    learning methods to automatically detect falls is the choice of features. In
    this paper, we propose to use an ensemble of autoencoders to extract features
    from different channels of wearable sensor data trained only on normal
    activities. We show that choosing a threshold as maximum of the reconstruction
    error on the training normal data is not the right way to identify unseen
    falls. We propose two methods for automatic tightening of reconstruction error
    from only the normal activities for better identification of unseen falls. We
    present our results on two activity recognition datasets and show the efficacy
    of our proposed method against traditional autoencoder models and two standard
    one-class classification methods.

    Exploring the Entire Regularization Path for the Asymmetric Cost Linear Support Vector Machine

    Daniel Wesierski
    Comments: 8 pages, 2 figures
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    We propose an algorithm for exploring the entire regularization path of
    asymmetric-cost linear support vector machines. Empirical evidence suggests the
    predictive power of support vector machines depends on the regularization
    parameters of the training algorithms. The algorithms exploring the entire
    regularization paths have been proposed for single-cost support vector machines
    thereby providing the complete knowledge on the behavior of the trained model
    over the hyperparameter space. Considering the problem in two-dimensional
    hyperparameter space though enables our algorithm to maintain greater
    flexibility in dealing with special cases and sheds light on problems
    encountered by algorithms building the paths in one-dimensional spaces. We
    demonstrate two-dimensional regularization paths for linear support vector
    machines that we train on synthetic and real data.

    Maximum entropy models for generation of expressive music

    Simon Moulieras, François Pachet
    Subjects: Artificial Intelligence (cs.AI); Sound (cs.SD)

    In the context of contemporary monophonic music, expression can be seen as
    the difference between a musical performance and its symbolic representation,
    i.e. a musical score. In this paper, we show how Maximum Entropy (MaxEnt)
    models can be used to generate musical expression in order to mimic a human
    performance. As a training corpus, we had a professional pianist play about 150
    melodies of jazz, pop, and latin jazz. The results show a good predictive
    power, validating the choice of our model. Additionally, we set up a listening
    test whose results reveal that on average, people significantly prefer the
    melodies generated by the MaxEnt model than the ones without any expression, or
    with fully random expression. Furthermore, in some cases, MaxEnt melodies are
    almost as popular as the human performed ones.

    A Chain-Detection Algorithm for Two-Dimensional Grids

    Paul Bonham, Azlan Iqbal
    Comments: 28 pages, 10 figures
    Subjects: Artificial Intelligence (cs.AI)

    We describe a general method of detecting valid chains or links of pieces on
    a two-dimensional grid. Specifically, using the example of the chess variant
    known as Switch-Side Chain-Chess (SSCC). Presently, no foolproof method of
    detecting such chains in any given chess position is known and existing graph
    theory, to our knowledge, is unable to fully address this problem either. We
    therefore propose a solution implemented and tested using the C++ programming
    language. We have been unable to find an incorrect result and therefore offer
    it as the most viable solution thus far to the chain-detection problem in this
    chess variant. The algorithm is also scalable, in principle, to areas beyond
    two-dimensional grids such as 3D analysis and molecular chemistry.

    Deep Fruit Detection in Orchards

    Suchet Bargoti, James Underwood
    Comments: Submitted to the IEEE International Conference on Robotics and Automation 2017
    Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    An accurate and reliable image based fruit detection system is critical for
    supporting higher level agriculture tasks such as yield mapping and robotic
    harvesting. This paper presents the use of a state-of-the-art object detection
    framework, Faster R-CNN, in the context of fruit detection in orchards,
    including mangoes, almonds and apples. Ablation studies are presented to better
    understand the practical deployment of the detection network, including how
    much training data is required to capture variability in the dataset. Data
    augmentation techniques are shown to yield significant performance gains,
    resulting in a greater than two-fold reduction in the number of training images
    required. In contrast, transferring knowledge between orchards contributed to
    negligible performance gain over initialising the Deep Convolutional Neural
    Network directly from ImageNet features. Finally, to operate over orchard data
    containing between 100-1000 fruit per image, a tiling approach is introduced
    for the Faster R-CNN framework. The study has resulted in the best yet
    detection performance for these orchards relative to previous works, with an
    F1-score of >0.9 achieved for apples and mangoes.

    Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model

    Paul Christiano, Zain Shah, Igor Mordatch, Jonas Schneider, Trevor Blackwell, Joshua Tobin, Pieter Abbeel, Wojciech Zaremba
    Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Learning (cs.LG); Systems and Control (cs.SY)

    Developing control policies in simulation is often more practical and safer
    than directly running experiments in the real world. This applies to policies
    obtained from planning and optimization, and even more so to policies obtained
    from reinforcement learning, which is often very data demanding. However, a
    policy that succeeds in simulation often doesn’t work when deployed on a real
    robot. Nevertheless, often the overall gist of what the policy does in
    simulation remains valid in the real world. In this paper we investigate such
    settings, where the sequence of states traversed in simulation remains
    reasonable for the real world, even if the details of the controls are not, as
    could be the case when the key differences lie in detailed friction, contact,
    mass and geometry properties. During execution, at each time step our approach
    computes what the simulation-based control policy would do, but then, rather
    than executing these controls on the real robot, our approach computes what the
    simulation expects the resulting next state(s) will be, and then relies on a
    learned deep inverse dynamics model to decide which real-world action is most
    suitable to achieve those next states. Deep models are only as good as their
    training data, and we also propose an approach for data collection to
    (incrementally) learn the deep inverse dynamics model. Our experiments shows
    our approach compares favorably with various baselines that have been developed
    for dealing with simulation to real world model discrepancy, including output
    error control and Gaussian dynamics adaptation.


    Computation and Language

    Domain-specific Question Generation from a Knowledge Base

    Linfeng Song, Lin Zhao
    Subjects: Computation and Language (cs.CL)

    Question generation has been a research topic for a long time, where a big
    challenge is how to generate deep and natural questions. To tackle this
    challenge, we propose a system to generate natural language questions from a
    domain-specific knowledge base (KB) by utilizing rich web information. A small
    number of question templates are first created based on the KB and instantiated
    into questions, which are used as seed set and further expanded through the web
    to get more question candidates. A filtering model is then applied to select
    candidates with high grammaticality and domain relevance. The system is able to
    generate large amount of in-domain natural language questions with considerable
    semantic diversity and is easily applicable to other domains. We evaluate the
    quality of the generated questions by human judgments and the results show the
    effectiveness of our proposed system.

    SentiHood: Targeted Aspect Based Sentiment Analysis Dataset for Urban Neighbourhoods

    Marzieh Saeidi, Guillaume Bouchard, Maria Liakata, Sebastian Riedel
    Comments: Accepted at COLING 2016
    Subjects: Computation and Language (cs.CL)

    In this paper, we introduce the task of targeted aspect-based sentiment
    analysis. The goal is to extract fine-grained information with respect to
    entities mentioned in user comments. This work extends both aspect-based
    sentiment analysis that assumes a single entity per document and targeted
    sentiment analysis that assumes a single sentiment towards a target entity. In
    particular, we identify the sentiment towards each aspect of one or more
    entities. As a testbed for this task, we introduce the SentiHood dataset,
    extracted from a question answering (QA) platform where urban neighbourhoods
    are discussed by users. In this context units of text often mention several
    aspects of one or more neighbourhoods. This is the first time that a generic
    social media platform in this case a QA platform, is used for fine-grained
    opinion mining. Text coming from QA platforms is far less constrained compared
    to text from review specific platforms which current datasets are based on. We
    develop several strong baselines, relying on logistic regression and
    state-of-the-art recurrent neural networks.

    Language Models with GloVe Word Embeddings

    Victor Makarenkov, Bracha Shapira, Lior Rokach
    Subjects: Computation and Language (cs.CL)

    In this work we implement a training of a Language Model (LM), using
    Recurrent Neural Network (RNN) and GloVe word embeddings, introduced by
    Pennigton et al. in [1]. The implementation is following the general idea of
    training RNNs for LM tasks presented in [2], but is rather using Gated
    Recurrent Unit (GRU) [3] for a memory cell, and not the more commonly used LSTM
    [4].

    Semi-supervised Discovery of Informative Tweets During the Emerging Disasters

    Shanshan Zhang, Slobodan Vucetic
    Subjects: Computation and Language (cs.CL); Social and Information Networks (cs.SI)

    The first objective towards the effective use of microblogging services such
    as Twitter for situational awareness during the emerging disasters is discovery
    of the disaster-related postings. Given the wide range of possible disasters,
    using a pre-selected set of disaster-related keywords for the discovery is
    suboptimal. An alternative that we focus on in this work is to train a
    classifier using a small set of labeled postings that are becoming available as
    a disaster is emerging. Our hypothesis is that utilizing large quantities of
    historical microblogs could improve the quality of classification, as compared
    to training a classifier only on the labeled data. We propose to use unlabeled
    microblogs to cluster words into a limited number of clusters and use the word
    clusters as features for classification. To evaluate the proposed
    semi-supervised approach, we used Twitter data from 6 different disasters. Our
    results indicate that when the number of labeled tweets is 100 or less, the
    proposed approach is superior to the standard classification based on the bag
    or words feature representation. Our results also reveal that the choice of the
    unlabeled corpus, the choice of word clustering algorithm, and the choice of
    hyperparameters can have a significant impact on the classification accuracy.

    A Paradigm for Situated and Goal-Driven Language Learning

    Jon Gauthier, Igor Mordatch
    Comments: 5 pages, submitted to Machine Intelligence @ NIPS workshop
    Subjects: Computation and Language (cs.CL)

    A distinguishing property of human intelligence is the ability to flexibly
    use language in order to communicate complex ideas with other humans in a
    variety of contexts. Research in natural language dialogue should focus on
    designing communicative agents which can integrate themselves into these
    contexts and productively collaborate with humans. In this abstract, we propose
    a general situated language learning paradigm which is designed to bring about
    robust language agents able to cooperate productively with humans.


    Distributed, Parallel, and Cluster Computing

    Optimizing Memory Efficiency for Deep Convolutional Neural Networks on GPUs

    Chao Li, Yi Yang, Min Feng, Srimat Chakradhar, Huiyang Zhou
    Comments: Published as a conference paper International Conference on High Performance Computing, Networking, Storage, and Analysis (SC’16), 2016
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Performance (cs.PF)

    Leveraging large data sets, deep Convolutional Neural Networks (CNNs) achieve
    state-of-the-art recognition accuracy. Due to the substantial compute and
    memory operations, however, they require significant execution time. The
    massive parallel computing capability of GPUs make them as one of the ideal
    platforms to accelerate CNNs and a number of GPU-based CNN libraries have been
    developed. While existing works mainly focus on the computational efficiency of
    CNNs, the memory efficiency of CNNs have been largely overlooked. Yet CNNs have
    intricate data structures and their memory behavior can have significant impact
    on the performance. In this work, we study the memory efficiency of various CNN
    layers and reveal the performance implication from both data layouts and memory
    access patterns. Experiments show the universal effect of our proposed
    optimizations on both single layers and various networks, with up to 27.9x for
    a single layer and up to 5.6x on the whole networks.

    Improved Parallel Construction of Wavelet Trees and Rank/Select Structures

    Julian Shun
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Existing parallel algorithms for wavelet tree construction have a work
    complexity of $O(nlogsigma)$. This paper presents parallel algorithms for the
    problem with improved work complexity. Our first algorithm is based on parallel
    integer sorting and has either $O(nsqrt{loglog n}lceillogsigma/sqrt{log
    n}
    ceil)$ work and polylogarithmic depth, or $O(nlceillogsigma/sqrt{log
    n}
    ceil)$ work and sub-linear depth. We also describe another algorithm that
    has $O(nlceillogsigma/sqrt{log n}
    ceil)$ work and $O(sigma+log n)$
    depth. We then show how to use similar ideas to construct variants of wavelet
    trees (arbitrary-shaped binary trees and multiary trees) as well as wavelet
    matrices in parallel with lower work complexity than prior algorithms. Finally,
    we show that the rank and select structures on binary sequences and multiary
    sequences, which are stored on wavelet tree nodes, can be constructed in
    parallel with improved work bounds. In particular, we show that the rank and
    select structures can be constructed for binary sequences in $O(n/log n)$ work
    and $O(log n)$ depth, and for multiary sequences in $O(nlogsigma/log n)$
    work and $O(log n)$ depth. These work bounds match that of the best existing
    sequential algorithms for constructing rank and select structures.


    Learning

    Introduction to the "Industrial Benchmark"

    Daniel Hein, Alexander Hentschel, Volkmar Sterzing, Michel Tokic, Steffen Udluft
    Comments: 11 pages
    Subjects: Learning (cs.LG)

    A novel reinforcement learning benchmark, called Industrial Benchmark, is
    introduced. The Industrial Benchmark aims at being be realistic in the sense,
    that it includes a variety of aspects that we found to be vital in industrial
    applications. It is not designed to be an approximation of any real system, but
    to pose the same hardness and complexity.

    On statistical learning via the lens of compression

    Ofir David, Shay Moran, Amir Yehudayoff
    Comments: To appear in NIPS ’16 (oral), 14 pages (not including appendix)
    Subjects: Learning (cs.LG); Discrete Mathematics (cs.DM); Logic in Computer Science (cs.LO); Combinatorics (math.CO); Logic (math.LO)

    This work continues the study of the relationship between sample compression
    schemes and statistical learning, which has been mostly investigated within the
    framework of binary classification. The central theme of this work is
    establishing equivalences between learnability and compressibility, and
    utilizing these equivalences in the study of statistical learning theory.

    We begin with the setting of multiclass categorization (zero/one loss). We
    prove that in this case learnability is equivalent to compression of
    logarithmic sample size, and that uniform convergence implies compression of
    constant size.

    We then consider Vapnik’s general learning setting: we show that in order to
    extend the compressibility-learnability equivalence to this case, it is
    necessary to consider an approximate variant of compression.

    Finally, we provide some applications of the compressibility-learnability
    equivalences:

    (i) Agnostic-case learnability and realizable-case learnability are
    equivalent in multiclass categorization problems (in terms of sample
    complexity).

    (ii) This equivalence between agnostic-case learnability and realizable-case
    learnability does not hold for general learning problems: There exists a
    learning problem whose loss function takes just three values, under which
    agnostic-case and realizable-case learnability are not equivalent.

    (iii) Uniform convergence implies compression of constant size in multiclass
    categorization problems. Part of the argument includes an analysis of the
    uniform convergence rate in terms of the graph dimension, in which we improve
    upon previous bounds.

    (iv) A dichotomy for sample compression in multiclass categorization
    problems: If a non-trivial compression exists then a compression of logarithmic
    size exists.

    (v) A compactness theorem for multiclass categorization problems.

    Minimax Filter: Learning to Preserve Privacy from Inference Attacks

    Jihun Hamm
    Subjects: Learning (cs.LG)

    Preserving privacy of continuous and/or high-dimensional data such as images,
    videos and audios, can be challenging with syntactic anonymization methods
    which are designed for discrete attributes. Differential privacy, which
    provides a more formal definition of privacy, has shown more success in
    sanitizing continuous data. However, both syntactic and differential privacy
    are susceptible to inference attacks, i.e., an adversary can accurately infer
    sensitive attributes from sanitized data. The paper proposes a novel
    filter-based mechanism which preserves privacy of continuous and
    high-dimensional attributes against inference attacks. Finding the optimal
    utility-privacy tradeoff is formulated as a min-diff-max optimization problem.
    The paper provides an ERM-like analysis of the generalization error and also a
    practical algorithm to perform the optimization. In addition, the paper
    proposes an extension that combines minimax filter and differentially-private
    noisy mechanism. Advantages of the method over purely noisy mechanisms is
    explained and demonstrated with examples. Experiments with several real-world
    tasks including facial expression classification, speech emotion
    classification, and activity classification from motion, show that the minimax
    filter can simultaneously achieve similar or better target task accuracy and
    lower inference accuracy, often significantly lower than previous methods.

    Parallelizing Stochastic Approximation Through Mini-Batching and Tail-Averaging

    Prateek Jain, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli, Aaron Sidford
    Comments: 34 pages
    Subjects: Machine Learning (stat.ML); Data Structures and Algorithms (cs.DS); Learning (cs.LG)

    This work characterizes the benefits of averaging techniques widely used in
    conjunction with stochastic gradient descent (SGD). In particular, this work
    sharply analyzes: (1) mini-batching, a method of averaging many samples of the
    gradient to both reduce the variance of a stochastic gradient estimate and for
    parallelizing SGD and (2) tail-averaging, a method involving averaging the
    final few iterates of SGD in order to decrease the variance in SGD’s final
    iterate. This work presents the first tight non-asymptotic generalization error
    bounds for these schemes for the stochastic approximation problem of least
    squares regression.

    Furthermore, this work establishes a precise problem-dependent extent to
    which mini-batching can be used to yield provable near-linear parallelization
    speedups over SGD with batch size one. These results are utilized in providing
    a highly parallelizable SGD algorithm that obtains the optimal statistical
    error rate with nearly the same number of serial updates as batch gradient
    descent, which improves significantly over existing SGD-style methods.

    Finally, this work sheds light on some fundamental differences in SGD’s
    behavior when dealing with agnostic noise in the (non-realizable) least squares
    regression problem. In particular, the work shows that the stepsizes that
    ensure optimal statistical error rates for the agnostic case must be a function
    of the noise properties.

    The central analysis tools used by this paper are obtained through
    generalizing the operator view of averaged SGD, introduced by Defossez and Bach
    (2015) followed by developing a novel analysis in bounding these operators to
    characterize the generalization error. These techniques may be of broader
    interest in analyzing various computational aspects of stochastic
    approximation.

    Detecting Unseen Falls from Wearable Devices using Channel-wise Ensemble of Autoencoders

    Shehroz S. Khan, Babak Taati
    Comments: 8 pages, 2 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)

    A fall is an abnormal activity that occurs rarely, so it is hard to collect
    real data for falls. It is, therefore, difficult to use supervised learning
    methods to automatically detect falls. Another challenge in using machine
    learning methods to automatically detect falls is the choice of features. In
    this paper, we propose to use an ensemble of autoencoders to extract features
    from different channels of wearable sensor data trained only on normal
    activities. We show that choosing a threshold as maximum of the reconstruction
    error on the training normal data is not the right way to identify unseen
    falls. We propose two methods for automatic tightening of reconstruction error
    from only the normal activities for better identification of unseen falls. We
    present our results on two activity recognition datasets and show the efficacy
    of our proposed method against traditional autoencoder models and two standard
    one-class classification methods.

    Exploring the Entire Regularization Path for the Asymmetric Cost Linear Support Vector Machine

    Daniel Wesierski
    Comments: 8 pages, 2 figures
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    We propose an algorithm for exploring the entire regularization path of
    asymmetric-cost linear support vector machines. Empirical evidence suggests the
    predictive power of support vector machines depends on the regularization
    parameters of the training algorithms. The algorithms exploring the entire
    regularization paths have been proposed for single-cost support vector machines
    thereby providing the complete knowledge on the behavior of the trained model
    over the hyperparameter space. Considering the problem in two-dimensional
    hyperparameter space though enables our algorithm to maintain greater
    flexibility in dealing with special cases and sheds light on problems
    encountered by algorithms building the paths in one-dimensional spaces. We
    demonstrate two-dimensional regularization paths for linear support vector
    machines that we train on synthetic and real data.

    Optimistic Semi-supervised Least Squares Classification

    Jesse H. Krijthe, Marco Loog
    Comments: 6 pages, 6 figures. International Conference on Pattern Recognition (ICPR) 2016, Cancun, Mexico
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    The goal of semi-supervised learning is to improve supervised classifiers by
    using additional unlabeled training examples. In this work we study a simple
    self-learning approach to semi-supervised learning applied to the least squares
    classifier. We show that a soft-label and a hard-label variant of self-learning
    can be derived by applying block coordinate descent to two related but slightly
    different objective functions. The resulting soft-label approach is related to
    an idea about dealing with missing data that dates back to the 1930s. We show
    that the soft-label variant typically outperforms the hard-label variant on
    benchmark datasets and partially explain this behaviour by studying the
    relative difficulty of finding good local minima for the corresponding
    objective functions.

    Deep Fruit Detection in Orchards

    Suchet Bargoti, James Underwood
    Comments: Submitted to the IEEE International Conference on Robotics and Automation 2017
    Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    An accurate and reliable image based fruit detection system is critical for
    supporting higher level agriculture tasks such as yield mapping and robotic
    harvesting. This paper presents the use of a state-of-the-art object detection
    framework, Faster R-CNN, in the context of fruit detection in orchards,
    including mangoes, almonds and apples. Ablation studies are presented to better
    understand the practical deployment of the detection network, including how
    much training data is required to capture variability in the dataset. Data
    augmentation techniques are shown to yield significant performance gains,
    resulting in a greater than two-fold reduction in the number of training images
    required. In contrast, transferring knowledge between orchards contributed to
    negligible performance gain over initialising the Deep Convolutional Neural
    Network directly from ImageNet features. Finally, to operate over orchard data
    containing between 100-1000 fruit per image, a tiling approach is introduced
    for the Faster R-CNN framework. The study has resulted in the best yet
    detection performance for these orchards relative to previous works, with an
    F1-score of >0.9 achieved for apples and mangoes.

    RetiNet: Automatic AMD identification in OCT volumetric data

    Stefanos Apostolopoulos, Carlos Ciller, Sandro I. De Zanet, Sebastian Wolf, Raphael Sznitman
    Comments: 14 pages, 10 figures, Code available
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Optical Coherence Tomography (OCT) provides a unique ability to image the eye
    retina in 3D at micrometer resolution and gives ophthalmologist the ability to
    visualize retinal diseases such as Age-Related Macular Degeneration (AMD).
    While visual inspection of OCT volumes remains the main method for AMD
    identification, doing so is time consuming as each cross-section within the
    volume must be inspected individually by the clinician. In much the same way,
    acquiring ground truth information for each cross-section is expensive and time
    consuming. This fact heavily limits the ability to acquire large amounts of
    ground truth, which subsequently impacts the performance of learning-based
    methods geared at automatic pathology identification. To avoid this burden, we
    propose a novel strategy for automatic analysis of OCT volumes where only
    volume labels are needed. That is, we train a classifier in a semi-supervised
    manner to conduct this task. Our approach uses a novel Convolutional Neural
    Network (CNN) architecture, that only needs volume-level labels to be trained
    to automatically asses whether an OCT volume is healthy or contains AMD. Our
    architecture involves first learning a cross-section pathology classifier using
    pseudo-labels that could be corrupted and then leverage these towards a more
    accurate volume-level classification. We then show that our approach provides
    excellent performances on a publicly available dataset and outperforms a number
    of existing automatic techniques.

    Optimizing Memory Efficiency for Deep Convolutional Neural Networks on GPUs

    Chao Li, Yi Yang, Min Feng, Srimat Chakradhar, Huiyang Zhou
    Comments: Published as a conference paper International Conference on High Performance Computing, Networking, Storage, and Analysis (SC’16), 2016
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Performance (cs.PF)

    Leveraging large data sets, deep Convolutional Neural Networks (CNNs) achieve
    state-of-the-art recognition accuracy. Due to the substantial compute and
    memory operations, however, they require significant execution time. The
    massive parallel computing capability of GPUs make them as one of the ideal
    platforms to accelerate CNNs and a number of GPU-based CNN libraries have been
    developed. While existing works mainly focus on the computational efficiency of
    CNNs, the memory efficiency of CNNs have been largely overlooked. Yet CNNs have
    intricate data structures and their memory behavior can have significant impact
    on the performance. In this work, we study the memory efficiency of various CNN
    layers and reveal the performance implication from both data layouts and memory
    access patterns. Experiments show the universal effect of our proposed
    optimizations on both single layers and various networks, with up to 27.9x for
    a single layer and up to 5.6x on the whole networks.

    Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model

    Paul Christiano, Zain Shah, Igor Mordatch, Jonas Schneider, Trevor Blackwell, Joshua Tobin, Pieter Abbeel, Wojciech Zaremba
    Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Learning (cs.LG); Systems and Control (cs.SY)

    Developing control policies in simulation is often more practical and safer
    than directly running experiments in the real world. This applies to policies
    obtained from planning and optimization, and even more so to policies obtained
    from reinforcement learning, which is often very data demanding. However, a
    policy that succeeds in simulation often doesn’t work when deployed on a real
    robot. Nevertheless, often the overall gist of what the policy does in
    simulation remains valid in the real world. In this paper we investigate such
    settings, where the sequence of states traversed in simulation remains
    reasonable for the real world, even if the details of the controls are not, as
    could be the case when the key differences lie in detailed friction, contact,
    mass and geometry properties. During execution, at each time step our approach
    computes what the simulation-based control policy would do, but then, rather
    than executing these controls on the real robot, our approach computes what the
    simulation expects the resulting next state(s) will be, and then relies on a
    learned deep inverse dynamics model to decide which real-world action is most
    suitable to achieve those next states. Deep models are only as good as their
    training data, and we also propose an approach for data collection to
    (incrementally) learn the deep inverse dynamics model. Our experiments shows
    our approach compares favorably with various baselines that have been developed
    for dealing with simulation to real world model discrepancy, including output
    error control and Gaussian dynamics adaptation.


    Information Theory

    Decentralized Coded Caching with Distinct Cache Capacities

    Mohammad Mohammadi Amiri, Qianqian Yang, Deniz Gündüz
    Comments: To be presented in ASILOMAR conference, 2016
    Subjects: Information Theory (cs.IT)

    Decentralized coded caching is studied for a content server with $N$ files,
    each of size $F$ bits, serving $K$ active users, each equipped with a cache of
    distinct capacity. It is assumed that the users’ caches are filled in advance
    during the off-peak traffic period without the knowledge of the number of
    active users, their identities, or the particular demands. User demands are
    revealed during the peak traffic period, and are served simultaneously through
    an error-free shared link. A new decentralized coded caching scheme is proposed
    for this scenario, and it is shown to improve upon the state-of-the-art in
    terms of the required delivery rate over the shared link, when there are more
    users in the system than the number of files. Numerical results indicate that
    the improvement becomes more significant as the cache capacities of the users
    become more skewed.

    Burst Transmission Symbol Synchronization in the Presence of Cycle Slip Arising from Different Clock Frequencies

    Somaye Bazin, Mahmoud Ferdosizade Naeiny, Roya Khanzade
    Subjects: Information Theory (cs.IT)

    In digital communication systems different clock frequencies of transmitter
    and receiver usually is translated into cycle slips. Receivers might experience
    different sampling frequencies from transmitter due to manufacturing
    imperfection, Doppler Effect introduced by channel or wrong estimation of
    symbol rate. Timing synchronization in presence of cycle slip for a burst
    sequence of received information, leads to severe degradation in system
    performance that represents as shortening or prolonging of bit stream. Therefor
    the necessity of prior detection and elimination of cycle slip is unavoidable.
    Accordingly, the main idea introduced in this paper is to employ the Gardner
    Detector (GAD) not only to recover a fixed timing offset, its output is also
    processed in a way such that timing drifts can be estimated and corrected.
    Deriving a two steps algorithm, eliminates the cycle slips arising from wrong
    estimation of symbol rate firstly, and then iteratively synchronize symbol
    timing of a burst received signal by applying GAD to a feed forward structure
    with the additional benefits that convergence and stability problems are
    avoided, as they are typical for feedback schemes normally used by GAD. The
    proposed algorithm is able to compensate considerable symbol rate offsets at
    the receiver side. Considerable results in terms of BER confirm the algorithm
    proficiency.

    An Inter-User Interference Suppression Method in Full-Duplex Networks

    Fei Wu, Si Li, Youxi Tang
    Comments: 34 pages, 15 figures, plan to submit to IEEE Transactions on Vehicular Technology
    Subjects: Information Theory (cs.IT)

    Considering a full-duplex network comprised of a full-duplex (FD) base
    station and two half-duplex (HD) users, one user transmits signal on the uplink
    channel and the other receives signal through downlink channel on the same
    frequency. Thus, the uplink user will generate inter-user interference (IUI) on
    the downlink user through interference channel. In this paper, we propose a
    base station assisted IUI suppression approach when the base station knows the
    full channel station information (CSI). To evaluate the performance of the
    proposed approach, four cases are considered, i.e., uplink, downlink, and
    interference channels are Gaussian; downlink and interference channels are
    Rayleigh fading and uplink channel is Gaussian; uplink, downlink, and
    interference channels are Rayleigh fading; uplink, downlink, and interference
    channels are Rician fading. We derive the close-form expression of the sum
    achievable rate and energy efficient for the former two cases and investigate
    the the sum achievable rate and energy efficient for the latter two cases
    through Monte Carlo simulations. Analytic and simulation results show that the
    sum achievable rate and energy efficient of the proposed IUI suppression
    approach is significantly influence by the signal-to-noise-ratio (SNR), the
    Rician factor, and channel power ratio between uplink and interference channel.

    New families of Strictly optimal Frequency hopping sequence sets

    Jingjun Bao
    Comments: arXiv admin note: substantial text overlap with arXiv:1511.02924
    Subjects: Information Theory (cs.IT)

    Frequency hopping sequences (FHSs) with favorable partial Hamming correlation
    properties have important applications in many synchronization and
    multiple-access systems. In this paper, we investigate constructions of FHS
    sets with optimal partial Hamming correlation. We present several direct
    constructions for balanced nested cyclic difference packings (BNCDPs) and
    balanced nested cyclic relative difference packings (BNCRDPs) such that both of
    them have a special property by using trace functions and discrete logarithm.
    We also show three recursive constructions for FHS sets with partial Hamming
    correlation, which are based on cyclic difference matrices and discrete
    logarithm. Combing these BNCDPs, BNCRDPs and three recursive constructions, we
    obtain infinitely many new strictly optimal FHS sets with respect to the
    Peng-Fan bounds.

    Capacity bounds for distributed storage

    Michael Luby
    Comments: 18 pages, 6 figures, submitted to IEEE Transactions on Information Theory on October 11, 2016
    Subjects: Information Theory (cs.IT)

    The information capacity of a distributed storage system is the amount of
    source data that can be reliably stored for long durations. Storage nodes fail
    over time and are replaced, and thus data is erased at an erasure rate. To
    maintain recoverability of source data, a repairer generates redundant data
    from data read from nodes, and writes redundant data to nodes, where the repair
    rate is the rate at which the repairer reads and writes data. We prove the
    information capacity approaches (1-1/(2*sigma))*N*s as N and sigma grow, where
    N is the number of nodes, s is the amount of data each node can store, and
    sigma is the repair rate to erasure rate ratio.

    Cooperative Strategies for Wireless-Powered Communications

    He Chen, Chao Zhai, Yonghui Li, Branka Vucetic
    Comments: Submitted for possible publications
    Subjects: Information Theory (cs.IT)

    Radio frequency (RF) energy transfer and harvesting has been intensively
    studied recently as a promising approach to significantly extend the lifetime
    of energy-constrained wireless networks. This technique has a great potential
    to provide relatively stable and continuous RF energy to devices wirelessly, it
    thus opened a new research paradigm, termed wireless-powered communication
    (WPC), which has raised many new research opportunities with wide applications.
    Among these, the design and evaluation of cooperative schemes towards
    energy-efficient WPC have attracted tremendous research interests nowadays.
    This article provides an overview of various energy-efficient cooperative
    strategies for WPC, with particular emphasis on relaying protocols for
    wireless-powered cooperative communications, cooperative spectrum sharing
    schemes for wireless-powered cognitive radio networks, and cooperative jamming
    strategies towards wireless-powered secure communications. We also identify
    some valuable research directions in this area before concluding this article.

    Enhancing Secrecy with Multi-Antenna Transmission in Millimeter Wave Vehicular Communication Systems

    Mohammed E. Eltayeb, Junil Choi, Tareq Y. Al-Naffouri, Robert W. Heath Jr
    Subjects: Information Theory (cs.IT)

    Millimeter wave (mmWave) vehicular communication systems will provide an
    abundance of bandwidth for the exchange of raw sensor data and support
    driver-assisted and safety-related functionalities. Lack of secure
    communication links, however, may lead to abuses and attacks that jeopardize
    the efficiency of transportation systems and the physical safety of drivers. In
    this paper, we propose two physical layer (PHY) security techniques for
    vehicular mmWave communication systems. The first technique uses multiple
    antennas with a single RF chain to transmit information symbols to a target
    receiver and noise-like signals in non-receiver directions. The second
    technique uses multiple antennas with a few RF chains to transmit information
    symbols to a target receiver and opportunistically inject artificial noise in
    controlled directions, thereby reducing interference in vehicular environments.
    Theoretical and numerical results show that the proposed techniques provide
    higher secrecy rate when compared to traditional PHY security techniques that
    require digital or more complex antenna architectures.

    Sparse Channel Estimation for Massive MIMO with 1-bit Feedback per Dimension

    Zhiyi Zhou, Xu Chen, Dongning Guo, Michael L. Honig
    Comments: Submitted
    Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT)

    In massive multiple-input multiple-output (MIMO) systems, acquisition of the
    channel state information at the transmitter side (CSIT) is crucial. In this
    paper, a practical CSIT estimation scheme is proposed for frequency division
    duplexing (FDD) massive MIMO systems. Specifically, each received pilot symbol
    is first quantized to one bit per dimension at the receiver side and then the
    quantized bits are fed back to the transmitter. A joint one-bit compressed
    sensing algorithm is implemented at the transmitter to recover the channel
    matrices. The algorithm leverages the hidden joint sparsity structure in the
    user channel matrices to minimize the training and feedback overhead, which is
    considered to be a major challenge for FDD systems. Moreover, the one-bit
    compressed sensing algorithm accurately recovers the channel directions for
    beamforming. The one-bit feedback mechanism can be implemented in practical
    systems using the uplink control channel. Simulation results show that the
    proposed scheme nearly achieves the maximum output signal-to-noise-ratio for
    beamforming based on the estimated CSIT.




沪ICP备19023445号-2号
友情链接