IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Fri, 17 Mar 2017

    我爱机器学习(52ml.net)发表于 2017-03-17 00:00:00
    love 0

    Neural and Evolutionary Computing

    Large Scale Evolution of Convolutional Neural Networks Using Volunteer Computing

    Travis Desell
    Comments: 17 pages, 13 figures. Submitted to the 2017 Genetic and Evolutionary Computation Conference (GECCO 2017)
    Subjects: Neural and Evolutionary Computing (cs.NE)

    This work presents a new algorithm called evolutionary exploration of
    augmenting convolutional topologies (EXACT), which is capable of evolving the
    structure of convolutional neural networks (CNNs). EXACT is in part modeled
    after the neuroevolution of augmenting topologies (NEAT) algorithm, with
    notable exceptions to allow it to scale to large scale distributed computing
    environments and evolve networks with convolutional filters. In addition to
    multithreaded and MPI versions, EXACT has been implemented as part of a BOINC
    volunteer computing project, allowing large scale evolution. During a period of
    two months, over 4,500 volunteered computers on the Citizen Science Grid
    trained over 120,000 CNNs and evolved networks reaching 98.32% test data
    accuracy on the MNIST handwritten digits dataset. These results are even
    stronger as the backpropagation strategy used to train the CNNs was fairly
    rudimentary (ReLU units, L2 regularization and Nesterov momentum) and these
    were initial test runs done without refinement of the backpropagation
    hyperparameters. Further, the EXACT evolutionary strategy is independent of the
    method used to train the CNNs, so they could be further improved by advanced
    techniques like elastic distortions, pretraining and dropout. The evolved
    networks are also quite interesting, showing “organic” structures and
    significant differences from standard human designed architectures.

    A Study of Complex Deep Learning Networks on High Performance, Neuromorphic, and Quantum Computers

    Thomas E. Potok, Catherine Schuman, Steven R. Young, Robert M. Patton, Federico Spedalieri, Jeremy Liu, Ke-Thia Yao, Garrett Rose, Gangotree Chakma
    Subjects: Neural and Evolutionary Computing (cs.NE); Learning (cs.LG)

    Current Deep Learning approaches have been very successful using
    convolutional neural networks (CNN) trained on large graphical processing units
    (GPU)-based computers. Three limitations of this approach are: 1) they are
    based on a simple layered network topology, i.e., highly connected layers,
    without intra-layer connections; 2) the networks are manually configured to
    achieve optimal results, and 3) the implementation of neuron model is expensive
    in both cost and power. In this paper, we evaluate deep learning models using
    three different computing architectures to address these problems: quantum
    computing to train complex topologies, high performance computing (HPC) to
    automatically determine network topology, and neuromorphic computing for a
    low-power hardware implementation. We use the MNIST dataset for our experiment,
    due to input size limitations of current quantum computers. Our results show
    the feasibility of using the three architectures in tandem to address the above
    deep learning limitations. We show a quantum computer can find high quality
    values of intra-layer connections weights, in a tractable time as the
    complexity of the network increases; a high performance computer can find
    optimal layer-based topologies; and a neuromorphic computer can represent the
    complex topology and weights derived from the other architectures in low power
    memristive hardware.


    Computer Vision and Pattern Recognition

    Learning Robust Hash Codes for Multiple Instance Image Retrieval

    Sailesh Conjeti, Magdalini Paschali, Amin Katouzian, Nassir Navab
    Comments: 10 pages, 7 figures, under review at MICCAI 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, for the first time, we introduce a multiple instance (MI) deep
    hashing technique for learning discriminative hash codes with weak bag-level
    supervision suited for large-scale retrieval. We learn such hash codes by
    aggregating deeply learnt hierarchical representations across bag members
    through a dedicated MI pool layer. For better trainability and retrieval
    quality, we propose a two-pronged approach that includes robust optimization
    and training with an auxiliary single instance hashing arm which is
    down-regulated gradually. We pose retrieval for tumor assessment as an MI
    problem because tumors often coexist with benign masses and could exhibit
    complementary signatures when scanned from different anatomical views.
    Experimental validations on benchmark mammography and histology datasets
    demonstrate improved retrieval performance over the state-of-the-art methods.

    SVDNet for Pedestrian Retrieval

    Yifan Sun, Liang Zheng, Weijian Deng, Shengjin Wang
    Comments: This version is not fully edited and will be updated soon
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper proposes the SVDNet for retrieval problems, with focus on the
    application of person re-identification (re-ID). We view the weight matrix (W)
    as a set of weight vectors or basis. It is observed that the weight vectors are
    usually highly-correlated. This problem leads correlations among entries of the
    FC descriptor, and compromises the retrieval performance based on the Euclidean
    distance. To alleviate the correlation problem, this paper proposes to
    decorrelate the learned weight vectors using singular vector decomposition
    (SVD). Specifically, we design a novel training strategy with the “restraint
    and relaxation iteration” (RRI) scheme. We conduct experiment on the
    Market-1501, CUHK03, and Duke datasets, and show that RRI effectively reduces
    the correlation among the projection vectors and significantly improves the
    re-ID accuracy. On the Market-1501 dataset, for instance, rank-1 accuracy is
    improved from 55.2% to 80.5% for CaffeNet, and from 73.8% to 83.1% for
    ResNet-50.

    Segmented and Directional Impact Detection for Parked Vehicles using Mobile Devices

    Andre Ebert, Sebastian Feld, Florian Dorfmeister
    Comments: 4 Pages, 6 Figures, Accepted at the The 23rd International Conference on Systems, Signals and Image Processing (IWSSIP)
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Mutual usage of vehicles as well as car sharing became more and more
    attractive during the last years. Especially in urban environments with limited
    parking possibilities and a higher risk for traffic jams, car rentals and
    sharing services may save time and money. But when renting a vehicle it could
    already be damaged (e.g., scratches or bumps inflicted by a previous user)
    without the damage being perceived by the service provider. In order to address
    such problems, we present an automated, motion-based system for impact
    detection, that facilitates a common smartphone as a sensor platform. The
    system is capable of detecting the impact segment and the point of time of an
    impact event on a vehicle’s surface, as well as its direction of origin. With
    this additional specific knowledge, it may be possible to reconstruct the
    circumstances of an impact event, e.g., to prove possible innocence of a
    service’s customer.

    Anisotropic-Scale Junction Detection and Matching for Indoor Images

    Nan Xue, Gui-Song Xia, Xiang Bai, Liangpei Zhang, Weiming Shen
    Comments: 24 pages, 15 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Junctions play an important role in the characterization of local geometric
    structures in images, the detection of which is a longstanding and challenging
    task. Existing junction detectors usually focus on identifying the junction
    locations and the orientations of the junction branches while ignoring their
    scales; however, these scales also contain rich geometric information. This
    paper presents a novel approach to junction detection and characterization that
    exploits the locally anisotropic geometries of a junction and estimates the
    scales of these geometries using an emph{a contrario} model. The output
    junctions have anisotropic scales — i.e., each branch of a junction is
    associated with an independent scale parameter — and are thus termed
    anisotropic-scale junctions (ASJs). We then apply the newly detected ASJs for
    the matching of indoor images, in which there may be dramatic changes in
    viewpoint and the detected local visual features, e.g., key-points, are usually
    insufficiently distinctive. We propose to use the anisotropic geometries of our
    junctions to improve the matching precision for indoor images. Matching results
    obtained on sets of indoor images demonstrate that our approach achieves
    state-of-the-art performance in indoor image matching.

    Deep Sketch Hashing: Fast Free-hand Sketch-Based Image Retrieval

    Li Liu, Fumin Shen, Yuming Shen, Xianglong Liu, Ling Shao
    Comments: This paper will appear as a spotlight paper in CVPR2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Free-hand sketch-based image retrieval (SBIR) is a specific cross-view
    retrieval task, in which queries are abstract and ambiguous sketches while the
    retrieval database is formed with natural images. Work in this area mainly
    focuses on extracting representative and shared features for sketches and
    natural images. However, these can neither cope well with the geometric
    distortion between sketches and images nor be feasible for large-scale SBIR due
    to the heavy continuous-valued distance computation. In this paper, we speed up
    SBIR by introducing a novel binary coding method, named extbf{Deep Sketch
    Hashing} (DSH), where a semi-heterogeneous deep architecture is proposed and
    incorporated into an end-to-end binary coding framework. Specifically, three
    convolutional neural networks are utilized to encode free-hand sketches,
    natural images and, especially, the auxiliary sketch-tokens which are adopted
    as bridges to mitigate the sketch-image geometric distortion. The learned DSH
    codes can effectively capture the cross-view similarities as well as the
    intrinsic semantic correlations between different categories. To the best of
    our knowledge, DSH is the first hashing work specifically designed for
    category-level SBIR with an end-to-end deep architecture. The proposed DSH is
    comprehensively evaluated on two large-scale datasets of TU-Berlin Extension
    and Sketchy, and the experiments consistently show DSH’s superior SBIR
    accuracies over several state-of-the-art methods, while achieving significantly
    reduced retrieval time and memory footprint.

    Convolutional neural network architecture for geometric matching

    Ignacio Rocco, Relja Arandjelović, Josef Sivic
    Comments: To appear in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017)
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    We address the problem of determining correspondences between two images in
    agreement with a geometric model such as an affine or thin-plate-spline
    transformation, and estimating its parameters. The contributions of this work
    are three-fold. First, we propose a convolutional neural network architecture
    for geometric matching. The architecture is based on three main components that
    mimic the standard steps of feature extraction, matching and simultaneous
    inlier detection and model parameter estimation, while being trainable
    end-to-end. Second, we demonstrate that the network parameters can be trained
    from synthetically generated imagery without the need for manual annotation and
    that our matching layer significantly increases generalization capabilities to
    never seen before images. Finally, we show that the same model can perform both
    instance-level and category-level matching giving state-of-the-art results on
    the challenging Proposal Flow dataset.

    From visual words to a visual grammar: using language modelling for image classification

    Antonio Foncubierta-Rodríguez, Henning Müller, Adrien Depeursinge
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    The Bag–of–Visual–Words (BoVW) is a visual description technique that aims
    at shortening the semantic gap by partitioning a low–level feature space into
    regions of the feature space that potentially correspond to visual concepts and
    by giving more value to this space. In this paper we present a conceptual
    analysis of three major properties of language grammar and how they can be
    adapted to the computer vision and image understanding domain based on the bag
    of visual words paradigm. Evaluation of the visual grammar shows that a
    positive impact on classification accuracy and/or descriptor size is obtained
    when the technique are applied when the proposed techniques are applied.

    Convolutional Neural Network on Three Orthogonal Planes for Dynamic Texture Classification

    Vincent Andrearczyk, Paul F. Whelan
    Comments: 19 pages, 10 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Dynamic Textures (DTs) are sequences of images of moving scenes that exhibit
    certain stationarity properties in time such as smoke, vegetation and fire. The
    analysis of DT is important for recognition, segmentation, synthesis or
    retrieval for a range of applications including surveillance, medical imaging
    and remote sensing. Deep learning methods have shown impressive results and are
    now the new state of the art for a wide range of computer vision tasks
    including image and video recognition and segmentation. In particular,
    Convolutional Neural Networks (CNNs) have recently proven to be well suited for
    texture analysis with a design similar to a filter bank approach. In this
    paper, we develop a new approach to DT analysis based on a CNN method applied
    on three orthogonal planes x y , xt and y t . We train CNNs on spatial frames
    and temporal slices extracted from the DT sequences and combine their outputs
    to obtain a competitive DT classifier. Our results on a wide range of commonly
    used DT classification benchmark datasets prove the robustness of our approach.
    Significant improvement of the state of the art is shown on the larger
    datasets.

    Global and Local Information Based Deep Network for Skin Lesion Segmentation

    Jin Qi, Miao Le, Chunming Li, Ping Zhou
    Comments: 4 pages, 3 figures. ISIC2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    With a large influx of dermoscopy images and a growing shortage of
    dermatologists, automatic dermoscopic image analysis plays an essential role in
    skin cancer diagnosis. In this paper, a new deep fully convolutional neural
    network (FCNN) is proposed to automatically segment melanoma out of skin images
    by end-to-end learning with only pixels and labels as inputs. Our proposed FCNN
    is capable of using both local and global information to segment melanoma by
    adopting skipping layers. The public benchmark database consisting of 150
    validation images, 600 test images and 2000 training images in the melanoma
    detection challenge 2017 at International Symposium Biomedical Imaging 2017 is
    used to test the performance of our algorithm. All large size images (for
    example, (4000 imes 6000) pixels) are reduced to much smaller images with
    (384 imes 384) pixels (more than 10 times smaller). We got and submitted
    preliminary results to the challenge without any pre or post processing. The
    performance of our proposed method could be further improved by data
    augmentation and by avoiding image size reduction.

    Using Human Brain Activity to Guide Machine Learning

    Ruth Fong, Walter Scheirer, David Cox
    Comments: Supplemental material can be downloaded here: this http URL
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Machine learning is a field of computer science that builds algorithms that
    learn. In many cases, machine learning algorithms are used to recreate a human
    ability like adding a caption to a photo, driving a car, or playing a game.
    While the human brain has long served as a source of inspiration for machine
    learning, little effort has been made to directly use data collected from
    working brains as a guide for machine learning algorithms. Here we demonstrate
    a new paradigm of “neurally-weighted” machine learning, which takes fMRI
    measurements of human brain activity from subjects viewing images, and infuses
    these data into the training process of an object recognition learning
    algorithm to make it more consistent with the human brain. After training,
    these neurally-weighted classifiers are able to classify images without
    requiring any additional neural data. We show that our neural-weighting
    approach can lead to large performance gains when used with traditional machine
    vision features, as well as to significant improvements with already
    high-performing convolutional neural network features. The effectiveness of
    this approach points to a path forward for a new class of hybrid machine
    learning algorithms which take both inspiration and direct constraints from
    neuronal data.

    A New and Practical Design of Cancellable Biometrics: Index-of-Max Hashing

    Zhe Jin, Yen-Lung Lai, Jung-Yeon Hwang, Soohyung Kim, Andrew Beng Jin Teoh
    Comments: 13 pages, 7 figures, 6 tables
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Despite a variety of theoretical-sound techniques have been proposed to
    generate cancellable biometric templates, there is rarely practical solution
    that satisfies non-invertibility, revocability, non-linkability and performance
    simultaneously. In this paper, we propose a locality sensitive hashing inspired
    cancellable biometrics framework, namely “Index-of-Max” (IoM) hashing. Briefly,
    IoM hashing non-linearly transforms a realvalued feature vector into discrete
    index hashed code. We demonstrate two realizations from IoM hashing framework,
    namely Gaussian Random Projection based and Uniformly Random Permutation based
    hashing schemes. The discrete indices representation nature of IoM hashed codes
    enjoy several merits such as it empowers strong concealment to biometric
    information. This contributes to the solid ground of noninvertibility
    guarantee. IoM hashing is insensitive to the magnitude of features, hence are
    more robust against biometric features variation and its magnitude-independence
    trait makes the resultant hash codes scale-invariant, which is critical for
    matching and feature alignment. The experimental results demonstrate reasonable
    accuracy performance on benchmark FVC2002 and FVC2004 fingerprint databases.
    Moreover, the analyses justify its resilience to the existing and newly
    introduced security and privacy attacks as well as satisfy the revocability and
    unlinkability criteria of cancellable biometrics. Besides, the implementation
    of IoM hashing is also incredibly simple for practical applications.

    Look into Person: Self-supervised Structure-sensitive Learning and A New Benchmark for Human Parsing

    Ke Gong, Xiaodan Liang, Xiaohui Shen, Liang Lin
    Comments: Accepted to appear in CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Human parsing has recently attracted a lot of research interests due to its
    huge application potentials. However existing datasets have limited number of
    images and annotations, and lack the variety of human appearances and the
    coverage of challenging cases in unconstrained environment. In this paper, we
    introduce a new benchmark “Look into Person (LIP)” that makes a significant
    advance in terms of scalability, diversity and difficulty, a contribution that
    we feel is crucial for future developments in human-centric analysis. This
    comprehensive dataset contains over 50,000 elaborately annotated images with 19
    semantic part labels, which are captured from a wider range of viewpoints,
    occlusions and background complexity. Given these rich annotations we perform
    detailed analyses of the leading human parsing approaches, gaining insights
    into the success and failures of these methods. Furthermore, in contrast to the
    existing efforts on improving the feature discriminative capability, we solve
    human parsing by exploring a novel self-supervised structure-sensitive learning
    approach, which imposes human pose structures into parsing results without
    resorting to extra supervision (i.e., no need for specifically labeling human
    joints in model training). Our self-supervised learning framework can be
    injected into any advanced neural networks to help incorporate rich high-level
    knowledge regarding human joints from a global perspective and improve the
    parsing results. Extensive evaluations on our LIP and the public
    PASCAL-Person-Part dataset demonstrate the superiority of our method.

    Convolutional Low-Resolution Fine-Grained Classification

    Dingding Cai, Ke Chen, Yanlin Qian, Joni-Kristian Kämäräinen
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Successful fine-grained image classification methods learn subtle details
    between visually similar (sub-)classes, but the problem becomes significantly
    more challenging if the details are missing due to low resolution. Encouraged
    by the recent success of Convolutional Neural Network (CNN) architectures in
    image classification, we propose a novel resolution-aware deep model which
    combines convolutional image super-resolution and convolutional fine-grained
    classification into a single model in an end-to-end manner. Extensive
    experiments on the Stanford Cars and Caltech-UCSD Birds 200-2011 benchmarks
    demonstrate that the proposed model consistently performs better than
    conventional convolutional net on classifying fine-grained object classes in
    low-resolution images.

    Illuminant Estimation using Ensembles of Multivariate Regression Trees

    Peter van Beek, R. Wayne Oldford
    Comments: 20 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    White balancing is a fundamental step in the image processing pipeline. The
    process involves estimating the chromaticity of the illuminant or light source
    and using the estimate to correct the image to remove any color cast. Given the
    importance of the problem, there has been much previous work on illuminant
    estimation. Recently, an approach based on ensembles of univariate regression
    trees that are fit using the squared-error loss function has been proposed and
    shown to give excellent performance. In this paper, we show that a simpler and
    more accurate ensemble model can be learned by (i) using multivariate
    regression trees to take into account that the chromaticity components of the
    illuminant are correlated and constrained, and (ii) fitting each tree by
    directly minimizing a loss function of interest—such as recovery angular
    error or reproduction angular error—rather than indirectly using the
    squared-error loss function as a surrogate. We show empirically that overall
    our method leads to improved performance on diverse image sets.

    Cloud Radiative Effect Study Using Sky Camera

    Soumyabrata Dev, Shilpa Manandhar, Feng Yuan, Yee Hui Lee, Stefan Winkler
    Comments: Accepted in Proc. IEEE AP-S Symposium on Antennas and Propagation and USNC-URSI Radio Science Meeting, 2017
    Subjects: Atmospheric and Oceanic Physics (physics.ao-ph); Computer Vision and Pattern Recognition (cs.CV)

    The analysis of clouds in the earth’s atmosphere is important for a variety
    of applications, viz. weather reporting, climate forecasting, and solar energy
    generation. In this paper, we focus our attention on the impact of cloud on the
    total solar irradiance reaching the earth’s surface. We use weather station to
    record the total solar irradiance. Moreover, we employ collocated ground-based
    sky camera to automatically compute the instantaneous cloud coverage. We
    analyze the relationship between measured solar irradiance and computed cloud
    coverage value, and conclude that higher cloud coverage greatly impacts the
    total solar irradiance. Such studies will immensely help in solar energy
    generation and forecasting.

    Combining Contrast Invariant L1 Data Fidelities with Nonlinear Spectral Image Decomposition

    Leonie Zeune, Stephan A. van Gils, Leon W.M.M. Terstappen, Christoph Brune
    Comments: 13 pages, 7 figures, conference SSVM 2017
    Subjects: Numerical Analysis (math.NA); Computer Vision and Pattern Recognition (cs.CV); Spectral Theory (math.SP)

    This paper focuses on multi-scale approaches for variational methods and
    corresponding gradient flows. Recently, for convex regularization functionals
    such as total variation, new theory and algorithms for nonlinear eigenvalue
    problems via nonlinear spectral decompositions have been developed. Those
    methods open new directions for advanced image filtering. However, for an
    effective use in image segmentation and shape decomposition, a clear
    interpretation of the spectral response regarding size and intensity scales is
    needed but lacking in current approaches. In this context, (L^1) data
    fidelities are particularly helpful due to their interesting multi-scale
    properties such as contrast invariance. Hence, the novelty of this work is the
    combination of (L^1)-based multi-scale methods with nonlinear spectral
    decompositions. We compare (L^1) with (L^2) scale-space methods in view of
    spectral image representation and decomposition. We show that the contrast
    invariant multi-scale behavior of (L^1-TV) promotes sparsity in the spectral
    response providing more informative decompositions. We provide a numerical
    method and analyze synthetic and biomedical images at which decomposition leads
    to improved segmentation.

    Steganographic Generative Adversarial Networks

    Denis Volkhonskiy, Ivan Nazarov, Boris Borisenko, Evgeny Burnaev
    Comments: 8 pages, 2 figures, Workshop on Adversarial Training (NIPS 2016, Barcelona, Spain)
    Subjects: Multimedia (cs.MM); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Applications (stat.AP)

    Steganography is collection of methods to hide secret information (“payload”)
    within non-secret information (“container”). Its counterpart, Steganalysis, is
    the practice of determining if a message contains a hidden payload, and
    recovering it if possible. Presence of hidden payloads is typically detected by
    a binary classifier. In the present study, we propose a new model for
    generating image-like containers based on Deep Convolutional Generative
    Adversarial Networks (DCGAN). This approach allows to generate more
    setganalysis-secure message embedding using standard steganography algorithms.
    Experiment results demonstrate that the new model successfully deceives the
    steganography analyzer, and for this reason, can be used in steganographic
    applications.

    Refining Image Categorization by Exploiting Web Images and General Corpus

    Yazhou Yao, Jian Zhang, Fumin Shen, Xiansheng Hua, Wankou Yang, Zhenmin Tang
    Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV)

    Studies show that refining real-world categories into semantic subcategories
    contributes to better image modeling and classification. Previous image
    sub-categorization work relying on labeled images and WordNet’s hierarchy is
    not only labor-intensive, but also restricted to classify images into NOUN
    subcategories. To tackle these problems, in this work, we exploit general
    corpus information to automatically select and subsequently classify web images
    into semantic rich (sub-)categories. The following two major challenges are
    well studied: 1) noise in the labels of subcategories derived from the general
    corpus; 2) noise in the labels of images retrieved from the web. Specifically,
    we first obtain the semantic refinement subcategories from the text perspective
    and remove the noise by the relevance-based approach. To suppress the search
    error induced noisy images, we then formulate image selection and classifier
    learning as a multi-class multi-instance learning problem and propose to solve
    the employed problem by the cutting-plane algorithm. The experiments show
    significant performance gains by using the generated data of our way on both
    image categorization and sub-categorization tasks. The proposed approach also
    consistently outperforms existing weakly supervised and web-supervised
    approaches.

    3D Vision Guided Robotic Charging Station for Electric and Plug-in Hybrid Vehicles

    Justinas Miseikis, Matthias Ruther, Bernhard Walzel, Mario Hirz, Helmut Brunner
    Comments: 6 pages, 8 figures, OAGM and ARW Joint Workshop 2017 on Vision, Automation and Robotics
    Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)

    Electric vehicles (EVs) and plug-in hybrid vehicles (PHEVs) are rapidly
    gaining popularity on our roads. Besides a comparatively high purchasing price,
    the main two problems limiting their use are the short driving range and
    inconvenient charging process. In this paper we address the following by
    presenting an automatic robot-based charging station with 3D vision guidance
    for plugging and unplugging the charger. First of all, the whole system concept
    consisting of a 3D vision system, an UR10 robot and a charging station is
    presented. Then we show the shape-based matching methods used to successfully
    identify and get the exact pose of the charging port. The same approach is used
    to calibrate the camera-robot system by using just known structure of the
    connector plug and no additional markers. Finally, a three-step robot motion
    planning procedure for plug-in is presented and functionality is demonstrated
    in a series of successful experiments.

    DeepVel: deep learning for the estimation of horizontal velocities at the solar surface

    A. Asensio Ramos (1,2), I. S. Requerey (1,2), N. Vitas (1,2) ((1) Instituto de Astrofisica de Canarias, (2) Universidad de La Laguna)
    Comments: 9 pages, 5 figures, submitted to A&A
    Subjects: Solar and Stellar Astrophysics (astro-ph.SR); Computer Vision and Pattern Recognition (cs.CV)

    Many phenomena taking place in the solar photosphere are controlled by plasma
    motions. Although the line-of-sight component of the velocity can be estimated
    using the Doppler effect, we do not have direct spectroscopic access to the
    components that are perpendicular to the line-of-sight. These components are
    typically estimated using methods based on local correlation tracking. We have
    designed DeepVel, an end-to-end deep neural network that produces an estimation
    of the velocity at every single pixel and at every time step and at three
    different heights in the atmosphere from just two consecutive continuum images.
    We confront DeepVel with local correlation tracking, pointing out that they
    give very similar results in the time- and spatially-averaged cases. We use the
    network to study the evolution in height of the horizontal velocity field in
    fragmenting granules, supporting the buoyancy-braking mechanism for the
    formation of integranular lanes in these granules. We also show that DeepVel
    can capture very small vortices, so that we can potentially expand the scaling
    cascade of vortices to very small sizes and durations.


    Artificial Intelligence

    ParaGraphE: A Library for Parallel Knowledge Graph Embedding

    Xiao-Fan Niu, Wu-Jun Li
    Subjects: Artificial Intelligence (cs.AI)

    Knowledge graph embedding aims at translating the knowledge graph into
    numerical representations by transforming the entities and relations into con-
    tinuous low-dimensional vectors. Recently, many methods [1, 5, 3, 2, 6] have
    been proposed to deal with this problem, but existing single-thread implemen-
    tations of them are time-consuming for large-scale knowledge graphs. Here, we
    design a unified parallel framework to parallelize these methods, which
    achieves a significant time reduction without in uencing the accuracy. We name
    our framework as ParaGraphE, which provides a library for parallel knowledge
    graph embedding. The source code can be downloaded from https:
    //github.com/LIBBLE/LIBBLE-MultiThread/tree/master/ParaGraphE.

    Efficient Online Learning for Optimizing Value of Information: Theory and Application to Interactive Troubleshooting

    Yuxin Chen, Jean-Michel Renders, Morteza Haghir Chehreghani, Andreas Krause
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    We consider the optimal value of information (VoI) problem, where the goal is
    to sequentially select a set of tests with a minimal cost, so that one can
    efficiently make the best decision based on the observed outcomes. Existing
    algorithms are either heuristics with no guarantees, or scale poorly (with
    exponential run time in terms of the number of available tests). Moreover,
    these methods assume a known distribution over the test outcomes, which is
    often not the case in practice. We propose an efficient sampling-based online
    learning framework to address the above issues. First, assuming the
    distribution over hypotheses is known, we propose a dynamic hypothesis
    enumeration strategy, which allows efficient information gathering with strong
    theoretical guarantees. We show that with sufficient amount of samples, one can
    identify a near-optimal decision with high probability. Second, when the
    parameters of the hypotheses distribution are unknown, we propose an algorithm
    which learns the parameters progressively via posterior sampling in an online
    fashion. We further establish a rigorous bound on the expected regret. We
    demonstrate the effectiveness of our approach on a real-world interactive
    troubleshooting application and show that one can efficiently make high-quality
    decisions with low cost.

    Concentration Bounds for Two Timescale Stochastic Approximation with Applications to Reinforcement Learning

    Gal Dalal, Balazs Szorenyi, Gugan Thoppe, Shie Mannor
    Subjects: Artificial Intelligence (cs.AI)

    Two-timescale Stochastic Approximation (SA) algorithms are widely used in
    Reinforcement Learning (RL). In such methods, the iterates consist of two parts
    that are updated using different stepsizes. We develop the first convergence
    rate result for these algorithms; in particular, we provide a general
    methodology for analyzing two-timescale linear SA. We apply our methodology to
    two-timescale RL algorithms such as GTD(0), GTD2, and TDC.

    Database Learning: Toward a Database that Becomes Smarter Every Time

    Yongjoo Park, Ahmad Shahab Tajik, Michael Cafarella, Barzan Mozafari
    Comments: This manuscript is an extended report of the work published in ACM SIGMOD conference 2017
    Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI)

    In today’s databases, previous query answers rarely benefit answering future
    queries. For the first time, to the best of our knowledge, we change this
    paradigm in an approximate query processing (AQP) context. We make the
    following observation: the answer to each query reveals some degree of
    knowledge about the answer to another query because their answers stem from the
    same underlying distribution that has produced the entire dataset. Exploiting
    and refining this knowledge should allow us to answer queries more
    analytically, rather than by reading enormous amounts of raw data. Also,
    processing more queries should continuously enhance our knowledge of the
    underlying distribution, and hence lead to increasingly faster response times
    for future queries.

    We call this novel idea—learning from past query answers—Database
    Learning. We exploit the principle of maximum entropy to produce answers, which
    are in expectation guaranteed to be more accurate than existing sample-based
    approximations. Empowered by this idea, we build a query engine on top of Spark
    SQL, called Verdict. We conduct extensive experiments on real-world query
    traces from a large customer of a major database vendor. Our results
    demonstrate that database learning supports 73.7% of these queries, speeding
    them up by up to 23.0x for the same accuracy level compared to existing AQP
    systems.

    Minimax Regret Bounds for Reinforcement Learning

    Mohammad Gheshlaghi Azar, Ian Osband, Rémi Munos
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG)

    We consider the problem of efficient exploration in finite horizon MDPs.We
    show that an optimistic modification to model-based value iteration, can
    achieve a regret bound ( ilde{O}( sqrt{HSAT} + H^2S^2A+Hsqrt{T})) where (H)
    is the time horizon, (S) the number of states, (A) the number of actions and
    (T) the time elapsed. This result improves over the best previous known bound
    ( ilde{O}(HS sqrt{AT})) achieved by the UCRL2 algorithm.The key significance
    of our new results is that when (Tgeq H^3S^3A) and (SAgeq H), it leads to a
    regret of ( ilde{O}(sqrt{HSAT})) that matches the established lower bounds of
    (Omega(sqrt{HSAT})) up to a logarithmic factor. Our analysis contain two key
    insights. We use careful application of concentration inequalities to the
    optimal value function as a whole, rather than to the transitions probabilities
    (to improve scaling in (S)), and we use “exploration bonuses” based on
    Bernstein’s inequality, together with using a recursive -Bellman-type- Law of
    Total Variance (to improve scaling in (H)).

    Look into Person: Self-supervised Structure-sensitive Learning and A New Benchmark for Human Parsing

    Ke Gong, Xiaodan Liang, Xiaohui Shen, Liang Lin
    Comments: Accepted to appear in CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Human parsing has recently attracted a lot of research interests due to its
    huge application potentials. However existing datasets have limited number of
    images and annotations, and lack the variety of human appearances and the
    coverage of challenging cases in unconstrained environment. In this paper, we
    introduce a new benchmark “Look into Person (LIP)” that makes a significant
    advance in terms of scalability, diversity and difficulty, a contribution that
    we feel is crucial for future developments in human-centric analysis. This
    comprehensive dataset contains over 50,000 elaborately annotated images with 19
    semantic part labels, which are captured from a wider range of viewpoints,
    occlusions and background complexity. Given these rich annotations we perform
    detailed analyses of the leading human parsing approaches, gaining insights
    into the success and failures of these methods. Furthermore, in contrast to the
    existing efforts on improving the feature discriminative capability, we solve
    human parsing by exploring a novel self-supervised structure-sensitive learning
    approach, which imposes human pose structures into parsing results without
    resorting to extra supervision (i.e., no need for specifically labeling human
    joints in model training). Our self-supervised learning framework can be
    injected into any advanced neural networks to help incorporate rich high-level
    knowledge regarding human joints from a global perspective and improve the
    parsing results. Extensive evaluations on our LIP and the public
    PASCAL-Person-Part dataset demonstrate the superiority of our method.

    Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting

    Sercan O. Arik, Markus Kliegl, Rewon Child, Joel Hestness, Andrew Gibiansky, Chris Fougner, Ryan Prenger, Adam Coates
    Comments: Submitted to Interspeech 2017
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Keyword spotting (KWS) constitutes a major component of human-technology
    interfaces. Maximizing the detection accuracy at a low false alarm (FA) rate,
    while minimizing the footprint size, latency and complexity are the goals for
    KWS. Towards achieving them, we study Convolutional Recurrent Neural Networks
    (CRNNs). Inspired by large-scale state-of-the-art speech recognition systems,
    we combine the strengths of convolutional layers and recurrent layers to
    exploit local structure and long-range context. We analyze the effect of
    architecture parameters, and propose training strategies to improve
    performance. With only ~230k parameters, our CRNN model yields acceptably low
    latency, and achieves 97.71% accuracy at 0.5 FA/hour for 5 dB signal-to-noise
    ratio.

    Legal Question Answering using Ranking SVM and Deep Convolutional Neural Network

    Phong-Khac Do, Huy-Tien Nguyen, Chien-Xuan Tran, Minh-Tien Nguyen, Minh-Le Nguyen
    Comments: 15 pages, 2 figures, Tenth International Workshop on Juris-informatics (JURISIN 2016) associated with JSAI International Symposia on AI 2016 (IsAI-2016)
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

    This paper presents a study of employing Ranking SVM and Convolutional Neural
    Network for two missions: legal information retrieval and question answering in
    the Competition on Legal Information Extraction/Entailment. For the first task,
    our proposed model used a triple of features (LSI, Manhattan, Jaccard), and is
    based on paragraph level instead of article level as in previous studies. In
    fact, each single-paragraph article corresponds to a particular paragraph in a
    huge multiple-paragraph article. For the legal question answering task,
    additional statistical features from information retrieval task integrated into
    Convolutional Neural Network contribute to higher accuracy.


    Information Retrieval

    Improving Document Clustering by Eliminating Unnatural Language

    Myunga Jang, Jinho D. Choi, James Allan
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    Technical documents contain a fair amount of unnatural language, such as
    tables, formulas, pseudo-codes, etc. Unnatural language can be an important
    factor of confusing existing NLP tools. This paper presents an effective method
    of distinguishing unnatural language from natural language, and evaluates the
    impact of unnatural language detection on NLP tasks such as document
    clustering. We view this problem as an information extraction task and build a
    multiclass classification model identifying unnatural language components into
    four categories. First, we create a new annotated corpus by collecting slides
    and papers in various formats, PPT, PDF, and HTML, where unnatural language
    components are annotated into four categories. We then explore features
    available from plain text to build a statistical model that can handle any
    format as long as it is converted into plain text. Our experiments show that
    removing unnatural language components gives an absolute improvement in
    document clustering up to 15%. Our corpus and tool are publicly available.


    Computation and Language

    Neobility at SemEval-2017 Task 1: An Attention-based Sentence Similarity Model

    Wenli Zhuang, Ernie Chang
    Subjects: Computation and Language (cs.CL)

    This paper describes a neural-network model which performed competitively
    (top 6) at the SemEval 2017 cross-lingual Semantic Textual Similarity (STS)
    task. Our system employs an attention-based recurrent neural network model that
    optimizes the sentence similarity. In this paper, we describe our participation
    in the multilingual STS task which measures similarity across English, Spanish,
    and Arabic.

    End-to-end optimization of goal-driven and visually grounded dialogue systems

    Florian Strub, Harm de Vries, Jeremie Mary, Bilal Piot, Aaron Courville, Olivier Pietquin
    Subjects: Computation and Language (cs.CL)

    End-to-end design of dialogue systems has recently become a popular research
    topic thanks to powerful tools such as encoder-decoder architectures for
    sequence-to-sequence learning. Yet, most current approaches cast human-machine
    dialogue management as a supervised learning problem, aiming at predicting the
    next utterance of a participant given the full history of the dialogue. This
    vision is too simplistic to render the intrinsic planning problem inherent to
    dialogue as well as its grounded nature, making the context of a dialogue
    larger than the sole history. This is why only chit-chat and question answering
    tasks have been addressed so far using end-to-end architectures. In this paper,
    we introduce a Deep Reinforcement Learning method to optimize visually grounded
    task-oriented dialogues, based on the policy gradient algorithm. This approach
    is tested on a dataset of 120k dialogues collected through Mechanical Turk and
    provides encouraging results at solving both the problem of generating natural
    dialogues and the task of discovering a specific object in a complex picture.

    Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting

    Sercan O. Arik, Markus Kliegl, Rewon Child, Joel Hestness, Andrew Gibiansky, Chris Fougner, Ryan Prenger, Adam Coates
    Comments: Submitted to Interspeech 2017
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Keyword spotting (KWS) constitutes a major component of human-technology
    interfaces. Maximizing the detection accuracy at a low false alarm (FA) rate,
    while minimizing the footprint size, latency and complexity are the goals for
    KWS. Towards achieving them, we study Convolutional Recurrent Neural Networks
    (CRNNs). Inspired by large-scale state-of-the-art speech recognition systems,
    we combine the strengths of convolutional layers and recurrent layers to
    exploit local structure and long-range context. We analyze the effect of
    architecture parameters, and propose training strategies to improve
    performance. With only ~230k parameters, our CRNN model yields acceptably low
    latency, and achieves 97.71% accuracy at 0.5 FA/hour for 5 dB signal-to-noise
    ratio.

    Legal Question Answering using Ranking SVM and Deep Convolutional Neural Network

    Phong-Khac Do, Huy-Tien Nguyen, Chien-Xuan Tran, Minh-Tien Nguyen, Minh-Le Nguyen
    Comments: 15 pages, 2 figures, Tenth International Workshop on Juris-informatics (JURISIN 2016) associated with JSAI International Symposia on AI 2016 (IsAI-2016)
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

    This paper presents a study of employing Ranking SVM and Convolutional Neural
    Network for two missions: legal information retrieval and question answering in
    the Competition on Legal Information Extraction/Entailment. For the first task,
    our proposed model used a triple of features (LSI, Manhattan, Jaccard), and is
    based on paragraph level instead of article level as in previous studies. In
    fact, each single-paragraph article corresponds to a particular paragraph in a
    huge multiple-paragraph article. For the legal question answering task,
    additional statistical features from information retrieval task integrated into
    Convolutional Neural Network contribute to higher accuracy.

    Improving Document Clustering by Eliminating Unnatural Language

    Myunga Jang, Jinho D. Choi, James Allan
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    Technical documents contain a fair amount of unnatural language, such as
    tables, formulas, pseudo-codes, etc. Unnatural language can be an important
    factor of confusing existing NLP tools. This paper presents an effective method
    of distinguishing unnatural language from natural language, and evaluates the
    impact of unnatural language detection on NLP tasks such as document
    clustering. We view this problem as an information extraction task and build a
    multiclass classification model identifying unnatural language components into
    four categories. First, we create a new annotated corpus by collecting slides
    and papers in various formats, PPT, PDF, and HTML, where unnatural language
    components are annotated into four categories. We then explore features
    available from plain text to build a statistical model that can handle any
    format as long as it is converted into plain text. Our experiments show that
    removing unnatural language components gives an absolute improvement in
    document clustering up to 15%. Our corpus and tool are publicly available.


    Distributed, Parallel, and Cluster Computing

    The challenge of decentralized marketplaces

    Bas van IJzendoorn
    Comments: responsible teacher: Johan Pouwelse
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Cryptography and Security (cs.CR); Computers and Society (cs.CY)

    Online trust systems are playing an important role in to-days world and face
    various challenges in building them. Billions of dollars of products and
    services are traded through electronic commerce, files are shared among large
    peer-to-peer networks and smart contracts can potentially replace paper
    contracts with digital contracts. These systems rely on trust mechanisms in
    peer-to-peer networks like reputation systems or a trustless public ledger. In
    most cases, reputation systems are build to determine the trustworthiness of
    users and to provide incentives for users to make a fair contribution to the
    peer-to-peer network. The main challenges are how to set up a good trust
    system, how to deal with security issues and how to deal with strategic users
    trying to cheat on the system. The Sybil attack, the most important attack on
    reputation systems is discussed. At last match making in two sided markets and
    the strategy proofness of these markets are discussed.

    Replicable Parallel Branch and Bound Search

    Blair Archibald, Ciaran McCreesh, Patrick Maier, Rob Stewart, Phil Trinder
    Comments: 38 pages, 12 figures, submitted to the Journal of Parallel and Distributed Computing
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Branch and bound searches are a common technique for solving global
    optimisation and decision problems, yet their irregularity, search order
    dependence, and the need to share bound information globally makes it
    challenging to implement them in parallel, and to reason about their parallel
    performance.

    We identify three key parallel search properties for replicable branch and
    bound implementations: Sequential Lower Bound, Non-increasing Runtimes, and
    Repeatability. We define a formal model for parallel branch and bound search
    problems and show its generality by using it to define three benchmarks:
    finding a Maximum Clique in a graph, 0/1 Knapsack and Travelling Salesperson
    (TSP). We present a Generic Branch and Bound search API that conforms to the
    model. For reusability we encapsulate the search behaviours as a pair of
    algorithmic based skeletons in a distributed-memory parallel Haskell. Crucially
    the Ordered skeleton is designed to guarantee the parallel search properties,
    potentially at a performance cost compared with the Unordered skeleton.

    We compare the sequential performance of the skeletons with a class leading
    C++ search implementation. We then use relative speedups to evaluate the
    skeletons for 40 benchmark instances on a cluster using 200 workers. The
    Ordered skeleton preserves the Sequential Lower Bound for all benchmark
    instances while the Unordered skeleton violates the property for 5 TSP
    instances. The Ordered skeleton preserves Non-increasing Runtimes for all
    benchmark instances while the Unordered skeleton violates the property for many
    instances of all three benchmarks. The Ordered skeleton delivers far more
    repeatable performance than the Unordered skeleton (Repeatability property)
    with a median relative standard deviation (RSD) of 1.78% vs 5.56%, 1.83% vs
    87.56% and 1.96% vs 8.61% for all Maximum Clique, Knapsack and TSP instances
    respectively.

    VieM v1.00 — Vienna Mapping and Sparse Quadratic Assignment User Guide

    Christian Schulz, Jesper Larsson Träff
    Comments: arXiv admin note: text overlap with arXiv:1311.1714
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS); Combinatorics (math.CO)

    This paper severs as a user guide to the mapping framework VieM (Vienna
    Mapping and Sparse Quadratic Assignment). We give a rough overview of the
    techniques used within the framework and describe the user interface as well as
    the file formats used.

    Lower Bounds and Algorithm for Partially Replicated Causally Consistent Shared Memory

    Zhuolun Xiang, Nitin H. Vaidya
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Distributed shared memory systems maintain multiple replicas of the shared
    memory locations. Maintaining causal consistency in such systems has received
    significant attention in the past. However, much of the previous literature
    focuses on full replication wherein each replica stores a copy of all the
    locations in the shared memory. In this paper, we investigate causal
    consistency in partially replicated systems, wherein each replica may store
    only a subset of the shared data. To achieve causal consistency, it is
    necessary to ensure that, before an update is performed at any given replica,
    all causally preceding updates must also be performed. Achieving this goal
    requires some mechanism to track causal dependencies. In the context of full
    replication, this goal is often achieved using vector timestamps, with the
    number of vector elements being equal to the number of replicas. Building on
    the past work, this paper makes three key contributions: 1. We develop lower
    bounds on the size of the timestamps that must be maintained in order to
    achieve causal consistency in partially replicated systems. The size of the
    timestamps is a function of the manner in which the replicas share data, and
    the set of replicas accessed by each client. 2. We present an algorithm to
    achieve causal consistency in partially replicated systems using simple vector
    timestamps. 3. We present some optimizations to improve the overhead of the
    timestamps required with partial replication.

    Proof of Luck: an Efficient Blockchain Consensus Protocol

    Mitar Milutinovic, Warren He, Howard Wu, Maxinder Kanwal
    Comments: SysTEX ’16, December 12-16, 2016, Trento, Italy
    Subjects: Cryptography and Security (cs.CR); Distributed, Parallel, and Cluster Computing (cs.DC)

    In the paper, we present designs for multiple blockchain consensus primitives
    and a novel blockchain system, all based on the use of trusted execution
    environments (TEEs), such as Intel SGX-enabled CPUs. First, we show how using
    TEEs for existing proof of work schemes can make mining equitably distributed
    by preventing the use of ASICs. Next, we extend the design with proof of time
    and proof of ownership consensus primitives to make mining energy- and
    time-efficient. Further improving on these designs, we present a blockchain
    using a proof of luck consensus protocol. Our proof of luck blockchain uses a
    TEE platform’s random number generation to choose a consensus leader, which
    offers low-latency transaction validation, deterministic confirmation time,
    negligible energy consumption, and equitably distributed mining. Lastly, we
    discuss a potential protection against up to a constant number of compromised
    TEEs.


    Learning

    Shift Aggregate Extract Networks

    Francesco Orsini, Daniele Baracchi, Paolo Frasconi
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    We introduce an architecture based on deep hierarchical decompositions to
    learn effective representations of large graphs. Our framework extends classic
    R-decompositions used in kernel methods, enabling nested “part-of-part”
    relations. Unlike recursive neural networks, which unroll a template on input
    graphs directly, we unroll a neural network template over the decomposition
    hierarchy, allowing us to deal with the high degree variability that typically
    characterize social network graphs. Deep hierarchical decompositions are also
    amenable to domain compression, a technique that reduces both space and time
    complexity by exploiting symmetries. We show empirically that our approach is
    competitive with current state-of-the-art graph classification methods,
    particularly when dealing with social network datasets.

    Aggregation of Classifiers: A Justifiable Information Granularity Approach

    Tien Thanh Nguyen, Xuan Cuong Pham, Alan Wee-Chung Liew, Witold Pedrycz
    Comments: 33 pages, 3 figures
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    In this study, we introduce a new approach to combine multi-classifiers in an
    ensemble system. Instead of using numeric membership values encountered in
    fixed combining rules, we construct interval membership values associated with
    each class prediction at the level of meta-data of observation by using
    concepts of information granules. In the proposed method, uncertainty
    (diversity) of findings produced by the base classifiers is quantified by
    interval-based information granules. The discriminative decision model is
    generated by considering both the bounds and the length of the obtained
    intervals. We select ten and then fifteen learning algorithms to build a
    heterogeneous ensemble system and then conducted the experiment on a number of
    UCI datasets. The experimental results demonstrate that the proposed approach
    performs better than the benchmark algorithms including six fixed combining
    methods, one trainable combining method, AdaBoost, Bagging, and Random
    Subspace.

    Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play

    Sainbayar Sukhbaatar, Ilya Kostrikov, Arthur Szlam, Rob Fergus
    Subjects: Learning (cs.LG)

    We describe a simple scheme that allows an agent to explore its environment
    in an unsupervised manner. Our scheme pits two versions of the same agent,
    Alice and Bob, against one another. Alice proposes a task for Bob to complete;
    and then Bob attempts to complete the task. In this work we will focus on
    (nearly) reversible environments, or environments that can be reset, and Alice
    will “propose” the task by running a set of actions and then Bob must partially
    undo, or repeat them, respectively. Via an appropriate reward structure, Alice
    and Bob automatically generate a curriculum of exploration, enabling
    unsupervised training of the agent. When deployed on an RL task within the
    environment, this unsupervised training reduces the number of episodes needed
    to learn.

    Bayesian Sketch Learning for Program Synthesis

    Vijayaraghavan Murali, Swarat Chaudhuri, Chris Jermaine
    Subjects: Programming Languages (cs.PL); Learning (cs.LG)

    We present a data-driven approach to the problem of inductive computer
    program synthesis. Our method learns a probabilistic model for real-world
    programs from a corpus of existing code. It uses this model during synthesis to
    automatically infer a posterior distribution over sketches, or syntactic models
    of the problem to be synthesized. Sketches sampled from this posterior are then
    used to drive combinatorial synthesis of a program in a high-level programming
    language.

    The key technical innovation of our approach — embodied in a system called
    Bayou — is utilizing user-supplied evidence as to the program’s desired
    behavior, along with a Bayesian update, to obtain a posterior distribution over
    the program’s true, latent specification (indicating user intent), which in
    turn produces a posterior over possible sketches. As we show experimentally,
    explicitly modeling uncertainty in specification significantly increases the
    accuracy of the synthesis algorithm. We evaluate Bayou’s ability to synthesize
    Java and Android methods. We find that using just a few example API sequences
    to communicate user intent, Bayou can synthesize complex method bodies, some
    implementing tasks never encountered during training.

    End-to-End Learning for Structured Prediction Energy Networks

    David Belanger, Bishan Yang, Andrew McCallum
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Structured Prediction Energy Networks (Belanger and McCallum, 2016) (SPENs)
    are a simple, yet expressive family of structured prediction models. An energy
    function over candidate structured outputs is given by a deep network, and
    predictions are formed by gradient-based optimization. Unfortunately, we have
    struggled to apply the structured SVM (SSVM) learning method of Belanger and
    McCallum, 2016 to applications with more complex structure than multi-label
    classification. In general, SSVMs are unreliable whenever exact energy
    minimization is intractable. In response, we present end-to-end learning for
    SPENs, where the energy function is discriminatively trained by
    back-propagating through gradient-based prediction. This paper presents a
    collection of methods necessary to apply the technique to problems with complex
    structure. For example, we avoid vanishing gradients when learning SPENs for
    convex relaxations of discrete prediction problems and explicitly train models
    such that energy minimization converges quickly in practice. Using end-to-end
    learning, we demonstrate the power of SPENs on 7-Scenes depth image denoising
    and CoNLL-2005 semantic role labeling tasks. In both, we outperform competitive
    baselines that employ more simplistic energy functions, but perform exact
    energy minimization. In particular, for denoising we achieve 40 PSNR,
    outperforming the previous state-of-the-art of 36.

    Convolutional neural network architecture for geometric matching

    Ignacio Rocco, Relja Arandjelović, Josef Sivic
    Comments: To appear in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017)
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    We address the problem of determining correspondences between two images in
    agreement with a geometric model such as an affine or thin-plate-spline
    transformation, and estimating its parameters. The contributions of this work
    are three-fold. First, we propose a convolutional neural network architecture
    for geometric matching. The architecture is based on three main components that
    mimic the standard steps of feature extraction, matching and simultaneous
    inlier detection and model parameter estimation, while being trainable
    end-to-end. Second, we demonstrate that the network parameters can be trained
    from synthetically generated imagery without the need for manual annotation and
    that our matching layer significantly increases generalization capabilities to
    never seen before images. Finally, we show that the same model can perform both
    instance-level and category-level matching giving state-of-the-art results on
    the challenging Proposal Flow dataset.

    Fraternal Twins: Unifying Attacks on Machine Learning and Digital Watermarking

    Erwin Quiring, Daniel Arp, Konrad Rieck
    Subjects: Cryptography and Security (cs.CR); Learning (cs.LG)

    Machine learning is increasingly used in security-critical applications, such
    as autonomous driving, face recognition and malware detection. Most learning
    methods, however, have not been designed with security in mind and thus are
    vulnerable to different types of attacks. This problem has motivated the
    research field of adversarial machine learning that is concerned with attacking
    and defending learning methods. Concurrently, a different line of research has
    tackled a very similar problem: In digital watermarking information are
    embedded in a signal in the presence of an adversary. As a consequence, this
    research field has also extensively studied techniques for attacking and
    defending watermarking methods.

    The two research communities have worked in parallel so far, unnoticeably
    developing similar attack and defense strategies. This paper is a first effort
    to bring these communities together. To this end, we present a unified notation
    of black-box attacks against machine learning and watermarking that reveals the
    similarity of both settings. To demonstrate the efficacy of this unified view,
    we apply concepts from watermarking to machine learning and vice versa. We show
    that countermeasures from watermarking can mitigate recent model-extraction
    attacks and, similarly, that techniques for hardening machine learning can fend
    off oracle attacks against watermarks. Our work provides a conceptual link
    between two research fields and thereby opens novel directions for improving
    the security of both, machine learning and digital watermarking.

    Using Reinforcement Learning for Demand Response of Domestic Hot Water Buffers: a Real-Life Demonstration

    Oscar De Somer, Ana Soares, Tristan Kuijpers, Koen Vossen, Koen Vanthournout, Fred Spiessens
    Comments: Submitted to IEEE ISGT Europe 2017
    Subjects: Systems and Control (cs.SY); Learning (cs.LG)

    This paper demonstrates a data-driven control approach for demand response in
    real-life residential buildings. The objective is to optimally schedule the
    heating cycles of the Domestic Hot Water (DHW) buffer to maximize the
    self-consumption of the local photovoltaic (PV) production. A model-based
    reinforcement learning technique is used to tackle the underlying sequential
    decision-making problem. The proposed algorithm learns the stochastic occupant
    behavior, predicts the PV production and takes into account the dynamics of the
    system. A real-life experiment with six residential buildings is performed
    using this algorithm. The results show that the self-consumption of the PV
    production is significantly increased, compared to the default thermostat
    control.

    Efficient Online Learning for Optimizing Value of Information: Theory and Application to Interactive Troubleshooting

    Yuxin Chen, Jean-Michel Renders, Morteza Haghir Chehreghani, Andreas Krause
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    We consider the optimal value of information (VoI) problem, where the goal is
    to sequentially select a set of tests with a minimal cost, so that one can
    efficiently make the best decision based on the observed outcomes. Existing
    algorithms are either heuristics with no guarantees, or scale poorly (with
    exponential run time in terms of the number of available tests). Moreover,
    these methods assume a known distribution over the test outcomes, which is
    often not the case in practice. We propose an efficient sampling-based online
    learning framework to address the above issues. First, assuming the
    distribution over hypotheses is known, we propose a dynamic hypothesis
    enumeration strategy, which allows efficient information gathering with strong
    theoretical guarantees. We show that with sufficient amount of samples, one can
    identify a near-optimal decision with high probability. Second, when the
    parameters of the hypotheses distribution are unknown, we propose an algorithm
    which learns the parameters progressively via posterior sampling in an online
    fashion. We further establish a rigorous bound on the expected regret. We
    demonstrate the effectiveness of our approach on a real-world interactive
    troubleshooting application and show that one can efficiently make high-quality
    decisions with low cost.

    Minimax Regret Bounds for Reinforcement Learning

    Mohammad Gheshlaghi Azar, Ian Osband, Rémi Munos
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG)

    We consider the problem of efficient exploration in finite horizon MDPs.We
    show that an optimistic modification to model-based value iteration, can
    achieve a regret bound ( ilde{O}( sqrt{HSAT} + H^2S^2A+Hsqrt{T})) where (H)
    is the time horizon, (S) the number of states, (A) the number of actions and
    (T) the time elapsed. This result improves over the best previous known bound
    ( ilde{O}(HS sqrt{AT})) achieved by the UCRL2 algorithm.The key significance
    of our new results is that when (Tgeq H^3S^3A) and (SAgeq H), it leads to a
    regret of ( ilde{O}(sqrt{HSAT})) that matches the established lower bounds of
    (Omega(sqrt{HSAT})) up to a logarithmic factor. Our analysis contain two key
    insights. We use careful application of concentration inequalities to the
    optimal value function as a whole, rather than to the transitions probabilities
    (to improve scaling in (S)), and we use “exploration bonuses” based on
    Bernstein’s inequality, together with using a recursive -Bellman-type- Law of
    Total Variance (to improve scaling in (H)).

    Look into Person: Self-supervised Structure-sensitive Learning and A New Benchmark for Human Parsing

    Ke Gong, Xiaodan Liang, Xiaohui Shen, Liang Lin
    Comments: Accepted to appear in CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Human parsing has recently attracted a lot of research interests due to its
    huge application potentials. However existing datasets have limited number of
    images and annotations, and lack the variety of human appearances and the
    coverage of challenging cases in unconstrained environment. In this paper, we
    introduce a new benchmark “Look into Person (LIP)” that makes a significant
    advance in terms of scalability, diversity and difficulty, a contribution that
    we feel is crucial for future developments in human-centric analysis. This
    comprehensive dataset contains over 50,000 elaborately annotated images with 19
    semantic part labels, which are captured from a wider range of viewpoints,
    occlusions and background complexity. Given these rich annotations we perform
    detailed analyses of the leading human parsing approaches, gaining insights
    into the success and failures of these methods. Furthermore, in contrast to the
    existing efforts on improving the feature discriminative capability, we solve
    human parsing by exploring a novel self-supervised structure-sensitive learning
    approach, which imposes human pose structures into parsing results without
    resorting to extra supervision (i.e., no need for specifically labeling human
    joints in model training). Our self-supervised learning framework can be
    injected into any advanced neural networks to help incorporate rich high-level
    knowledge regarding human joints from a global perspective and improve the
    parsing results. Extensive evaluations on our LIP and the public
    PASCAL-Person-Part dataset demonstrate the superiority of our method.

    Cost-complexity pruning of random forests

    Kiran Bangalore Ravi, Jean Serra
    Comments: Accepted at ISMM 2017
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Random forests perform bootstrap-aggregation by sampling the training samples
    with replacement. This enables the evaluation of out-of-bag error which serves
    as a internal cross-validation mechanism. Our motivation lies in using the
    unsampled training samples to improve each decision tree in the ensemble. We
    study the effect of using the out-of-bag samples to improve the generalization
    error first of the decision trees and second the random forest by post-pruning.
    A preliminary empirical study on four UCI repository datasets show consistent
    decrease in the size of the forests without considerable loss in accuracy.

    Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting

    Sercan O. Arik, Markus Kliegl, Rewon Child, Joel Hestness, Andrew Gibiansky, Chris Fougner, Ryan Prenger, Adam Coates
    Comments: Submitted to Interspeech 2017
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Keyword spotting (KWS) constitutes a major component of human-technology
    interfaces. Maximizing the detection accuracy at a low false alarm (FA) rate,
    while minimizing the footprint size, latency and complexity are the goals for
    KWS. Towards achieving them, we study Convolutional Recurrent Neural Networks
    (CRNNs). Inspired by large-scale state-of-the-art speech recognition systems,
    we combine the strengths of convolutional layers and recurrent layers to
    exploit local structure and long-range context. We analyze the effect of
    architecture parameters, and propose training strategies to improve
    performance. With only ~230k parameters, our CRNN model yields acceptably low
    latency, and achieves 97.71% accuracy at 0.5 FA/hour for 5 dB signal-to-noise
    ratio.

    A Study of Complex Deep Learning Networks on High Performance, Neuromorphic, and Quantum Computers

    Thomas E. Potok, Catherine Schuman, Steven R. Young, Robert M. Patton, Federico Spedalieri, Jeremy Liu, Ke-Thia Yao, Garrett Rose, Gangotree Chakma
    Subjects: Neural and Evolutionary Computing (cs.NE); Learning (cs.LG)

    Current Deep Learning approaches have been very successful using
    convolutional neural networks (CNN) trained on large graphical processing units
    (GPU)-based computers. Three limitations of this approach are: 1) they are
    based on a simple layered network topology, i.e., highly connected layers,
    without intra-layer connections; 2) the networks are manually configured to
    achieve optimal results, and 3) the implementation of neuron model is expensive
    in both cost and power. In this paper, we evaluate deep learning models using
    three different computing architectures to address these problems: quantum
    computing to train complex topologies, high performance computing (HPC) to
    automatically determine network topology, and neuromorphic computing for a
    low-power hardware implementation. We use the MNIST dataset for our experiment,
    due to input size limitations of current quantum computers. Our results show
    the feasibility of using the three architectures in tandem to address the above
    deep learning limitations. We show a quantum computer can find high quality
    values of intra-layer connections weights, in a tractable time as the
    complexity of the network increases; a high performance computer can find
    optimal layer-based topologies; and a neuromorphic computer can represent the
    complex topology and weights derived from the other architectures in low power
    memristive hardware.


    Information Theory

    Simultaneous Wireless Information and Power Transfer in MISO Full-Duplex Systems

    Alexander A. Okandeji, Muhammad R. A. Khandaker, Kai-Kit Wong, Gan Zheng, Yangyang Zhang, Zhongbin Zheng
    Subjects: Information Theory (cs.IT)

    This paper investigates a multiuser multiple-input single-output (MISO)
    full-duplex (FD) system for simultaneous wireless information and power
    transfer (SWIPT), in which a multi-antenna base station (BS) simultaneously
    sends wirelessly information and power to a set of single-antenna mobile
    stations (MSs) using power splitters (PSs) in the downlink and receives
    information in the uplink in FD mode. In particular, we address the joint
    design of the receive PS ratio and the transmit power at the MSs, and the
    beamforming matrix at the BS under signal-to-interference-plus-noise ratio
    (SINR) and the harvested power constraints. Using semidefinite relaxation
    (SDR), we obtain the solution to the problem with imperfect channel state
    information (CSI) of the self-interfering channels. Furthermore, we propose
    another suboptimal zero-forcing (ZF) based solution by separating the
    optimization of the transmit beamforming vector and the PS ratio. Simulation
    results are provided to evaluate the performance of the proposed beamforming
    designs.

    Enhancing Coexistence in the Unlicensed Band with Massive MIMO

    Giovanni Geraci, Adrian Garcia-Rodriguez, David López-Pérez, Andrea Bonfante, Lorenzo Galati Giordano, Holger Claussen
    Comments: To appear in Proc. IEEE ICC 2017
    Subjects: Information Theory (cs.IT)

    We consider cellular base stations (BSs) equipped with a large number of
    antennas and operating in the unlicensed band. We denote such system as massive
    MIMO unlicensed (mMIMO-U). We design the key procedures required to guarantee
    coexistence between a cellular BS and nearby Wi-Fi devices. These include:
    neighboring Wi-Fi channel covariance estimation, allocation of spatial degrees
    of freedom for interference suppression, and enhanced channel sensing and data
    transmission phases. We evaluate the performance of the so-designed mMIMO-U,
    showing that it allows simultaneous cellular and Wi-Fi transmissions by keeping
    their mutual interference below the regulatory threshold. The same is not true
    for conventional listen-before-talk (LBT) operations. As a result, mMIMO-U
    boosts the aggregate cellular-plus-Wi-Fi data rate in the unlicensed band with
    respect to conventional LBT, exhibiting increasing gains as the number of BS
    antennas grows.

    Low Complexity Beamforming Training Method for mmWave Communications

    Felix Fellhauer, Nabil Loghin, Dana Ciochina, Thomas Handte, Stephan ten Brink
    Comments: Submitted to 2017 International Workshop on Signal Processing Advances in Wireless Communications (SPAWC)
    Subjects: Information Theory (cs.IT)

    This paper introduces a low complexity method for antenna sector selection in
    mmWave Hybrid MIMO communication systems like the IEEE 802.11ay amendment for
    Wireless LANs. The method is backwards compatible to the methods already
    defined for the released mmWave standard IEEE 802.11ad. We introduce an
    extension of the 802.11ad channel model to support common Hybrid MIMO
    configurations. The proposed method is evaluated and compared to the
    theoretical limit of transmission rates found by exhaustive search. In contrast
    to state-of-the-art solutions, the presented method requires sparse channel
    information only. Numerical results show a significant complexity reduction in
    terms of number of necessary trainings, while approaching maximum achievable
    rate.

    Modeling and Analysis of Non-Orthogonal MBMS Transmission in Heterogeneous Networks

    Zhengquan Zhang, Zheng Ma, Ming Xiao, Pingzhi Fan
    Comments: 30 pages, 8 figures
    Subjects: Information Theory (cs.IT)

    Broadcasting/multicasting is an efficient mechanism for multimedia
    communications due to its high spectrum efficiency, which achieves
    point-to-multipoint transmission on the same radio resources. To satisfy the
    increasing demands for multimedia broadcast multicast service (MBMS), we
    present a power domain non-orthogonal MBMS transmission scheme in a K-tier
    heterogeneous network (HetNet). Firstly, the system model, usage scenarios, and
    fundamentals of the presented scheme are discussed. Next, a tractable framework
    is developed to analyse the performance of non-orthogonal MBMS transmission, by
    using stochastic geometry. Based on this framework, the analytical expressions
    for the signal-to-interference-plus-noise ratio (SINR) coverage probability,
    average number of served users, and sum rate are derived. Furthermore,
    synchronous non-orthogonal MBMS transmission to further improving the system
    performance is also studied. The results demonstrate that non-orthogonal MBMS
    transmission can achieve better performance than the conventional one, in which
    non-orthogonal multirate one can fully utilize channel conditions to achieve a
    significant rate gain, while non-orthogonal multi-service one can efficiently
    use power resources to guarantee the quality of service (QoS) of high priority
    users, and also provide services for low priority users simultaneously.

    Concatenated LDPC-Polar Codes Decoding Through Belief Propagation

    Syed Mohsin Abbas, YouZhe Fan, Ji Chen, Chi-Ying Tsui
    Comments: Accepted for publication in IEEE International Symposium on Circuits & Systems 2017 (ISCAS 2017)
    Subjects: Information Theory (cs.IT)

    Owing to their capacity-achieving performance and low encoding and decoding
    complexity, polar codes have drawn much research interests recently. Successive
    cancellation decoding (SCD) and belief propagation decoding (BPD) are two
    common approaches for decoding polar codes. SCD is sequential in nature while
    BPD can run in parallel. Thus BPD is more attractive for low latency
    applications. However BPD has some performance degradation at higher SNR when
    compared with SCD. Concatenating LDPC with Polar codes is one popular approach
    to enhance the performance of BPD , where a short LDPC code is used as an outer
    code and Polar code is used as an inner code. In this work we propose a new way
    to construct concatenated LDPC-Polar code, which not only outperforms
    conventional BPD and existing concatenated LDPC-Polar code but also shows a
    performance improvement of 0.5 dB at higher SNR regime when compared with SCD.

    A Compressive Method for Centralized PSD Map Construction with Imperfect Reporting Channel

    Mohammad Eslami, Seyed Hamid Safavi, Farah Torkamani-Azar, Esfandiar Mehrshahi
    Comments: Submitted to the 25th European Signal Processing Conference (EUSIPCO 2017). arXiv admin note: text overlap with arXiv:1612.02892
    Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

    Spectrum resources management of growing demands is a challenging problem and
    Cognitive Radio (CR) known to be capable of improving the spectrum utilization.
    Recently, Power Spectral Density (PSD) map is defined to enable the CR to reuse
    the frequency resources regarding to the area. For this reason, the sensed PSDs
    are collected by the distributed sensors in the area and fused by a Fusion
    Center (FC). But, for a given zone, the sensed PSDs by neighbor CR sensors may
    contain a shared common component for a while. This component can be exploited
    in the theory of the Distributed Source Coding (DSC) to make the sensors
    transmission data more compressed. However, uncertain channel fading and random
    shadowing would lead to varying signal strength at different CRs, even placed
    close to each other. Hence, existence of some perturbations in the transmission
    procedure yields to some imperfection in the reporting channel and as a result
    it degrades the performance remarkably. The main focus of this paper is to be
    able to reconstruct the PSDs of sensors extit{robustly} based on the
    Distributed Compressive Sensing (DCS) when the data transmission is slightly
    imperfect. Simulation results verify the robustness of the proposed scheme.

    Multi-User Millimeter Wave Channel Estimation Using Generalized Block OMP Algorithm

    Manoj A, Arun Pachai Kannu
    Comments: 5 page, 3 figures
    Subjects: Information Theory (cs.IT)

    In a multi-user millimeter (mm) wave communication system, we consider the
    problem of estimating the channel response between the central node (base
    station) and each of the user equipments (UE). We propose three different
    strategies: 1) Each UE estimates its channel separately, 2) Base station
    estimates all the UEs channels jointly, and 3) Two stage process with
    estimation done at both UE and base station. Exploiting the low rank nature of
    the mm wave channels, we propose a generalized block orthogonal matching
    pursuit (G.BOMP) framework for channel estimation in all the three strategies.
    Our simulation results show that, the average beamforming gain of the G.BOMP
    algorithm is higher than that of the conventional OMP algorithm and other
    existing works on the multi-user mm wave system.

    Mobile Unmanned Aerial Vehicles (UAVs) for Energy-Efficient Internet of Things Communications

    Mohammad Mozaffari, Walid Saad, Mehdi Bennis, Merouane Debbah
    Subjects: Information Theory (cs.IT)

    In this paper, the efficient deployment and mobility of multiple unmanned
    aerial vehicles (UAVs), used as aerial base stations to collect data from
    ground Internet of Things (IoT) devices, is investigated. In particular, to
    enable reliable uplink communications for IoT devices with a minimum total
    transmit power, a novel framework is proposed for jointly optimizing the
    three-dimensional (3D) placement and mobility of the UAVs, device-UAV
    association, and uplink power control. First, given the locations of active IoT
    devices at each time instant, the optimal UAVs’ locations and associations are
    determined. Next, to dynamically serve the IoT devices in a time-varying
    network, the optimal mobility patterns of the UAVs are analyzed. To this end,
    based on the activation process of the IoT devices, the time instances at which
    the UAVs must update their locations are derived. Moreover, the optimal 3D
    trajectory of each UAV is obtained in a way that the total energy used for the
    mobility of the UAVs is minimized while serving the IoT devices. Simulation
    results show that, using the proposed approach, the total transmit power of the
    IoT devices is reduced by 45% compared to a case in which stationary aerial
    base stations are deployed. In addition, the proposed approach can yield a
    maximum of 28% enhanced system reliability compared to the stationary case. The
    results also reveal an inherent tradeoff between the number of update times,
    the mobility of the UAVs, and the transmit power of the IoT devices. In
    essence, a higher number of updates can lead to lower transmit powers for the
    IoT devices at the cost of an increased mobility for the UAVs.

    Layered black-box, behavioral interconnection perspective and applications to the problem of communication with fidelity criteria, Part II: stationary sources satisfying ψ-mixing criterion

    Mukul Agarwal, Sanjoy Mitter, Anant Sahai
    Subjects: Information Theory (cs.IT)

    Theorems from Part 1 of this paper are generalized to {psi}-mixing sources
    in this paper. Application to Markoff chains and order m Markoff chains is
    presented. The main result is the generalization of Theorem 1 in Part 1.

    Layered black-box, behavioral interconnection perspective and applications to the problem of communication with fidelity criteria, Part I: i.i.d. sources

    Mukul Agarwal, Sanjoy Mitter, Anant Sahai
    Subjects: Information Theory (cs.IT)

    In this paper, the problem of communication over an essentially unknown
    channel, which is known to be able to communicate a source to a destination to
    within a certain distortion level, is con- sidered from a behavioral,
    interconnection view-point. Rates of re- liable communication are derived and
    source-channel separation for communication with fidelity criteria is proved.
    The results are then generalized to the multi-user setting under certain
    assump- tions. Other applications of this problem problem which follow from
    this perspective are discussed.

    On decoding algorithms for polar codes

    Ilya Dumer
    Comments: 4 pages, 2 figures
    Subjects: Information Theory (cs.IT)

    We consider the known list decoding algorithms for polar codes and compare
    their complexity.

    Index terms: Polar codes; Reed-Muller codes; successive cancellation
    decoding.

    Recursive Decoding and Its Performance for Low-Rate Reed-Muller Codes

    Ilya Dumer
    Journal-ref: IEEE Trans. Info. Theory, vol. 50, pp. 811-823, 2004
    Subjects: Information Theory (cs.IT)

    Recursive decoding techniques are considered for Reed-Muller (RM) codes of
    growing length (n) and fixed order (r.) An algorithm is designed that has
    complexity of order (nlog n) and corrects most error patterns of weight up to
    (n(1/2-varepsilon)) given that (varepsilon) exceeds (n^{-1/2^{r}}.) This
    improves the asymptotic bounds known for decoding RM codes with nonexponential
    complexity.

    Soft decision decoding of Reed-Muller codes: recursive lists

    Ilya Dumer, Kirill Shabunov
    Journal-ref: IEEE Trans. Info. Theory, vol. 52, no. 3, pp. 1260-1266, 2006
    Subjects: Information Theory (cs.IT)

    Recursive list decoding is considered for Reed-Muller (RM) codes. The
    algorithm repeatedly relegates itself to the shorter RM codes by recalculating
    the posterior probabilities of their symbols. Intermediate decodings are only
    performed when these recalculations reach the trivial RM codes. In turn, the
    updated lists of most plausible codewords are used in subsequent decodings. The
    algorithm is further improved by using permutation techniques on code positions
    and by eliminating the most error-prone information bits. Simulation results
    show that for all RM codes of length 256 and many subcodes of length 512, these
    algorithms approach maximum-likelihood (ML) performance within a margin of 0.1
    dB. As a result, we present tight experimental bounds on ML performance for
    these codes.

    Recursive List Decoding for Reed-Muller Codes

    Ilya Dumer, Kirill Shabunov
    Journal-ref: “Information, Coding and Mathematics”, ed. M. Blaum, P. Farrell,
    and H.C.A. van Tilborg, Kluwer, Boston, 2002, pp. 279-298
    Subjects: Information Theory (cs.IT)

    We consider recursive decoding for Reed-Muller (RM) codes and their subcodes.
    Two new recursive techniques are described. We analyze asymptotic properties of
    these algorithms and show that they substantially outperform other decoding
    algorithms with nonexponential complexity known for RM codes. Decoding
    performance is further enhanced by using intermediate code lists and
    permutation procedures. For moderate lengths up to 512, near-optimum decoding
    with feasible complexity is obtained.

    Recursive decoding of Reed-Muller codes

    Ilya Dumer
    Journal-ref: Proc. 37th Allerton Conf. on Commun., Control, and Computing,
    Monticello, IL, USA, 1999, pp. 61-69
    Subjects: Information Theory (cs.IT)

    New soft- and hard decision decoding algorithms are presented for general
    Reed-Muller codes (left{genfrac{}{}{0pt}{}{m}{r}
    ight} ) of length (2^{m})
    and distance (2^{m-r}). We use Plotkin ((u,u+v)) construction and decompose
    code (left{genfrac{}{}{0pt}{}{m}{r}
    ight} ) onto subblocks
    (uinleft{genfrac{}{}{0pt}{}{m-1}{r}
    ight} ) and
    (vinleft{genfrac{}{}{0pt}{}{m-1}{r-1}
    ight} .) In decoding, we first try
    to find a subblock (v) from the better protected code and then proceed with the
    block (u). The likelihoods of the received symbols are recalculated in a way
    similar to belief propagation. Thus, decoding is relegated to the two
    constituent codes. We repeat this recursion and execute decoding only at the
    end nodes (left{genfrac{}{}{0pt}{}{j}{1}
    ight} ) and
    (left{genfrac{}{}{0pt}{}{j}{j-1}
    ight} ). The overall complexity has low
    order of (nlog n.) It is shown that this decoding substantially outperforms
    other algorithms of polynomial complexity known for RM codes. In particular,
    for medium and high code rates, the algorithm corrects most error patterns of
    weight (dln d/2.)

    Recursive constructions and their maximum likelihood decoding

    Ilya Dumer, Kirill Shabunov
    Journal-ref: Proc. 38th Allerton Conf. Communication, Control, and Computing,
    Monticello, IL, USA, 2000, pp. 71-80
    Subjects: Information Theory (cs.IT)

    We consider recursive decoding techniques for RM codes, their subcodes, and
    newly designed codes. For moderate lengths up to 512, we obtain near-optimum
    decoding with feasible complexity.

    Upper bounds for the Holevo quantity and their use

    M.E. Shirokov
    Comments: 25 pages, any comments are welcome
    Subjects: Quantum Physics (quant-ph); Information Theory (cs.IT); Mathematical Physics (math-ph)

    We present a family of easily computable upper bounds for the Holevo quantity
    of ensemble of quantum states depending on a reference state as a free
    parameter. These upper bounds are obtained by combining probabilistic and
    metric characteristics of the ensemble. We show that appropriate choice of the
    reference state gives tight upper bounds for the Holevo quantity which in many
    cases improve existing estimates in the literature.

    We also present upper bound for the Holevo quantity of a generalized ensemble
    of quantum states with finite average energy depending on metric divergence of
    the ensemble. The specification of this upper bound for the multi-mode quantum
    oscillator is tight for large energy.

    The above results are used to obtain tight upper bounds for the Holevo
    capacity of finite-dimensional and infinite-dimensional energy-constrained
    quantum channels depending on metric characteristics of the channel output.

    Harmonic Mean Iteratively Reweighted Least Squares for Low-Rank Matrix Recovery

    Christian Kümmerle, Juliane Sigl
    Comments: 46 pages, 4 figures
    Subjects: Numerical Analysis (math.NA); Information Theory (cs.IT)

    We propose a new iteratively reweighted least squares (IRLS) algorithm for
    the recovery of a matrix (X in mathbb{C}^{d_1 imes d_2}) of rank (r
    llmin(d_1,d_2)) from incomplete linear observations, solving a sequence of
    low complexity linear problems. The easily implementable algorithm, which we
    call harmonic mean iteratively reweighted least squares (HM-IRLS), optimizes a
    non-convex Schatten-(p) quasi-norm penalization to promote low-rankness and
    carries three major strengths, in particular for the matrix completion setting.
    First, the algorithm converges globally to the low-rank matrix for relevant,
    interesting cases, for which any other (non-)convex state-of-the-art
    optimization approach fails the recovery. Secondly, HM-IRLS exhibits an
    empirical recovery probability close to (100\%) even for a number of
    measurements very close to the theoretical lower bound (r (d_1 +d_2 -r)), i.e.,
    already for significantly fewer linear observations than any other tractable
    approach in the literature. Thirdly, HM-IRLS exhibits a locally superlinear
    rate of convergence (of order (2-p)) if the linear observations fulfill a
    suitable null space property. While for the first two properties we have so far
    only strong empirical evidence, we prove the third property as our main
    theoretical result.




沪ICP备19023445号-2号
友情链接