IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Thu, 23 Mar 2017

    我爱机器学习(52ml.net)发表于 2017-03-23 00:00:00
    love 0

    Neural and Evolutionary Computing

    ASP: Learning to Forget with Adaptive Synaptic Plasticity in Spiking Neural Networks

    Priyadarshini Panda, Jason M. Allred, Shriram Ramanathan, Kaushik Roy
    Comments: 14 pages, 14 figures
    Subjects: Neural and Evolutionary Computing (cs.NE); Computer Vision and Pattern Recognition (cs.CV)

    A fundamental feature of learning in animals is the “ability to forget” that
    allows an organism to perceive, model and make decisions from disparate streams
    of information and adapt to changing environments. Against this backdrop, we
    present a novel unsupervised learning mechanism ASP (Adaptive Synaptic
    Plasticity) for improved recognition with Spiking Neural Networks (SNNs) for
    real time on-line learning in a dynamic environment. We incorporate an adaptive
    weight decay mechanism with the traditional Spike Timing Dependent Plasticity
    (STDP) learning to model adaptivity in SNNs. The leak rate of the synaptic
    weights is modulated based on the temporal correlation between the spiking
    patterns of the pre- and post-synaptic neurons. This mechanism helps in gradual
    forgetting of insignificant data while retaining significant, yet old,
    information. ASP, thus, maintains a balance between forgetting and immediate
    learning to construct a stable-plastic self-adaptive SNN for continuously
    changing inputs. We demonstrate that the proposed learning methodology
    addresses catastrophic forgetting while yielding significantly improved
    accuracy over the conventional STDP learning method for digit recognition
    applications. Additionally, we observe that the proposed learning model
    automatically encodes selective attention towards relevant features in the
    input data while eliminating the influence of background noise (or denoising)
    further improving the robustness of the ASP learning.

    Deep Learning for Explicitly Modeling Optimization Landscapes

    Shumeet Baluja
    Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Learning (cs.LG)

    In all but the most trivial optimization problems, the structure of the
    solutions exhibit complex interdependencies between the input parameters.
    Decades of research with stochastic search techniques has shown the benefit of
    explicitly modeling the interactions between sets of parameters and the overall
    quality of the solutions discovered. We demonstrate a novel method, based on
    learning deep networks, to model the global landscapes of optimization
    problems. To represent the search space concisely and accurately, the deep
    networks must encode information about the underlying parameter interactions
    and their contributions to the quality of the solution. Once the networks are
    trained, the networks are probed to reveal parameter combinations with high
    expected performance with respect to the optimization task. These estimates are
    used to initialize fast, randomized, local search algorithms, which in turn
    expose more information about the search space that is subsequently used to
    refine the models. We demonstrate the technique on multiple optimization
    problems that have arisen in a variety of real-world domains, including:
    packing, graphics, job scheduling, layout and compression. The problems include
    combinatoric search spaces, discontinuous and highly non-linear spaces, and
    span binary, higher-cardinality discrete, as well as continuous parameters.
    Strengths, limitations, and extensions of the approach are extensively
    discussed and demonstrated.

    Direct Acoustics-to-Word Models for English Conversational Speech Recognition

    Kartik Audhkhasi, Bhuvana Ramabhadran, George Saon, Michael Picheny, David Nahamoo
    Comments: Submitted to Interspeech-2017
    Subjects: Computation and Language (cs.CL); Human-Computer Interaction (cs.HC); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    Recent work on end-to-end automatic speech recognition (ASR) has shown that
    the connectionist temporal classification (CTC) loss can be used to convert
    acoustics to phone or character sequences. Such systems are used with a
    dictionary and separately-trained Language Model (LM) to produce word
    sequences. However, they are not truly end-to-end in the sense of mapping
    acoustics directly to words without an intermediate phone representation. In
    this paper, we present the first results employing direct acoustics-to-word CTC
    models on two well-known public benchmark tasks: Switchboard and CallHome.
    These models do not require an LM or even a decoder at run-time and hence
    recognize speech with minimal complexity. However, due to the large number of
    word output units, CTC word models require orders of magnitude more data to
    train reliably compared to traditional systems. We present some techniques to
    mitigate this issue. Our CTC word model achieves a word error rate of
    13.0%/18.8% on the Hub5-2000 Switchboard/CallHome test sets without any LM or
    decoder compared with 9.6%/16.0% for phone-based CTC with a 4-gram LM. We also
    present rescoring results on CTC word model lattices to quantify the
    performance benefits of a LM, and contrast the performance of word and phone
    CTC models.

    In Defense of the Triplet Loss for Person Re-Identification

    Alexander Hermans, Lucas Beyer, Bastian Leibe
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    In the past few years, the field of computer vision has gone through a
    revolution fueled mainly by the advent of large datasets and the adoption of
    deep convolutional neural networks for end-to-end learning. The person
    re-identification subfield is no exception to this, thanks to the notable
    publication of the Market-1501 and MARS datasets and several strong deep
    learning approaches. Unfortunately, a prevailing belief in the community seems
    to be that the triplet loss is inferior to using surrogate losses
    (classification, verification) followed by a separate metric learning step. We
    show that, for models trained from scratch as well as pretrained ones, using a
    variant of the triplet loss to perform end-to-end deep metric learning
    outperforms any other published method by a large margin.


    Computer Vision and Pattern Recognition

    In Defense of the Triplet Loss for Person Re-Identification

    Alexander Hermans, Lucas Beyer, Bastian Leibe
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    In the past few years, the field of computer vision has gone through a
    revolution fueled mainly by the advent of large datasets and the adoption of
    deep convolutional neural networks for end-to-end learning. The person
    re-identification subfield is no exception to this, thanks to the notable
    publication of the Market-1501 and MARS datasets and several strong deep
    learning approaches. Unfortunately, a prevailing belief in the community seems
    to be that the triplet loss is inferior to using surrogate losses
    (classification, verification) followed by a separate metric learning step. We
    show that, for models trained from scratch as well as pretrained ones, using a
    variant of the triplet loss to perform end-to-end deep metric learning
    outperforms any other published method by a large margin.

    Classifying Symmetrical Differences and Temporal Change in Mammography Using Deep Neural Networks

    Thijs Kooi, Nico Karssemeijer
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We investigate the addition of symmetry and temporal context information to a
    deep Convolutional Neural Network (CNN) with the purpose of detecting malignant
    soft tissue lesions in mammography. We employ a simple linear mapping that
    takes the location of a mass candidate and maps it to either the contra-lateral
    or prior mammogram and Regions Of Interest (ROI) are extracted around each
    location. We subsequently explore two different architectures (1) a fusion
    model employing two datastreams were both ROIs are fed to the network during
    training and testing and (2) a stage-wise approach where a single ROI CNN is
    trained on the primary image and subsequently used as feature extractor for
    both primary and symmetrical or prior ROIs. A ‘shallow’ Gradient Boosted Tree
    (GBT) classifier is then trained on the concatenation of these features and
    used to classify the joint representation. Results shown a significant increase
    in performance using the first architecture and symmetry information, but only
    marginal gains in performance using temporal data and the other setting. We
    feel results are promising and can greatly be improved when more temporal data
    becomes available.

    Predicting Deeper into the Future of Semantic Segmentation

    Natalia Neverova, Pauline Luc, Camille Couprie, Jakob Verbeek, Yann LeCun
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    The ability to predict and therefore to anticipate the future is an important
    attribute of intelligence. It is also of utmost importance in real-time
    systems, e.g. in robotics or autonomous driving, which depend on visual scene
    understanding for decision making. While prediction of the raw RGB pixel values
    in future video frames has been studied in previous work, here we focus on
    predicting semantic segmentations of future frames. More precisely, given a
    sequence of semantically segmented video frames, our goal is to predict
    segmentation maps of not yet observed video frames that lie up to a second or
    further in the future. We develop an autoregressive convolutional neural
    network that learns to iteratively generate multiple frames. Our results on the
    Cityscapes dataset show that directly predicting future segmentations is
    substantially better than predicting and then segmenting future RGB frames. Our
    models predict trajectories of cars and pedestrians much more accurately (25%)
    than baselines that copy the most recent semantic segmentation or warp it using
    optical flow. Prediction results up to half a second in the future are visually
    convincing, the mean IoU of predicted segmentations reaching two thirds of the
    real future segmentations.

    Neural Ctrl-F: Segmentation-free Query-by-String Word Spotting in Handwritten Manuscript Collections

    Tomas Wilkinson, Jonas Lindström, Anders Brun
    Comments: Technical report
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we approach the problem of segmentation-free query-by-string
    word spotting for handwritten documents. In other words, we use methods
    inspired from computer vision and machine learning to search for words in large
    collections of digitized manuscripts. In particular, we are interested in
    historical handwritten texts, which are often far more challenging than modern
    printed documents. This task is important, as it provides people with a way to
    quickly find what they are looking for in large collections that are tedious
    and difficult to read manually. To this end, we introduce an end-to-end
    trainable model based on deep neural networks that we call Ctrl-F-Net. Given a
    full manuscript page, the model simultaneously generates region proposals, and
    embeds these into a distributed word embedding space, where searches are
    performed. We evaluate the model on common benchmarks for handwritten word
    spotting, outperforming the previous state-of-the-art segmentation-free
    approaches by a large margin, and in some cases even segmentation-based
    approaches. One interesting real-life application of our approach is to help
    historians to find and count specific words in court records that are related
    to women’s sustenance activities and division of labor. We provide promising
    preliminary experiments that validate our method on this task.

    Can you tell where in India I am from? Comparing humans and computers on fine-grained race face classification

    Harish Katti, S. P. Arun
    Comments: 9 pages, 5 figure, 2 tables
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Faces form the basis for a rich variety of judgments in humans, yet the
    underlying features remain poorly understood. Although fine-grained
    distinctions within a race might more strongly constrain possible facial
    features used by humans than in case of coarse categories such as race or
    gender, such fine grained distinctions are relatively less studied.
    Fine-grained race classification is also interesting because even humans may
    not be perfectly accurate on these tasks. This allows us to compare errors made
    by humans and machines, in contrast to standard object detection tasks where
    human performance is nearly perfect. We have developed a novel face database of
    close to 1650 diverse Indian faces labeled for fine-grained race (South vs
    North India) as well as for age, weight, height and gender. We then asked close
    to 130 human subjects who were instructed to categorize each face as belonging
    toa Northern or Southern state in India. We then compared human performance on
    this task with that of computational models trained on the ground-truth labels.
    Our main results are as follows: (1) Humans are highly consistent (average
    accuracy: 63.6%), with some faces being consistently classified with > 90%
    accuracy and others consistently misclassified with < 30% accuracy; (2) Models
    trained on ground-truth labels showed slightly worse performance (average
    accuracy: 62%) but showed higher accuracy (72.2%) on faces classified with >
    80% accuracy by humans. This was true for models trained on simple spatial and
    intensity measurements extracted from faces as well as deep neural networks
    trained on race or gender classification; (3) Using overcomplete banks of
    features derived from each face part, we found that mouth shape was the single
    largest contributor towards fine-grained race classification, whereas distances
    between face parts was the strongest predictor of gender.

    An End-to-End Approach to Natural Language Object Retrieval via Context-Aware Deep Reinforcement Learning

    Fan Wu, Zhongwen Xu, Yi Yang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We propose an end-to-end approach to the natural language object retrieval
    task, which localizes an object within an image according to a natural language
    description, i.e., referring expression. Previous works divide this problem
    into two independent stages: first, compute region proposals from the image
    without the exploration of the language description; second, score the object
    proposals with regard to the referring expression and choose the top-ranked
    proposals. The object proposals are generated independently from the referring
    expression, which makes the proposal generation redundant and even irrelevant
    to the referred object. In this work, we train an agent with deep reinforcement
    learning, which learns to move and reshape a bounding box to localize the
    object according to the referring expression. We incorporate both the spatial
    and temporal context information into the training procedure. By simultaneously
    exploiting local visual information, the spatial and temporal context and the
    referring language a priori, the agent selects an appropriate action to take at
    each time. A special action is defined to indicate when the agent finds the
    referred object, and terminate the procedure. We evaluate our model on various
    datasets, and our algorithm significantly outperforms the compared algorithms.
    Notably, the accuracy improvement of our method over the recent method GroundeR
    and SCRC on the ReferItGame dataset are 7.67% and 18.25%, respectively.

    Deep MANTA: A Coarse-to-fine Many-Task Network for joint 2D and 3D vehicle analysis from monocular image

    Florian Chabot, Mohamed Chaouch, Jaonary Rabarisoa, Céline Teulière, Thierry Chateau
    Comments: CVPR 2017 (to appear)
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we present a novel approach, called Deep MANTA (Deep
    Many-Tasks), for many-task vehicle analysis from a given image. A robust
    convolutional network is introduced for simultaneous vehicle detection, part
    localization, visibility characterization and 3D dimension estimation. Its
    architecture is based on a new coarse-to-fine object proposal that boosts the
    vehicle detection. Moreover, the Deep MANTA network is able to localize vehicle
    parts even if these parts are not visible. In the inference, the network’s
    outputs are used by a real time robust pose estimation algorithm for fine
    orientation estimation and 3D vehicle localization. We show in experiments that
    our method outperforms monocular state-of-the-art approaches on vehicle
    detection, orientation and 3D location tasks on the very challenging KITTI
    benchmark.

    Deeply-Supervised CNN for Prostate Segmentation

    Qikui Zhu, Bo Du, Baris Turkbey, Peter L . Choyke, Pingkun Yan
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Prostate segmentation from Magnetic Resonance (MR) images plays an important
    role in image guided interven- tion. However, the lack of clear boundary
    specifically at the apex and base, and huge variation of shape and texture
    between the images from different patients make the task very challenging. To
    overcome these problems, in this paper, we propose a deeply supervised
    convolutional neural network (CNN) utilizing the convolutional information to
    accurately segment the prostate from MR images. The proposed model can
    effectively detect the prostate region with additional deeply supervised layers
    compared with other approaches. Since some information will be abandoned after
    convolution, it is necessary to pass the features extracted from early stages
    to later stages. The experimental results show that significant segmentation
    accuracy improvement has been achieved by our proposed method compared to other
    reported approaches.

    Joint Intermodal and Intramodal Label Transfers for Extremely Rare or Unseen Classes

    Guo-Jun Qi, Wei Liu, Charu Aggarwal, Thomas Huang
    Comments: The paper has been accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence. It will apear in a future issue
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we present a label transfer model from texts to images for
    image classification tasks. The problem of image classification is often much
    more challenging than text classification. On one hand, labeled text data is
    more widely available than the labeled images for classification tasks. On the
    other hand, text data tends to have natural semantic interpretability, and they
    are often more directly related to class labels. On the contrary, the image
    features are not directly related to concepts inherent in class labels. One of
    our goals in this paper is to develop a model for revealing the functional
    relationships between text and image features as to directly transfer
    intermodal and intramodal labels to annotate the images. This is implemented by
    learning a transfer function as a bridge to propagate the labels between two
    multimodal spaces. However, the intermodal label transfers could be undermined
    by blindly transferring the labels of noisy texts to annotate images. To
    mitigate this problem, we present an intramodal label transfer process, which
    complements the intermodal label transfer by transferring the image labels
    instead when relevant text is absent from the source corpus. In addition, we
    generalize the inter-modal label transfer to zero-shot learning scenario where
    there are only text examples available to label unseen classes of images
    without any positive image examples. We evaluate our algorithm on an image
    classification task and show the effectiveness with respect to the other
    compared algorithms.

    Video Frame Interpolation via Adaptive Convolution

    Simon Niklaus, Long Mai, Feng Liu
    Comments: CVPR 2017, this http URL
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Video frame interpolation typically involves two steps: motion estimation and
    pixel synthesis. Such a two-step approach heavily depends on the quality of
    motion estimation. This paper presents a robust video frame interpolation
    method that combines these two steps into a single process. Specifically, our
    method considers pixel synthesis for the interpolated frame as local
    convolution over two input frames. The convolution kernel captures both the
    local motion between the input frames and the coefficients for pixel synthesis.
    Our method employs a deep fully convolutional neural network to estimate a
    spatially-adaptive convolution kernel for each pixel. This deep neural network
    can be directly trained end to end using widely available video data without
    any difficult-to-obtain ground-truth data like optical flow. Our experiments
    show that the formulation of video interpolation as a single convolution
    process allows our method to gracefully handle challenges like occlusion, blur,
    and abrupt brightness change and enables high-quality video frame
    interpolation.

    Deep Photo Style Transfer

    Fujun Luan, Sylvain Paris, Eli Shechtman, Kavita Bala
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper introduces a deep-learning approach to photographic style transfer
    that handles a large variety of image content while faithfully transferring the
    reference style. Our approach builds upon recent work on painterly transfer
    that separates style from the content of an image by considering different
    layers of a neural network. However, as is, this approach is not suitable for
    photorealistic style transfer. Even when both the input and reference images
    are photographs, the output still exhibits distortions reminiscent of a
    painting. Our contribution is to constrain the transformation from the input to
    the output to be locally affine in colorspace, and to express this constraint
    as a custom CNN layer through which we can backpropagate. We show that this
    approach successfully suppresses distortion and yields satisfying
    photorealistic style transfers in a broad variety of scenarios, including
    transfer of the time of day, weather, season, and artistic edits.

    Knowledge Transfer for Melanoma Screening with Deep Learning

    Afonso Menegola, Michel Fornaciali, Ramon Pires, Flávia Vasques Bittencourt, Sandra Avila, Eduardo Valle
    Comments: 4 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Knowledge transfer impacts the performance of deep learning — the state of
    the art for image classification tasks, including automated melanoma screening.
    Deep learning’s greed for large amounts of training data poses a challenge for
    medical tasks, which we can alleviate by recycling knowledge from models
    trained on different tasks, in a scheme called transfer learning. Although much
    of the best art on automated melanoma screening employs some form of transfer
    learning, a systematic evaluation was missing. Here we investigate the presence
    of transfer, from which task the transfer is sourced, and the application of
    fine tuning (i.e., retraining of the deep learning model after transfer). We
    also test the impact of picking deeper (and more expensive) models. Our results
    favor deeper models, pre-trained over ImageNet, with fine-tuning, reaching an
    AUC of 80.7% and 84.5% for the two skin-lesion datasets evaluated.

    Spatially-Varying Blur Detection Based on Multiscale Fused and Sorted Transform Coefficients of Gradient Magnitudes

    S. Alireza Golestaneh, Lina J. Karam
    Comments: Paper got accepted in CVPR 2017
    Journal-ref: 2017 IEEE Conference on Computer Vision and Pattern Recognition
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    The detection of spatially-varying blur without having any information about
    the blur type is a challenging task. In this paper, we propose a novel
    effective approach to address the blur detection problem from a single image
    without requiring any knowledge about the blur type, level, or camera settings.
    Our approach computes blur detection maps based on a novel High-frequency
    multiscale Fusion and Sort Transform (HiFST) of gradient magnitudes. The
    evaluations of the proposed approach on a diverse set of blurry images with
    different blur types, levels, and contents demonstrate that the proposed
    algorithm performs favorably against the state-of-the-art methods qualitatively
    and quantitatively.

    PKU-MMD: A Large Scale Benchmark for Continuous Multi-Modal Human Action Understanding

    Chunhui Liu, Yueyu Hu, Yanghao Li, Sijie Song, Jiaying Liu
    Comments: 10 pages, submitted to ICCV 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Despite the fact that many 3D human activity benchmarks being proposed, most
    existing action datasets focus on the action recognition tasks for the
    segmented videos. There is a lack of standard large-scale benchmarks,
    especially for current popular data-hungry deep learning based methods. In this
    paper, we introduce a new large scale benchmark (PKU-MMD) for continuous
    multi-modality 3D human action understanding and cover a wide range of complex
    human activities with well annotated information. PKU-MMD contains 1076 long
    video sequences in 51 action categories, performed by 66 subjects in three
    camera views. It contains almost 20,000 action instances and 5.4 million frames
    in total. Our dataset also provides multi-modality data sources, including RGB,
    depth, Infrared Radiation and Skeleton. With different modalities, we conduct
    extensive experiments on our dataset in terms of two scenarios and evaluate
    different methods by various metrics, including a new proposed evaluation
    protocol 2D-AP. We believe this large-scale dataset will benefit future
    researches on action detection for the community.

    Episode-Based Active Learning with Bayesian Neural Networks

    Feras Dayoub, Niko Sünderhauf, Peter Corke
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)

    We investigate different strategies for active learning with Bayesian deep
    neural networks. We focus our analysis on scenarios where new, unlabeled data
    is obtained episodically, such as commonly encountered in mobile robotics
    applications. An evaluation of different strategies for acquisition, updating,
    and final training on the CIFAR-10 dataset shows that incremental network
    updates with final training on the accumulated acquisition set are essential
    for best performance, while limiting the amount of required human labeling
    labor.

    No Fuss Distance Metric Learning using Proxies

    Yair Movshovitz-Attias, Alexander Toshev, Thomas K. Leung, Sergey Ioffe, Saurabh Singh
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We address the problem of distance metric learning (DML), defined as learning
    a distance consistent with a notion of semantic similarity. Traditionally, for
    this problem supervision is expressed in the form of sets of points that follow
    an ordinal relationship — an anchor point (x) is similar to a set of positive
    points (Y), and dissimilar to a set of negative points (Z), and a loss defined
    over these distances is minimized.

    While the specifics of the optimization differ, in this work we collectively
    call this type of supervision Triplets and all methods that follow this pattern
    Triplet-Based methods. These methods are challenging to optimize. A main issue
    is the need for finding informative triplets, which is usually achieved by a
    variety of tricks such as increasing the batch size, hard or semi-hard triplet
    mining, etc, but even with these tricks, the convergence rate of such methods
    is slow. In this paper we propose to optimize the triplet loss on a different
    space of triplets, consisting of an anchor data point and similar and
    dissimilar proxy points. These proxies approximate the original data points, so
    that a triplet loss over the proxies is a tight upper bound of the original
    loss. This proxy-based loss is empirically better behaved. As a result, the
    proxy-loss improves on state-of-art results for three standard zero-shot
    learning datasets, by up to 15% points, while converging three times as fast as
    other triplet-based losses.

    IOD-CNN: Integrating Object Detection Networks for Event Recognition

    Sungmin Eum, Hyungtae Lee, Heesung Kwon, David Doermann
    Comments: submitted to IEEE International Conference on Image Processing 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Many previous methods have showed the importance of considering semantically
    relevant objects for performing event recognition, yet none of the methods have
    exploited the power of deep convolutional neural networks to directly integrate
    relevant object information into a unified network. We present a novel unified
    deep CNN architecture which integrates architecturally different, yet
    semantically-related object detection networks to enhance the performance of
    the event recognition task. Our architecture allows the sharing of the
    convolutional layers and a fully connected layer which effectively integrates
    event recognition, rigid object detection and non-rigid object detection.

    Simple Online and Realtime Tracking with a Deep Association Metric

    Nicolai Wojke, Alex Bewley, Dietrich Paulus
    Comments: 5 pages, 1 figure
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Simple Online and Realtime Tracking (SORT) is a pragmatic approach to
    multiple object tracking with a focus on simple, effective algorithms. In this
    paper, we integrate appearance information to improve the performance of SORT.
    Due to this extension we are able to track objects through longer periods of
    occlusions, effectively reducing the number of identity switches. In spirit of
    the original framework we place much of the computational complexity into an
    offline pre-training stage where we learn a deep association metric on a
    large-scale person re-identification dataset. During online application, we
    establish measurement-to-track associations using nearest neighbor queries in
    visual appearance space. Experimental evaluation shows that our extensions
    reduce the number of identity switches by 45%, achieving overall competitive
    performance at high frame rates.

    ASP: Learning to Forget with Adaptive Synaptic Plasticity in Spiking Neural Networks

    Priyadarshini Panda, Jason M. Allred, Shriram Ramanathan, Kaushik Roy
    Comments: 14 pages, 14 figures
    Subjects: Neural and Evolutionary Computing (cs.NE); Computer Vision and Pattern Recognition (cs.CV)

    A fundamental feature of learning in animals is the “ability to forget” that
    allows an organism to perceive, model and make decisions from disparate streams
    of information and adapt to changing environments. Against this backdrop, we
    present a novel unsupervised learning mechanism ASP (Adaptive Synaptic
    Plasticity) for improved recognition with Spiking Neural Networks (SNNs) for
    real time on-line learning in a dynamic environment. We incorporate an adaptive
    weight decay mechanism with the traditional Spike Timing Dependent Plasticity
    (STDP) learning to model adaptivity in SNNs. The leak rate of the synaptic
    weights is modulated based on the temporal correlation between the spiking
    patterns of the pre- and post-synaptic neurons. This mechanism helps in gradual
    forgetting of insignificant data while retaining significant, yet old,
    information. ASP, thus, maintains a balance between forgetting and immediate
    learning to construct a stable-plastic self-adaptive SNN for continuously
    changing inputs. We demonstrate that the proposed learning methodology
    addresses catastrophic forgetting while yielding significantly improved
    accuracy over the conventional STDP learning method for digit recognition
    applications. Additionally, we observe that the proposed learning model
    automatically encodes selective attention towards relevant features in the
    input data while eliminating the influence of background noise (or denoising)
    further improving the robustness of the ASP learning.


    Artificial Intelligence

    \(1 Today or \)2 Tomorrow? The Answer is in Your Facebook Likes

    Tao Ding, Shimei Pan, Warren K. Bickel
    Subjects: Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Social and Information Networks (cs.SI)

    In economics and psychology, delay discounting is often used to characterize
    how individuals choose between a smaller immediate reward and a larger delayed
    reward. People with higher delay discounting rate (DDR) often choose smaller
    but more immediate rewards (a “today person”). In contrast, people with a lower
    discounting rate often choose a larger future rewards (a “tomorrow person”).
    Since the ability to modulate the desire of immediate gratification for long
    term rewards plays an important role in our decision-making, the lower
    discounting rate often predicts better social, academic and health outcomes. In
    contrast, the higher discounting rate is often associated with problematic
    behaviors such as alcohol/drug abuse, pathological gambling and credit card
    default. Thus, research on understanding and moderating delay discounting has
    the potential to produce substantial societal benefits.

    RobustFill: Neural Program Learning under Noisy I/O

    Jacob Devlin, Jonathan Uesato, Surya Bhupatiraju, Rishabh Singh, Abdel-rahman Mohamed, Pushmeet Kohli
    Comments: 8 pages + 9 pages of supplementary material
    Subjects: Artificial Intelligence (cs.AI)

    The problem of automatically generating a computer program from some
    specification has been studied since the early days of AI. Recently, two
    competing approaches for automatic program learning have received significant
    attention: (1) neural program synthesis, where a neural network is conditioned
    on input/output (I/O) examples and learns to generate a program, and (2) neural
    program induction, where a neural network generates new outputs directly using
    a latent program representation.

    Here, for the first time, we directly compare both approaches on a
    large-scale, real-world learning task. We additionally contrast to rule-based
    program synthesis, which uses hand-crafted semantics to guide the program
    generation. Our neural models use a modified attention RNN to allow encoding of
    variable-sized sets of I/O pairs. Our best synthesis model achieves 92%
    accuracy on a real-world test set, compared to the 34% accuracy of the previous
    best neural synthesis approach. The synthesis model also outperforms a
    comparable induction model on this task, but we more importantly demonstrate
    that the strength of each approach is highly dependent on the evaluation metric
    and end-user application. Finally, we show that we can train our neural models
    to remain very robust to the type of noise expected in real-world data (e.g.,
    typos), while a highly-engineered rule-based system fails entirely.

    S-Concave Distributions: Towards Broader Distributions for Noise-Tolerant and Sample-Efficient Learning Algorithms

    Maria-Florina Balcan, Hongyang Zhang
    Comments: 39 pages
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG)

    We provide new results concerning noise-tolerant and sample-efficient
    learning algorithms under (s)-concave distributions over (mathbb{R}^n) for
    (-frac{1}{2n+3}le sle 0). The new class of (s)-concave distributions is a
    broad and natural generalization of log-concavity, and includes many important
    additional distributions, e.g., the Pareto distribution and (t)-distribution.
    This class has been studied in the context of efficient sampling, integration,
    and optimization, but much remains unknown concerning the geometry of this
    class of distributions and their applications in the context of learning.

    The challenge is that unlike the commonly used distributions in learning
    (uniform or more generally log-concave distributions), this broader class is
    not closed under the marginalization operator and many such distributions are
    fat-tailed. In this work, we introduce new convex geometry tools to study the
    properties of s-concave distributions and use these properties to provide
    bounds on quantities of interest to learning including the probability of
    disagreement between two halfspaces, disagreement outside a band, and
    disagreement coefficient. We use these results to significantly generalize
    prior results for margin-based active learning, disagreement-based active
    learning, and passively learning of intersections of halfspaces.

    Our analysis of geometric properties of s-concave distributions might be of
    independent interest to optimization more broadly.

    UBEV – A More Practical Algorithm for Episodic RL with Near-Optimal PAC and Regret Guarantees

    Christoph Dann, Tor Lattimore, Emma Brunskill
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    We present UBEV, a simple and efficient reinforcement learning algorithm for
    fixed-horizon episodic Markov decision processes. The main contribution is a
    proof that UBEV enjoys a sample-complexity bound that holds for all accuracy
    levels simultaneously with high probability, and matches the lower bound except
    for logarithmic terms and one factor of the horizon. A consequence of the fact
    that our sample-complexity bound holds for all accuracy levels is that the new
    algorithm achieves a sub-linear regret of O(sqrt(SAT)), which is the first time
    the dependence on the size of the state space has provably appeared inside the
    square root. A brief empirical evaluation shows that UBEV is practically
    superior to existing algorithms with known sample-complexity guarantees.

    Deep Exploration via Randomized Value Functions

    Ian Osband, Daniel Russo, Zheng Wen, Benjamin Van Roy
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG)

    We study the use of randomized value functions to guide deep exploration in
    reinforcement learning. This offers an elegant means for synthesizing
    statistically and computationally efficient exploration with common practical
    approaches to value function learning. We present several reinforcement
    learning algorithms that leverage randomized value functions and demonstrate
    their efficacy through computational studies. We also prove a regret bound that
    establishes statistical efficiency with a tabular representation.

    Deep Learning for Explicitly Modeling Optimization Landscapes

    Shumeet Baluja
    Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Learning (cs.LG)

    In all but the most trivial optimization problems, the structure of the
    solutions exhibit complex interdependencies between the input parameters.
    Decades of research with stochastic search techniques has shown the benefit of
    explicitly modeling the interactions between sets of parameters and the overall
    quality of the solutions discovered. We demonstrate a novel method, based on
    learning deep networks, to model the global landscapes of optimization
    problems. To represent the search space concisely and accurately, the deep
    networks must encode information about the underlying parameter interactions
    and their contributions to the quality of the solution. Once the networks are
    trained, the networks are probed to reveal parameter combinations with high
    expected performance with respect to the optimization task. These estimates are
    used to initialize fast, randomized, local search algorithms, which in turn
    expose more information about the search space that is subsequently used to
    refine the models. We demonstrate the technique on multiple optimization
    problems that have arisen in a variety of real-world domains, including:
    packing, graphics, job scheduling, layout and compression. The problems include
    combinatoric search spaces, discontinuous and highly non-linear spaces, and
    span binary, higher-cardinality discrete, as well as continuous parameters.
    Strengths, limitations, and extensions of the approach are extensively
    discussed and demonstrated.

    Ontology Based Pivoted normalization using Vector Based Approach for information Retrieval

    Vishal Jain, Dr. Mayank Singh
    Journal-ref: 7th International Conference on Advanced Computing and
    Communication Technologies, 16th November, 2013
    Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI)

    The proposed methodology is procedural i.e. it follows finite number of steps
    that extracts relevant documents according to users query. It is based on
    principles of Data Mining for analyzing web data. Data Mining first adapts
    integration of data to generate warehouse. Then, it extracts useful information
    with the help of algorithm. The task of representing extracted documents is
    done by using Vector Based Statistical Approach that represents each document
    in set of Terms.

    Improving Statistical Multimedia Information Retrieval Model by using Ontology

    Gagandeep Singh Narula, Vishal Jain
    Journal-ref: International Journal of Computer Applications ISSN No 0975 8887
    Volume 94 No 2, May 2014
    Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI)

    A typical IR system that delivers and stores information is affected by
    problem of matching between user query and available content on web. Use of
    Ontology represents the extracted terms in form of network graph consisting of
    nodes, edges, index terms etc. The above mentioned IR approaches provide
    relevance thus satisfying users query. The paper also emphasis on analyzing
    multimedia documents and performs calculation for extracted terms using
    different statistical formulas. The proposed model developed reduces semantic
    gap and satisfies user needs efficiently.


    Information Retrieval

    Ontology Based Pivoted normalization using Vector Based Approach for information Retrieval

    Vishal Jain, Dr. Mayank Singh
    Journal-ref: 7th International Conference on Advanced Computing and
    Communication Technologies, 16th November, 2013
    Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI)

    The proposed methodology is procedural i.e. it follows finite number of steps
    that extracts relevant documents according to users query. It is based on
    principles of Data Mining for analyzing web data. Data Mining first adapts
    integration of data to generate warehouse. Then, it extracts useful information
    with the help of algorithm. The task of representing extracted documents is
    done by using Vector Based Statistical Approach that represents each document
    in set of Terms.

    Improving Statistical Multimedia Information Retrieval Model by using Ontology

    Gagandeep Singh Narula, Vishal Jain
    Journal-ref: International Journal of Computer Applications ISSN No 0975 8887
    Volume 94 No 2, May 2014
    Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI)

    A typical IR system that delivers and stores information is affected by
    problem of matching between user query and available content on web. Use of
    Ontology represents the extracted terms in form of network graph consisting of
    nodes, edges, index terms etc. The above mentioned IR approaches provide
    relevance thus satisfying users query. The paper also emphasis on analyzing
    multimedia documents and performs calculation for extracted terms using
    different statistical formulas. The proposed model developed reduces semantic
    gap and satisfies user needs efficiently.


    Computation and Language

    Direct Acoustics-to-Word Models for English Conversational Speech Recognition

    Kartik Audhkhasi, Bhuvana Ramabhadran, George Saon, Michael Picheny, David Nahamoo
    Comments: Submitted to Interspeech-2017
    Subjects: Computation and Language (cs.CL); Human-Computer Interaction (cs.HC); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    Recent work on end-to-end automatic speech recognition (ASR) has shown that
    the connectionist temporal classification (CTC) loss can be used to convert
    acoustics to phone or character sequences. Such systems are used with a
    dictionary and separately-trained Language Model (LM) to produce word
    sequences. However, they are not truly end-to-end in the sense of mapping
    acoustics directly to words without an intermediate phone representation. In
    this paper, we present the first results employing direct acoustics-to-word CTC
    models on two well-known public benchmark tasks: Switchboard and CallHome.
    These models do not require an LM or even a decoder at run-time and hence
    recognize speech with minimal complexity. However, due to the large number of
    word output units, CTC word models require orders of magnitude more data to
    train reliably compared to traditional systems. We present some techniques to
    mitigate this issue. Our CTC word model achieves a word error rate of
    13.0%/18.8% on the Hub5-2000 Switchboard/CallHome test sets without any LM or
    decoder compared with 9.6%/16.0% for phone-based CTC with a 4-gram LM. We also
    present rescoring results on CTC word model lattices to quantify the
    performance benefits of a LM, and contrast the performance of word and phone
    CTC models.

    Hierarchical RNN with Static Sentence-Level Attention for Text-Based Speaker Change Detection

    Zhao Meng, Lili Mou, Zhi Jin
    Subjects: Computation and Language (cs.CL)

    Traditional speaker change detection in dialogues is typically based on audio
    input. In some scenarios, however, researchers can only obtain text, and do not
    have access to raw audio signals. Moreover, with the increasing need of deep
    semantic processing, text-based dialogue understanding is attracting more
    attention in the community. These raise the problem of text-based speaker
    change detection. In this paper, we formulate the task as a matching problem of
    utterances before and after a certain decision point; we propose a hierarchical
    recurrent neural network (RNN) with static sentence-level attention. Our model
    comprises three main components: a sentence encoder with a long short term
    memory (LSTM)-based RNN, a context encoder with another LSTM-RNN, and a static
    sentence-level attention mechanism, which allows rich information interaction.
    Experimental results show that neural networks consistently achieve better
    performance than feature-based approaches, and that our attention-based model
    significantly outperforms non-attention neural networks.

    Topic Identification for Speech without ASR

    Chunxi Liu, Jan Trmal, Matthew Wiesner, Craig Harman, Sanjeev Khudanpur
    Comments: 5 pages, 2 figures, submitted to Interspeech 2017
    Subjects: Computation and Language (cs.CL)

    Modern topic identification (topic ID) systems for speech use automatic
    speech recognition (ASR) to produce speech transcripts, and perform supervised
    classification on such ASR outputs. However, under resource-limited conditions,
    the manually transcribed speech required to develop standard ASR systems can be
    severely limited or unavailable. In this paper, we investigate alternative
    unsupervised solutions to obtaining tokenizations of speech in terms of a
    vocabulary of automatically discovered word-like or phoneme-like units, without
    depending on the supervised training of ASR systems. Moreover, using automatic
    phoneme-like tokenizations, we demonstrate that a convolutional neural network
    based framework for learning spoken document representations provides
    competitive performance compared to a standard bag-of-words representation, as
    evidenced by comprehensive topic ID evaluations on both single-label and
    multi-label classification tasks.

    The NLTK FrameNet API: Designing for Discoverability with a Rich Linguistic Resource

    Nathan Schneider, Chuck Wooters
    Subjects: Computation and Language (cs.CL)

    A new Python API, integrated within the NLTK suite, offers access to the
    FrameNet 1.7 lexical database. The lexicon (structured in terms of frames) as
    well as annotated sentences can be processed programatically, or browsed with
    human-readable displays via the interactive Python prompt.

    Gate Activation Signal Analysis for Gated Recurrent Neural Networks and Its Correlation with Phoneme Boundaries

    Yu-Hsuan Wang, Cheng-Tao Chung, Hung-yi Lee
    Comments: 5 pages
    Subjects: Sound (cs.SD); Computation and Language (cs.CL); Learning (cs.LG)

    In this paper we analyze the gate activation signals inside the gated
    recurrent neural networks, and find the temporal structure of such signals is
    highly correlated with the phoneme boundaries. This correlation is further
    verified by a set of experiments for phoneme segmentation, in which better
    results compared to standard approaches were obtained.


    Distributed, Parallel, and Cluster Computing

    Snafu: Function-as-a-Service (FaaS) Runtime Design and Implementation

    Josef Spillner
    Comments: 15 pages, 9 figures, 4 tables, repeatable, unreviewed
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Snafu, or Snake Functions, is a modular system to host, execute and manage
    language-level functions offered as stateless (micro-)services to diverse
    external triggers. The system interfaces resemble those of commercial FaaS
    providers but its implementation provides distinct features which make it
    overall useful to research on FaaS and prototyping of FaaS-based applications.
    This paper argues about the system motivation in the presence of already
    existing alternatives, its design and architecture, the open source
    implementation and collected metrics which characterise the system.

    Approximating k-spanners in the LOCAL model

    Michael Dinitz, Yasamin Nazari
    Subjects: Data Structures and Algorithms (cs.DS); Distributed, Parallel, and Cluster Computing (cs.DC); Combinatorics (math.CO)

    Graph spanners have been studied extensively, and have many applications in
    algorithms, distributed systems, and computer networks. For many of these
    application, we want distributed constructions of spanners, i.e., algorithms
    which use only local information. Dinitz and Krauthgamer (PODC 2011) provided a
    distributed approximation algorithm for 2-spanners in the LOCAL model with
    polylogarithmic running time, but the question of whether a similar algorithm
    exists for k-spanners with k > 2 remained open. In this paper, we show that a
    similar algorithm also works for cases where k > 2.


    Learning

    Independently Controllable Features

    Emmanuel Bengio, Valentin Thomas, Joelle Pineau, Doina Precup, Yoshua Bengio
    Comments: RLDM submission
    Subjects: Learning (cs.LG)

    Finding features that disentangle the different causes of variation in real
    data is a difficult task, that has nonetheless received considerable attention
    in static domains like natural images. Interactive environments, in which an
    agent can deliberately take actions, offer an opportunity to tackle this task
    better, because the agent can experiment with different actions and observe
    their effects. We introduce the idea that in interactive environments, latent
    factors that control the variation in observed data can be identified by
    figuring out what the agent can control. We propose a naive method to find
    factors that explain or measure the effect of the actions of a learner, and
    test it in illustrative experiments.

    UBEV – A More Practical Algorithm for Episodic RL with Near-Optimal PAC and Regret Guarantees

    Christoph Dann, Tor Lattimore, Emma Brunskill
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    We present UBEV, a simple and efficient reinforcement learning algorithm for
    fixed-horizon episodic Markov decision processes. The main contribution is a
    proof that UBEV enjoys a sample-complexity bound that holds for all accuracy
    levels simultaneously with high probability, and matches the lower bound except
    for logarithmic terms and one factor of the horizon. A consequence of the fact
    that our sample-complexity bound holds for all accuracy levels is that the new
    algorithm achieves a sub-linear regret of O(sqrt(SAT)), which is the first time
    the dependence on the size of the state space has provably appeared inside the
    square root. A brief empirical evaluation shows that UBEV is practically
    superior to existing algorithms with known sample-complexity guarantees.

    Characterization of Deterministic and Probabilistic Sampling Patterns for Finite Completability of Low Tensor-Train Rank Tensor

    Morteza Ashraphijuo, Xiaodong Wang
    Comments: arXiv admin note: text overlap with arXiv:1612.01597
    Subjects: Learning (cs.LG); Information Theory (cs.IT); Algebraic Geometry (math.AG); Machine Learning (stat.ML)

    In this paper, we analyze the fundamental conditions for low-rank tensor
    completion given the separation or tensor-train (TT) rank, i.e., ranks of
    unfoldings. We exploit the algebraic structure of the TT decomposition to
    obtain the deterministic necessary and sufficient conditions on the locations
    of the samples to ensure finite completability. Specifically, we propose an
    algebraic geometric analysis on the TT manifold that can incorporate the whole
    rank vector simultaneously in contrast to the existing approach based on the
    Grassmannian manifold that can only incorporate one rank component. Our
    proposed technique characterizes the algebraic independence of a set of
    polynomials defined based on the sampling pattern and the TT decomposition,
    which is instrumental to obtaining the deterministic condition on the sampling
    pattern for finite completability. In addition, based on the proposed analysis,
    assuming that the entries of the tensor are sampled independently with
    probability (p), we derive a lower bound on the sampling probability (p), or
    equivalently, the number of sampled entries that ensures finite completability
    with high probability. Moreover, we also provide the deterministic and
    probabilistic conditions for unique completability.

    Machine Learning Based Source Code Classification Using Syntax Oriented Features

    Shaul Zevin, Catherine Holzem
    Comments: 13 pages, 4 tables, 4 examples
    Subjects: Learning (cs.LG); Programming Languages (cs.PL)

    As of today the programming language of the vast majority of the published
    source code is manually specified or programmatically assigned based on the
    sole file extension. In this paper we show that the source code programming
    language identification task can be fully automated using machine learning
    techniques. We first define the criteria that a production-level automatic
    programming language identification solution should meet. Our criteria include
    accuracy, programming language coverage, extensibility and performance. We then
    describe our approach: How training files are preprocessed for extracting
    features that mimic grammar productions, and then how these extracted grammar
    productions are used for the training and testing of our classifier. We achieve
    a 99 percent accuracy rate while classifying 29 of the most popular programming
    languages with a Maximum Entropy classifier.

    Clustering for Different Scales of Measurement – the Gap-Ratio Weighted K-means Algorithm

    Joris Guérin, Olivier Gibaru, Stéphane Thiery, Eric Nyiri
    Comments: 13 pages, 6 figures, 2 tables. This paper is under the review process for AIAP 2017
    Subjects: Learning (cs.LG); Data Structures and Algorithms (cs.DS); Machine Learning (stat.ML)

    This paper describes a method for clustering data that are spread out over
    large regions and which dimensions are on different scales of measurement. Such
    an algorithm was developed to implement a robotics application consisting in
    sorting and storing objects in an unsupervised way. The toy dataset used to
    validate such application consists of Lego bricks of different shapes and
    colors. The uncontrolled lighting conditions together with the use of RGB color
    features, respectively involve data with a large spread and different levels of
    measurement between data dimensions. To overcome the combination of these two
    characteristics in the data, we have developed a new weighted K-means
    algorithm, called gap-ratio K-means, which consists in weighting each dimension
    of the feature space before running the K-means algorithm. The weight
    associated with a feature is proportional to the ratio of the biggest gap
    between two consecutive data points, and the average of all the other gaps.
    This method is compared with two other variants of K-means on the Lego bricks
    clustering problem as well as two other common classification datasets.

    Efficient PAC Learning from the Crowd

    Pranjal Awasthi, Avrim Blum, Nika Haghtalab, Yishay Mansour
    Subjects: Learning (cs.LG); Data Structures and Algorithms (cs.DS)

    In recent years crowdsourcing has become the method of choice for gathering
    labeled training data for learning algorithms. Standard approaches to
    crowdsourcing view the process of acquiring labeled data separately from the
    process of learning a classifier from the gathered data. This can give rise to
    computational and statistical challenges. For example, in most cases there are
    no known computationally efficient learning algorithms that are robust to the
    high level of noise that exists in crowdsourced data, and efforts to eliminate
    noise through voting often require a large number of queries per example.

    In this paper, we show how by interleaving the process of labeling and
    learning, we can attain computational efficiency with much less overhead in the
    labeling cost. In particular, we consider the realizable setting where there
    exists a true target function in (mathcal{F}) and consider a pool of labelers.
    When a noticeable fraction of the labelers are perfect, and the rest behave
    arbitrarily, we show that any (mathcal{F}) that can be efficiently learned in
    the traditional realizable PAC model can be learned in a computationally
    efficient manner by querying the crowd, despite high amounts of noise in the
    responses. Moreover, we show that this can be done while each labeler only
    labels a constant number of examples and the number of labels requested per
    example, on average, is a constant. When no perfect labelers exist, a related
    task is to find a set of the labelers which are good but not perfect. We show
    that we can identify all good labelers, when at least the majority of labelers
    are good.

    REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models

    George Tucker, Andriy Mnih, Chris J. Maddison, Jascha Sohl-Dickstein
    Comments: Accepted as a workshop submission at ICLR 2017
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    Learning in models with discrete latent variables is challenging due to high
    variance gradient estimators. Generally, approaches have relied on control
    variates to reduce the variance of the REINFORCE estimator. Recent work (Jang
    et al. 2016, Maddison et al. 2016) has taken a different approach, introducing
    a continuous relaxation of discrete variables to produce low-variance, but
    biased, gradient estimates. In this work, we combine the two approaches through
    a novel control variate that produces low-variance, unbiased gradient
    estimates. We present encouraging preliminary results on a toy problem and on
    learning sigmoid belief networks.

    CNN-MERP: An FPGA-Based Memory-Efficient Reconfigurable Processor for Forward and Backward Propagation of Convolutional Neural Networks

    Xushen Han, Dajiang Zhou, Shihao Wang, Shinji Kimura
    Journal-ref: ICCD 2016
    Subjects: Learning (cs.LG); Hardware Architecture (cs.AR)

    Large-scale deep convolutional neural networks (CNNs) are widely used in
    machine learning applications. While CNNs involve huge complexity, VLSI (ASIC
    and FPGA) chips that deliver high-density integration of computational
    resources are regarded as a promising platform for CNN’s implementation. At
    massive parallelism of computational units, however, the external memory
    bandwidth, which is constrained by the pin count of the VLSI chip, becomes the
    system bottleneck. Moreover, VLSI solutions are usually regarded as a lack of
    the flexibility to be reconfigured for the various parameters of CNNs. This
    paper presents CNN-MERP to address these issues. CNN-MERP incorporates an
    efficient memory hierarchy that significantly reduces the bandwidth
    requirements from multiple optimizations including on/off-chip data allocation,
    data flow optimization and data reuse. The proposed 2-level reconfigurability
    is utilized to enable fast and efficient reconfiguration, which is based on the
    control logic and the multiboot feature of FPGA. As a result, an external
    memory bandwidth requirement of 1.94MB/GFlop is achieved, which is 55% lower
    than prior arts. Under limited DRAM bandwidth, a system throughput of
    1244GFlop/s is achieved at the Vertex UltraScale platform, which is 5.48 times
    higher than the state-of-the-art FPGA implementations.

    Multitask Learning and Benchmarking with Clinical Time Series Data

    Hrayr Harutyunyan, Hrant Khachatrian, David C. Kale, Aram Galstyan
    Comments: Proposes clinical prediction benchmark tasks; details can be found at this https URL Manuscript revisions will be uploaded regularly to reflect updates to the benchmark data set, new results, and new findings. An earlier version of this manuscript is currently under review for SIGKDD 2017
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Health care is one of the most exciting frontiers in data mining and machine
    learning. Successful adoption of electronic health records (EHRs) created an
    explosion in digital clinical data available for analysis, but progress in
    machine learning for healthcare research has been difficult to measure because
    of the absence of publicly available benchmark data sets. To address this
    problem, we propose four clinical prediction benchmarks using data derived from
    the publicly available Medical Information Mart for Intensive Care (MIMIC-III)
    database. These tasks cover a range of clinical problems including modeling
    risk of mortality, forecasting length of stay, detecting physiologic decline,
    and phenotype classification. We formulate a heterogeneous multitask problem
    where the goal is to jointly learn multiple clinically relevant prediction
    tasks based on the same time series data. To address this problem, we propose a
    novel recurrent neural network (RNN) architecture that leverages the
    correlations between the various tasks to learn a better predictive model. We
    validate the proposed neural architecture on this benchmark, and demonstrate
    that it outperforms strong baselines, including single task RNNs.

    S-Concave Distributions: Towards Broader Distributions for Noise-Tolerant and Sample-Efficient Learning Algorithms

    Maria-Florina Balcan, Hongyang Zhang
    Comments: 39 pages
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG)

    We provide new results concerning noise-tolerant and sample-efficient
    learning algorithms under (s)-concave distributions over (mathbb{R}^n) for
    (-frac{1}{2n+3}le sle 0). The new class of (s)-concave distributions is a
    broad and natural generalization of log-concavity, and includes many important
    additional distributions, e.g., the Pareto distribution and (t)-distribution.
    This class has been studied in the context of efficient sampling, integration,
    and optimization, but much remains unknown concerning the geometry of this
    class of distributions and their applications in the context of learning.

    The challenge is that unlike the commonly used distributions in learning
    (uniform or more generally log-concave distributions), this broader class is
    not closed under the marginalization operator and many such distributions are
    fat-tailed. In this work, we introduce new convex geometry tools to study the
    properties of s-concave distributions and use these properties to provide
    bounds on quantities of interest to learning including the probability of
    disagreement between two halfspaces, disagreement outside a band, and
    disagreement coefficient. We use these results to significantly generalize
    prior results for margin-based active learning, disagreement-based active
    learning, and passively learning of intersections of halfspaces.

    Our analysis of geometric properties of s-concave distributions might be of
    independent interest to optimization more broadly.

    Predicting Deeper into the Future of Semantic Segmentation

    Natalia Neverova, Pauline Luc, Camille Couprie, Jakob Verbeek, Yann LeCun
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    The ability to predict and therefore to anticipate the future is an important
    attribute of intelligence. It is also of utmost importance in real-time
    systems, e.g. in robotics or autonomous driving, which depend on visual scene
    understanding for decision making. While prediction of the raw RGB pixel values
    in future video frames has been studied in previous work, here we focus on
    predicting semantic segmentations of future frames. More precisely, given a
    sequence of semantically segmented video frames, our goal is to predict
    segmentation maps of not yet observed video frames that lie up to a second or
    further in the future. We develop an autoregressive convolutional neural
    network that learns to iteratively generate multiple frames. Our results on the
    Cityscapes dataset show that directly predicting future segmentations is
    substantially better than predicting and then segmenting future RGB frames. Our
    models predict trajectories of cars and pedestrians much more accurately (25%)
    than baselines that copy the most recent semantic segmentation or warp it using
    optical flow. Prediction results up to half a second in the future are visually
    convincing, the mean IoU of predicted segmentations reaching two thirds of the
    real future segmentations.

    Deep Exploration via Randomized Value Functions

    Ian Osband, Daniel Russo, Zheng Wen, Benjamin Van Roy
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG)

    We study the use of randomized value functions to guide deep exploration in
    reinforcement learning. This offers an elegant means for synthesizing
    statistically and computationally efficient exploration with common practical
    approaches to value function learning. We present several reinforcement
    learning algorithms that leverage randomized value functions and demonstrate
    their efficacy through computational studies. We also prove a regret bound that
    establishes statistical efficiency with a tabular representation.

    Gate Activation Signal Analysis for Gated Recurrent Neural Networks and Its Correlation with Phoneme Boundaries

    Yu-Hsuan Wang, Cheng-Tao Chung, Hung-yi Lee
    Comments: 5 pages
    Subjects: Sound (cs.SD); Computation and Language (cs.CL); Learning (cs.LG)

    In this paper we analyze the gate activation signals inside the gated
    recurrent neural networks, and find the temporal structure of such signals is
    highly correlated with the phoneme boundaries. This correlation is further
    verified by a set of experiments for phoneme segmentation, in which better
    results compared to standard approaches were obtained.

    LogitBoost autoregressive networks

    Marc Goessling
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Multivariate binary distributions can be decomposed into products of
    univariate conditional distributions. Recently popular approaches have modeled
    these conditionals through neural networks with sophisticated weight-sharing
    structures. It is shown that state-of-the-art performance on several standard
    benchmark datasets can actually be achieved by training separate probability
    estimators for each dimension. In that case, model training can be trivially
    parallelized over data dimensions. On the other hand, complexity control has to
    be performed for each learned conditional distribution. Three possible methods
    are considered and experimentally compared. The estimator that is employed for
    each conditional is LogitBoost. Similarities and differences between the
    proposed approach and autoregressive models based on neural networks are
    discussed in detail.

    Episode-Based Active Learning with Bayesian Neural Networks

    Feras Dayoub, Niko Sünderhauf, Peter Corke
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)

    We investigate different strategies for active learning with Bayesian deep
    neural networks. We focus our analysis on scenarios where new, unlabeled data
    is obtained episodically, such as commonly encountered in mobile robotics
    applications. An evaluation of different strategies for acquisition, updating,
    and final training on the CIFAR-10 dataset shows that incremental network
    updates with final training on the accumulated acquisition set are essential
    for best performance, while limiting the amount of required human labeling
    labor.

    Deep Learning for Explicitly Modeling Optimization Landscapes

    Shumeet Baluja
    Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Learning (cs.LG)

    In all but the most trivial optimization problems, the structure of the
    solutions exhibit complex interdependencies between the input parameters.
    Decades of research with stochastic search techniques has shown the benefit of
    explicitly modeling the interactions between sets of parameters and the overall
    quality of the solutions discovered. We demonstrate a novel method, based on
    learning deep networks, to model the global landscapes of optimization
    problems. To represent the search space concisely and accurately, the deep
    networks must encode information about the underlying parameter interactions
    and their contributions to the quality of the solution. Once the networks are
    trained, the networks are probed to reveal parameter combinations with high
    expected performance with respect to the optimization task. These estimates are
    used to initialize fast, randomized, local search algorithms, which in turn
    expose more information about the search space that is subsequently used to
    refine the models. We demonstrate the technique on multiple optimization
    problems that have arisen in a variety of real-world domains, including:
    packing, graphics, job scheduling, layout and compression. The problems include
    combinatoric search spaces, discontinuous and highly non-linear spaces, and
    span binary, higher-cardinality discrete, as well as continuous parameters.
    Strengths, limitations, and extensions of the approach are extensively
    discussed and demonstrated.


    Information Theory

    Minimax Robust Decentralized Detection in Parallel Sensor Networks

    Gökhan Gül
    Subjects: Information Theory (cs.IT)

    Minimax robust decentralized detection is studied for parallel sensor
    networks. Random variables corresponding to sensor observations are assumed to
    follow a distribution function, which belongs to an uncertainty class. It has
    been proven that, for some uncertainty classes, if all probability
    distributions are absolutely continuous with respect to a common measure, the
    joint stochastic boundedness property, which is the fundamental rule for the
    derivations in Veerevalli’s work, does not hold. This raises a natural question
    whether minimax robust decentralized detection is possible if the uncertainty
    classes do not own this property. The answer to this question has been shown to
    be positive, which leads to a generalization of the work of Veerevalli.
    Moreover, due to a direct consequence of Tsitsiklis’s work, quantization
    functions at the sensors are not required to be monotone. For the proposed
    model, some specific examples have been provided and possible generalizations
    have been discussed.

    Energy Efficiency in Cache Enabled Small Cell Networks With Adaptive User Clustering

    Salah Eddine Hajri, Mohamad Assaad
    Subjects: Information Theory (cs.IT)

    The exponential increase in mobile data traffic forces network operators to
    deal with a capacity shortage. One of the most promising technologies for 5G
    networks is proactive caching. Using a network of cache enabled small cells,
    traffic during peak hours can be reduced through proactively caching the
    content that is most probable to be requested. Studying the users behavior and
    caching files at base stations accordingly, can offload the backhaul traffic
    and improve the network throughput. We explore a new caching framework, in
    which, the users are clustered according to their content popularity. The
    caching is then done based on the mean content popularity in each cluster. In
    order to achieve an efficient clustering of the users, we use a statistical
    model selection criterion, namely the Akaike information criterion. We derive a
    closed form expression of the hit probability, which will be optimized with
    respect to the fractions of small base stations associated to each cluster. We
    then investigate the Energy Efficiency of the proposed caching framework. We
    derive a closed form expression of the energy efficiency, which will be
    optimized by defining the optimal density of active small base stations. We
    then provide a small base station allocation algorithm in order to associate
    each individual base station with a given cluster. This algorithm aims at
    caching in each small base station the files that are most likely to be
    requested within their direct neighborhood. Coupled with channel inversion
    power control, this optimization will improve the energy efficiency and cache
    hit probability of the network. Numerical results show that the clustering
    scheme enable to considerably improve the cache hit probability. We also show
    that optimizing the allocation of the small base stations results in improving
    of the energy efficiency and hit probability in the network.

    A New Approach for Throughput Enhancement of MIMO Interference Networks under Imperfect Channel State Information

    Ali Dalir, Hassan Aghaeinia
    Comments: arXiv admin note: text overlap with arXiv:1703.05944
    Subjects: Information Theory (cs.IT)

    This paper proposes a new algorithm to improve the throughput of the MIMO
    interference channel, under imperfect channel state information (CSI). Each
    transmitter and receiver has respectively M and N antennas and network operates
    in a time division duplex mode. With the knowledge of channel estimation error
    variance, mean of signal-to-interference-plus-noise ratio (SINR) is
    approximated. Each transceiver adjusts its filter to maximize the expected
    value of SINR. Formulation of maximization problem is convex. The proposed New
    Approach for Throughput Enhancement under imperfect CSI utilizes the
    reciprocity of wireless networks to maximize the estimated mean. The sum rate
    performance of the proposed algorithm is verified using Monte Carlo
    simulations.

    Hardware Trojan Detection Game: A Prospect-Theoretic Approach

    Walid Saad, Anibal Sanjab, Yunpeng Wang, Charles Kamhoua, Kevin Kwiat
    Comments: IEEE Transactions on Vehicular Technology
    Subjects: Information Theory (cs.IT); Cryptography and Security (cs.CR); Computer Science and Game Theory (cs.GT)

    Outsourcing integrated circuit (IC) manufacturing to offshore foundries has
    grown exponentially in recent years. Given the critical role of ICs in the
    control and operation of vehicular systems and other modern engineering
    designs, such offshore outsourcing has led to serious security threats due to
    the potential of insertion of hardware trojans – malicious designs that, when
    activated, can lead to highly detrimental consequences. In this paper, a novel
    game-theoretic framework is proposed to analyze the interactions between a
    hardware manufacturer, acting as attacker, and an IC testing facility, acting
    as defender. The problem is formulated as a noncooperative game in which the
    attacker must decide on the type of trojan that it inserts while taking into
    account the detection penalty as well as the damage caused by the trojan.
    Meanwhile, the resource-constrained defender must decide on the best testing
    strategy that allows optimizing its overall utility which accounts for both
    damages and the fines. The proposed game is based on the robust behavioral
    framework of prospect theory (PT) which allows capturing the potential
    uncertainty, risk, and irrational behavior in the decision making of both the
    attacker and defender. For both, the standard rational expected utility (EUT)
    case and the PT case, a novel algorithm based on fictitious play is proposed
    and shown to converge to a mixed-strategy Nash equilibrium. For an illustrative
    case study, thorough analytical results are derived for both EUT and PT to
    study the properties of the reached equilibrium as well as the impact of key
    system parameters such as the defender-set fine. Simulation results assess the
    performance of the proposed framework under both EUT and PT and show that the
    use of PT will provide invaluable insights on the outcomes of the proposed
    hardware trojan game, in particular, and system security, in general.

    Frequency Offset Estimation and Training Sequence Design for MIMO OFDM

    Yanxiang Jiang, Hlaing Minn, Xiqi Gao, Xiaohu You
    Comments: 12 pages, 5 figures, IEEE TWC, 2008
    Journal-ref: IEEE TWC, Apr. 2008
    Subjects: Information Theory (cs.IT)

    This paper addresses carrier frequency offset (CFO) estimation and training
    sequence design for multiple-input multiple-output (MIMO) orthogonal frequency
    division multiplexing (OFDM) systems over frequency selective fading channels.
    By exploiting the orthogonality of the training sequences in the frequency
    domain, integer CFO (ICFO) is estimated. {With the uniformly spaced non-zero
    pilots in the training sequences} and the corresponding geometric mapping,
    fractional CFO (FCFO) is estimated through the roots of a real polynomial.
    Furthermore, the condition for the training sequences to guarantee estimation
    identifiability is developed. Through the analysis of the correlation property
    of the training sequences, two types of sub-optimal training sequences
    generated from the Chu sequence are constructed. Simulation results verify the
    good performance of the CFO estimator assisted by the proposed training
    sequences.

    Energy Efficient Precoding Design for SWIPT in MIMO Two-Way Relay Networks

    Javane Rostampoor, S. Mohammad Razavizadeh, Inkyu Lee
    Comments: 16 pages, 6 figures, to appear in IEEE Transactions on Vehicular Technology
    Subjects: Information Theory (cs.IT)

    In this paper, we study the energy efficiency (EE) maximization problem in
    multiple-input multiple-output (MIMO) two-way relay networks with simultaneous
    wireless information and power transfer (SWIPT). The network consists of a
    multiple-antenna amplify-and-forward relay node which provides bidirectional
    communications between two multiple-antenna transceiver nodes

    Comment on the Equality Condition for the I-MMSE Proof of Entropy Power Inequality

    Alex Dytso, Ronit Bustin, H. Vincent Poor, Shlomo Shamai (Shitz)
    Subjects: Information Theory (cs.IT)

    The paper establishes the equality condition in the I-MMSE proof of the
    entropy power inequality (EPI). This is done by establishing an exact
    expression for the deficit between the two sides of the EPI. Interestingly, a
    necessary condition for the equality is established by making a connection to
    the famous Cauchy functional equation.

    Cognitive Hierarchy Theory for Distributed Resource Allocation in the Internet of Things

    Nof Abuzainab, Walid Saad, Choong-Seon Hong, H. Vincent Poor
    Subjects: Information Theory (cs.IT)

    In this paper, the problem of distributed resource allocation is studied for
    an Internet of Things (IoT) system, composed of heterogeneous group of nodes
    compromising both machine-type devices (MTDs) and human-type devices (HTDs).
    The problem is formulated as a noncooperative game between the heterogeneous
    IoT devices that seek to find the optimal time allocation so as to meet their
    qualityof-service (QoS) requirements in terms of energy, rate and latency.
    Since the strategy space of each device is dependent on the actions of the
    other devices, the generalized Nash equilibrium (GNE) solution is first
    characterized, and the conditions for uniqueness of the GNE are derived. Then,
    to explicitly capture the heterogeneity of the devices, in terms of resource
    constraints and QoS needs, a novel and more realistic game-theoretic approach,
    based on the behavioral framework of cognitive hierarchy (CH) theory, is
    proposed. This approach is then shown to enable the IoT devices to reach a CH
    equilibrium (CHE) concept that takes into account the various levels of
    rationality corresponding to the heterogeneous computational capabilities and
    the information accessible for each one of the MTDs and HTDs. Simulation
    results show that the CHE solution maintains a stable performance. In
    particular, the proposed CHE solution keeps the percentage of devices with
    satisfied QoS constraints above 96% for IoT networks containing up to 4000
    devices while not considerably degrading the overall system performance in
    terms of the total utility. Simulation results also show that the proposed CHE
    solution brings a two fold increase in the total rate of HTDs and deceases the
    total energy by MTDs by 78% compared to the equal time policy.

    Characterization of Deterministic and Probabilistic Sampling Patterns for Finite Completability of Low Tensor-Train Rank Tensor

    Morteza Ashraphijuo, Xiaodong Wang
    Comments: arXiv admin note: text overlap with arXiv:1612.01597
    Subjects: Learning (cs.LG); Information Theory (cs.IT); Algebraic Geometry (math.AG); Machine Learning (stat.ML)

    In this paper, we analyze the fundamental conditions for low-rank tensor
    completion given the separation or tensor-train (TT) rank, i.e., ranks of
    unfoldings. We exploit the algebraic structure of the TT decomposition to
    obtain the deterministic necessary and sufficient conditions on the locations
    of the samples to ensure finite completability. Specifically, we propose an
    algebraic geometric analysis on the TT manifold that can incorporate the whole
    rank vector simultaneously in contrast to the existing approach based on the
    Grassmannian manifold that can only incorporate one rank component. Our
    proposed technique characterizes the algebraic independence of a set of
    polynomials defined based on the sampling pattern and the TT decomposition,
    which is instrumental to obtaining the deterministic condition on the sampling
    pattern for finite completability. In addition, based on the proposed analysis,
    assuming that the entries of the tensor are sampled independently with
    probability (p), we derive a lower bound on the sampling probability (p), or
    equivalently, the number of sampled entries that ensures finite completability
    with high probability. Moreover, we also provide the deterministic and
    probabilistic conditions for unique completability.

    A Las Vegas algorithm to solve the elliptic curve discrete logarithm problem

    Ayan Mahalanobis, Vivek Mallick
    Subjects: Cryptography and Security (cs.CR); Information Theory (cs.IT); Algebraic Geometry (math.AG)

    In this short paper, we develop a probabilistic algorithm for the elliptic
    curve discrete logarithm problem. This algorithm is not generic in nature, it
    uses some properties of the elliptic curve.

    Energy Efficient Power Control for the Two-tier Networks with Small Cells and Massive MIMO

    Ningning Lu, Yanxiang Jiang, Fuchun Zheng, Xiaohu You
    Comments: 6 pages, 4 figures, IEEE Wireless Communications and Networking Conference Workshops (WCNCW’16)
    Journal-ref: IEEE Wireless Communications and Networking Conference Workshops
    (WCNCW’16), April 2016
    Subjects: Networking and Internet Architecture (cs.NI); Computer Science and Game Theory (cs.GT); Information Theory (cs.IT)

    In this paper, energy efficient power control for the uplink two-tier
    networks where a macrocell tier with a massive multiple-input multiple-output
    (MIMO) base station is overlaid with a small cell tier is investigated. We
    propose a distributed energy efficient power control algorithm which allows
    each user in the two-tier network taking individual decisions to optimize its
    own energy efficiency (EE) for the multi-user and multi-cell scenario. The
    distributed power control algorithm is implemented by decoupling the EE
    optimization problem into two steps. In the first step, we propose to assign
    the users on the same resource into the same group and each group can optimize
    its own EE, respectively. In the second step, multiple power control games
    based on evolutionary game theory (EGT) are formulated for each group, which
    allows each user optimizing its own EE. In the EGT-based power control games,
    each player selects a strategy giving a higher payoff than the average payoff,
    which can improve the fairness among the users. The proposed algorithm has a
    linear complexity with respect to the number of subcarriers and the number of
    cells in comparison with the brute force approach which has an exponential
    complexity. Simulation results show the remarkable improvements in terms of
    fairness by using the proposed algorithm.

    Energy Efficient Non-Cooperative Power Control in Small Cell Networks

    Yanxiang Jiang, Ningning Lu, Yan Chen, Mehdi Bennis, Fuchun Zheng, Xiqi Gao, Xiaohu You
    Comments: 8 pages, 10 figures. This paper has been accepted by IEEE Transactions on Vehicular Technology
    Journal-ref: IEEE Transactions on Vehicular Technology, 2017
    Subjects: Computer Science and Game Theory (cs.GT); Information Theory (cs.IT)

    In this paper, energy efficient power control for small cells underlaying a
    macro cellular network is investigated. We formulate the power control problem
    in self-organizing small cell networks as a non-cooperative game, and propose a
    distributed energy efficient power control scheme, which allows the small base
    stations (SBSs) to take individual decisions for attaining the Nash equilibrium
    (NE) with minimum information exchange. Specially, in the non-cooperative power
    control game, a non-convex optimization problem is formulated for each SBS to
    maximize their energy efficiency (EE). By exploiting the properties of
    parameter-free fractional programming and the concept of perspective function,
    the non-convex optimization problem for each SBS is transformed into an
    equivalent constrained convex optimization problem. Then, the constrained
    convex optimization problem is converted into an unconstrained convex
    optimization problem by exploiting the mixed penalty function method. The
    inequality constraints are eliminated by introducing the logarithmic barrier
    functions and the equality constraint is eliminated by introducing the
    quadratic penalty function. We also theoretically show the existence and the
    uniqueness of the NE in the non-cooperative power control game. Simulation
    results show remarkable improvements in terms of EE by using the proposed
    scheme.

    Demixing Sines and Spikes: Robust Spectral Super-resolution in the Presence of Outliers

    Carlos Fernandez-Granda, Gongguo Tang, Xiaodong Wang, Le Zheng
    Subjects: Optimization and Control (math.OC); Information Theory (cs.IT)

    We consider the problem of super-resolving the line spectrum of a
    multisinusoidal signal from a finite number of samples, some of which may be
    completely corrupted. Measurements of this form can be modeled as an additive
    mixture of a sinusoidal and a sparse component. We propose to demix the two
    components and super-resolve the spectrum of the multisinusoidal signal by
    solving a convex program. Our main theoretical result is that– up to
    logarithmic factors– this approach is guaranteed to be successful with high
    probability for a number of spectral lines that is linear in the number of
    measurements, even if a constant fraction of the data are outliers. The result
    holds under the assumption that the phases of the sinusoidal and sparse
    components are random and the line spectrum satisfies a minimum-separation
    condition. We show that the method can be implemented via semidefinite
    programming and explain how to adapt it in the presence of dense perturbations,
    as well as exploring its connection to atomic-norm denoising. In addition, we
    propose a fast greedy demixing method which provides good empirical results
    when coupled with a local nonconvex-optimization step.




沪ICP备19023445号-2号
友情链接