IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Wed, 12 Oct 2016

    我爱机器学习(52ml.net)发表于 2016-10-12 00:00:00
    love 0

    Neural and Evolutionary Computing

    Long Short-Term Memory based Convolutional Recurrent Neural Networks for Large Vocabulary Speech Recognition

    Xiangang Li, Xihong Wu
    Comments: Published in INTERSPEECH 2015, September 6-10, 2015, Dresden, Germany
    Subjects: Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE)

    Long short-term memory (LSTM) recurrent neural networks (RNNs) have been
    shown to give state-of-the-art performance on many speech recognition tasks, as
    they are able to provide the learned dynamically changing contextual window of
    all sequence history. On the other hand, the convolutional neural networks
    (CNNs) have brought significant improvements to deep feed-forward neural
    networks (FFNNs), as they are able to better reduce spectral variation in the
    input signal. In this paper, a network architecture called as convolutional
    recurrent neural network (CRNN) is proposed by combining the CNN and LSTM RNN.
    In the proposed CRNNs, each speech frame, without adjacent context frames, is
    organized as a number of local feature patches along the frequency axis, and
    then a LSTM network is performed on each feature patch along the time axis. We
    train and compare FFNNs, LSTM RNNs and the proposed LSTM CRNNs at various
    number of configurations. Experimental results show that the LSTM CRNNs can
    exceed state-of-the-art speech recognition performance.


    Computer Vision and Pattern Recognition

    Deep Learning Assessment of Tumor Proliferation in Breast Cancer Histological Images

    Manan Shah, Christopher Rubadue, David Suster, Dayong Wang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Current analysis of tumor proliferation, the most salient prognostic
    biomarker for invasive breast cancer, is limited to subjective mitosis counting
    by pathologists in localized regions of tissue images. This study presents the
    first data-driven integrative approach to characterize the severity of tumor
    growth and spread on a categorical and molecular level, utilizing multiple
    biologically salient deep learning classifiers to develop a comprehensive
    prognostic model. Our approach achieves pathologist-level performance on
    three-class categorical tumor severity prediction. It additionally pioneers
    prediction of molecular expression data from a tissue image, obtaining a
    Spearman’s rank correlation coefficient of 0.60 with ex vivo mean calculated
    RNA expression. Furthermore, our framework is applied to identify over two
    hundred unprecedented biomarkers critical to the accurate assessment of tumor
    proliferation, validating our proposed integrative pipeline as the first to
    holistically and objectively analyze histopathological images.

    Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection

    Xianzhi Du, Mostafa El-Khamy, Jungwon Lee, Larry S. Davis
    Comments: 11 pages and 8 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We propose a deep neural network fusion architecture for fast and robust
    pedestrian detection. The proposed network fusion architecture allows for
    parallel processing of multiple networks for speed. A single shot deep
    convolutional network is trained as a object detector to generate all possible
    pedestrian candidates of different sizes and occlusions. This network outputs a
    large variety of pedestrian candidates to cover the majority of ground-truth
    pedestrians while also introducing a large number of false positives. Next,
    multiple deep neural networks are used in parallel for further refinement of
    these pedestrian candidates. We introduce a soft-rejection based network fusion
    method to fuse the soft metrics from all networks together to generate the
    final confidence scores. Our method performs better than existing
    state-of-the-arts, especially when detecting small-size and occluded
    pedestrians. Furthermore, we propose a method for integrating pixel-wise
    semantic segmentation network into the network fusion architecture as a
    reinforcement to the pedestrian detector. The approach outperforms
    state-of-the-art methods on most protocols on Caltech Pedestrian dataset, with
    significant boosts on several protocols. It is also faster than all other
    methods.

    Restoring STM images via Sparse Coding: noise and artifact removal

    João P. Oliveira, Ana Bragança, José Bioucas-Dias, Mário Figueiredo, Luís Alcácer, Jorge Morgado, Quirina Ferreira
    Comments: 14 pages, 6 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this article, we present a denoising algorithm to improve the
    interpretation and quality of scanning tunneling microscopy (STM) images. Given
    the high level of self-similarity of STM images, we propose a denoising
    algorithm by reformulating the true estimation problem as a sparse regression,
    often termed sparse coding. We introduce modifications to the algorithm to cope
    with the existence of artifacts, mainly dropouts, which appear in a structured
    way as consecutive line segments on the scanning direction. The resulting
    algorithm treats the artifacts as missing data, and the estimated values
    outperform those algorithms that substitute the outliers by a local filtering.
    We provide code implementations for both Matlab and Gwyddion.

    Crossing the Road Without Traffic Lights: An Android-based Safety Device

    Adi Perry, Dor Verbin, Nahum Kiryati
    Comments: Planned submission to “Pattern Recognition Letters”
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In the absence of pedestrian crossing lights, finding a safe moment to cross
    the road is often hazardous and challenging, especially for people with visual
    impairments. We present a reliable low-cost solution, an Android device
    attached to a traffic sign or lighting pole near the crossing, indicating
    whether it is safe to cross the road. The indication can be by sound, display,
    vibration, and various communication modalities provided by the Android device.
    The integral system camera is aimed at approaching traffic. Optical flow is
    computed from the incoming video stream, and projected onto an influx map,
    automatically acquired during a brief training period. The crossing safety is
    determined based on a 1-dimensional temporal signal derived from the
    projection. We implemented the complete system on a Samsung Galaxy K-Zoom
    Android smartphone, and obtained real-time operation. The system achieves
    promising experimental results, providing pedestrians with sufficiently early
    warning of approaching vehicles. The system can serve as a stand-alone safety
    device, that can be installed where pedestrian crossing lights are ruled out.
    Requiring no dedicated infrastructure, it can be powered by a solar panel and
    remotely maintained via the cellular network.

    Proposal for Automatic License and Number Plate Recognition System for Vehicle Identification

    Hamed Saghaei
    Comments: 5 pages, 3 figures, 2016 1st International Conference on New Research Achievements in Electrical and Computer Engineering
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we propose an automatic and mechanized license and number
    plate recognition (LNPR) system which can extract the license plate number of
    the vehicles passing through a given location using image processing
    algorithms. No additional devices such as GPS or radio frequency identification
    (RFID) need to be installed for implementing the proposed system. Using special
    cameras, the system takes pictures from each passing vehicle and forwards the
    image to the computer for being processed by the LPR software. Plate
    recognition software uses different algorithms such as localization,
    orientation, normalization, segmentation and finally optical character
    recognition (OCR). The resulting data is applied to compare with the records on
    a database. Experimental results reveal that the presented system successfully
    detects and recognizes the vehicle number plate on real images. This system can
    also be used for security and traffic control.

    Multiple Instance Learning Convolutional Neural Networks for Object Recognition

    Miao Sun, Tony X. Han, Ming-Chang Liu, Ahmad Khodayari-Rostamabad
    Comments: International Conference on Pattern Recognition(ICPR) 2016, Oral paper
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Convolutional Neural Networks (CNN) have demon- strated its successful
    applications in computer vision, speech recognition, and natural language
    processing. For object recog- nition, CNNs might be limited by its strict label
    requirement and an implicit assumption that images are supposed to be target-
    object-dominated for optimal solutions. However, the labeling procedure,
    necessitating laying out the locations of target ob- jects, is very tedious,
    making high-quality large-scale dataset prohibitively expensive. Data
    augmentation schemes are widely used when deep networks suffer the insufficient
    training data problem. All the images produced through data augmentation share
    the same label, which may be problematic since not all data augmentation
    methods are label-preserving. In this paper, we propose a weakly supervised CNN
    framework named Multiple Instance Learning Convolutional Neural Networks
    (MILCNN) to solve this problem. We apply MILCNN framework to object recognition
    and report state-of-the-art performance on three benchmark datasets: CIFAR10,
    CIFAR100 and ILSVRC2015 classification dataset.

    FaceVR: Real-Time Facial Reenactment and Eye Gaze Control in Virtual Reality

    Justus Thies, Michael Zollhöfer, Marc Stamminger, Christian Theobalt, Matthias Nießner
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We introduce FaceVR, a novel method for gaze-aware facial reenactment in the
    Virtual Reality (VR) context. The key component of FaceVR is a robust algorithm
    to perform real-time facial motion capture of an actor who is wearing a
    head-mounted display (HMD), as well as a new data-driven approach for eye
    tracking from monocular videos. In addition to these face reconstruction
    components, FaceVR incorporates photo-realistic re-rendering in real time, thus
    allowing artificial modifications of face and eye appearances. For instance, we
    can alter facial expressions, change gaze directions, or remove the VR goggles
    in realistic re-renderings. In a live setup with a source and a target actor,
    we apply these newly-introduced algorithmic components. We assume that the
    source actor is wearing a VR device, and we capture his facial expressions and
    eye movement in real-time. For the target video, we mimic a similar tracking
    process; however, we use the source input to drive the animations of the target
    video, thus enabling gaze-aware facial reenactment. To render the modified
    target video on a stereo display, we augment our capture and reconstruction
    process with stereo data. In the end, FaceVR produces compelling results for a
    variety of applications, such as gaze-aware facial reenactment, reenactment in
    virtual reality, removal of VR goggles, and re-targeting of somebody’s gaze
    direction in a video conferencing call.

    Tangled Splines

    Aditya Tatu
    Comments: 12 pages, To be sent to a Journal/Conference
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Computational Geometry (cs.CG)

    Extracting shape information from object bound- aries is a well studied
    problem in vision, and has found tremen- dous use in applications like object
    recognition. Conversely, studying the space of shapes represented by curves
    satisfying certain constraints is also intriguing. In this paper, we model and
    analyze the space of shapes represented by a 3D curve (space curve) formed by
    connecting n pieces of quarter of a unit circle. Such a space curve is what we
    call a Tangle, the name coming from a toy built on the same principle. We
    provide two models for the shape space of n-link open and closed tangles, and
    we show that tangles are a subset of trigonometric splines of a certain order.
    We give algorithms for curve approximation using open/closed tangles, computing
    geodesics on these shape spaces, and to find the deformation that takes one
    given tangle to another given tangle, i.e., the Log map. The algorithms
    provided yield tangles upto a small and acceptable tolerance, as shown by the
    results given in the paper.


    Artificial Intelligence

    Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving

    Shai Shalev-Shwartz, Shaked Shammah, Amnon Shashua
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    Autonomous driving is a multi-agent setting where the host vehicle must apply
    sophisticated negotiation skills with other road users when overtaking, giving
    way, merging, taking left and right turns and while pushing ahead in
    unstructured urban roadways. Since there are many possible scenarios, manually
    tackling all possible cases will likely yield a too simplistic policy.
    Moreover, one must balance between unexpected behavior of other
    drivers/pedestrians and at the same time not to be too defensive so that normal
    traffic flow is maintained.

    In this paper we apply deep reinforcement learning to the problem of forming
    long term driving strategies. We note that there are two major challenges that
    make autonomous driving different from other robotic tasks. First, is the
    necessity for ensuring functional safety – something that machine learning has
    difficulty with given that performance is optimized at the level of an
    expectation over many instances. Second, the Markov Decision Process model
    often used in robotics is problematic in our case because of unpredictable
    behavior of other agents in this multi-agent scenario. We make three
    contributions in our work. First, we show how policy gradient iterations can be
    used without Markovian assumptions. Second, we decompose the problem into a
    composition of a Policy for Desires (which is to be learned) and trajectory
    planning with hard constraints (which is not learned). The goal of Desires is
    to enable comfort of driving, while hard constraints guarantees the safety of
    driving. Third, we introduce a hierarchical temporal abstraction we call an
    “Option Graph” with a gating mechanism that significantly reduces the effective
    horizon and thereby reducing the variance of the gradient estimation even
    further.

    Error Asymmetry in Causal and Anticausal Regression

    Patrick Blöbaum, Takashi Washio, Shohei Shimizu
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    It is generally difficult to make any statements about the expected
    prediction error in an univariate setting without further knowledge about how
    the data were generated. Recent work showed that knowledge about the real
    underlying causal structure of a data generation process has implications for
    various machine learning settings. Assuming an additive noise and an
    independence between data generating mechanism and its input, we draw a novel
    connection between the intrinsic causal relationship of two variables and the
    expected prediction error. We formulate the theorem that the expected error of
    the true data generating function as prediction model is generally smaller when
    the effect is predicted from its cause and, on the contrary, greater when the
    cause is predicted from its effect. The theorem implies an asymmetry in the
    error depending on the prediction direction. This is further corroborated with
    empirical evaluations in artificial and real-world data sets.

    PCG-Based Game Design Patterns

    Michael Cook, Mirjam Eladhari, Andy Nealen, Mike Treanor, Eddy Boxerman, Alex Jaffe, Paul Sottosanti, Steve Swink
    Subjects: Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)

    People enjoy encounters with generative software, but rarely are they
    encouraged to interact with, understand or engage with it. In this paper we
    define the term ‘PCG-based game’, and explain how this concept follows on from
    the idea of an AI-based game. We look at existing examples of games which
    foreground their AI, put forward a methodology for designing PCG-based games,
    describe some example case study designs for PCG-based games, and describe
    lessons learned during this process of sketching and developing ideas.

    Is psychosis caused by defective dissociation? An Artificial Life model for schizophrenia

    Alessandro Fontana
    Comments: 11 pages, 4 figures
    Subjects: Neurons and Cognition (q-bio.NC); Artificial Intelligence (cs.AI)

    Both neurobiological and environmental factors are known to play a role in
    the origin of schizophrenia, but no model has been proposed that accounts for
    both. This work presents a functional model of schizophrenia that merges
    psychodynamic elements with ingredients borrowed from the theory of
    psychological traumas, and evidences the interplay of traumatic experiences and
    defective mental functions in the pathogenesis of the disorder. Our model
    foresees that dissociation is a standard tool used by the mind to protect
    itself from emotional pain. In case of repeated traumas, the mind learns to
    adopt selective forms of dissociation to avoid pain without losing touch with
    external reality. We conjecture that this process is defective in
    schizophrenia, where dissociation is either too weak, giving rise to positive
    symptoms, or too strong, causing negative symptoms.

    Navigational Instruction Generation as Inverse Reinforcement Learning with Neural Machine Translation

    Andrea F. Daniele, Mohit Bansal, Matthew R. Walter
    Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Learning (cs.LG)

    Modern robotics applications that involve human-robot interaction require
    robots to be able to communicate with humans seamlessly and effectively.
    Natural language provides a flexible and efficient medium through which robots
    can exchange information with their human partners. Significant advancements
    have been made in developing robots capable of interpreting free-form
    instructions, but less attention has been devoted to endowing robots with the
    ability to generate natural language. We propose a navigational guide model
    that enables robots to generate natural language instructions that allow humans
    to navigate a priori unknown environments. We first decide which information to
    share with the user according to their preferences, using a policy trained from
    human demonstrations via inverse reinforcement learning. We then “translate”
    this information into a natural language instruction using a neural
    sequence-to-sequence model that learns to generate free-form instructions from
    natural language corpora. We evaluate our method on a benchmark route
    instruction dataset and achieve a BLEU score of 72.18% when compared to
    human-generated reference instructions. We additionally conduct navigation
    experiments with human participants that demonstrate that our method generates
    instructions that people follow as accurately and easily as those produced by
    humans.


    Information Retrieval

    Context-Aware Online Learning for Course Recommendation of MOOC Big Data

    Yifan Hou, Pan Zhou, Ting Wang, Yuchong Hu, Dapeng Wu
    Subjects: Learning (cs.LG); Computers and Society (cs.CY); Information Retrieval (cs.IR)

    The Massive Open Online Course (MOOC) has expanded significantly in recent
    years. With the widespread of MOOC, the opportunity to study the fascinating
    courses for free has attracted numerous people of diverse educational
    backgrounds all over the world. In the big data era, a key research topic for
    MOOC is how to mine the needed courses in the massive course databases in cloud
    for each individual (course) learner accurately and rapidly as the number of
    courses is increasing fleetly. In this respect, the key challenge is how to
    realize personalized course recommendation as well as to reduce the computing
    and storage costs for the tremendous course data. In this paper, we propose a
    big data-supported, context-aware online learning-based course recommender
    system that could handle the dynamic and infinitely massive datasets, which
    recommends courses by using personalized context information and historical
    statistics. The context-awareness takes the personal preferences into
    consideration, making the recommendation suitable for people with different
    backgrounds. Besides, the algorithm achieves the sublinear regret performance,
    which means it can gradually recommend the mostly preferred and matched courses
    to learners. Unlike other existing algorithms, ours bounds the time complexity
    and space complexity linearly. In addition, our devised storage module is
    expanded to the distributed-connected clouds, which can handle massive course
    storage problems from heterogenous sources. Our experiment results verify the
    superiority of our algorithms when comparing with existing works in the big
    data setting.

    Correlation-Based Method for Sentiment Classification

    Hussam Hamdan
    Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)

    The classic supervised classification algorithms are efficient, but
    time-consuming, complicated and not interpretable, which makes it difficult to
    analyze their results that limits the possibility to improve them based on real
    observations. In this paper, we propose a new and a simple classifier to
    predict a sentiment label of a short text. This model keeps the capacity of
    human interpret-ability and can be extended to integrate NLP techniques in a
    more interpretable way. Our model is based on a correlation metric which
    measures the degree of association between a sentiment label and a word. Ten
    correlation metrics are proposed and evaluated intrinsically. And then a
    classifier based on each metric is proposed, evaluated and compared to the
    classic classification algorithms which have proved their performance in many
    studies. Our model outperforms these algorithms with several correlation
    metrics.

    Supervised Term Weighting Metrics for Sentiment Analysis in Short Text

    Hussam Hamdan, Patrice Bellot, Frederic Bechet
    Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR); Learning (cs.LG)

    Term weighting metrics assign weights to terms in order to discriminate the
    important terms from the less crucial ones. Due to this characteristic, these
    metrics have attracted growing attention in text classification and recently in
    sentiment analysis. Using the weights given by such metrics could lead to more
    accurate document representation which may improve the performance of the
    classification. While previous studies have focused on proposing or comparing
    different weighting metrics at two-classes document level sentiment analysis,
    this study propose to analyse the results given by each metric in order to find
    out the characteristics of good and bad weighting metrics. Therefore we present
    an empirical study of fifteen global supervised weighting metrics with four
    local weighting metrics adopted from information retrieval, we also give an
    analysis to understand the behavior of each metric by observing and analysing
    how each metric distributes the terms and deduce some characteristics which may
    distinguish the good and bad metrics. The evaluation has been done using
    Support Vector Machine on three different datasets: Twitter, restaurant and
    laptop reviews.


    Computation and Language

    Survey on the Use of Typological Information in Natural Language Processing

    Helen O'Horan, Yevgeni Berzak, Ivan Vulić, Roi Reichart, Anna Korhonen
    Journal-ref: COLING 2016
    Subjects: Computation and Language (cs.CL)

    In recent years linguistic typology, which classifies the world’s languages
    according to their functional and structural properties, has been widely used
    to support multilingual NLP. While the growing importance of typological
    information in supporting multilingual tasks has been recognised, no systematic
    survey of existing typological resources and their use in NLP has been
    published. This paper provides such a survey as well as discussion which we
    hope will both inform and inspire future work in the area.

    From phonemes to images: levels of representation in a recurrent neural model of visually-grounded language learning

    Lieke Gelderloos, Grzegorz Chrupała
    Comments: Accepted at COLING 2016
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    We present a model of visually-grounded language learning based on stacked
    gated recurrent neural networks which learns to predict visual features given
    an image description in the form of a sequence of phonemes. The learning task
    resembles that faced by human language learners who need to discover both
    structure and meaning from noisy and ambiguous data across modalities. We show
    that our model indeed learns to predict features of the visual context given
    phonetically transcribed image descriptions, and show that it represents
    linguistic information in a hierarchy of levels: lower layers in the stack are
    comparatively more sensitive to form, whereas higher layers are more sensitive
    to meaning.

    Keystroke dynamics as signal for shallow syntactic parsing

    Barbara Plank
    Comments: In COLING 2016
    Subjects: Computation and Language (cs.CL)

    Keystroke dynamics have been extensively used in psycholinguistic and writing
    research to gain insights into cognitive processing. But do keystroke logs
    contain actual signal that can be used to learn better natural language
    processing models?

    We postulate that keystroke dynamics contain information about syntactic
    structure that can inform shallow syntactic parsing. To test this hypothesis,
    we explore labels derived from keystroke logs as auxiliary task in a multi-task
    bidirectional Long Short-Term Memory (bi-LSTM). Our results show promising
    results on two shallow syntactic parsing tasks, chunking and CCG supertagging.
    Our model is simple, has the advantage that data can come from distinct
    sources, and produces models that are significantly better than models trained
    on the text annotations alone.

    GMM-Free Flat Start Sequence-Discriminative DNN Training

    Gábor Gosztolya, Tamás Grósz, László Tóth
    Subjects: Computation and Language (cs.CL)

    Recently, attempts have been made to remove Gaussian mixture models (GMM)
    from the training process of deep neural network-based hidden Markov models
    (HMM/DNN). For the GMM-free training of a HMM/DNN hybrid we have to solve two
    problems, namely the initial alignment of the frame-level state labels and the
    creation of context-dependent states. Although flat-start training via
    iteratively realigning and retraining the DNN using a frame-level error
    function is viable, it is quite cumbersome. Here, we propose to use a
    sequence-discriminative training criterion for flat start. While
    sequence-discriminative training is routinely applied only in the final phase
    of model training, we show that with proper caution it is also suitable for
    getting an alignment of context-independent DNN models. For the construction of
    tied states we apply a recently proposed KL-divergence-based state clustering
    method, hence our whole training process is GMM-free. In the experimental
    evaluation we found that the sequence-discriminative flat start training method
    is not only significantly faster than the straightforward approach of iterative
    retraining and realignment, but the word error rates attained are slightly
    better as well.

    Toward a new instances of NELL

    Maisa C. Duarte, Pierre Maret
    Comments: 6 pages, 1 figure and 2 tables
    Subjects: Computation and Language (cs.CL)

    We are developing the method to start new instances of NELL in various
    languages and develop then NELL multilingualism. We base our method on our
    experience on NELL Portuguese and NELL French. This reports explain our method
    and develops some research perspectives.

    An Empirical Exploration of Skip Connections for Sequential Tagging

    Huijia Wu, Jiajun Zhang, Chengqing Zong
    Comments: Accepted at COLING 2016
    Subjects: Computation and Language (cs.CL)

    In this paper, we empirically explore the effects of various kinds of skip
    connections in stacked bidirectional LSTMs for sequential tagging. We
    investigate three kinds of skip connections connecting to LSTM cells: (a) skip
    connections to the gates, (b) skip connections to the internal states and (c)
    skip connections to the cell outputs. We present comprehensive experiments
    showing that skip connections to cell outputs outperform the remaining two.
    Furthermore, we observe that using gated identity functions as skip mappings
    works pretty well. Based on this novel skip connections, we successfully train
    deep stacked bidirectional LSTM models and obtain state-of-the-art results on
    CCG supertagging and comparable results on POS tagging.

    Long Short-Term Memory based Convolutional Recurrent Neural Networks for Large Vocabulary Speech Recognition

    Xiangang Li, Xihong Wu
    Comments: Published in INTERSPEECH 2015, September 6-10, 2015, Dresden, Germany
    Subjects: Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE)

    Long short-term memory (LSTM) recurrent neural networks (RNNs) have been
    shown to give state-of-the-art performance on many speech recognition tasks, as
    they are able to provide the learned dynamically changing contextual window of
    all sequence history. On the other hand, the convolutional neural networks
    (CNNs) have brought significant improvements to deep feed-forward neural
    networks (FFNNs), as they are able to better reduce spectral variation in the
    input signal. In this paper, a network architecture called as convolutional
    recurrent neural network (CRNN) is proposed by combining the CNN and LSTM RNN.
    In the proposed CRNNs, each speech frame, without adjacent context frames, is
    organized as a number of local feature patches along the frequency axis, and
    then a LSTM network is performed on each feature patch along the time axis. We
    train and compare FFNNs, LSTM RNNs and the proposed LSTM CRNNs at various
    number of configurations. Experimental results show that the LSTM CRNNs can
    exceed state-of-the-art speech recognition performance.

    Correlation-Based Method for Sentiment Classification

    Hussam Hamdan
    Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)

    The classic supervised classification algorithms are efficient, but
    time-consuming, complicated and not interpretable, which makes it difficult to
    analyze their results that limits the possibility to improve them based on real
    observations. In this paper, we propose a new and a simple classifier to
    predict a sentiment label of a short text. This model keeps the capacity of
    human interpret-ability and can be extended to integrate NLP techniques in a
    more interpretable way. Our model is based on a correlation metric which
    measures the degree of association between a sentiment label and a word. Ten
    correlation metrics are proposed and evaluated intrinsically. And then a
    classifier based on each metric is proposed, evaluated and compared to the
    classic classification algorithms which have proved their performance in many
    studies. Our model outperforms these algorithms with several correlation
    metrics.

    Leveraging Recurrent Neural Networks for Multimodal Recognition of Social Norm Violation in Dialog

    Tiancheng Zhao, Ran Zhao, Zhao Meng, Justine Cassell
    Comments: Submitted to NIPS Workshop. arXiv admin note: text overlap with arXiv:1608.02977 by other authors
    Subjects: Computation and Language (cs.CL)

    Social norms are shared rules that govern and facilitate social interaction.
    Violating such social norms via teasing and insults may serve to upend power
    imbalances or, on the contrary reinforce solidarity and rapport in
    conversation, rapport which is highly situated and context-dependent. In this
    work, we investigate the task of automatically identifying the phenomena of
    social norm violation in discourse. Towards this goal, we leverage the power of
    recurrent neural networks and multimodal information present in the
    interaction, and propose a predictive model to recognize social norm violation.
    Using long-term temporal and contextual information, our model achieves an F1
    score of 0.705. Implications of our work regarding developing a social-aware
    agent are discussed.

    Supervised Term Weighting Metrics for Sentiment Analysis in Short Text

    Hussam Hamdan, Patrice Bellot, Frederic Bechet
    Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR); Learning (cs.LG)

    Term weighting metrics assign weights to terms in order to discriminate the
    important terms from the less crucial ones. Due to this characteristic, these
    metrics have attracted growing attention in text classification and recently in
    sentiment analysis. Using the weights given by such metrics could lead to more
    accurate document representation which may improve the performance of the
    classification. While previous studies have focused on proposing or comparing
    different weighting metrics at two-classes document level sentiment analysis,
    this study propose to analyse the results given by each metric in order to find
    out the characteristics of good and bad weighting metrics. Therefore we present
    an empirical study of fifteen global supervised weighting metrics with four
    local weighting metrics adopted from information retrieval, we also give an
    analysis to understand the behavior of each metric by observing and analysing
    how each metric distributes the terms and deduce some characteristics which may
    distinguish the good and bad metrics. The evaluation has been done using
    Support Vector Machine on three different datasets: Twitter, restaurant and
    laptop reviews.

    Neural Paraphrase Generation with Stacked Residual LSTM Networks

    Aaditya Prakash, Sadid A. Hasan, Kathy Lee, Vivek Datla, Ashequl Qadir, Joey Liu, Oladimeji Farri
    Comments: COLING 2016
    Subjects: Computation and Language (cs.CL)

    In this paper, we propose a novel neural approach for paraphrase generation.
    Conventional para- phrase generation methods either leverage hand-written rules
    and thesauri-based alignments, or use statistical machine learning principles.
    To the best of our knowledge, this work is the first to explore deep learning
    models for paraphrase generation. Our primary contribution is a stacked
    residual LSTM network, where we add residual connections between LSTM layers.
    This allows for efficient training of deep LSTMs. We evaluate our model and
    other state-of-the-art deep learning models on three different datasets: PPDB,
    WikiAnswers and MSCOCO. Evaluation results demonstrate that our model
    outperforms sequence to sequence, attention-based and bi- directional LSTM
    models on BLEU, METEOR, TER and an embedding-based sentence similarity metric.

    Navigational Instruction Generation as Inverse Reinforcement Learning with Neural Machine Translation

    Andrea F. Daniele, Mohit Bansal, Matthew R. Walter
    Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Learning (cs.LG)

    Modern robotics applications that involve human-robot interaction require
    robots to be able to communicate with humans seamlessly and effectively.
    Natural language provides a flexible and efficient medium through which robots
    can exchange information with their human partners. Significant advancements
    have been made in developing robots capable of interpreting free-form
    instructions, but less attention has been devoted to endowing robots with the
    ability to generate natural language. We propose a navigational guide model
    that enables robots to generate natural language instructions that allow humans
    to navigate a priori unknown environments. We first decide which information to
    share with the user according to their preferences, using a policy trained from
    human demonstrations via inverse reinforcement learning. We then “translate”
    this information into a natural language instruction using a neural
    sequence-to-sequence model that learns to generate free-form instructions from
    natural language corpora. We evaluate our method on a benchmark route
    instruction dataset and achieve a BLEU score of 72.18% when compared to
    human-generated reference instructions. We additionally conduct navigation
    experiments with human participants that demonstrate that our method generates
    instructions that people follow as accurately and easily as those produced by
    humans.


    Distributed, Parallel, and Cluster Computing

    A Distributed Multi Agents Based Platform for High Performance Computing Infrastructures

    Chairi Kiourt, Dimitris Kalles
    Comments: 12 pages,4 figures, Conference: Workshop Parallel and Distributed Computing for Knowledge Discovery in Data Bases, a workshop of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery, At Porto, Portugal, 2015
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Multiagent Systems (cs.MA)

    This work introduces a novel, modular, layered web based platform for
    managing machine learning experiments on grid-based High Performance Computing
    infrastructures. The coupling of the communication services offered by the
    grid, with an administration layer and conventional web server programming, via
    a data synchronization utility, leads to the straightforward development of a
    web-based user interface that allows the monitoring and managing of diverse
    online distributed computing applications. It also introduces an experiment
    generation and monitoring tool particularly suitable for investigating machine
    learning in game playing. The platform is demonstrated with experiments for two
    different games.

    Implementing High-Order FIR Filters in FPGAs

    Philipp Födisch, Artsiom Bryksa, Bert Lange, Wolfgang Enghardt, Peter Kaever
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Hardware Architecture (cs.AR)

    Contemporary field-programmable gate arrays (FPGAs) are predestined for the
    application of finite impulse response (FIR) filters. Their embedded digital
    signal processing~(DSP) blocks for multiply-accumulate operations enable
    efficient fixed-point computations, in cases where the filter structure is
    accurately mapped to the dedicated hardware architecture. This brief presents a
    generic systolic structure for high-order FIR filters, efficiently exploiting
    the hardware resources of an FPGA in terms of routability and timing. Although
    this seems to be an easily implementable task, the synthesizing tools require
    an adaptation of the straightforward digital filter implementation for an
    optimal mapping. Using the example of a symmetric FIR filter with 90 taps, we
    demonstrate the performance of the proposed structure with FPGAs from Xilinx
    and Altera. The implementation utilizes less than 1% of slice logic and runs at
    clock frequencies up to 526 MHz. Moreover, an enhancement of the structure
    ultimately provides an extended dynamic range for the quantized coefficients
    without the costs of additional slice logic.

    Cloud Kotta: Enabling Secure and Scalable Data Analytics in the Cloud

    Yadu N. Babuji, Kyle Chard, Aaron Gerow, Eamon Duede
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Distributed communities of researchers rely increasingly on valuable,
    proprietary, or sensitive datasets. Given the growth of such data, especially
    in fields new to data-driven, computationally intensive research like the
    social sciences and humanities, coupled with what are often strict and complex
    data-use agreements, many research communities now require methods that allow
    secure, scalable and cost-effective storage and analysis. Here we present CLOUD
    KOTTA: a cloud-based data management and analytics framework. CLOUD KOTTA
    delivers an end-to-end solution for coordinating secure access to large
    datasets, and an execution model that provides both automated infrastructure
    scaling and support for executing analytics near to the data. CLOUD KOTTA
    implements a fine-grained security model ensuring that only authorized users
    may access, analyze, and download protected data. It also implements automated
    methods for acquiring and configuring low-cost storage and compute resources as
    they are needed. We present the architecture and implementation of CLOUD KOTTA
    and demonstrate the advantages it provides in terms of increased performance
    and flexibility. We show that CLOUD KOTTA’s elastic provisioning model can
    reduce costs by up to 16x when compared with statically provisioned models.

    A Secure Data Enclave and Analytics Platform for Social Scientists

    Yadu N. Babuji, Kyle Chard, Aaron Gerow, Eamon Duede
    Comments: Forthcoming eScience 2016
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Data-driven research is increasingly ubiquitous and data itself is a defining
    asset for researchers, particularly in the computational social sciences and
    humanities. Entire careers and research communities are built around valuable,
    proprietary or sensitive datasets. However, many existing computation resources
    fail to support secure and cost-effective storage of data while also enabling
    secure and flexible analysis of the data. To address these needs we present
    CLOUD KOTTA , a cloud-based architecture for the secure management and analysis
    of social science data. CLOUD KOTTA leverages reliable, secure, and scalable
    cloud resources to deliver capabilities to users, and removes the need for
    users to manage complicated infrastructure.CLOUD KOTTA implements automated,
    cost-aware models for efficiently provisioning tiered storage and automatically
    scaled compute resources.CLOUD KOTTA has been used in production for several
    months and currently manages approximately 10TB of data and has been used to
    process more than 5TB of data with over 75,000 CPU hours. It has been used for
    a broad variety of text analysis workflows, matrix factorization, and various
    machine learning algorithms, and more broadly, it supports fast, secure and
    cost-effective research.

    Deterministic parallel algorithms for fooling polylogarithmic juntas and the Lovasz Local Lemma

    David G. Harris
    Subjects: Data Structures and Algorithms (cs.DS); Distributed, Parallel, and Cluster Computing (cs.DC); Probability (math.PR)

    Many randomized algorithms can be derandomized efficiently using either the
    method of conditional expectations or probability spaces with low (almost-)
    independence. A series of papers, beginning with work by Luby (1988) and
    continuing with Berger & Rompel (1991) and Chari et al. (1994), showed that
    these techniques can be combined to give deterministic parallel algorithms for
    combinatorial optimization problems involving sums of $w$-juntas. We improve
    these algorithms through derandomized variable partitioning. This reduces the
    processor complexity to essentially independent of $w$ while the running time
    is reduced from exponential in $w$ to approximately $O(w)$. For example, we
    improve the time complexity of an algorithm of Berger & Rompel (1991) for
    rainbow hypergraph coloring by a factor of approximately $log^2 n$ and the
    processor complexity by a factor of approximately $m^{ln 2}$.

    As a major application of this, we give an NC algorithm for the Lov'{a}sz
    Local Lemma. Previous NC algorithms, including Moser & Tardos (2010) and
    Chandrasekaran et. al (2013), required that (essentially) the bad-events could
    span only $O(log n)$ variables; we relax this to allowing $ ext{polylog}(n)$
    variables. As two applications of our new algorithm, we give algorithms for
    defective vertex coloring and domatic graph partition.

    One main sub-problem encountered in these algorithms is to generate a
    probability space which can “fool” a given list of $GF(2)$ Fourier characters.
    Schulman (1992) gave an NC algorithm for this; we dramatically improve its
    efficiency to near-optimal time and processor complexity and code dimension.
    This leads to a new algorithm to solve the heavy-codeword problem, introduced
    by Naor & Naor (1993), with a near-linear processor compliexty $(mn)^{1+o(1)}$;
    this improves on the algorithm of Chari et. al. (1994) requiring $O(m n^2)$
    processors.


    Learning

    Deep Variational Canonical Correlation Analysis

    Weiran Wang, Honglak Lee, Karen Livescu
    Subjects: Learning (cs.LG)

    We present deep variational canonical correlation analysis (VCCA), a deep
    multi-view learning model that extends the latent variable model interpretation
    of linear CCA~citep{BachJordan05a} to nonlinear observation models
    parameterized by deep neural networks (DNNs). Marginal data likelihood as well
    as inference are intractable under this model. We derive a variational lower
    bound of the data likelihood by parameterizing the posterior density of the
    latent variables with another DNN, and approximate the lower bound via Monte
    Carlo sampling. Interestingly, the resulting model resembles that of multi-view
    autoencoders~citep{Ngiam_11b}, with the key distinction of an additional
    sampling procedure at the bottleneck layer. We also propose a variant of VCCA
    called VCCA-private which can, in addition to the “common variables” underlying
    both views, extract the “private variables” within each view. We demonstrate
    that VCCA-private is able to disentangle the shared and private information for
    multi-view data without hard supervision.

    Context-Aware Online Learning for Course Recommendation of MOOC Big Data

    Yifan Hou, Pan Zhou, Ting Wang, Yuchong Hu, Dapeng Wu
    Subjects: Learning (cs.LG); Computers and Society (cs.CY); Information Retrieval (cs.IR)

    The Massive Open Online Course (MOOC) has expanded significantly in recent
    years. With the widespread of MOOC, the opportunity to study the fascinating
    courses for free has attracted numerous people of diverse educational
    backgrounds all over the world. In the big data era, a key research topic for
    MOOC is how to mine the needed courses in the massive course databases in cloud
    for each individual (course) learner accurately and rapidly as the number of
    courses is increasing fleetly. In this respect, the key challenge is how to
    realize personalized course recommendation as well as to reduce the computing
    and storage costs for the tremendous course data. In this paper, we propose a
    big data-supported, context-aware online learning-based course recommender
    system that could handle the dynamic and infinitely massive datasets, which
    recommends courses by using personalized context information and historical
    statistics. The context-awareness takes the personal preferences into
    consideration, making the recommendation suitable for people with different
    backgrounds. Besides, the algorithm achieves the sublinear regret performance,
    which means it can gradually recommend the mostly preferred and matched courses
    to learners. Unlike other existing algorithms, ours bounds the time complexity
    and space complexity linearly. In addition, our devised storage module is
    expanded to the distributed-connected clouds, which can handle massive course
    storage problems from heterogenous sources. Our experiment results verify the
    superiority of our algorithms when comparing with existing works in the big
    data setting.

    Dynamic Metric Learning from Pairwise Comparisons

    Kristjan Greenewald, Stephen Kelley, Alfred Hero III
    Comments: to appear Allerton 2016. arXiv admin note: substantial text overlap with arXiv:1603.03678
    Subjects: Learning (cs.LG)

    Recent work in distance metric learning has focused on learning
    transformations of data that best align with specified pairwise similarity and
    dissimilarity constraints, often supplied by a human observer. The learned
    transformations lead to improved retrieval, classification, and clustering
    algorithms due to the better adapted distance or similarity measures. Here, we
    address the problem of learning these transformations when the underlying
    constraint generation process is nonstationary. This nonstationarity can be due
    to changes in either the ground-truth clustering used to generate constraints
    or changes in the feature subspaces in which the class structure is apparent.
    We propose Online Convex Ensemble StrongLy Adaptive Dynamic Learning (OCELAD),
    a general adaptive, online approach for learning and tracking optimal metrics
    as they change over time that is highly robust to a variety of nonstationary
    behaviors in the changing metric. We apply the OCELAD framework to an ensemble
    of online learners. Specifically, we create a retro-initialized composite
    objective mirror descent (COMID) ensemble (RICE) consisting of a set of
    parallel COMID learners with different learning rates, demonstrate RICE-OCELAD
    on both real and synthetic data sets and show significant performance
    improvements relative to previously proposed batch and online distance metric
    learning algorithms.

    Learning in Implicit Generative Models

    Shakir Mohamed, Balaji Lakshminarayanan
    Subjects: Machine Learning (stat.ML); Learning (cs.LG); Computation (stat.CO)

    Generative adversarial networks (GANs) provide an algorithmic framework for
    constructing generative models with several appealing properties: they do not
    require a likelihood function to be specified, only a generating procedure;
    they provide samples that are sharp and compelling; and they allow us to
    harness our knowledge of building highly accurate neural network classifiers.
    Here, we develop our understanding of GANs with the aim of forming a rich view
    of this growing area of machine learning—to build connections to the diverse
    set of statistical thinking on this topic, of which much can be gained by a
    mutual exchange of ideas. We frame GANs within the wider landscape of
    algorithms for learning in implicit generative models–models that only specify
    a stochastic procedure with which to generate data–and relate these ideas to
    modelling problems in related fields, such as econometrics and approximate
    Bayesian computation. We develop likelihood-free inference methods and
    highlight hypothesis testing as a principle for learning in implicit generative
    models, using which we are able to derive the objective function used by GANs,
    and many other related objectives. The testing viewpoint directs our focus to
    the general problem of density ratio estimation. There are four approaches for
    density ratio estimation, one of which is a solution using classifiers to
    distinguish real from generated data. Other approaches such as divergence
    minimisation and moment matching have also been explored in the GAN literature,
    and we synthesise these views to form an understanding in terms of the
    relationships between them and the wider literature, highlighting avenues for
    future exploration and cross-pollination.

    Maximum entropy models capture melodic styles

    Jason Sakellariou, Francesca Tria, Vittorio Loreto, François Pachet
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    We introduce a Maximum Entropy model able to capture the statistics of
    melodies in music. The model can be used to generate new melodies that emulate
    the style of the musical corpus which was used to train it. Instead of using
    the $n-$body interactions of $(n-1)-$order Markov models, traditionally used in
    automatic music generation, we use a $k-$nearest neighbour model with pairwise
    interactions only. In that way, we keep the number of parameters low and avoid
    over-fitting problems typical of Markov models. We show that long-range musical
    phrases don’t need to be explicitly enforced using high-order Markov
    interactions, but can instead emerge from multiple, competing, pairwise
    interactions. We validate our Maximum Entropy model by contrasting how much the
    generated sequences capture the style of the original corpus without
    plagiarizing it. To this end we use a data-compression approach to discriminate
    the levels of borrowing and innovation featured by the artificial sequences.
    The results show that our modelling scheme outperforms both fixed-order and
    variable-order Markov models. This shows that, despite being based only on
    pairwise interactions, this Maximum Entropy scheme opens the possibility to
    generate musically sensible alterations of the original phrases, providing a
    way to generate innovation.

    From phonemes to images: levels of representation in a recurrent neural model of visually-grounded language learning

    Lieke Gelderloos, Grzegorz Chrupała
    Comments: Accepted at COLING 2016
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    We present a model of visually-grounded language learning based on stacked
    gated recurrent neural networks which learns to predict visual features given
    an image description in the form of a sequence of phonemes. The learning task
    resembles that faced by human language learners who need to discover both
    structure and meaning from noisy and ambiguous data across modalities. We show
    that our model indeed learns to predict features of the visual context given
    phonetically transcribed image descriptions, and show that it represents
    linguistic information in a hierarchy of levels: lower layers in the stack are
    comparatively more sensitive to form, whereas higher layers are more sensitive
    to meaning.

    A Greedy Approach for Budgeted Maximum Inner Product Search

    Hsiang-Fu Yu, Cho-Jui Hsieh, Qi Lei, Inderjit S. Dhillon
    Subjects: Data Structures and Algorithms (cs.DS); Learning (cs.LG)

    Maximum Inner Product Search (MIPS) is an important task in many machine
    learning applications such as the prediction phase of a low-rank matrix
    factorization model for a recommender system. There have been some works on how
    to perform MIPS in sub-linear time recently. However, most of them do not have
    the flexibility to control the trade-off between search efficient and search
    quality. In this paper, we study the MIPS problem with a computational budget.
    By carefully studying the problem structure of MIPS, we develop a novel
    Greedy-MIPS algorithm, which can handle budgeted MIPS by design. While simple
    and intuitive, Greedy-MIPS yields surprisingly superior performance compared to
    state-of-the-art approaches. As a specific example, on a candidate set
    containing half a million vectors of dimension 200, Greedy-MIPS runs 200x
    faster than the naive approach while yielding search results with the top-5
    precision greater than 75\%.

    Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving

    Shai Shalev-Shwartz, Shaked Shammah, Amnon Shashua
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    Autonomous driving is a multi-agent setting where the host vehicle must apply
    sophisticated negotiation skills with other road users when overtaking, giving
    way, merging, taking left and right turns and while pushing ahead in
    unstructured urban roadways. Since there are many possible scenarios, manually
    tackling all possible cases will likely yield a too simplistic policy.
    Moreover, one must balance between unexpected behavior of other
    drivers/pedestrians and at the same time not to be too defensive so that normal
    traffic flow is maintained.

    In this paper we apply deep reinforcement learning to the problem of forming
    long term driving strategies. We note that there are two major challenges that
    make autonomous driving different from other robotic tasks. First, is the
    necessity for ensuring functional safety – something that machine learning has
    difficulty with given that performance is optimized at the level of an
    expectation over many instances. Second, the Markov Decision Process model
    often used in robotics is problematic in our case because of unpredictable
    behavior of other agents in this multi-agent scenario. We make three
    contributions in our work. First, we show how policy gradient iterations can be
    used without Markovian assumptions. Second, we decompose the problem into a
    composition of a Policy for Desires (which is to be learned) and trajectory
    planning with hard constraints (which is not learned). The goal of Desires is
    to enable comfort of driving, while hard constraints guarantees the safety of
    driving. Third, we introduce a hierarchical temporal abstraction we call an
    “Option Graph” with a gating mechanism that significantly reduces the effective
    horizon and thereby reducing the variance of the gradient estimation even
    further.

    Error Asymmetry in Causal and Anticausal Regression

    Patrick Blöbaum, Takashi Washio, Shohei Shimizu
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    It is generally difficult to make any statements about the expected
    prediction error in an univariate setting without further knowledge about how
    the data were generated. Recent work showed that knowledge about the real
    underlying causal structure of a data generation process has implications for
    various machine learning settings. Assuming an additive noise and an
    independence between data generating mechanism and its input, we draw a novel
    connection between the intrinsic causal relationship of two variables and the
    expected prediction error. We formulate the theorem that the expected error of
    the true data generating function as prediction model is generally smaller when
    the effect is predicted from its cause and, on the contrary, greater when the
    cause is predicted from its effect. The theorem implies an asymmetry in the
    error depending on the prediction direction. This is further corroborated with
    empirical evaluations in artificial and real-world data sets.

    Navigational Instruction Generation as Inverse Reinforcement Learning with Neural Machine Translation

    Andrea F. Daniele, Mohit Bansal, Matthew R. Walter
    Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Learning (cs.LG)

    Modern robotics applications that involve human-robot interaction require
    robots to be able to communicate with humans seamlessly and effectively.
    Natural language provides a flexible and efficient medium through which robots
    can exchange information with their human partners. Significant advancements
    have been made in developing robots capable of interpreting free-form
    instructions, but less attention has been devoted to endowing robots with the
    ability to generate natural language. We propose a navigational guide model
    that enables robots to generate natural language instructions that allow humans
    to navigate a priori unknown environments. We first decide which information to
    share with the user according to their preferences, using a policy trained from
    human demonstrations via inverse reinforcement learning. We then “translate”
    this information into a natural language instruction using a neural
    sequence-to-sequence model that learns to generate free-form instructions from
    natural language corpora. We evaluate our method on a benchmark route
    instruction dataset and achieve a BLEU score of 72.18% when compared to
    human-generated reference instructions. We additionally conduct navigation
    experiments with human participants that demonstrate that our method generates
    instructions that people follow as accurately and easily as those produced by
    humans.

    Supervised Term Weighting Metrics for Sentiment Analysis in Short Text

    Hussam Hamdan, Patrice Bellot, Frederic Bechet
    Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR); Learning (cs.LG)

    Term weighting metrics assign weights to terms in order to discriminate the
    important terms from the less crucial ones. Due to this characteristic, these
    metrics have attracted growing attention in text classification and recently in
    sentiment analysis. Using the weights given by such metrics could lead to more
    accurate document representation which may improve the performance of the
    classification. While previous studies have focused on proposing or comparing
    different weighting metrics at two-classes document level sentiment analysis,
    this study propose to analyse the results given by each metric in order to find
    out the characteristics of good and bad weighting metrics. Therefore we present
    an empirical study of fifteen global supervised weighting metrics with four
    local weighting metrics adopted from information retrieval, we also give an
    analysis to understand the behavior of each metric by observing and analysing
    how each metric distributes the terms and deduce some characteristics which may
    distinguish the good and bad metrics. The evaluation has been done using
    Support Vector Machine on three different datasets: Twitter, restaurant and
    laptop reviews.

    An efficient high-probability algorithm for Linear Bandits

    Gábor Braun, Sebastian Pokutta
    Comments: 17 pages
    Subjects: Data Structures and Algorithms (cs.DS); Learning (cs.LG)

    For the linear bandit problem, we extend the analysis of algorithm CombEXP
    from [R. Combes, M. S. Talebi Mazraeh Shahi, A. Proutiere, and M. Lelarge.
    Combinatorial bandits revisited. In C. Cortes, N. D. Lawrence, D. D. Lee, M.
    Sugiyama, and R. Garnett, editors, Advances in Neural Information Processing
    Systems 28, pages 2116–2124. Curran Associates, Inc., 2015. URL
    this http URL] to the
    high-probability case against adaptive adversaries, allowing actions to come
    from an arbitrary polytope. We prove a high-probability regret of
    (O(T^{2/3})) for time horizon (T). While this bound is weaker than the
    optimal (O(sqrt{T})) bound achieved by GeometricHedge in [P. L. Bartlett, V.
    Dani, T. Hayes, S. Kakade, A. Rakhlin, and A. Tewari. High-probability regret
    bounds for bandit online linear optimization. In 21th Annual Conference on
    Learning Theory (COLT 2008), July 2008.
    this http URL], CombEXP is computationally
    efficient, requiring only an efficient linear optimization oracle over the
    convex hull of the actions.


    Information Theory

    Secrecy in MIMO Networks with No Eavesdropper CSIT

    Pritam Mukherjee, Sennur Ulukus
    Comments: Submitted to IEEE Transactions on Communications, October 2016
    Subjects: Information Theory (cs.IT); Cryptography and Security (cs.CR)

    We consider two fundamental multi-user channel models: the multiple-input
    multiple-output (MIMO) wiretap channel with one helper (WTH) and the MIMO
    multiple access wiretap channel (MAC-WT). In each case, the eavesdropper has
    $K$ antennas while the remaining terminals have $N$ antennas each. We consider
    a fast fading channel where the channel state information (CSI) of the
    legitimate receiver is available at the transmitters but no channel state
    information at the transmitters (CSIT) is available for the eavesdropper’s
    channel. We determine the optimal sum secure degrees of freedom (s.d.o.f.) for
    each channel model for the regime $Kleq N$, and show that in this regime, the
    MAC-WT channel reduces to the WTH in the absence of eavesdropper CSIT. For the
    regime $Nleq Kleq 2N$, we obtain the optimal linear s.d.o.f., and show that
    the MAC-WT channel and the WTH have the same optimal s.d.o.f. when restricted
    to linear encoding strategies. In the absence of any such restrictions, we
    provide an upper bound for the sum s.d.o.f. of the MAC-WT chanel in the regime
    $Nleq Kleq 2N$. Our results show that unlike in the single-input
    single-output (SISO) case, there is loss of s.d.o.f. for even the WTH due to
    lack of eavesdropper CSIT when $Kgeq N$.

    Modulation Classification via Subspace Detection in MIMO Systems

    Hadi Sarieddeen, Mohammad M. Mansour, Ali Chehab
    Subjects: Information Theory (cs.IT)

    The problem of efficient modulation classification (MC) in multiple-input
    multiple-output (MIMO) systems is considered. Per-layer likelihood-based MC is
    proposed by employing subspace decomposition to partially decouple the
    transmitted streams. When detecting the modulation type of the stream of
    interest, a dense constellation is assumed on all remaining streams. The
    proposed classifier outperforms existing MC schemes at a lower complexity cost,
    and can be efficiently implemented in the context of joint MC and subspace data
    detection.

    Channel Training for Analog FDD Repeaters: Optimal Estimators and Cramér-Rao Bounds

    Stefan Wesemann, Thomas L. Marzetta
    Comments: Submitted to IEEE Transactions on Signal Processing, 9 pages, 6 figures
    Subjects: Information Theory (cs.IT)

    A network of analog repeaters, each fed by a wireless fronthaul link and
    powered by e.g., solar energy, is a promising candidate for a flexible small
    cell deployment. A key challenge is the acquisition of accurate channel state
    information by the fronthaul hub (FH), which is needed for the spatial
    multiplexing of multiple fronthaul links over the same time/frequency resource.
    For frequency division duplex channels, a simple pilot loop-back procedure has
    been proposed that allows the estimation of the UL & DL channels at the FH
    without relying on any digital signal processing at the repeater side. For this
    scheme, we derive the maximum likelihood (ML) estimators for the UL & DL
    channel subspaces, formulate the corresponding Cram’er-Rao bounds and show the
    asymptotic efficiency of both (SVD-based) estimators by means of Monte Carlo
    simulations. In addition, we illustrate how to compute the underlying (rank-1)
    SVD with quadratic time complexity by employing the power iteration method. To
    enable power control for the fronthaul links, knowledge of the channel gains is
    needed. Assuming that the UL & DL channels have on average the same gain, we
    formulate the ML estimator for the UL channel gain, and illustrate its
    robustness against strong noise by means of simulations.

    Throughput Optimal Listen-Before-Talk for Cellular in Unlicensed Spectrum

    Ning Wei, Xingqin Lin, Wanwan Li, Youzhi Xiong, Zhongpei Zhang
    Comments: 5 pages, 3 figures, submitted to IEEE ICC 2017
    Subjects: Information Theory (cs.IT)

    The effort to extend cellular technologies to unlicensed spectrum has been
    gaining high momentum. Listen-before-talk (LBT) is enforced in the regions such
    as European Union and Japan to harmonize coexistence of cellular and incumbent
    systems in unlicensed spectrum. In this paper, we study throughput optimal LBT
    transmission strategy for load based equipment (LBE). We find that the optimal
    rule is a pure threshold policy: The LBE should stop listening and transmit
    once the channel quality exceeds an optimized threshold. We also reveal the
    optimal set of LBT parameters that are compliant with regulatory requirements.
    Our results shed light on how the regulatory LBT requirements can affect the
    transmission strategies of radio equipment in unlicensed spectrum.

    Optimal Beamforming for MIMO Shared Relaying in Downlink Cellular Networks with ARQ

    Ahmed Raafat Hosny, Ramy Abdallah Tannious, Amr El-Keyi
    Subjects: Information Theory (cs.IT)

    In this paper, we study the performance of the downlink of a cellular network
    with automatic repeat-request (ARQ) and a half duplex decode-and-forward shared
    relay. In this system, two multiple-input-multiple-output (MIMO) base stations
    serve two single antenna users. A MIMO shared relay retransmits the lost
    packets to the target users. First, we study the system with direct
    retransmission from the base station and derive a closed form expression for
    the outage probability of the system.We show that the direct retransmission can
    overcome the fading, however, it cannot overcome the interference. After that,
    we invoke the shared relay and design the relay beamforming matrices such that
    the signal-to-interference-and-noise ratio (SINR) is improved at the users
    subject to power constraints on the relay. In the case when the transmission of
    only one user fails, we derive a closed form solution for the relay
    beamformers. On the other hand when both transmissions fail, we pose the
    beamforming problem as a sequence of non-convex feasibility problems. We use
    semidefinite relaxation (SDR) to convert each feasibility problem into a convex
    optimization problem. We ensure a rank one solution, and hence, there is no
    loss of optimality in SDR. Simulation results are presented showing the
    superior performance of the proposed relay beamforming strategy compared to
    direct ARQ system in terms of the outage probability.

    Vector Approximate Message Passing

    Sundeep Rangan, Philip Schniter, Alyson Fletcher
    Subjects: Information Theory (cs.IT)

    The standard linear regression (SLR) problem is to recover a vector
    $mathbf{x}^0$ from noisy linear observations
    $mathbf{y}=mathbf{Ax}^0+mathbf{w}$. The approximate message passing (AMP)
    algorithm recently proposed by Donoho, Maleki, and Montanari is a
    computationally efficient iterative approach to SLR that has a remarkable
    property: for large i.i.d. sub-Gaussian matrices $mathbf{A}$, its
    per-iteration behavior is rigorously characterized by a scalar state-evolution
    whose fixed points, when unique, are Bayes optimal. AMP, however, is fragile in
    that even small deviations from the i.i.d. sub-Gaussian model can cause the
    algorithm to diverge. This paper considers a “vector AMP” (VAMP) algorithm and
    shows that VAMP has a rigorous scalar state-evolution that holds under a much
    broader class of large random matrices $mathbf{A}$: those that are
    right-rotationally invariant. After performing an initial singular value
    decomposition (SVD) of $mathbf{A}$, the per-iteration complexity of VAMP can
    be made similar to that of AMP. In addition, the fixed points of VAMP’s state
    evolution are consistent with the replica prediction of the minimum
    mean-squared error recently derived by Tulino, Caire, Verd’u, and Shamai. The
    effectiveness and state evolution predictions of VAMP are confirmed in
    numerical experiments.

    Multi-User MIMO with flexible numerology for 5G

    Sridhar Rajagopal, Md. Saifur Rahman
    Comments: 6 pages, 10 figures
    Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

    Flexible numerologies are being considered as part of designs for 5G systems
    to support vertical services with diverse requirements such as enhanced mobile
    broadband, ultra-reliable low-latency communications, and massive machine type
    communication. Different vertical services can be multiplexed in either
    frequency domain, time domain, or both. In this paper, we investigate the use
    of spatial multiplexing of services using MU-MIMO where the numerologies for
    different users may be different. The users are grouped according to the chosen
    numerology and a separate pre-coder and FFT size is used per numerology at the
    transmitter. The pre-coded signals for the multiple numerologies are added in
    the time domain before transmission. We analyze the performance gains of this
    approach using capacity analysis and link level simulations using conjugate
    beamforming and signal-to-leakage noise ratio maximization techniques. We show
    that the MU interference between users with different numerologies can be
    suppressed efficiently with reasonable number of antennas at the base-station.
    This feature enables MU-MIMO techniques to be applied for 5G across different
    numerologies.

    Quantum authentication with key recycling

    Christopher Portmann
    Comments: 34+13 pages, 12 figures, comments welcome
    Subjects: Quantum Physics (quant-ph); Cryptography and Security (cs.CR); Information Theory (cs.IT)

    We show that a family of quantum authentication protocols introduced in
    [Barnum et al., FOCS 2002] can be used to construct a secure quantum channel
    and additionally recycle all of the secret key if the message is successfully
    authenticated, and recycle part of the key if tampering is detected. We give a
    full security proof that constructs the secure channel given only insecure
    noisy channels and a shared secret key. We also prove that the number of
    recycled key bits is optimal for this family of protocols, i.e., there exists
    an adversarial strategy to obtain all non-recycled bits. Previous works
    recycled less key and only gave partial security proofs, since they did not
    consider all possible distinguishers (environments) that may be used to
    distinguish the real setting from the ideal secure quantum channel.




沪ICP备19023445号-2号
友情链接