IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Wed, 26 Apr 2017

    我爱机器学习(52ml.net)发表于 2017-04-26 00:00:00
    love 0

    Neural and Evolutionary Computing

    Tapping the sensorimotor trajectory

    Oswald Berthold, Verena Hafner
    Comments: Paper submitted to the ICDL-Epirob 2017 conference. Currently under review
    Subjects: Neural and Evolutionary Computing (cs.NE)

    We propose sensorimotor tappings, a new graphical technique that explicitly
    represents relations between the time steps of an agent’s sensorimotor loop and
    a single training step of an adaptive model that the agent is using internally.
    In the simplest case this is a relation linking two time steps. In realistic
    cases these relations can extend over several time steps and over different
    sensory channels. The aim is to capture the footprint of information intake
    relative to the agent’s current time step. We argue that this view allows us to
    make prior considerations explicit and then use them in implementations without
    modification once they are established. In the paper we introduce the problem
    domain, explain the basic idea, provide example tappings for standard
    configurations used in developmental models, and show how tappings can be
    applied to problems in related fields.

    Introspective Generative Modeling: Decide Discriminatively

    Justin Lazarow, Long Jin, Zhuowen Tu
    Comments: 10 pages, 9 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    We study unsupervised learning by developing introspective generative
    modeling (IGM) that attains a generator using progressively learned deep
    convolutional neural networks. The generator is itself a discriminator, capable
    of introspection: being able to self-evaluate the difference between its
    generated samples and the given training data. When followed by repeated
    discriminative learning, desirable properties of modern discriminative
    classifiers are directly inherited by the generator. IGM learns a cascade of
    CNN classifiers using a synthesis-by-classification algorithm. In the
    experiments, we observe encouraging results on a number of applications
    including texture modeling, artistic style transferring, face modeling, and
    semi-supervised learning.

    Introspective Classifier Learning: Empower Generatively

    Long Jin, Justin Lazarow, Zhuowen Tu
    Comments: 11 pages, 6 figure
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    In this paper we propose introspective classifier learning (ICL) that
    emphasizes the importance of having a discriminative classifier empowered with
    generative capabilities. We develop a reclassification-by-synthesis algorithm
    to perform training using a formulation stemmed from the Bayes theory. Our
    classifier is able to iteratively: (1) synthesize pseudo-negative samples in
    the synthesis step; and (2) enhance itself by improving the classification in
    the reclassification step. The single classifier learned is at the same time
    generative — being able to directly synthesize new samples within its own
    discriminative model. We conduct experiments on standard benchmark datasets
    including MNIST, CIFAR, and SVHN using state-of-the-art CNN architectures, and
    observe improved classification results.

    DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning

    Xiaodong Gu, Hongyu Zhang, Dongmei Zhang, Sunghun Kim
    Comments: Accepted at IJCAI 2017 (The 26th International Joint Conference on Artificial Intelligence)
    Subjects: Software Engineering (cs.SE); Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE)

    Computer programs written in one language are often required to be ported to
    other languages to support multiple devices and environments. When programs use
    language specific APIs (Application Programming Interfaces), it is very
    challenging to migrate these APIs to the corresponding APIs written in other
    languages. Existing approaches mine API mappings from projects that have
    corresponding versions in two languages. They rely on the sparse availability
    of bilingual projects, thus producing a limited number of API mappings. In this
    paper, we propose an intelligent system called DeepAM for automatically mining
    API mappings from a large-scale code corpus without bilingual projects. The key
    component of DeepAM is based on the multimodal sequence to sequence learning
    architecture that aims to learn joint semantic representations of bilingual API
    sequences from big source code data. Experimental results indicate that DeepAM
    significantly increases the accuracy of API mappings as well as the number of
    API mappings, when compared with the state-of-the-art approaches.


    Computer Vision and Pattern Recognition

    Introspective Generative Modeling: Decide Discriminatively

    Justin Lazarow, Long Jin, Zhuowen Tu
    Comments: 10 pages, 9 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    We study unsupervised learning by developing introspective generative
    modeling (IGM) that attains a generator using progressively learned deep
    convolutional neural networks. The generator is itself a discriminator, capable
    of introspection: being able to self-evaluate the difference between its
    generated samples and the given training data. When followed by repeated
    discriminative learning, desirable properties of modern discriminative
    classifiers are directly inherited by the generator. IGM learns a cascade of
    CNN classifiers using a synthesis-by-classification algorithm. In the
    experiments, we observe encouraging results on a number of applications
    including texture modeling, artistic style transferring, face modeling, and
    semi-supervised learning.

    Introspective Classifier Learning: Empower Generatively

    Long Jin, Justin Lazarow, Zhuowen Tu
    Comments: 11 pages, 6 figure
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    In this paper we propose introspective classifier learning (ICL) that
    emphasizes the importance of having a discriminative classifier empowered with
    generative capabilities. We develop a reclassification-by-synthesis algorithm
    to perform training using a formulation stemmed from the Bayes theory. Our
    classifier is able to iteratively: (1) synthesize pseudo-negative samples in
    the synthesis step; and (2) enhance itself by improving the classification in
    the reclassification step. The single classifier learned is at the same time
    generative — being able to directly synthesize new samples within its own
    discriminative model. We conduct experiments on standard benchmark datasets
    including MNIST, CIFAR, and SVHN using state-of-the-art CNN architectures, and
    observe improved classification results.

    Unsupervised Learning of Depth and Ego-Motion from Video

    Tinghui Zhou, Matthew Brown, Noah Snavely, David G. Lowe
    Comments: Accepted to CVPR 2017. Project webpage: this https URL
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We present an unsupervised learning framework for the task of monocular depth
    and camera motion estimation from unstructured video sequences. We achieve this
    by simultaneously training depth and camera pose estimation networks using the
    task of view synthesis as the supervisory signal. The networks are thus coupled
    via the view synthesis objective during training, but can be applied
    independently at test time. Empirical evaluation on the KITTI dataset
    demonstrates the effectiveness of our approach: 1) monocular depth performing
    comparably with supervised methods that use either ground-truth pose or depth
    for training, and 2) pose estimation performing favorably with established SLAM
    systems under comparable input settings.

    Hand Keypoint Detection in Single Images using Multiview Bootstrapping

    Tomas Simon, Hanbyul Joo, Iain Matthews, Yaser Sheikh
    Comments: CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We present an approach that uses a multi-camera system to train fine-grained
    detectors for keypoints that are prone to occlusion, such as the joints of a
    hand. We call this procedure multiview bootstrapping: first, an initial
    keypoint detector is used to produce noisy labels in multiple views of the
    hand. The noisy detections are then triangulated in 3D using multiview geometry
    or marked as outliers. Finally, the reprojected triangulations are used as new
    labeled training data to improve the detector. We repeat this process,
    generating more labeled data in each iteration. We derive a result analytically
    relating the minimum number of views to achieve target true and false positive
    rates for a given detector. The method is used to train a hand keypoint
    detector for single images. The resulting keypoint detector runs in realtime on
    RGB images and has accuracy comparable to methods that use depth sensors. The
    single view detector, triangulated over multiple views, enables 3D markerless
    hand motion capture with complex object interactions.

    SfM-Net: Learning of Structure and Motion from Video

    Sudheendra Vijayanarasimhan, Susanna Ricco, Cordelia Schmid, Rahul Sukthankar, Katerina Fragkiadaki
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We propose SfM-Net, a geometry-aware neural network for motion estimation in
    videos that decomposes frame-to-frame pixel motion in terms of scene and object
    depth, camera motion and 3D object rotations and translations. Given a sequence
    of frames, SfM-Net predicts depth, segmentation, camera and rigid object
    motions, converts those into a dense frame-to-frame motion field (optical
    flow), differentiably warps frames in time to match pixels and back-propagates.
    The model can be trained with various degrees of supervision: 1)
    self-supervised by the re-projection photometric error (completely
    unsupervised), 2) supervised by ego-motion (camera motion), or 3) supervised by
    depth (e.g., as provided by RGBD sensors). SfM-Net extracts meaningful depth
    estimates and successfully estimates frame-to-frame camera rotations and
    translations. It often successfully segments the moving objects in the scene,
    even though such supervision is never provided.

    Arabidopsis roots segmentation based on morphological operations and CRFs

    José Ignacio Orlando, Hugo Luis Manterola, Enzo Ferrante, Federico Ariel
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Arabidopsis thaliana is a plant species widely utilized by scientists to
    estimate the impact of genetic differences in root morphological features. For
    this purpose, images of this plant after genetic modifications are taken to
    study differences in the root architecture. This task requires manual
    segmentations of radicular structures, although this is a particularly tedious
    and time-consuming labor. In this work, we present an unsupervised method for
    Arabidopsis thaliana root segmentation based on morphological operations and
    fully-connected Conditional Random Fields. Although other approaches have been
    proposed to this purpose, all of them are based on more complex and expensive
    imaging modalities. Our results prove that our method can be easily applied
    over images taken using conventional scanners, with a minor user intervention.
    A first data set, our results and a fully open source implementation are
    available online.

    Joint Sequence Learning and Cross-Modality Convolution for 3D Biomedical Segmentation

    Kuan-Lun Tseng, Yen-Liang Lin, Winston Hsu, Chung-Yang Huang
    Comments: CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Deep learning models such as convolutional neural net- work have been widely
    used in 3D biomedical segmentation and achieve state-of-the-art performance.
    However, most of them often adapt a single modality or stack multiple
    modalities as different input channels. To better leverage the multi-
    modalities, we propose a deep encoder-decoder structure with cross-modality
    convolution layers to incorporate different modalities of MRI data. In
    addition, we exploit convolutional LSTM to model a sequence of 2D slices, and
    jointly learn the multi-modalities and convolutional LSTM in an end-to-end
    manner. To avoid converging to the certain labels, we adopt a re-weighting
    scheme and two-phase training to handle the label imbalance. Experimental
    results on BRATS-2015 show that our method outperforms state-of-the-art
    biomedical segmentation approaches.

    Speeding up Convolutional Neural Networks By Exploiting the Sparsity of Rectifier Units

    Shaohuai Shi, Xiaowen Chu
    Comments: 7 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Rectifier neuron units (ReLUs) have been widely used in deep convolutional
    networks. An ReLU converts negative values to zeros, and does not change
    positive values, which leads to a high sparsity of neurons. In this work, we
    first examine the sparsity of the outputs of ReLUs in some popular deep
    convolutional architectures. And then we use the sparsity property of ReLUs to
    accelerate the calculation of convolution by skipping calculations of
    zero-valued neurons. The proposed sparse convolution algorithm achieves
    significant speedup on CPUs compared to the traditional matrix-matrix
    multiplication algorithm for convolution.

    Masked Signal Decomposition Using Subspace Representation and Its Applications

    Shervin Minaee, Yao Wang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Signal decomposition is a classical problem in signal processing, which aims
    to separate an observed signal into two or more components each with its own
    property. Usually each component is described by its own subspace or
    dictionary. Extensive research has been done for the case where the components
    are additive, but in real world applications, the components are often
    non-additive. For example, an image may consist of a foreground object overlaid
    on a background, where each pixel either belongs to the foreground or the
    background. In such a situation, to separate signal components, we need to find
    a binary mask which shows the location of each component. Therefore it requires
    to solve a binary optimization problem. Since most of the binary optimization
    problems are intractable, we relax this problem to the approximated continuous
    problem, and solve it by alternating optimization technique, which solves the
    mask and the subspace projection for each component, alternatingly and
    iteratively. We show the application of the proposed algorithm for three
    applications: separation of text from background in images, separation of
    moving objects from a background undergoing global camera motion in videos,
    separation of sinusoidal and spike components in one dimensional signals. We
    demonstrate in each case that considering the non-additive nature of the
    problem can lead to significant improvement.

    Inception Recurrent Convolutional Neural Network for Object Recognition

    Md Zahangir Alom, Mahmudul Hasan, Chris Yakopcic, Tarek M. Taha
    Comments: 11 pages, 10 figures, 2 tables
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Deep convolutional neural networks (DCNNs) are an influential tool for
    solving various problems in the machine learning and computer vision fields. In
    this paper, we introduce a new deep learning model called an Inception-
    Recurrent Convolutional Neural Network (IRCNN), which utilizes the power of an
    inception network combined with recurrent layers in DCNN architecture. We have
    empirically evaluated the recognition performance of the proposed IRCNN model
    using different benchmark datasets such as MNIST, CIFAR-10, CIFAR- 100, and
    SVHN. Experimental results show similar or higher recognition accuracy when
    compared to most of the popular DCNNs including the RCNN. Furthermore, we have
    investigated IRCNN performance against equivalent Inception Networks and
    Inception-Residual Networks using the CIFAR-100 dataset. We report about 3.5%,
    3.47% and 2.54% improvement in classification accuracy when compared to the
    RCNN, equivalent Inception Networks, and Inception- Residual Networks on the
    augmented CIFAR- 100 dataset respectively.

    Perivascular Spaces Segmentation in Brain MRI Using Optimal 3D Filtering

    Lucia Ballerini, Ruggiero Lovreglio, Maria del C. Valdes-Hernandez, Joel Ramirez, Bradley J. MacIntosh, Sandra E. Black, Joanna M. Wardlaw
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Perivascular Spaces (PVS) are a recently recognised feature of Small Vessel
    Disease (SVD), also indicating neuroinflammation, and are an important part of
    the brain’s circulation and glymphatic drainage system. Quantitative analysis
    of PVS on Magnetic Resonance Images (MRI) is important for understanding their
    relationship with neurological diseases. In this work, we propose a
    segmentation technique based on the 3D Frangi filtering for extraction of PVS
    from MRI. Based on prior knowledge from neuroradiological ratings of PVS, we
    used ordered logit models to optimise Frangi filter parameters in response to
    the variability in the scanner’s parameters and study protocols. We optimized
    and validated our proposed models on two independent cohorts, a dementia sample
    (N=20) and patients who previously had mild to moderate stroke (N=48). Results
    demonstrate the robustness and generalisability of our segmentation method.
    Segmentation-based PVS burden estimates correlated with neuroradiological
    assessments (Spearman’s (
    ho) = 0.74, p (<) 0.001), suggesting the great
    potential of our proposed method

    Joint Layout Estimation and Global Multi-View Registration for Indoor Reconstruction

    Jeong-Kyun Lee, Jae-Won Yea, Min-Gyu Park, Kuk-Jin Yoon
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we propose a novel method to jointly solve scene layout
    estimation and global registration problems for accurate indoor 3D
    reconstruction. Given a sequence of range data, we first build a set of scene
    fragments using KinectFusion and register them through pose graph optimization.
    Afterwards, we alternate between layout estimation and layout-based global
    registration processes in iterative fashion to complement each other. We
    extract the scene layout through hierarchical agglomerative clustering and
    energy-based multi-model fitting in consideration of noisy measurements. Having
    the estimated scene layout in one hand, we register all the range data through
    the global iterative closest point algorithm where the positions of 3D points
    that belong to the layout such as walls and a ceiling are constrained to be
    close to the layout. We experimentally verify the proposed method with the
    publicly available synthetic and real-world datasets in both quantitative and
    qualitative ways.

    Skeleton-based Action Recognition with Convolutional Neural Networks

    Chao Li, Qiaoyong Zhong, Di Xie, Shiliang Pu
    Comments: ICMEW 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Current state-of-the-art approaches to skeleton-based action recognition are
    mostly based on recurrent neural networks (RNN). In this paper, we propose a
    novel convolutional neural networks (CNN) based framework for both action
    classification and detection. Raw skeleton coordinates as well as skeleton
    motion are fed directly into CNN for label prediction. A novel skeleton
    transformer module is designed to rearrange and select important skeleton
    joints automatically. With a simple 7-layer network, we obtain 89.3% accuracy
    on validation set of the NTU RGB+D dataset. For action detection in untrimmed
    videos, we develop a window proposal network to extract temporal segment
    proposals, which are further classified within the same network. On the recent
    PKU-MMD dataset, we achieve 93.7% mAP, surpassing the baseline by a large
    margin.

    Towards a quality metric for dense light fields

    Vamsi Kiran Adhikarla, Marek Vinkler, Denis Sumin, Rafał K. Mantiuk
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Light fields become a popular representation of three dimensional scenes, and
    there is interest in their processing, resampling, and compression. As those
    operations often result in loss of quality, there is a need to quantify it. In
    this work, we collect a new dataset of dense reference and distorted light
    fields as well as the corresponding quality scores which are scaled in
    perceptual units. The scores were acquired in a subjective experiment using an
    interactive light-field viewing setup. The dataset contains typical artifacts
    that occur in light-field processing chain due to light-field reconstruction,
    multi-view compression, and limitations of automultiscopic displays. We test a
    number of existing objective quality metrics to determine how well they can
    predict the quality of light fields. We find that the existing image quality
    metrics provide good measures of light-field quality, but require dense
    reference light- fields for optimal performance. For more complex tasks of
    comparing two distorted light fields, their performance drops significantly,
    which reveals the need for new, light-field-specific metrics.

    A Labeling-Free Approach to Supervising Deep Neural Networks for Retinal Blood Vessel Segmentation

    Yongliang Chen
    Comments: 10 pages, 8 figures, 2 tables, forbidden work
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Segmenting blood vessels in fundus imaging plays an important role in medical
    diagnosis. Many algorithms have been proposed. While deep Neural Networks have
    been attracting enormous attention from computer vision community recent years
    and several novel works have been done in terms of its application in retinal
    blood vessel segmentation, most of them are based on supervised learning which
    requires amount of labeled data, which is both scarce and expensive to obtain.
    We leverage the power of Deep Convolutional Neural Networks (DCNN) in feature
    learning, in this work, to achieve this ultimate goal. The highly efficient
    feature learning of DCNN inspires our novel approach that trains the networks
    with automatically-generated samples to achieve desirable performance on
    real-world fundus images. For this, we design a set of rules abstracted from
    the domain-specific prior knowledge to generate these samples. We argue that,
    with the high efficiency of DCNN in feature learning, one can achieve this goal
    by constructing the training dataset with prior knowledge, no manual labeling
    is needed. This approach allows us to take advantages of supervised learning
    without labeling. We also build a naive DCNN model to test it. The results on
    standard benchmarks of fundus imaging show it is competitive to the
    state-of-the-art methods which implies a potential way to leverage the power of
    DCNN in feature learning.

    A Context Aware and Video-Based Risk Descriptor for Cyclists

    Miguel Costa, Beatriz Quintino Ferreira, Manuel Marques
    Comments: Submitted to ITSC2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Aiming to reduce pollutant emissions, bicycles are regaining popularity
    specially in urban areas. However, the number of cyclists’ fatalities is not
    showing the same decreasing trend as the other traffic groups. Hence,
    monitoring cyclists’ data appears as a keystone to foster urban cyclists’
    safety by helping urban planners to design safer cyclist routes. In this work,
    we propose a fully image-based framework to assess the rout risk from the
    cyclist perspective. From smartphone sequences of images, this generic
    framework is able to automatically identify events considering different risk
    criteria based on the cyclist’s motion and object detection. Moreover, since it
    is entirely based on images, our method provides context on the situation and
    is independent from the expertise level of the cyclist. Additionally, we build
    on an existing platform and introduce several improvements on its mobile app to
    acquire smartphone sensor data, including video. From the inertial sensor data,
    we automatically detect the route segments performed by bicycle, applying
    behavior analysis techniques. We test our methods on real data, attaining very
    promising results in terms of risk classification, according to two different
    criteria, and behavior analysis accuracy.

    Can Saliency Information Benefit Image Captioning Models?

    Hamed R. Tavakoli, Rakshith Shetty, Ali Borji, Jorma Laaksonen
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

    To bridge the gap between humans and machines in image understanding and
    describing, we need further insight into how people describe a perceived scene.
    In this paper, we study the agreement between bottom-up saliency-based visual
    attention and object referrals in scene description constructs. We investigate
    the properties of human-written descriptions and machine-generated ones. We
    then propose a saliency-boosted image captioning model in order to investigate
    benefits from low-level cues in language models. We learn that (1) humans
    mention more salient objects earlier than less salient ones in their
    descriptions, (2) the better a captioning model performs, the better attention
    agreement it has with human descriptions, (3) the proposed saliency-boosted
    model, compared to its baseline form, does not improve significantly on the MS
    COCO database, indicating explicit bottom-up boosting does not help when the
    task is well learnt and tuned on a data, (4) a better generalization ability
    is, however, observed for the saliency-boosted model on unseen data.

    Prominent Object Detection and Recognition: A Saliency-based Pipeline

    Hamed R. Tavakoli, Jorma Laaksonen
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

    This manuscript introduces the problem of prominent object detection and
    recognition. The problem deals with finding the most important region of
    interest, segmenting the relevant item/object in that area, and assigning it an
    object class label. In other words, we are solving the three problems of
    saliency modeling, saliency detection, and object recognition under one
    umbrella. The motivation behind such a problem formulation is 1) the benefits
    to the knowledge representation-based vision pipelines, and 2) the potential
    improvements in emulating bio-inspired vision systems by solving these three
    problems together. We propose a method as a baseline for further research. The
    proposed model predicts the most important area in the image, segments the
    associated objects, and labels them. The proposed problem and method are
    evaluated against human fixations, annotated segmentation masks, and object
    class categories. We define a chance level for each of the evaluation criterion
    to compare the proposed algorithm with. Despite the good performance of the
    proposed baseline, the overall evaluations indicate that the problem of
    prominent object detection and recognition is a challenging task that is still
    worth investigating further.

    Sharing deep generative representation for perceived image reconstruction from human brain activity

    Changde Du, Changying Du, Huiguang He
    Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neurons and Cognition (q-bio.NC)

    Decoding human brain activities via functional magnetic resonance imaging
    (fMRI) has gained increasing attention in recent years. While encouraging
    results have been reported in brain states classification tasks, reconstructing
    the details of human visual experience still remains difficult. Two main
    challenges that hinder the development of effective models are the perplexing
    fMRI measurement noise and the high dimensionality of limited data instances.
    Existing methods generally suffer from one or both of these issues and yield
    dissatisfactory results. In this paper, we tackle this problem by casting the
    reconstruction of visual stimulus as the Bayesian inference of missing view in
    a multiview latent variable model. Sharing a common latent representation, our
    joint generative model of external stimulus and brain response is not only
    “deep” in extracting nonlinear features from visual images, but also powerful
    in capturing correlations among voxel activities of fMRI recordings. The
    nonlinearity and deep structure endow our model with strong representation
    ability, while the correlations of voxel activities are critical for
    suppressing noise and improving prediction. We devise an efficient variational
    Bayesian method to infer the latent variables and the model parameters. To
    further improve the reconstruction accuracy, the latent representations of
    testing instances are enforced to be close to that of their neighbours from the
    training set via posterior regularization. Experiments on three fMRI recording
    datasets demonstrate that our approach can more accurately reconstruct visual
    stimuli.

    Multi-Task Video Captioning with Video and Entailment Generation

    Ramakanth Pasunuru, Mohit Bansal
    Comments: Accepted at ACL 2017 (13 pages w/ supplementary)
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

    Video captioning, the task of describing the content of a video, has seen
    some promising improvements in recent years with sequence-to-sequence models,
    but accurately learning the temporal and logical dynamics involved in the task
    still remains a challenge, especially given the lack of sufficient annotated
    data. We improve video captioning by sharing knowledge with two related
    directed-generation tasks: a temporally-directed unsupervised video prediction
    task to learn richer context-aware video encoder representations, and a
    logically-directed language entailment generation task to learn better
    video-entailing caption decoder representations. For this, we present a
    many-to-many multi-task learning model that shares parameters across the
    encoders and decoders of the three tasks. We achieve significant improvements
    and the new state-of-the-art on several standard video captioning datasets
    using diverse automatic and human evaluations. We also show mutual multi-task
    improvements on the entailment generation task.

    Deeply-Supervised Nets

    Chen-Yu Lee, Saining Xie, Patrick Gallagher, Zhengyou Zhang, Zhuowen Tu
    Comments: Patent disclosure, UCSD Docket No. SD2014-313, filed on May 22, 2014
    Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Our proposed deeply-supervised nets (DSN) method simultaneously minimizes
    classification error while making the learning process of hidden layers direct
    and transparent. We make an attempt to boost the classification performance by
    studying a new formulation in deep networks. Three aspects in convolutional
    neural networks (CNN) style architectures are being looked at: (1) transparency
    of the intermediate layers to the overall classification; (2)
    discriminativeness and robustness of learned features, especially in the early
    layers; (3) effectiveness in training due to the presence of the exploding and
    vanishing gradients. We introduce “companion objective” to the individual
    hidden layers, in addition to the overall objective at the output layer (a
    different strategy to layer-wise pre-training). We extend techniques from
    stochastic gradient methods to analyze our algorithm. The advantage of our
    method is evident and our experimental result on benchmark datasets shows
    significant performance gain over existing methods (e.g. all state-of-the-art
    results on MNIST, CIFAR-10, CIFAR-100, and SVHN).


    Artificial Intelligence

    Taxonomy Induction using Hypernym Subsequences

    Amit Gupta, Rémi Lebret, Hamza Harkous, Karl Aberer
    Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Information Retrieval (cs.IR)

    We propose a novel, semi-supervised approach towards domain taxonomy
    induction from an input vocabulary of seed terms. Unlike most previous
    approaches, which typically extract direct hypernym edges for terms, our
    approach utilizes a novel probabilistic framework to extract hypernym
    subsequences. Taxonomy induction from extracted subsequences is cast as an
    instance of the minimum-cost flow problem on a carefully designed directed
    graph. Through experiments, we demonstrate that our approach outperforms
    state-of-the-art taxonomy induction approaches across four languages.
    Furthermore, we show that our approach is robust to the presence of noise in
    the input vocabulary.

    Sharing deep generative representation for perceived image reconstruction from human brain activity

    Changde Du, Changying Du, Huiguang He
    Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neurons and Cognition (q-bio.NC)

    Decoding human brain activities via functional magnetic resonance imaging
    (fMRI) has gained increasing attention in recent years. While encouraging
    results have been reported in brain states classification tasks, reconstructing
    the details of human visual experience still remains difficult. Two main
    challenges that hinder the development of effective models are the perplexing
    fMRI measurement noise and the high dimensionality of limited data instances.
    Existing methods generally suffer from one or both of these issues and yield
    dissatisfactory results. In this paper, we tackle this problem by casting the
    reconstruction of visual stimulus as the Bayesian inference of missing view in
    a multiview latent variable model. Sharing a common latent representation, our
    joint generative model of external stimulus and brain response is not only
    “deep” in extracting nonlinear features from visual images, but also powerful
    in capturing correlations among voxel activities of fMRI recordings. The
    nonlinearity and deep structure endow our model with strong representation
    ability, while the correlations of voxel activities are critical for
    suppressing noise and improving prediction. We devise an efficient variational
    Bayesian method to infer the latent variables and the model parameters. To
    further improve the reconstruction accuracy, the latent representations of
    testing instances are enforced to be close to that of their neighbours from the
    training set via posterior regularization. Experiments on three fMRI recording
    datasets demonstrate that our approach can more accurately reconstruct visual
    stimuli.

    Molecular De Novo Design through Deep Reinforcement Learning

    Marcus Olivecrona, Thomas Blaschke, Ola Engkvist, Hongming Chen
    Subjects: Artificial Intelligence (cs.AI)

    This work introduces a method to tune a sequence-based generative model for
    molecular de novo design that through augmented episodic likelihood can learn
    to generate structures with certain specified desirable properties. We
    demonstrate how this model can execute a range of tasks such as generating
    analogues to a query structure and generating compounds predicted to be active
    against a biological target. As a proof of principle, the model is first
    trained to generate molecules that do not contain sulphur. As a second example,
    the model is trained to generate analogues to the drug Celecoxib, a technique
    that could be used for scaffold hopping or library expansion starting from a
    single molecule. Finally, when tuning the model towards generating compounds
    predicted to be active against the dopamine receptor D2, the model generates
    structures of which more than 95% are predicted to be active, including
    experimentally confirmed actives that have not been included in either the
    generative model nor the activity prediction model.

    Semi-supervised Bayesian Deep Multi-modal Emotion Recognition

    Changde Du, Changying Du, Jinpeng Li, Wei-long Zheng, Bao-liang Lu, Huiguang He
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    In emotion recognition, it is difficult to recognize human’s emotional states
    using just a single modality. Besides, the annotation of physiological
    emotional data is particularly expensive. These two aspects make the building
    of effective emotion recognition model challenging. In this paper, we first
    build a multi-view deep generative model to simulate the generative process of
    multi-modality emotional data. By imposing a mixture of Gaussians assumption on
    the posterior approximation of the latent variables, our model can learn the
    shared deep representation from multiple modalities. To solve the
    labeled-data-scarcity problem, we further extend our multi-view model to
    semi-supervised learning scenario by casting the semi-supervised classification
    problem as a specialized missing data imputation task. Our semi-supervised
    multi-view deep generative framework can leverage both labeled and unlabeled
    data from multiple modalities, where the weight factor for each modality can be
    learned automatically. Compared with previous emotion recognition methods, our
    method is more robust and flexible. The experiments conducted on two real
    multi-modal emotion datasets have demonstrated the superiority of our framework
    over a number of competitors.

    Path Planning with Kinematic Constraints for Robot Groups

    Wolfgang Hönig, T. K. Satish Kumar, Liron Cohen, Hang Ma, Sven Koenig, Nora Ayanian
    Comments: Published in Southern California Robotics Symposium 2016
    Subjects: Artificial Intelligence (cs.AI); Robotics (cs.RO)

    Path planning for multiple robots is well studied in the AI and robotics
    communities. For a given discretized environment, robots need to find
    collision-free paths to a set of specified goal locations. Robots can be fully
    anonymous, non-anonymous, or organized in groups. Although powerful solvers for
    this abstract problem exist, they make simplifying assumptions by ignoring
    kinematic constraints, making it difficult to use the resulting plans on actual
    robots. In this paper, we present a solution which takes kinematic constraints,
    such as maximum velocities, into account, while guaranteeing a user-specified
    minimum safety distance between robots. We demonstrate our approach in
    simulation and on real robots in 2D and 3D environments.

    Learning of Human-like Algebraic Reasoning Using Deep Feedforward Neural Networks

    Cheng-Hao Cai, Dengfeng Ke, Yanyan Xu, Kaile Su
    Comments: 8 pages, 7 figures
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Logic in Computer Science (cs.LO)

    There is a wide gap between symbolic reasoning and deep learning. In this
    research, we explore the possibility of using deep learning to improve symbolic
    reasoning. Briefly, in a reasoning system, a deep feedforward neural network is
    used to guide rewriting processes after learning from algebraic reasoning
    examples produced by humans. To enable the neural network to recognise patterns
    of algebraic expressions with non-deterministic sizes, reduced partial trees
    are used to represent the expressions. Also, to represent both top-down and
    bottom-up information of the expressions, a centralisation technique is used to
    improve the reduced partial trees. Besides, symbolic association vectors and
    rule application records are used to improve the rewriting processes.
    Experimental results reveal that the algebraic reasoning examples can be
    accurately learnt only if the feedforward neural network has enough hidden
    layers. Also, the centralisation technique, the symbolic association vectors
    and the rule application records can reduce error rates of reasoning. In
    particular, the above approaches have led to 4.6% error rate of reasoning on a
    dataset of linear equations, differentials and integrals.

    Leveraging Patient Similarity and Time Series Data in Healthcare Predictive Models

    Amin Morid, Olivia R. Liu Sheng, Samir Abdelrahman
    Comments: 10 pages, Proc. of 7th Annual Workshop on Health IT and Economics (WHITE) (Eds.). Washington, DC., 2016
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG)

    Patient time series classification faces challenges in high degrees of
    dimensionality and missingness. In light of patient similarity theory, this
    study explores effective temporal feature engineering and reduction, missing
    value imputation, and change point detection methods that can afford
    similarity-based classification models with desirable accuracy enhancement. We
    select a piecewise aggregation approximation method to extract fine-grain
    temporal features and propose a minimalist method to impute missing values in
    temporal features. For dimensionality reduction, we adopt a gradient descent
    search method for feature weight assignment. We propose new patient status and
    directional change definitions based on medical knowledge or clinical
    guidelines about the value ranges for different patient status levels, and
    develop a method to detect change points indicating positive or negative
    patient status changes. We evaluate the effectiveness of the proposed methods
    in the context of early Intensive Care Unit mortality prediction. The
    evaluation results show that the k-Nearest Neighbor algorithm that incorporates
    methods we select and propose significantly outperform the relevant benchmarks
    for early ICU mortality prediction. This study makes contributions to time
    series classification and early ICU mortality prediction via identifying and
    enhancing temporal feature engineering and reduction methods for
    similarity-based time series classification.

    Learning from Ontology Streams with Semantic Concept Drift

    Freddy Lecue, Jiaoyan Chen, Jeff Pan, Huajun Chen
    Subjects: Artificial Intelligence (cs.AI)

    Data stream learning has been largely studied for extracting knowledge
    structures from continuous and rapid data records. In the semantic Web, data is
    interpreted in ontologies and its ordered sequence is represented as an
    ontology stream. Our work exploits the semantics of such streams to tackle the
    problem of concept drift i.e., unexpected changes in data distribution, causing
    most of models to be less accurate as time passes. To this end we revisited (i)
    semantic inference in the context of supervised stream learning, and (ii)
    models with semantic embeddings. The experiments show accurate prediction with
    data from Dublin and Beijing.

    Fine-Grained Entity Typing with High-Multiplicity Assignments

    Maxim Rabinovich, Dan Klein
    Comments: ACL 2017
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Learning (cs.LG); Machine Learning (stat.ML)

    As entity type systems become richer and more fine-grained, we expect the
    number of types assigned to a given entity to increase. However, most
    fine-grained typing work has focused on datasets that exhibit a low degree of
    type multiplicity. In this paper, we consider the high-multiplicity regime
    inherent in data sources such as Wikipedia that have semi-open type systems. We
    introduce a set-prediction approach to this problem and show that our model
    outperforms unstructured baselines on a new Wikipedia-based fine-grained typing
    corpus.

    280 Birds with One Stone: Inducing Multilingual Taxonomies from Wikipedia using Character-level Classification

    Amit Gupta, Rémi Lebret, Hamza Harkous, Karl Aberer
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)

    We propose a simple, yet effective, approach towards inducing multilingual
    taxonomies from Wikipedia. Given an English taxonomy, our approach leverages
    the interlanguage links of Wikipedia followed by character-level classifiers to
    induce high-precision, high-coverage taxonomies in other languages. Through
    experiments, we demonstrate that our approach significantly outperforms the
    state-of-the-art, heuristics-heavy approaches for six languages. As a
    consequence of our work, we release presumably the largest and the most
    accurate multilingual taxonomic resource spanning over 280 languages.

    Abstract Syntax Networks for Code Generation and Semantic Parsing

    Maxim Rabinovich, Mitchell Stern, Dan Klein
    Comments: ACL 2017. MR and MS contributed equally
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    Tasks like code generation and semantic parsing require mapping unstructured
    (or partially structured) inputs to well-formed, executable outputs. We
    introduce abstract syntax networks, a modeling framework for these problems.
    The outputs are represented as abstract syntax trees (ASTs) and constructed by
    a decoder with a dynamically-determined modular structure paralleling the
    structure of the output tree. On the benchmark Hearthstone dataset for code
    generation, our model obtains 79.2 BLEU and 22.7% exact match accuracy,
    compared to previous state-of-the-art values of 67.1 and 6.1%. Furthermore, we
    perform competitively on the Atis, Jobs, and Geo semantic parsing datasets with
    no task-specific engineering.

    Multi-Task Video Captioning with Video and Entailment Generation

    Ramakanth Pasunuru, Mohit Bansal
    Comments: Accepted at ACL 2017 (13 pages w/ supplementary)
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

    Video captioning, the task of describing the content of a video, has seen
    some promising improvements in recent years with sequence-to-sequence models,
    but accurately learning the temporal and logical dynamics involved in the task
    still remains a challenge, especially given the lack of sufficient annotated
    data. We improve video captioning by sharing knowledge with two related
    directed-generation tasks: a temporally-directed unsupervised video prediction
    task to learn richer context-aware video encoder representations, and a
    logically-directed language entailment generation task to learn better
    video-entailing caption decoder representations. For this, we present a
    many-to-many multi-task learning model that shares parameters across the
    encoders and decoders of the three tasks. We achieve significant improvements
    and the new state-of-the-art on several standard video captioning datasets
    using diverse automatic and human evaluations. We also show mutual multi-task
    improvements on the entailment generation task.

    GaKCo: a Fast GApped k-mer string Kernel using COunting

    Ritambhara Singh, Arshdeep Sekhon, Kamran Kowsari, Jack Lanchantin, Beilun Wang, Yanjun Qi
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computational Complexity (cs.CC); Computation and Language (cs.CL); Data Structures and Algorithms (cs.DS)

    String Kernel (SK) techniques, especially those using gapped (k)-mers as
    features (gk), have obtained great success in classifying sequences like DNA,
    protein, and text. However, the state-of-the-art gk-SK runs extremely slow when
    we increase the dictionary size ((Sigma)) or allow more mismatches ((M)). This
    is because current gk-SK uses a trie-based algorithm to calculate co-occurrence
    of mismatched substrings resulting in a time cost proportional to
    (O(Sigma^{M})). We propose a extbf{fast} algorithm for calculating
    underline{Ga}pped (k)-mer underline{K}ernel using underline{Co}unting
    (GaKCo). GaKCo uses associative arrays to calculate the co-occurrence of
    substrings using cumulative counting. This algorithm is fast, scalable to
    larger (Sigma) and (M), and naturally parallelizable. We provide a rigorous
    asymptotic analysis that compares GaKCo with the state-of-the-art gk-SK.
    Theoretically, the time cost of GaKCo is independent of the (Sigma^{M}) term
    that slows down the trie-based approach. Experimentally, we observe that GaKCo
    achieves the same accuracy as the state-of-the-art and outperforms its speed by
    factors of 2, 100, and 4, on classifying sequences of DNA (5 datasets), protein
    (12 datasets), and character-based English text (2 datasets), respectively

    GaKCo is shared as an open source tool at
    url{this https URL}

    Can Saliency Information Benefit Image Captioning Models?

    Hamed R. Tavakoli, Rakshith Shetty, Ali Borji, Jorma Laaksonen
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

    To bridge the gap between humans and machines in image understanding and
    describing, we need further insight into how people describe a perceived scene.
    In this paper, we study the agreement between bottom-up saliency-based visual
    attention and object referrals in scene description constructs. We investigate
    the properties of human-written descriptions and machine-generated ones. We
    then propose a saliency-boosted image captioning model in order to investigate
    benefits from low-level cues in language models. We learn that (1) humans
    mention more salient objects earlier than less salient ones in their
    descriptions, (2) the better a captioning model performs, the better attention
    agreement it has with human descriptions, (3) the proposed saliency-boosted
    model, compared to its baseline form, does not improve significantly on the MS
    COCO database, indicating explicit bottom-up boosting does not help when the
    task is well learnt and tuned on a data, (4) a better generalization ability
    is, however, observed for the saliency-boosted model on unseen data.

    Prominent Object Detection and Recognition: A Saliency-based Pipeline

    Hamed R. Tavakoli, Jorma Laaksonen
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

    This manuscript introduces the problem of prominent object detection and
    recognition. The problem deals with finding the most important region of
    interest, segmenting the relevant item/object in that area, and assigning it an
    object class label. In other words, we are solving the three problems of
    saliency modeling, saliency detection, and object recognition under one
    umbrella. The motivation behind such a problem formulation is 1) the benefits
    to the knowledge representation-based vision pipelines, and 2) the potential
    improvements in emulating bio-inspired vision systems by solving these three
    problems together. We propose a method as a baseline for further research. The
    proposed model predicts the most important area in the image, segments the
    associated objects, and labels them. The proposed problem and method are
    evaluated against human fixations, annotated segmentation masks, and object
    class categories. We define a chance level for each of the evaluation criterion
    to compare the proposed algorithm with. Despite the good performance of the
    proposed baseline, the overall evaluations indicate that the problem of
    prominent object detection and recognition is a challenging task that is still
    worth investigating further.


    Information Retrieval

    User Profile Based Research Paper Recommendation

    Harshita Sahijwani, Sourish Dasgupta
    Comments: Work in progress. arXiv admin note: text overlap with arXiv:1611.04822
    Subjects: Information Retrieval (cs.IR)

    We design a recommender system for research papers based on topic-modeling.
    The users feedback to the results is used to make the results more relevant the
    next time they fire a query. The user’s needs are understood by observing the
    change in the themes that the user shows a preference for over time.

    Fine-Grained Entity Typing with High-Multiplicity Assignments

    Maxim Rabinovich, Dan Klein
    Comments: ACL 2017
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Learning (cs.LG); Machine Learning (stat.ML)

    As entity type systems become richer and more fine-grained, we expect the
    number of types assigned to a given entity to increase. However, most
    fine-grained typing work has focused on datasets that exhibit a low degree of
    type multiplicity. In this paper, we consider the high-multiplicity regime
    inherent in data sources such as Wikipedia that have semi-open type systems. We
    introduce a set-prediction approach to this problem and show that our model
    outperforms unstructured baselines on a new Wikipedia-based fine-grained typing
    corpus.

    Taxonomy Induction using Hypernym Subsequences

    Amit Gupta, Rémi Lebret, Hamza Harkous, Karl Aberer
    Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Information Retrieval (cs.IR)

    We propose a novel, semi-supervised approach towards domain taxonomy
    induction from an input vocabulary of seed terms. Unlike most previous
    approaches, which typically extract direct hypernym edges for terms, our
    approach utilizes a novel probabilistic framework to extract hypernym
    subsequences. Taxonomy induction from extracted subsequences is cast as an
    instance of the minimum-cost flow problem on a carefully designed directed
    graph. Through experiments, we demonstrate that our approach outperforms
    state-of-the-art taxonomy induction approaches across four languages.
    Furthermore, we show that our approach is robust to the presence of noise in
    the input vocabulary.

    280 Birds with One Stone: Inducing Multilingual Taxonomies from Wikipedia using Character-level Classification

    Amit Gupta, Rémi Lebret, Hamza Harkous, Karl Aberer
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)

    We propose a simple, yet effective, approach towards inducing multilingual
    taxonomies from Wikipedia. Given an English taxonomy, our approach leverages
    the interlanguage links of Wikipedia followed by character-level classifiers to
    induce high-precision, high-coverage taxonomies in other languages. Through
    experiments, we demonstrate that our approach significantly outperforms the
    state-of-the-art, heuristics-heavy approaches for six languages. As a
    consequence of our work, we release presumably the largest and the most
    accurate multilingual taxonomic resource spanning over 280 languages.


    Computation and Language

    Fine-Grained Entity Typing with High-Multiplicity Assignments

    Maxim Rabinovich, Dan Klein
    Comments: ACL 2017
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Learning (cs.LG); Machine Learning (stat.ML)

    As entity type systems become richer and more fine-grained, we expect the
    number of types assigned to a given entity to increase. However, most
    fine-grained typing work has focused on datasets that exhibit a low degree of
    type multiplicity. In this paper, we consider the high-multiplicity regime
    inherent in data sources such as Wikipedia that have semi-open type systems. We
    introduce a set-prediction approach to this problem and show that our model
    outperforms unstructured baselines on a new Wikipedia-based fine-grained typing
    corpus.

    280 Birds with One Stone: Inducing Multilingual Taxonomies from Wikipedia using Character-level Classification

    Amit Gupta, Rémi Lebret, Hamza Harkous, Karl Aberer
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)

    We propose a simple, yet effective, approach towards inducing multilingual
    taxonomies from Wikipedia. Given an English taxonomy, our approach leverages
    the interlanguage links of Wikipedia followed by character-level classifiers to
    induce high-precision, high-coverage taxonomies in other languages. Through
    experiments, we demonstrate that our approach significantly outperforms the
    state-of-the-art, heuristics-heavy approaches for six languages. As a
    consequence of our work, we release presumably the largest and the most
    accurate multilingual taxonomic resource spanning over 280 languages.

    Joint POS Tagging and Dependency Parsing with Transition-based Neural Networks

    Liner Yang, Meishan Zhang, Yang Liu, Nan Yu, Maosong Sun, Guohong Fu
    Subjects: Computation and Language (cs.CL)

    While part-of-speech (POS) tagging and dependency parsing are observed to be
    closely related, existing work on joint modeling with manually crafted feature
    templates suffers from the feature sparsity and incompleteness problems. In
    this paper, we propose an approach to joint POS tagging and dependency parsing
    using transition-based neural networks. Three neural network based classifiers
    are designed to resolve shift/reduce, tagging, and labeling conflicts.
    Experiments show that our approach significantly outperforms previous methods
    for joint POS tagging and dependency parsing across a variety of natural
    languages.

    Adversarial Multi-Criteria Learning for Chinese Word Segmentation

    Xinchi Chen, Zhan Shi, Xipeng Qiu, Xuanjing Huang
    Subjects: Computation and Language (cs.CL)

    Different linguistic perspectives causes many diverse segmentation criteria
    for Chinese word segmentation (CWS). Most existing methods focus on improve the
    performance for each single criterion. However, it is interesting to exploit
    these different criteria and mining their common underlying knowledge. In this
    paper, we propose adversarial multi-criteria learning for CWS by integrating
    shared knowledge from multiple heterogeneous segmentation criteria. Experiments
    on eight corpora with heterogeneous segmentation criteria show that the
    performance of each corpus obtains a significant improvement, compared to
    single-criterion learning. Source codes of this paper are available on Github.

    Abstract Syntax Networks for Code Generation and Semantic Parsing

    Maxim Rabinovich, Mitchell Stern, Dan Klein
    Comments: ACL 2017. MR and MS contributed equally
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    Tasks like code generation and semantic parsing require mapping unstructured
    (or partially structured) inputs to well-formed, executable outputs. We
    introduce abstract syntax networks, a modeling framework for these problems.
    The outputs are represented as abstract syntax trees (ASTs) and constructed by
    a decoder with a dynamically-determined modular structure paralleling the
    structure of the output tree. On the benchmark Hearthstone dataset for code
    generation, our model obtains 79.2 BLEU and 22.7% exact match accuracy,
    compared to previous state-of-the-art values of 67.1 and 6.1%. Furthermore, we
    perform competitively on the Atis, Jobs, and Geo semantic parsing datasets with
    no task-specific engineering.

    Multi-Task Video Captioning with Video and Entailment Generation

    Ramakanth Pasunuru, Mohit Bansal
    Comments: Accepted at ACL 2017 (13 pages w/ supplementary)
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

    Video captioning, the task of describing the content of a video, has seen
    some promising improvements in recent years with sequence-to-sequence models,
    but accurately learning the temporal and logical dynamics involved in the task
    still remains a challenge, especially given the lack of sufficient annotated
    data. We improve video captioning by sharing knowledge with two related
    directed-generation tasks: a temporally-directed unsupervised video prediction
    task to learn richer context-aware video encoder representations, and a
    logically-directed language entailment generation task to learn better
    video-entailing caption decoder representations. For this, we present a
    many-to-many multi-task learning model that shares parameters across the
    encoders and decoders of the three tasks. We achieve significant improvements
    and the new state-of-the-art on several standard video captioning datasets
    using diverse automatic and human evaluations. We also show mutual multi-task
    improvements on the entailment generation task.

    Streaming Word Embeddings with the Space-Saving Algorithm

    Chandler May, Kevin Duh, Benjamin Van Durme, Ashwin Lall
    Comments: 16 pages
    Subjects: Computation and Language (cs.CL)

    We develop a streaming (one-pass, bounded-memory) word embedding algorithm
    based on the canonical skip-gram with negative sampling algorithm implemented
    in word2vec. We compare our streaming algorithm to word2vec empirically by
    measuring the cosine similarity between word pairs under each algorithm and by
    applying each algorithm in the downstream task of hashtag prediction on a
    two-month interval of the Twitter sample stream. We then discuss the results of
    these experiments, concluding they provide partial validation of our approach
    as a streaming replacement for word2vec. Finally, we discuss potential failure
    modes and suggest directions for future work.

    Detecting English Writing Styles For Non Native Speakers

    Yanging Chen, Rami Al-Rfou', Yejin Choi
    Comments: 9 figures, 5 tables, 9 pages
    Subjects: Computation and Language (cs.CL)

    This paper presents the first attempt, up to our knowledge, to classify
    English writing styles on this scale with the challenge of classifying day to
    day language written by writers with different backgrounds covering various
    areas of topics.The paper proposes simple machine learning algorithms and
    simple to generate features to solve hard problems. Relying on the scale of the
    data available from large sources of knowledge like Wikipedia. We believe such
    sources of data are crucial to generate robust solutions for the web with high
    accuracy and easy to deploy in practice. The paper achieves 74\% accuracy
    classifying native versus non native speakers writing styles.

    Moreover, the paper shows some interesting observations on the similarity
    between different languages measured by the similarity of their users English
    writing styles. This technique could be used to show some well known facts
    about languages as in grouping them into families, which our experiments
    support.

    A Challenge Set Approach to Evaluating Machine Translation

    Pierre Isabelle, Colin Cherry, George Foster
    Comments: 26 pages, including appendix
    Subjects: Computation and Language (cs.CL)

    Neural machine translation represents an exciting leap forward in translation
    quality. But what longstanding weaknesses does it resolve, and which remain? We
    address these questions with a challenge set approach to translation evaluation
    and error analysis. A challenge set consists of a small set of sentences, each
    hand-designed to probe a system’s capacity to bridge a particular structural
    divergence between languages. To exemplify this approach, we present an
    English-French challenge set, and use it to analyze phrase-based and neural
    systems. The resulting analysis provides not only a more fine-grained picture
    of the strengths of neural systems, but also insight into which linguistic
    phenomena remain out of reach.

    Recognizing Descriptive Wikipedia Categories for Historical Figures

    Yanqing Chen, Steven Skiena
    Comments: 9 pages, 6 tables, 5 figures
    Subjects: Computation and Language (cs.CL)

    Wikipedia is a useful knowledge source that benefits many applications in
    language processing and knowledge representation. An important feature of
    Wikipedia is that of categories. Wikipedia pages are assigned different
    categories according to their contents as human-annotated labels which can be
    used in information retrieval, ad hoc search improvements, entity ranking and
    tag recommendations. However, important pages are usually assigned too many
    categories, which makes it difficult to recognize the most important ones that
    give the best descriptions.

    In this paper, we propose an approach to recognize the most descriptive
    Wikipedia categories. We observe that historical figures in a precise category
    presumably are mutually similar and such categorical coherence could be
    evaluated via texts or Wikipedia links of corresponding members in the
    category. We rank descriptive level of Wikipedia categories according to their
    coherence and our ranking yield an overall agreement of 88.27% compared with
    human wisdom.

    Ruminating Reader: Reasoning with Gated Multi-Hop Attention

    Yichen Gong, Samuel R. Bowman
    Comments: 10 pages, 6 figures
    Subjects: Computation and Language (cs.CL)

    To answer the question in machine comprehension (MC) task, the models need to
    establish the interaction between the question and the context. To tackle the
    problem that the single-pass model cannot reflect on and correct its answer, we
    present Ruminating Reader. Ruminating Reader adds a second pass of attention
    and a novel information fusion component to the Bi-Directional Attention Flow
    model (BiDAF). We propose novel layer structures that construct an query-aware
    context vector representation and fuse encoding representation with
    intermediate representation on top of BiDAF model. We show that a multi-hop
    attention mechanism can be applied to a bi-directional attention structure. In
    experiments on SQuAD, we find that the Reader outperforms the BiDAF baseline by
    a substantial margin, and matches or surpasses the performance of all other
    published systems.

    Predicting Native Language from Gaze

    Yevgeni Berzak, Chie Nakamura, Suzanne Flynn, Boris Katz
    Comments: ACL 2017
    Subjects: Computation and Language (cs.CL)

    A fundamental question in language learning concerns the role of a speaker’s
    first language in second language acquisition. We present a novel methodology
    for studying this question: analysis of eye-movement patterns in second
    language reading of free-form text. Using this methodology, we demonstrate for
    the first time that the native language of English learners can be predicted
    from their gaze fixations when reading English. We provide analysis of
    classifier uncertainty and learned features, which indicates that differences
    in English reading are likely to be rooted in linguistic divergences across
    native languages. The presented framework complements production studies and
    offers new ground for advancing research on multilingualism.

    Email Babel: Does Language Affect Criminal Activity in Compromised Webmail Accounts?

    Emeric Bernard-Jones, Jeremiah Onaolapo, Gianluca Stringhini
    Subjects: Computers and Society (cs.CY); Computation and Language (cs.CL)

    We set out to understand the effects of differing language on the ability of
    cybercriminals to navigate webmail accounts and locate sensitive information in
    them. To this end, we configured thirty Gmail honeypot accounts with English,
    Romanian, and Greek language settings. We populated the accounts with email
    messages in those languages by subscribing them to selected online newsletters.
    We hid email messages about fake bank accounts in fifteen of the accounts to
    mimic real-world webmail users that sometimes store sensitive information in
    their accounts. We then leaked credentials to the honey accounts via paste
    sites on the Surface Web and the Dark Web, and collected data for fifteen days.
    Our statistical analyses on the data show that cybercriminals are more likely
    to discover sensitive information (bank account information) in the Greek
    accounts than the remaining accounts, contrary to the expectation that Greek
    ought to constitute a barrier to the understanding of non-Greek visitors to the
    Greek accounts. We also extracted the important words among the emails that
    cybercriminals accessed (as an approximation of the keywords that they searched
    for within the honey accounts), and found that financial terms featured among
    the top words. In summary, we show that language plays a significant role in
    the ability of cybercriminals to access sensitive information hidden in
    compromised webmail accounts.

    DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning

    Xiaodong Gu, Hongyu Zhang, Dongmei Zhang, Sunghun Kim
    Comments: Accepted at IJCAI 2017 (The 26th International Joint Conference on Artificial Intelligence)
    Subjects: Software Engineering (cs.SE); Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE)

    Computer programs written in one language are often required to be ported to
    other languages to support multiple devices and environments. When programs use
    language specific APIs (Application Programming Interfaces), it is very
    challenging to migrate these APIs to the corresponding APIs written in other
    languages. Existing approaches mine API mappings from projects that have
    corresponding versions in two languages. They rely on the sparse availability
    of bilingual projects, thus producing a limited number of API mappings. In this
    paper, we propose an intelligent system called DeepAM for automatically mining
    API mappings from a large-scale code corpus without bilingual projects. The key
    component of DeepAM is based on the multimodal sequence to sequence learning
    architecture that aims to learn joint semantic representations of bilingual API
    sequences from big source code data. Experimental results indicate that DeepAM
    significantly increases the accuracy of API mappings as well as the number of
    API mappings, when compared with the state-of-the-art approaches.

    Taxonomy Induction using Hypernym Subsequences

    Amit Gupta, Rémi Lebret, Hamza Harkous, Karl Aberer
    Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Information Retrieval (cs.IR)

    We propose a novel, semi-supervised approach towards domain taxonomy
    induction from an input vocabulary of seed terms. Unlike most previous
    approaches, which typically extract direct hypernym edges for terms, our
    approach utilizes a novel probabilistic framework to extract hypernym
    subsequences. Taxonomy induction from extracted subsequences is cast as an
    instance of the minimum-cost flow problem on a carefully designed directed
    graph. Through experiments, we demonstrate that our approach outperforms
    state-of-the-art taxonomy induction approaches across four languages.
    Furthermore, we show that our approach is robust to the presence of noise in
    the input vocabulary.

    GaKCo: a Fast GApped k-mer string Kernel using COunting

    Ritambhara Singh, Arshdeep Sekhon, Kamran Kowsari, Jack Lanchantin, Beilun Wang, Yanjun Qi
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computational Complexity (cs.CC); Computation and Language (cs.CL); Data Structures and Algorithms (cs.DS)

    String Kernel (SK) techniques, especially those using gapped (k)-mers as
    features (gk), have obtained great success in classifying sequences like DNA,
    protein, and text. However, the state-of-the-art gk-SK runs extremely slow when
    we increase the dictionary size ((Sigma)) or allow more mismatches ((M)). This
    is because current gk-SK uses a trie-based algorithm to calculate co-occurrence
    of mismatched substrings resulting in a time cost proportional to
    (O(Sigma^{M})). We propose a extbf{fast} algorithm for calculating
    underline{Ga}pped (k)-mer underline{K}ernel using underline{Co}unting
    (GaKCo). GaKCo uses associative arrays to calculate the co-occurrence of
    substrings using cumulative counting. This algorithm is fast, scalable to
    larger (Sigma) and (M), and naturally parallelizable. We provide a rigorous
    asymptotic analysis that compares GaKCo with the state-of-the-art gk-SK.
    Theoretically, the time cost of GaKCo is independent of the (Sigma^{M}) term
    that slows down the trie-based approach. Experimentally, we observe that GaKCo
    achieves the same accuracy as the state-of-the-art and outperforms its speed by
    factors of 2, 100, and 4, on classifying sequences of DNA (5 datasets), protein
    (12 datasets), and character-based English text (2 datasets), respectively

    GaKCo is shared as an open source tool at
    url{this https URL}


    Distributed, Parallel, and Cluster Computing

    Fast Space Optimal Leader Election in Population Protocols

    Leszek Gasieniec, Grzegorz Stachowiak
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    The model of population protocols refers to the growing in popularity
    theoretical framework suitable for studying pairwise interactions within a
    large collection of simple indistinguishable entities, frequently called
    agents. In this paper the emphasis is on the space complexity in fast leader
    election via population protocols governed by the random scheduler, which
    uniformly at random selects pairwise interactions within the population of n
    agents.

    The main result of this paper is a new fast and space optimal leader election
    protocol. The new protocol utilises O(log^2 n) parallel time (which is
    equivalent to O(n log^2 n) sequential pairwise interactions), and each agent
    operates on O(log log n) states. This double logarithmic space usage matches
    asymptotically the lower bound 1/2 log log n on the minimal number of states
    required by agents in any leader election algorithm with the running time
    o(n/polylog n).

    Our solution takes an advantage of the concept of phase clocks, a fundamental
    synchronisation and coordination tool in distributed computing. We propose a
    new fast and robust population protocol for initialisation of phase clocks to
    be run simultaneously in multiple modes and intertwined with the leader
    election process. We also provide the reader with the relevant formal
    argumentation indicating that our solution is always correct, and fast with
    high probability.

    A decentralized proximal-gradient method with network independent step-sizes and separated convergence rates

    Zhi Li, Wei Shi, Ming Yan
    Subjects: Optimization and Control (math.OC); Distributed, Parallel, and Cluster Computing (cs.DC); Learning (cs.LG); Numerical Analysis (math.NA); Machine Learning (stat.ML)

    This paper considers the problem of decentralized optimization with a
    composite objective containing smooth and non-smooth terms. To solve the
    problem, a proximal-gradient scheme is studied. Specifically, the smooth and
    nonsmooth terms are dealt with by gradient update and proximal update,
    respectively. The studied algorithm is closely related to a previous
    decentralized optimization algorithm, PG-EXTRA [37], but has a few advantages.
    First of all, in our new scheme, agents use uncoordinated step-sizes and the
    stable upper bounds on step-sizes are independent from network topology. The
    step-sizes depend on local objective functions, and they can be as large as
    that of the gradient descent. Secondly, for the special case without non-smooth
    terms, linear convergence can be achieved under the strong convexity
    assumption. The dependence of the convergence rate on the objective functions
    and the network are separated, and the convergence rate of our new scheme is as
    good as one of the two convergence rates that match the typical rates for the
    general gradient descent and the consensus averaging. We also provide some
    numerical experiments to demonstrate the efficacy of the introduced algorithms
    and validate our theoretical discoveries.


    Learning

    FWDA: a Fast Wishart Discriminant Analysis with its Application to Electronic Health Records Data Classification

    Haoyi Xiong, Wei Cheng, Wenqing Hu, Jiang Bian, Zhishan Guo
    Subjects: Learning (cs.LG)

    Linear Discriminant Analysis (LDA) on Electronic Health Records (EHR) data is
    widely-used for early detection of diseases. Classical LDA for EHR data
    classification, however, suffers from two handicaps: the ill-posed estimation
    of LDA parameters (e.g., covariance matrix), and the “linear inseparability” of
    EHR data. To handle these two issues, in this paper, we propose a novel
    classifier FWDA — Fast Wishart Discriminant Analysis, that makes predictions
    in an ensemble way. Specifically, FWDA first surrogates the distribution of
    inverse covariance matrices using a Wishart distribution estimated from the
    training data, then “weighted-averages” the classification results of multiple
    LDA classifiers parameterized by the sampled inverse covariance matrices via a
    Bayesian Voting scheme. The weights for voting are optimally updated to adapt
    each new input data, so as to enable the nonlinear classification. Theoretical
    analysis indicates that FWDA possesses a fast convergence rate and a robust
    performance on high dimensional data. Extensive experiments on large-scale EHR
    dataset show that our approach outperforms state-of-the-art algorithms by a
    large margin.

    Automatic Anomaly Detection in the Cloud Via Statistical Learning

    Jordan Hochenbaum, Owen S. Vallis, Arun Kejariwal
    Comments: 13 pages, 12 figures
    Subjects: Learning (cs.LG)

    Performance and high availability have become increasingly important drivers,
    amongst other drivers, for user retention in the context of web services such
    as social networks, and web search. Exogenic and/or endogenic factors often
    give rise to anomalies, making it very challenging to maintain high
    availability, while also delivering high performance. Given that
    service-oriented architectures (SOA) typically have a large number of services,
    with each service having a large set of metrics, automatic detection of
    anomalies is non-trivial.

    Although there exists a large body of prior research in anomaly detection,
    existing techniques are not applicable in the context of social network data,
    owing to the inherent seasonal and trend components in the time series data.

    To this end, we developed two novel statistical techniques for automatically
    detecting anomalies in cloud infrastructure data. Specifically, the techniques
    employ statistical learning to detect anomalies in both application, and system
    metrics. Seasonal decomposition is employed to filter the trend and seasonal
    components of the time series, followed by the use of robust statistical
    metrics — median and median absolute deviation (MAD) — to accurately detect
    anomalies, even in the presence of seasonal spikes.

    We demonstrate the efficacy of the proposed techniques from three different
    perspectives, viz., capacity planning, user behavior, and supervised learning.
    In particular, we used production data for evaluation, and we report Precision,
    Recall, and F-measure in each case.

    An All-Pair Approach for Big Data Multiclass Classification with Quantum SVM

    Arit Kumar Bishwas, Ashish Mani, Vasile Palade
    Subjects: Learning (cs.LG); Quantum Physics (quant-ph)

    In this paper we have discussed a quantum approach for the all-pair
    multiclass classification problem. We have shown that the multiclass support
    vector machine for big data classification with a quantum all-pair approach can
    be implemented in logarithm time complexity on a quantum computer. In an
    all-pair approach, there is one binary classification problem for each pair of
    classes, and so there are k (k-1)/2 classifiers for a k-class problem. As
    compared to the classical multiclass support vector machine that can be
    implemented with polynomial time complexity, our approach exhibits exponential
    speed up in the quantum version. The quantum all-pair algorithm can be used
    with other classification algorithms, and a speed up gain can be achieved as
    compared to their classical counterparts.

    Decision Stream: Cultivating Deep Decision Trees

    Dmitry Ignatov, Andrey Ignatov
    Subjects: Learning (cs.LG)

    Various modifications of decision trees have been extensively used during the
    past years due to their high efficiency and interpretability. Selection of
    relevant features for spitting the tree nodes is a key property of their
    architecture, at the same time being their major shortcoming: the recursive
    nodes partitioning leads to geometric reduction of data quantity in the leaf
    nodes, which causes an excessive model complexity and data overfitting. In this
    paper, we present a novel architecture – a Decision Stream, – aimed to overcome
    this problem. Instead of building an acyclic tree structure during the training
    process, we propose merging nodes from different branches based on their
    similarity that is estimated with two-sample test statistics. To evaluate the
    proposed solution, we test it on several common machine learning problems~—
    credit scoring, twitter sentiment analysis, aircraft flight control, MNIST and
    CIFAR image classification, synthetic data classification and regression. Our
    experimental results reveal that the proposed approach significantly
    outperforms the standard decision tree method on both regression and
    classification tasks, yielding a prediction error decrease up to 35%.

    Deep Over-sampling Framework for Classifying Imbalanced Data

    Shin Ando, Chun Yuan Huang
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    Class imbalance is a challenging issue in practical classification problems
    for deep learning models as well as traditional models. Traditionally
    successful countermeasures such as synthetic over-sampling have had limited
    success with complex, structured data handled by deep learning models. In this
    paper, we propose Deep Over-sampling (DOS), a framework for extending the
    synthetic over-sampling method to exploit the deep feature space acquired by a
    convolutional neural network (CNN). Its key feature is an explicit, supervised
    representation learning, for which the training data presents each raw input
    sample with a synthetic embedding target in the deep feature space . which is
    sampled from the linear subspace of in-class neighbors. We implement an
    iterative process of training the CNN and updating the targets, which induces
    smaller in-class variance among the embeddings, to increase the discriminative
    power of the deep representation. We present an empirical study using public
    benchmarks, which shows that the DOS framework not only counteracts class
    imbalance better than the existing method, but also improves the performance of
    the CNN in the standard, balanced settings.

    Scalable Planning with Tensorflow for Hybrid Nonlinear Domains

    Ga Wu, Buser Say, Scott Sanner
    Comments: 8 pages
    Subjects: Learning (cs.LG)

    Given recent deep learning results that demonstrate the ability to
    effectively optimize high-dimensional non-convex functions with gradient
    descent optimization on GPUs, we ask in this paper whether symbolic gradient
    optimization tools such as Tensorflow can be effective for planning in hybrid
    (mixed discrete and continuous) nonlinear domains with high dimensional state
    and action spaces? To this end, we demonstrate that hybrid planning with
    Tensorflow and RMSProp gradient descent is competitive with mixed integer
    linear program (MILP) based optimization on piecewise linear planning domains
    (where we can compute optimal solutions) and substantially outperforms
    state-of-the-art interior point methods for nonlinear planning domains.
    Furthermore, we remark that Tensorflow is highly scalable, converging to a
    strong policy on a large-scale concurrent domain with a total of 576,000
    continuous actions over a horizon of 96 time steps in only 4 minutes. We
    provide a number of insights that clarify such strong performance including
    observations that despite long horizons, RMSProp avoids both the vanishing and
    exploding gradients problem. Together these results suggest a new frontier for
    highly scalable planning in nonlinear hybrid domains by leveraging GPUs and the
    power of recent advances in gradient descent with highly optmized toolkits like
    Tensorflow.

    Some Like it Hoax: Automated Fake News Detection in Social Networks

    Eugenio Tacchini, Gabriele Ballarin, Marco L. Della Vedova, Stefano Moret, Luca de Alfaro
    Subjects: Learning (cs.LG); Human-Computer Interaction (cs.HC); Social and Information Networks (cs.SI)

    In recent years, the reliability of information on the Internet has emerged
    as a crucial issue of modern society. Social network sites (SNSs) have
    revolutionized the way in which information is spread by allowing users to
    freely share content. As a consequence, SNSs are also increasingly used as
    vectors for the diffusion of misinformation and hoaxes. The amount of
    disseminated information and the rapidity of its diffusion make it practically
    impossible to assess reliability in a timely manner, highlighting the need for
    automatic hoax detection systems.

    As a contribution towards this objective, we show that Facebook posts can be
    classified with high accuracy as hoaxes or non-hoaxes on the basis of the users
    who “liked” them. We present two classification techniques, one based on
    logistic regression, the other on a novel adaptation of boolean crowdsourcing
    algorithms. On a dataset consisting of 15,500 Facebook posts and 909,236 users,
    we obtain classification accuracies exceeding 99% even when the training set
    contains less than 1% of the posts. We further show that our techniques are
    robust: they work even when we restrict our attention to the users who like
    both hoax and non-hoax posts. These results suggest that mapping the diffusion
    pattern of information can be a useful component of automatic hoax detection
    systems.

    PPMF: A Patient-based Predictive Modeling Framework for Early ICU Mortality Prediction

    Mohammad Amin Morid, Olivia R. Liu Sheng, Samir Abdelrahman
    Comments: 10 pages, Healthcare Analytics and Medical Decision Making, INFORMS Workshop. Nashville, Tennessee, 2016
    Subjects: Learning (cs.LG)

    To date, developing a good model for early intensive care unit (ICU)
    mortality prediction is still challenging. This paper presents a patient based
    predictive modeling framework (PPMF) to improve the performance of ICU
    mortality prediction using data collected during the first 48 hours of ICU
    admission. PPMF consists of three main components verifying three related
    research hypotheses. The first component captures dynamic changes of patients
    status in the ICU using their time series data (e.g., vital signs and
    laboratory tests). The second component is a local approximation algorithm that
    classifies patients based on their similarities. The third component is a
    Gradient Decent wrapper that updates feature weights according to the
    classification feedback. Experiments using data from MIMICIII show that PPMF
    significantly outperforms: (1) the severity score systems, namely SASP III,
    APACHE IV, and MPM0III, (2) the aggregation based classifiers that utilize
    summarized time series, and (3) baseline feature selection methods.

    Continuously Differentiable Exponential Linear Units

    Jonathan T. Barron
    Subjects: Learning (cs.LG)

    Exponential Linear Units (ELUs) are a useful rectifier for constructing deep
    learning architectures, as they may speed up and otherwise improve learning by
    virtue of not have vanishing gradients and by having mean activations near
    zero. However, the ELU activation as parametrized in [1] is not continuously
    differentiable with respect to its input when the shape parameter alpha is not
    equal to 1. We present an alternative parametrization which is C1 continuous
    for all values of alpha, making the rectifier easier to reason about and making
    alpha easier to tune. This alternative parametrization has several other useful
    properties that the original parametrization of ELU does not: 1) its derivative
    with respect to x is bounded, 2) it contains both the linear transfer function
    and ReLU as special cases, and 3) it is scale-similar with respect to alpha.

    GaKCo: a Fast GApped k-mer string Kernel using COunting

    Ritambhara Singh, Arshdeep Sekhon, Kamran Kowsari, Jack Lanchantin, Beilun Wang, Yanjun Qi
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computational Complexity (cs.CC); Computation and Language (cs.CL); Data Structures and Algorithms (cs.DS)

    String Kernel (SK) techniques, especially those using gapped (k)-mers as
    features (gk), have obtained great success in classifying sequences like DNA,
    protein, and text. However, the state-of-the-art gk-SK runs extremely slow when
    we increase the dictionary size ((Sigma)) or allow more mismatches ((M)). This
    is because current gk-SK uses a trie-based algorithm to calculate co-occurrence
    of mismatched substrings resulting in a time cost proportional to
    (O(Sigma^{M})). We propose a extbf{fast} algorithm for calculating
    underline{Ga}pped (k)-mer underline{K}ernel using underline{Co}unting
    (GaKCo). GaKCo uses associative arrays to calculate the co-occurrence of
    substrings using cumulative counting. This algorithm is fast, scalable to
    larger (Sigma) and (M), and naturally parallelizable. We provide a rigorous
    asymptotic analysis that compares GaKCo with the state-of-the-art gk-SK.
    Theoretically, the time cost of GaKCo is independent of the (Sigma^{M}) term
    that slows down the trie-based approach. Experimentally, we observe that GaKCo
    achieves the same accuracy as the state-of-the-art and outperforms its speed by
    factors of 2, 100, and 4, on classifying sequences of DNA (5 datasets), protein
    (12 datasets), and character-based English text (2 datasets), respectively

    GaKCo is shared as an open source tool at
    url{this https URL}

    Introspective Generative Modeling: Decide Discriminatively

    Justin Lazarow, Long Jin, Zhuowen Tu
    Comments: 10 pages, 9 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    We study unsupervised learning by developing introspective generative
    modeling (IGM) that attains a generator using progressively learned deep
    convolutional neural networks. The generator is itself a discriminator, capable
    of introspection: being able to self-evaluate the difference between its
    generated samples and the given training data. When followed by repeated
    discriminative learning, desirable properties of modern discriminative
    classifiers are directly inherited by the generator. IGM learns a cascade of
    CNN classifiers using a synthesis-by-classification algorithm. In the
    experiments, we observe encouraging results on a number of applications
    including texture modeling, artistic style transferring, face modeling, and
    semi-supervised learning.

    Introspective Classifier Learning: Empower Generatively

    Long Jin, Justin Lazarow, Zhuowen Tu
    Comments: 11 pages, 6 figure
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    In this paper we propose introspective classifier learning (ICL) that
    emphasizes the importance of having a discriminative classifier empowered with
    generative capabilities. We develop a reclassification-by-synthesis algorithm
    to perform training using a formulation stemmed from the Bayes theory. Our
    classifier is able to iteratively: (1) synthesize pseudo-negative samples in
    the synthesis step; and (2) enhance itself by improving the classification in
    the reclassification step. The single classifier learned is at the same time
    generative — being able to directly synthesize new samples within its own
    discriminative model. We conduct experiments on standard benchmark datasets
    including MNIST, CIFAR, and SVHN using state-of-the-art CNN architectures, and
    observe improved classification results.

    A decentralized proximal-gradient method with network independent step-sizes and separated convergence rates

    Zhi Li, Wei Shi, Ming Yan
    Subjects: Optimization and Control (math.OC); Distributed, Parallel, and Cluster Computing (cs.DC); Learning (cs.LG); Numerical Analysis (math.NA); Machine Learning (stat.ML)

    This paper considers the problem of decentralized optimization with a
    composite objective containing smooth and non-smooth terms. To solve the
    problem, a proximal-gradient scheme is studied. Specifically, the smooth and
    nonsmooth terms are dealt with by gradient update and proximal update,
    respectively. The studied algorithm is closely related to a previous
    decentralized optimization algorithm, PG-EXTRA [37], but has a few advantages.
    First of all, in our new scheme, agents use uncoordinated step-sizes and the
    stable upper bounds on step-sizes are independent from network topology. The
    step-sizes depend on local objective functions, and they can be as large as
    that of the gradient descent. Secondly, for the special case without non-smooth
    terms, linear convergence can be achieved under the strong convexity
    assumption. The dependence of the convergence rate on the objective functions
    and the network are separated, and the convergence rate of our new scheme is as
    good as one of the two convergence rates that match the typical rates for the
    general gradient descent and the consensus averaging. We also provide some
    numerical experiments to demonstrate the efficacy of the introduced algorithms
    and validate our theoretical discoveries.

    Fine-Grained Entity Typing with High-Multiplicity Assignments

    Maxim Rabinovich, Dan Klein
    Comments: ACL 2017
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Learning (cs.LG); Machine Learning (stat.ML)

    As entity type systems become richer and more fine-grained, we expect the
    number of types assigned to a given entity to increase. However, most
    fine-grained typing work has focused on datasets that exhibit a low degree of
    type multiplicity. In this paper, we consider the high-multiplicity regime
    inherent in data sources such as Wikipedia that have semi-open type systems. We
    introduce a set-prediction approach to this problem and show that our model
    outperforms unstructured baselines on a new Wikipedia-based fine-grained typing
    corpus.

    Single-Pass PCA of Large High-Dimensional Data

    Wenjian Yu, Yu Gu, Jian Li, Shenghua Liu, Yaohang Li
    Comments: IJCAI 2017, 16 pages, 6 figures
    Subjects: Data Structures and Algorithms (cs.DS); Learning (cs.LG); Numerical Analysis (math.NA)

    Principal component analysis (PCA) is a fundamental dimension reduction tool
    in statistics and machine learning. For large and high-dimensional data,
    computing the PCA (i.e., the singular vectors corresponding to a number of
    dominant singular values of the data matrix) becomes a challenging task. In
    this work, a single-pass randomized algorithm is proposed to compute PCA with
    only one pass over the data. It is suitable for processing extremely large and
    high-dimensional data stored in slow memory (hard disk) or the data generated
    in a streaming fashion. Experiments with synthetic and real data validate the
    algorithm’s accuracy, which has orders of magnitude smaller error than an
    existing single-pass algorithm. For a set of high-dimensional data stored as a
    150 GB file, the proposed algorithm is able to compute the first 50 principal
    components in just 24 minutes on a typical 24-core computer, with less than 1
    GB memory cost.

    Learning Agents in Black-Scholes Financial Markets: Consensus Dynamics and Volatility Smiles

    Tushar Vaidya, Carlos Murguia, Georgios Piliouras
    Comments: 12 pages, 7 figures
    Subjects: Mathematical Finance (q-fin.MF); Learning (cs.LG); Multiagent Systems (cs.MA)

    Black-Scholes (BS) is the standard mathematical model for option pricing in
    financial markets. Option prices are calculated using an analytical formula
    whose main inputs are strike (at which price to exercise) and volatility. The
    BS framework assumes that volatility remains constant across all strikes,
    however, in practice it varies. How do traders come to learn these parameters?
    We introduce natural models of learning agents, in which they update their
    beliefs about the true implied volatility based on the opinions of other
    traders. We prove convergence of these opinion dynamics using techniques from
    control theory and leader-follower models, thus providing a resolution between
    theory and market practices. We allow for two different models, one with
    feedback and one with an unknown leader and no feedback. Both scalar and
    multidimensional cases are analyzed.

    Semi-supervised Bayesian Deep Multi-modal Emotion Recognition

    Changde Du, Changying Du, Jinpeng Li, Wei-long Zheng, Bao-liang Lu, Huiguang He
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    In emotion recognition, it is difficult to recognize human’s emotional states
    using just a single modality. Besides, the annotation of physiological
    emotional data is particularly expensive. These two aspects make the building
    of effective emotion recognition model challenging. In this paper, we first
    build a multi-view deep generative model to simulate the generative process of
    multi-modality emotional data. By imposing a mixture of Gaussians assumption on
    the posterior approximation of the latent variables, our model can learn the
    shared deep representation from multiple modalities. To solve the
    labeled-data-scarcity problem, we further extend our multi-view model to
    semi-supervised learning scenario by casting the semi-supervised classification
    problem as a specialized missing data imputation task. Our semi-supervised
    multi-view deep generative framework can leverage both labeled and unlabeled
    data from multiple modalities, where the weight factor for each modality can be
    learned automatically. Compared with previous emotion recognition methods, our
    method is more robust and flexible. The experiments conducted on two real
    multi-modal emotion datasets have demonstrated the superiority of our framework
    over a number of competitors.

    Abstract Syntax Networks for Code Generation and Semantic Parsing

    Maxim Rabinovich, Mitchell Stern, Dan Klein
    Comments: ACL 2017. MR and MS contributed equally
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    Tasks like code generation and semantic parsing require mapping unstructured
    (or partially structured) inputs to well-formed, executable outputs. We
    introduce abstract syntax networks, a modeling framework for these problems.
    The outputs are represented as abstract syntax trees (ASTs) and constructed by
    a decoder with a dynamically-determined modular structure paralleling the
    structure of the output tree. On the benchmark Hearthstone dataset for code
    generation, our model obtains 79.2 BLEU and 22.7% exact match accuracy,
    compared to previous state-of-the-art values of 67.1 and 6.1%. Furthermore, we
    perform competitively on the Atis, Jobs, and Geo semantic parsing datasets with
    no task-specific engineering.

    Dynamic Model Selection for Prediction Under a Budget

    Feng Nan, Venkatesh Saligrama
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    We present a dynamic model selection approach for resource-constrained
    prediction. Given an input instance at test-time, a gating function identifies
    a prediction model for the input among a collection of models. Our objective is
    to minimize overall average cost without sacrificing accuracy. We learn gating
    and prediction models on fully labeled training data by means of a bottom-up
    strategy. Our novel bottom-up method is a recursive scheme whereby a
    high-accuracy complex model is first trained. Then a low-complexity gating and
    prediction model are subsequently learnt to adaptively approximate the
    high-accuracy model in regions where low-cost models are capable of making
    highly accurate predictions. We pose an empirical loss minimization problem
    with cost constraints to jointly train gating and prediction models. On a
    number of benchmark datasets our method outperforms state-of-the-art achieving
    higher accuracy for the same cost.

    Learning of Human-like Algebraic Reasoning Using Deep Feedforward Neural Networks

    Cheng-Hao Cai, Dengfeng Ke, Yanyan Xu, Kaile Su
    Comments: 8 pages, 7 figures
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Logic in Computer Science (cs.LO)

    There is a wide gap between symbolic reasoning and deep learning. In this
    research, we explore the possibility of using deep learning to improve symbolic
    reasoning. Briefly, in a reasoning system, a deep feedforward neural network is
    used to guide rewriting processes after learning from algebraic reasoning
    examples produced by humans. To enable the neural network to recognise patterns
    of algebraic expressions with non-deterministic sizes, reduced partial trees
    are used to represent the expressions. Also, to represent both top-down and
    bottom-up information of the expressions, a centralisation technique is used to
    improve the reduced partial trees. Besides, symbolic association vectors and
    rule application records are used to improve the rewriting processes.
    Experimental results reveal that the algebraic reasoning examples can be
    accurately learnt only if the feedforward neural network has enough hidden
    layers. Also, the centralisation technique, the symbolic association vectors
    and the rule application records can reduce error rates of reasoning. In
    particular, the above approaches have led to 4.6% error rate of reasoning on a
    dataset of linear equations, differentials and integrals.

    Leveraging Patient Similarity and Time Series Data in Healthcare Predictive Models

    Amin Morid, Olivia R. Liu Sheng, Samir Abdelrahman
    Comments: 10 pages, Proc. of 7th Annual Workshop on Health IT and Economics (WHITE) (Eds.). Washington, DC., 2016
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG)

    Patient time series classification faces challenges in high degrees of
    dimensionality and missingness. In light of patient similarity theory, this
    study explores effective temporal feature engineering and reduction, missing
    value imputation, and change point detection methods that can afford
    similarity-based classification models with desirable accuracy enhancement. We
    select a piecewise aggregation approximation method to extract fine-grain
    temporal features and propose a minimalist method to impute missing values in
    temporal features. For dimensionality reduction, we adopt a gradient descent
    search method for feature weight assignment. We propose new patient status and
    directional change definitions based on medical knowledge or clinical
    guidelines about the value ranges for different patient status levels, and
    develop a method to detect change points indicating positive or negative
    patient status changes. We evaluate the effectiveness of the proposed methods
    in the context of early Intensive Care Unit mortality prediction. The
    evaluation results show that the k-Nearest Neighbor algorithm that incorporates
    methods we select and propose significantly outperform the relevant benchmarks
    for early ICU mortality prediction. This study makes contributions to time
    series classification and early ICU mortality prediction via identifying and
    enhancing temporal feature engineering and reduction methods for
    similarity-based time series classification.

    Bootstrapping Graph Convolutional Neural Networks for Autism Spectrum Disorder Classification

    Rushil Anirudh, Jayaraman J. Thiagarajan
    Comments: Under review
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Using predictive models to identify patterns that can act as biomarkers for
    different neuropathoglogical conditions is becoming highly prevalent. In this
    paper, we consider the problem of Autism Spectrum Disorder (ASD)
    classification. While non-invasive imaging measurements, such as the rest state
    fMRI, are typically used in this problem, it can be beneficial to incorporate a
    wide variety of non-imaging features, including personal and socio-cultural
    traits, into predictive modeling. We propose to employ a graph-based approach
    for combining both types of feature, where a contextual graph encodes the
    traits of a larger population while the brain activity patterns are defined as
    a multivariate function at the nodes of the graph. Since the underlying graph
    dictates the performance of the resulting predictive models, we explore the use
    of different graph construction strategies. Furthermore, we develop a
    bootstrapped version of graph convolutional neural networks (G-CNNs) that
    utilizes an ensemble of weakly trained G-CNNs to avoid overfitting and also
    reduce the sensitivity of the models on the choice of graph construction. We
    demonstrate its effectiveness on the Autism Brain Imaging Data Exchange (ABIDE)
    dataset and show that the proposed approach outperforms state-of-the-art
    approaches for this problem.

    Active Bias: Training a More Accurate Neural Network by Emphasizing High Variance Samples

    Haw-Shiuan Chang, Erik Learned-Miller, Andrew McCallum
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    This paper addresses the limitations of previous training methods that
    emphasize either easy examples like self-paced learning or difficult examples
    like hard example mining. Inspired by active learning, we propose two
    alternatives to re-weight training samples based on lightweight estimates of
    sample uncertainty in stochastic gradient descent (SGD): the variance in
    predicted probability of the correct class across iterations of mini-batch SGD,
    and the proximity of the correct class probability to the decision threshold
    (or threshold closeness). Extensive experimental results on multiple datasets
    show that our methods reliably improve accuracy in various network
    architectures, including providing additional gains on top of other popular
    training tools, such as ADAM, dropout, and distillation.


    Information Theory

    A lower bound on the differential entropy of log-concave random vectors with applications

    Arnaud Marsiglietti, Victoria Kostina
    Comments: 31 pages, 2 figures
    Subjects: Information Theory (cs.IT)

    We derive a lower bound on the differential entropy of a log-concave random
    variable (X) in terms of the (p)-th absolute moment of (X). The new bound leads
    to a reverse entropy power inequality with an explicit constant, and to new
    bounds on the rate-distortion function and the channel capacity.

    Specifically, we study the rate-distortion function for log-concave sources
    and distortion measure (| x – hat x|^r), and we establish that the difference
    between the rate distortion function and the Shannon lower bound is at most
    (log(2 sqrt{pi e}) approx 2.5) bits, independently of (r) and the target
    distortion (d). For mean-square error distortion, the difference is at most
    (log (sqrt{2 pi e}) approx 2) bits, regardless of (d). The bounds can be
    further strengthened if the source, in addition to being log-concave, is
    symmetric. In particular, we establish that for mean-square error distortion,
    the difference is at most (log (sqrt{pi e}) approx 1.5) bits, regardless of
    (d).

    We also provide bounds on the capacity of memoryless additive noise channels
    when the noise is log-concave. We show that the difference between the capacity
    of such channels and the capacity of the Gaussian channel with the same noise
    power is at most (log(sqrt {2pi e}) approx 2) bits, and at most (log
    (sqrt{pi e}) approx 1.5) bits if the noise is symmetric log-concave.

    Our results generalize to the case of vector (X) with possibly dependent
    coordinates, and to (gamma)-concave random variables. Our proof technique
    leverages tools from convex geometry.

    Low-Complexity Robust MISO Downlink Precoder Design With Per-Antenna Power Constraints

    Mostafa Medra, Timothy N. Davidson
    Subjects: Information Theory (cs.IT)

    This paper considers the design of the beamformers for a multiple-input
    single-output (MISO) downlink system that seeks to mitigate the impact of the
    imperfections in the channel state information (CSI) that is available at the
    base station (BS). The goal of the design is to minimize the outage probability
    of specified signal-to-interference-and-noise ratio (SINR) targets, while
    satisfying per-antenna power constraints (PAPCs), and to do so at a low
    computational cost. Based on insights from the offset maximization technique
    for robust beamforming, and observations regarding the structure of the
    optimality conditions, low-complexity iterative algorithms that involve the
    evaluation of closed-form expressions are developed. To further reduce the
    computational cost, algorithms are developed for per-antenna power-constrained
    variants of the zero-forcing (ZF) and maximum ratio transmission (MRT)
    beamforming directions. In the MRT case, our low-complexity version for systems
    with a large number of antennas may be of independent interest. The proposed
    algorithms are extended to systems with both PAPCs and a total power
    constraint. Simulation results show that the proposed robust designs can
    provide substantial gains in the outage probability while satisfying the PAPCs.

    Secure Transmission with Large Numbers of Antennas and Finite Alphabet Inputs

    Yongpeng Wu, Jun-Bo Wang, Jue Wang, Robert Schober, Chengshan Xiao
    Comments: Accepted by IEEE Transactions on Communications. arXiv admin note: text overlap with arXiv:1612.08328
    Subjects: Information Theory (cs.IT)

    In this paper, we investigate secure transmission over the large-scale
    multiple-antenna wiretap channel with finite alphabet inputs. First, we
    investigate the case where instantaneous channel state information (CSI) of the
    eavesdropper is known at the transmitter. We show analytically that a
    generalized singular value decomposition (GSVD) based design, which is optimal
    for Gaussian inputs, may exhibit a severe performance loss for finite alphabet
    inputs in the high signal-to-noise ratio (SNR) regime. In light of this, we
    propose a novel Per-Group-GSVD (PG-GSVD) design which can effectively
    compensate the performance loss caused by the GSVD design. More importantly,
    the computational complexity of the PG-GSVD design is by orders of magnitude
    lower than that of the existing design for finite alphabet inputs in [1] while
    the resulting performance loss is minimal. Then, we extend the PG-GSVD design
    to the case where only statistical CSI of the eavesdropper is available at the
    transmitter. Numerical results indicate that the proposed PG-GSVD design can be
    efficiently implemented in large-scale multiple-antenna systems and achieves
    significant performance gains compared to the GSVD design.

    Coding for Arbitrarily Varying Remote Sources

    Amitalok J. Budkuley, Bikash Kumar Dey, Vinod M. Prabhakaran
    Comments: 11 pages
    Subjects: Information Theory (cs.IT)

    We study a lossy source coding problem for a memoryless remote source. The
    source data is broadcast over an arbitrarily varying channel (AVC) controlled
    by an adversary. One output of the AVC is received as input at the encoder, and
    another output is received as side information at the decoder. The adversary is
    assumed to know the source data non-causally, and can employ randomized jamming
    strategies arbitrarily correlated to the source data. The decoder reconstructs
    the source data from the encoded message and the side information. We prove
    upper and lower bounds on the adversarial rate distortion function for the
    source under randomized coding. Furthermore, we present some interesting
    special cases of our general setup where the above bounds coincide, and thus,
    provide their complete rate distortion function characterization.

    Graph Sampling for Covariance Estimation

    Sundeep Prabhakar Chepuri, Geert Leus
    Comments: Under peer review for Jour. of Sel. Topics in Signal Proc. (special issue on graph signal processing), Nov. 2016
    Subjects: Information Theory (cs.IT)

    In this paper the focus is on subsampling as well as reconstructing the
    second-order statistics of signals residing on nodes of arbitrary undirected
    graphs. Second-order stationary graph signals may be obtained by graph
    filtering zero-mean white noise and they admit a well-defined power spectrum
    whose shape is determined by the frequency response of the graph filter.
    Estimating the graph power spectrum forms an important component of stationary
    graph signal processing and related inference tasks such as Wiener prediction
    or inpainting on graphs. The central result of this paper is that by sampling a
    significantly smaller subset of vertices and using simple least squares, we can
    reconstruct the second-order statistics of the graph signal from the subsampled
    observations, and more importantly, without any spectral priors. To this end,
    both a nonparametric approach as well as parametric approaches including moving
    average and autoregressive models for the graph power spectrum are considered.
    The results specialize for undirected circulant graphs in that the graph nodes
    leading to the best compression rates are given by the so-called minimal sparse
    rulers. A near-optimal greedy algorithm is developed to design the subsampling
    scheme for the non-parametric and the moving average models, whereas a
    particular subsampling scheme that allows linear estimation for the
    autoregressive model is proposed. Numerical experiments on synthetic as well as
    real datasets related to climatology and processing handwritten digits are
    provided to demonstrate the developed theory.

    Wireless Surveillance of Two-Hop Communications

    Ganggang Ma, Jie Xu, Lingjie Duan, Rui Zhang
    Comments: Submitted for conference publication
    Subjects: Information Theory (cs.IT)

    Wireless surveillance is becoming increasingly important to protect the
    public security by legitimately eavesdropping suspicious wireless
    communications. This paper studies the wireless surveillance of a two-hop
    suspicious communication link by a half-duplex legitimate monitor. By exploring
    the suspicious link’s two-hop nature, the monitor can adaptively choose among
    the following three eavesdropping modes to improve the eavesdropping
    performance: (I) emph{passive eavesdropping} to intercept both hops to decode
    the message collectively, (II) emph{proactive eavesdropping} via {emph{noise
    jamming}} over the first hop, and (III) emph{proactive eavesdropping} via
    {emph{hybrid jamming}} over the second hop. In both proactive eavesdropping
    modes, the (noise/hybrid) jamming over one hop is for the purpose of reducing
    the end-to-end communication rate of the suspicious link and accordingly making
    the interception more easily over the other hop. Under this setup, we maximize
    the eavesdropping rate at the monitor by jointly optimizing the eavesdropping
    mode selection as well as the transmit power for noise and hybrid jamming.
    Numerical results show that the eavesdropping mode selection significantly
    improves the eavesdropping rate as compared to each individual eavesdropping
    mode.

    Optical Non-Orthogonal Multiple Access for Visible Light Communication

    Hanaa Marshoud, Sami Muhaidat, Paschalis C. Sofotasios, Sajjad Hussain, Muhammad Ali Imran, Bayan S. Sharif
    Subjects: Information Theory (cs.IT)

    The proliferation of mobile Internet and connected devices, offering a
    variety of services at different levels of performance, represents a major
    challenge for the fifth generation wireless networks and beyond. This requires
    a paradigm shift towards the development of key enabling techniques for the
    next generation wireless networks. In this respect, visible light communication
    (VLC) has recently emerged as a new communication paradigm that is capable of
    providing ubiquitous connectivity by complementing radio frequency
    communications. One of the main challenges of VLC systems, however, is the low
    modulation bandwidth of the light-emitting-diodes, which is in the megahertz
    range. This article presents a promising technology, referred to as “optical-
    non-orthogonal multiple access (O-NOMA)”, which is envisioned to address the
    key challenges in the next generation of wireless networks. We provide a
    detailed overview and analysis of the state-of-the-art integration of O-NOMA in
    VLC networks. Furthermore, we provide insights on the potential opportunities
    and challenges as well as some open research problems that are envisioned to
    pave the way for the future design and implementation of O-NOMA in VLC systems.

    Optimal Demand-Side Management for Joint Privacy-Cost Optimization with Energy Storage

    Giulio Giaconi, Deniz Gunduz, H. Vincent Poor
    Comments: 6 pages, 7 figures
    Subjects: Information Theory (cs.IT)

    The smart meter (SM) privacy problem is addressed together with the cost of
    energy for the consumer. It is assumed that a storage device, e.g., an
    electrical battery, is available to the consumer, which can be utilized both to
    achieve privacy and to reduce the energy cost by modifying the energy
    consumption profile. Privacy is measured via the squared-error distortion
    between the SM readings, which are reported to the utility provider (UP), and a
    target profile, whereas time-of-use pricing is considered for energy cost
    calculation. Extensive numerical results are presented to evaluate the
    performance of the proposed strategy. The trade-off between the achievable
    privacy and the energy cost is studied by taking into account the limited
    capacity of the battery as well as the capability to sell energy to the UP.

    Joint Transmit and Receive Filter Optimization for Sub-Nyquist Wireless Channel Estimation

    Andreas Lenz, Manuel S. Stein, A. Lee Swindlehurst
    Comments: 13 pages, 14 figures
    Subjects: Information Theory (cs.IT)

    In this article a framework is presented for the joint optimization of the
    analog transmit and receive filter with respect to a channel estimation
    problem. At the receiver, conventional signal processing systems restrict the
    bandwidth of the analog pre-filter (B) to the rate of the analog-to-digital
    converter (f_s) in order to comply with the well-known Nyquist sampling
    theorem. In contrast, here we consider a transceiver that by design violates
    the common paradigm (Bleq f_s). To this end, at the receiver we allow for a
    higher pre-filter bandwidth (B>f_s) and study the achievable channel estimation
    accuracy under a fixed sampling rate when the transmit and receive filter are
    jointly optimized with respect to the Bayesian Cram'{e}r-Rao lower bound. For
    the case of a channel with unknown delay-Doppler shift we show how to
    approximate the required Fisher information matrix and solve the transceiver
    design problem by an alternating optimization algorithm. The presented approach
    allows us to explore the Pareto-optimal region spanned by transmit and receive
    filters which are favorable under a weighted mean squared error criterion. We
    discuss the complexity of the obtained transceiver design by visualizing the
    resulting ambiguity function. Finally, we verify the achievable performance of
    the proposed designs by Monte-Carlo simulations of a likelihood-based channel
    estimator.

    A Survey on MIMO Transmission with Discrete Input Signals: Technical Challenges, Advances, and Future Trends

    Yongpeng Wu, Chengshan Xiao, Zhi Ding, Xiqi Gao, Shi Jin
    Comments: 110 pages, 512 references, submit to Proceedings of the IEEE
    Subjects: Information Theory (cs.IT)

    Multiple antennas have been exploited for spatial multiplexing and diversity
    transmission in a wide range of communication applications. However, most of
    the advances in the design of high speed wireless multiple-input multiple
    output (MIMO) systems are based on information-theoretic principles that
    demonstrate how to efficiently transmit signals conforming to Gaussian
    distribution. Although the Gaussian signal is capacity-achieving, signals
    conforming to discrete constellations are transmitted in practical
    communication systems. As a result, this paper is motivated to provide a
    comprehensive overview on MIMO transmission design with discrete input signals.
    We first summarize the existing fundamental results for MIMO systems with
    discrete input signals. Then, focusing on the basic point-to-point MIMO
    systems, we examine transmission schemes based on three most important criteria
    for communication systems: the mutual information driven designs, the mean
    square error driven designs, and the diversity driven designs. Particularly, a
    unified framework which designs low complexity transmission schemes applicable
    to massive MIMO systems in upcoming 5G wireless networks is provided in the
    first time. Moreover, adaptive transmission designs which switch among these
    criteria based on the channel conditions to formulate the best transmission
    strategy are discussed. Then, we provide a survey of the transmission designs
    with discrete input signals for multiuser MIMO scenarios, including MIMO uplink
    transmission, MIMO downlink transmission, MIMO interference channel, and MIMO
    wiretap channel. Additionally, we discuss the transmission designs with
    discrete input signals for other systems using MIMO technology. Finally,
    technical challenges which remain unresolved at the time of writing are
    summarized and the future trends of transmission designs with discrete input
    signals are addressed.

    Beyond WYSIWYG: Sharing Contextual Sensing Data Through mmWave V2V Communications

    Cristina Perfecto, Javier Del Ser, Mehdi Bennis, Miren Nekane Bilbao
    Comments: 6 pages, 4 figures
    Subjects: Information Theory (cs.IT)

    In vehicular scenarios context awareness is a key enabler for road safety.
    However, the amount of contextual information that can be collected by a
    vehicle is stringently limited by the sensor technology itself (e.g.,
    line-of-sight, coverage, weather robustness) and by the low bandwidths offered
    by current wireless vehicular technologies such as DSRC/802.11p. Motivated by
    the upsurge of research around millimeter-wave vehicle-to-anything (V2X)
    communications, in this work we propose a distributed vehicle-to-vehicle (V2V)
    association scheme that considers a quantitative measure of the potential value
    of the shared contextual information in the pairing process. First, we properly
    define the utility function of every vehicle balancing classical channel state
    and queuing state information (CSI/QSI) with context information i.e., sensing
    content resolution, timeliness and enhanced range of the sensing. Next we solve
    the problem via a distributed many-to-one matching game in a junction scenario
    with realistic vehicular mobility traces. It is shown that when receivers are
    able to leverage information from different sources, the average volume of
    collected extended sensed information under our proposed scheme is up to 71%
    more than that of distance and minimum delay-based matching baselines.

    Investigation of Wireless Channel Asymmetry in Indoor Environments

    Ehab Salahat, Ahmed Kulaib, Nazar Ali, Raed Shubair
    Comments: Accepted in IEEE International Symposium on Antennas and Propagation (APS17), San Diego, California, 9-14 Jul. 2017. arXiv admin note: substantial text overlap with arXiv:1704.06874
    Subjects: Information Theory (cs.IT)

    Asymmetry is unquestionably an important characteristic of the wireless
    propagation channel, which needs to be accurately modeled for wireless and
    mobile communications, 5G networks, and associated applications such as
    indoor/outdoor localization. This paper reports on the potential causes of
    propagation asymmetry. Practical channel measurements at Khalifa University
    premises proved that wireless channels are asymmetric in realistic scenarios.
    Some important conclusions and recommendation are also summarized.

    Limitations on Transversal Computation through Quantum Homomorphic Encryption

    Michael Newman, Yaoyun Shi
    Comments: 22 pages, 2 figures, comments welcome!
    Subjects: Quantum Physics (quant-ph); Cryptography and Security (cs.CR); Information Theory (cs.IT)

    Transversality is a simple and effective method for implementing quantum
    computation fault-tolerantly. However, no quantum error-correcting code (QECC)
    can transversally implement a quantum universal gate set (Eastin and Knill,
    Phys. Rev. Lett., 102, 110502). Since reversible classical computation is often
    a dominating part of useful quantum computation, whether or not it can be
    implemented transversally is an important open problem. We show that, other
    than a small set of non-additive codes that we cannot rule out, no binary QECC
    can transversally implement a classical reversible universal gate set. In
    particular, no such QECC can implement the Toffoli gate transversally.

    We prove our result by constructing an information theoretically secure (but
    inefficient) quantum homomorphic encryption (ITS-QHE) scheme inspired by Ouyang
    et al. (arXiv:1508.00938). Homomorphic encryption allows the implementation of
    certain functions directly on encrypted data, i.e. homomorphically. Our scheme
    builds on almost any QECC, and implements that code’s transversal gate set
    homomorphically. We observe a restriction imposed by Nayak’s bound (FOCS 1999)
    on ITS-QHE, implying that any ITS quantum fully homomorphic scheme (ITS-QFHE)
    implementing the full set of classical reversible functions must be highly
    inefficient. While our scheme incurs exponential overhead, any such QECC
    implementing Toffoli transversally would still violate this lower bound through
    our scheme.

    Estimating Sparse Signals Using Integrated Wideband Dictionaries

    Maksim Butsenko, Johan Swärd, Andreas Jakobsson
    Subjects: Methodology (stat.ME); Information Theory (cs.IT)

    In this paper, we introduce a wideband dictionary framework for estimating
    sparse signals. By formulating integrated dictionary elements spanning bands of
    the considered parameter space, one may efficiently find and discard large
    parts of the parameter space not active in the signal. After each iteration,
    the zero-valued parts of the dictionary may be discarded to allow a refined
    dictionary to be formed around the active elements, resulting in a zoomed
    dictionary to be used in the following iterations. Implementing this scheme
    allows for more accurate estimates, at a much lower computational cost, as
    compared to directly forming a larger dictionary spanning the whole parameter
    space or performing a zooming procedure using standard dictionary elements.
    Different from traditional dictionaries, the wideband dictionary allows for the
    use of dictionaries with fewer elements than the number of available samples
    without loss of resolution. The technique may be used on both one- and
    multi-dimensional signals, and may be exploited to refine several traditional
    sparse estimators, here illustrated with the LASSO and the SPICE estimators.
    Numerical examples illustrate the improved performance.

    How Close Can I Be? – A Comprehensive Analysis of Cellular Interference on ATC Radar

    Neelakantan Nurani Krishnan, Ratnesh Kumbhkar, Narayan B. Mandayam, Ivan Seskar, Sastry Kompella
    Comments: This paper has been submitted to IEEE Globecom 2017
    Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT)

    Increasing data traffic demands over wireless spectrum have necessitated
    spectrum sharing and coexistence between heterogeneous systems such as radar
    and cellular communications systems. In this context, we specifically
    investigate the co-channel coexistence between an air traffic control (ATC)
    radar and a wide area cellular communication (comms) system. We present a
    comprehensive characterization and analysis of interference caused by the comms
    system on the ATC radar with respect to multiple parameters such as radar
    range, protection radius around the radar, and radar antenna elevation angle.
    The analysis suggests that maintaining a protection radius of 50 km around the
    radar will ensure the required INR protection criterion of -10 dB at the radar
    receiver with ~0.9 probability, even when the radar beam is in the same horizon
    as the comms BS. Detailed evaluations of the radar target detection performance
    provide a framework to choose appropriate protection radii around the radar to
    meet specific performance requirements.

    Denoising Linear Models with Permuted Data

    Ashwin Pananjady, Martin J. Wainwright, Thomas A. Courtade
    Comments: To appear in part at ISIT 2017, Aachen
    Subjects: Machine Learning (stat.ML); Information Theory (cs.IT); Statistics Theory (math.ST)

    The multivariate linear regression model with shuffled data and additive
    Gaussian noise arises in various correspondence estimation and matching
    problems. Focusing on the denoising aspect of this problem, we provide a
    characterization the minimax error rate that is sharp up to logarithmic
    factors. We also analyze the performance of two versions of a computationally
    efficient estimator, and establish their consistency for a large range of input
    parameters. Finally, we provide an exact algorithm for the noiseless problem
    and demonstrate its performance on an image point-cloud matching task. Our
    analysis also extends to datasets with outliers.




沪ICP备19023445号-2号
友情链接