IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Mon, 10 Apr 2017

    我爱机器学习(52ml.net)发表于 2017-04-10 00:00:00
    love 0

    Neural and Evolutionary Computing

    Threat analysis of IoT networks Using Artificial Neural Network Intrusion Detection System

    Elike Hodo, Xavier Bellekens, Andrew Hamilton, Pierre-louis Dubouilh, Ephraim Iorkyase, Christos Tachtatzis, Robert Atkinson
    Comments: Published in The 2016 International Symposium on Networks, Computers and Communications (IEEE ISNCC’16) , Hammamet, Tunisia, 2016
    Subjects: Neural and Evolutionary Computing (cs.NE); Cryptography and Security (cs.CR)

    The Internet of things (IoT) is still in its infancy and has attracted much
    interest in many industrial sectors including medical fields, logistics
    tracking, smart cities and automobiles. However as a paradigm, it is
    susceptible to a range of significant intrusion threats. This paper presents a
    threat analysis of the IoT and uses an Artificial Neural Network (ANN) to
    combat these threats. A multi-level perceptron, a type of supervised ANN, is
    trained using internet packet traces, then is assessed on its ability to thwart
    Distributed Denial of Service (DDoS/DoS) attacks. This paper focuses on the
    classification of normal and threat patterns on an IoT Network. The ANN
    procedure is validated against a simulated IoT network. The experimental
    results demonstrate 99.4% accuracy and can successfully detect various DDoS/DoS
    attacks.

    The (1+(λ)) Evolutionary Algorithm with Self-Adjusting Mutation Rate

    Benjamin Doerr, Christian Gießen, Carsten Witt, Jing Yang
    Comments: An extended abstract of this report will appear in the proceedings of the 2017 Genetic and Evolutionary Computation Conference (GECCO 2017)
    Subjects: Neural and Evolutionary Computing (cs.NE)

    We propose a new way to self-adjust the mutation rate in population-based
    evolutionary algorithms. Roughly speaking, it consists of creating half the
    offspring with a mutation rate that is twice the current mutation rate and the
    other half with half the current rate. The mutation rate is then updated to the
    rate used in that subpopulation which contains the best offspring.

    We analyze how the ((1+lambda)) evolutionary algorithm with this
    self-adjusting mutation rate optimizes the OneMax test function. We prove that
    this dynamic version of the ((1+lambda))~EA finds the optimum in an expected
    optimization time (number of fitness evaluations) of
    (O(nlambda/!loglambda+nlog n)). This time is asymptotically smaller than
    the optimization time of the classic ((1+lambda)) EA. Previous work shows that
    this performance is best-possible among all (lambda)-parallel mutation-based
    unbiased black-box algorithms.

    This result shows that the new way of adjusting the mutation rate can find
    optimal dynamic parameter values on the fly. Since our adjustment mechanism is
    simpler than the ones previously used for adjusting the mutation rate and does
    not have parameters itself, we are optimistic that it will find other
    applications.

    Evolution in Groups: A deeper look at synaptic cluster driven evolution of deep neural networks

    Mohammad Javad Shafiee, Elnaz Barshan, Alexander Wong
    Comments: 8 pages. arXiv admin note: substantial text overlap with arXiv:1609.01360
    Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

    A promising paradigm for achieving highly efficient deep neural networks is
    the idea of evolutionary deep intelligence, which mimics biological evolution
    processes to progressively synthesize more efficient networks. A crucial design
    factor in evolutionary deep intelligence is the genetic encoding scheme used to
    simulate heredity and determine the architectures of offspring networks. In
    this study, we take a deeper look at the notion of synaptic cluster-driven
    evolution of deep neural networks which guides the evolution process towards
    the formation of a highly sparse set of synaptic clusters in offspring
    networks. Utilizing a synaptic cluster-driven genetic encoding, the
    probabilistic encoding of synaptic traits considers not only individual
    synaptic properties but also inter-synaptic relationships within a deep neural
    network. This process results in highly sparse offspring networks which are
    particularly tailored for parallel computational devices such as GPUs and deep
    neural network accelerator chips. Comprehensive experimental results using four
    well-known deep neural network architectures (LeNet-5, AlexNet, ResNet-56, and
    DetectNet) on two different tasks (object categorization and object detection)
    demonstrate the efficiency of the proposed method. Cluster-driven genetic
    encoding scheme synthesizes networks that can achieve state-of-the-art
    performance with significantly smaller number of synapses than that of the
    original ancestor network. ((sim)125-fold decrease in synapses for MNIST).
    Furthermore, the improved cluster efficiency in the generated offspring
    networks ((sim)9.71-fold decrease in clusters for MNIST and a (sim)8.16-fold
    decrease in clusters for KITTI) is particularly useful for accelerated
    performance on parallel computing hardware architectures such as those in GPUs
    and deep neural network accelerator chips.

    A Software-equivalent SNN Hardware using RRAM-array for Asynchronous Real-time Learning

    Aditya Shukla, Vinay Kumar, Udayan Ganguly
    Comments: Eight pages, ten figures and two tables
    Subjects: Neural and Evolutionary Computing (cs.NE)

    Spiking Neural Network (SNN) naturally inspires hardware implementation as it
    is based on biology. For learning, spike time dependent plasticity (STDP) may
    be implemented using an energy efficient waveform superposition on memristor
    based synapse. However, system level implementation has three challenges.
    First, a classic dilemma is that recognition requires current reading for short
    voltage(-)spikes which is disturbed by large voltage(-)waveforms that are
    simultaneously applied on the same memristor for real(-)time learning i.e. the
    simultaneous read(-)write dilemma. Second, the hardware needs to exactly
    replicate software implementation for easy adaptation of algorithm to hardware.
    Third, the devices used in hardware simulations must be realistic. In this
    paper, we present an approach to address the above concerns. First, the
    learning and recognition occurs in separate arrays simultaneously in
    real(-)time, asynchronously (-) avoiding non(-)biomimetic clocking based
    complex signal management. Second, we show that the hardware emulates software
    at every stage by comparison of SPICE (circuit(-)simulator) with MATLAB
    (mathematical SNN algorithm implementation in software) implementations. As an
    example, the hardware shows 97.5 per cent accuracy in classification which is
    equivalent to software for a Fisher(-)Iris dataset. Third, the STDP is
    implemented using a model of synaptic device implemented using HfO2 memristor.
    We show that an increasingly realistic memristor model slightly reduces the
    hardware performance (85 per cent), which highlights the need to engineer RRAM
    characteristics specifically for SNN.

    A Constrained Sequence-to-Sequence Neural Model for Sentence Simplification

    Yaoyuan Zhang, Zhenxu Ye, Yansong Feng, Dongyan Zhao, Rui Yan
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    Sentence simplification reduces semantic complexity to benefit people with
    language impairments. Previous simplification studies on the sentence level and
    word level have achieved promising results but also meet great challenges. For
    sentence-level studies, sentences after simplification are fluent but sometimes
    are not really simplified. For word-level studies, words are simplified but
    also have potential grammar errors due to different usages of words before and
    after simplification. In this paper, we propose a two-step simplification
    framework by combining both the word-level and the sentence-level
    simplifications, making use of their corresponding advantages. Based on the
    two-step framework, we implement a novel constrained neural generation model to
    simplify sentences given simplified words. The final results on Wikipedia and
    Simple Wikipedia aligned datasets indicate that our method yields better
    performance than various baselines.

    Associative content-addressable networks with exponentially many robust stable states

    Rishidev Chaudhuri, Ila Fiete
    Comments: 35 pages, 7 figures
    Subjects: Neurons and Cognition (q-bio.NC); Neural and Evolutionary Computing (cs.NE)

    The brain must robustly store a large number of memories, corresponding to
    the many events and scenes a person encounters over a lifetime. However, the
    number of memory states in existing neural network models either grows weakly
    with network size or recall performance fails catastrophically with vanishingly
    little noise. Here we show that it is possible to construct an associative
    content-addressable memory (ACAM) with exponentially many stable states and
    robust error-correction. The network possesses expander graph connectivity on a
    restricted Boltzmann machine architecture. The expansion property allows simple
    neural network dynamics to perform at par with modern error-correcting codes.
    Appropriate networks can be constructed with sparse random connections combined
    with glomerular nodes and a local associative learning rule, using low
    dynamic-range weights. Thus, sparse quasi-random constraint structures —
    characteristic of an important class of modern error-correcting codes — may
    provide for high-performance computation in artificial neural networks and the
    brain.


    Computer Vision and Pattern Recognition

    Adversarial Generator-Encoder Networks

    Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We present a new autoencoder-type architecture, that is trainable in an
    unsupervised mode, sustains both generation and inference, and has the quality
    of conditional and unconditional samples boosted by adversarial learning.
    Unlike previous hybrids of autoencoders and adversarial networks, the
    adversarial game in our approach is set up directly between the encoder and the
    generator, and no external mappings are trained in the process of learning. The
    game objective compares the divergences of each of the real and the generated
    data distributions with the canonical distribution in the latent space. We show
    that direct generator-vs-encoder game leads to a tight coupling of the two
    components, resulting in samples and reconstructions of a comparable quality to
    some recently-proposed more complex architectures.

    Deep Unsupervised Similarity Learning using Partially Ordered Sets

    Miguel A Bautista, Artsiom Sanakoyeu, Bjorn Ömmer
    Comments: Accepted for publication at IEEE Computer Vision and Pattern Recognition 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Unsupervised learning of visual similarities is of paramount importance to
    computer vision, particularly due to lacking training data for fine-grained
    similarities. Deep learning of similarities is often based on relationships
    between pairs or triplets of samples. Many of these relations are unreliable
    and mutually contradicting, implying inconsistencies when trained without
    supervision information that relates different tuples or triplets to each
    other. To overcome this problem, we use local estimates of reliable
    (dis-)similarities to initially group samples into compact surrogate classes
    and use local partial orders of samples to classes to link classes to each
    other. Similarity learning is then formulated as a partial ordering task with
    soft correspondences of all samples to classes. Adopting a strategy of
    self-supervision, a CNN is trained to optimally represent samples in a mutually
    consistent manner while updating the classes. The similarity learning and
    grouping procedure are integrated in a single model and optimized jointly. The
    proposed unsupervised approach shows competitive performance on detailed pose
    estimation and object classification.

    Learned Watershed: End-to-End Learning of Seeded Segmentation

    Steffen Wolf, Lukas Schott, Ullrich Köthe, Fred Hamprecht
    Comments: The first two authors contributed equally
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Learned boundary maps are known to outperform hand- crafted ones as a basis
    for the watershed algorithm. We show, for the first time, how to train
    watershed computation jointly with boundary map prediction. The estimator for
    the merging priorities is cast as a neural network that is con- volutional
    (over space) and recurrent (over iterations). The latter allows learning of
    complex shape priors. The method gives the best known seeded segmentation
    results on the CREMI segmentation challenge.

    Clothing and People – A Social Signal Processing Perspective

    Maedeh Aghaei, Federico Parezzan, Mariella Dimiccoli, Petia Radeva, Marco Cristani
    Comments: To appear in the 12th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2017)
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In our society and century, clothing is not anymore used only as a means for
    body protection. Our paper builds upon the evidence, studied within the social
    sciences, that clothing brings a clear communicative message in terms of social
    signals, influencing the impression and behaviour of others towards a person.
    In fact, clothing correlates with personality traits, both in terms of
    self-assessment and assessments that unacquainted people give to an individual.
    The consequences of these facts are important: the influence of clothing on the
    decision making of individuals has been investigated in the literature, showing
    that it represents a discriminative factor to differentiate among diverse
    groups of people. Unfortunately, this has been observed after cumbersome and
    expensive manual annotations, on very restricted populations, limiting the
    scope of the resulting claims. With this position paper, we want to sketch the
    main steps of the very first systematic analysis, driven by social signal
    processing techniques, of the relationship between clothing and social signals,
    both sent and perceived. Thanks to human parsing technologies, which exhibit
    high robustness owing to deep learning architectures, we are now capable to
    isolate visual patterns characterising a large types of garments. These
    algorithms will be used to capture statistical relations on a large corpus of
    evidence to confirm the sociological findings and to go beyond the state of the
    art.

    Hand3D: Hand Pose Estimation using 3D Neural Network

    Xiaoming Deng, Shuo Yang, Yinda Zhang, Ping Tan, Liang Chang, Hongan Wang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We propose a novel 3D neural network architecture for 3D hand pose estimation
    from a single depth image. Different from previous works that mostly run on 2D
    depth image domain and require intermediate or post process to bring in the
    supervision from 3D space, we convert the depth map to a 3D volumetric
    representation, and feed it into a 3D convolutional neural network(CNN) to
    directly produce the pose in 3D requiring no further process. Our system does
    not require the ground truth reference point for initialization, and our
    network architecture naturally integrates both local feature and global context
    in 3D space. To increase the coverage of the hand pose space of the training
    data, we render synthetic depth image by transferring hand pose from existing
    real image datasets. We evaluation our algorithm on two public benchmarks and
    achieve the state-of-the-art performance. The synthetic hand pose dataset will
    be available.

    Investigating Natural Image Pleasantness Recognition using Deep Features and Eye Tracking for Loosely Controlled Human-computer Interaction

    Hamed R. Tavakoli, Jorma Laaksonen, Esa Rahtu
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper revisits recognition of natural image pleasantness by employing
    deep convolutional neural networks and affordable eye trackers. There exist
    several approaches to recognize image pleasantness: (1) computer vision, and
    (2) psychophysical signals. For natural images, computer vision approaches have
    not been as successful as for abstract paintings and is lagging behind the
    psychophysical signals like eye movements. Despite better results, the
    scalability of eye movements is adversely affected by the sensor cost. While
    the introduction of affordable sensors have helped the scalability issue by
    making the sensors more accessible, the application of such sensors in a
    loosely controlled human-computer interaction setup is not yet studied for
    affective image tagging. On the other hand, deep convolutional neural networks
    have boosted the performance of vision-based techniques significantly in recent
    years. To investigate the current status in regard to affective image tagging,
    we (1) introduce a new eye movement dataset using an affordable eye tracker,
    (2) study the use of deep neural networks for pleasantness recognition, (3)
    investigate the gap between deep features and eye movements. To meet these
    ends, we record eye movements in a less controlled setup, akin to daily
    human-computer interaction. We assess features from eye movements, visual
    features, and their combination. Our results show that (1) recognizing natural
    image pleasantness from eye movement under less restricted setup is difficult
    and previously used techniques are prone to fail, and (2) visual class
    categories are strong cues for predicting pleasantness, due to their
    correlation with emotions, necessitating careful study of this phenomenon. This
    latter finding is alerting as some deep learning approaches may fit to the
    class category bias.

    DeepCoder: Semi-parametric Variational Autoencoders for Facial Action Unit Intensity Estimation

    Dieu Linh Tran, Robert Walecki, Ognjen (Oggi)
    Rudovic, Stefanos Eleftheriadis, Bjoern Schuller, Maja Pantic
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Variational (deep) parametric auto-encoders (VAE) have shown a great
    potential for unsupervised extraction of latent representations from large
    amounts of data. Human face exhibits an inherent hierarchy in facial
    representations (encoded in facial action units (AUs) and their intensity).
    This makes VAE a sophisticated method for learning facial features for AU
    intensity estimation. Yet, most existing methods apply classifiers learned
    separately from the encoded features. On the other hand, non-parametric
    (probabilistic) approaches, such as Gaussian Processes (GPs), typically
    outperform their parametric counterparts, but cannot deal easily with large
    amounts of data. In this paper, we propose a novel VAE semi-parametric modeling
    framework, named DeepCoder, which combines the modeling power of parametric
    (convolutional) and nonparametric (ordinal GPs) VAEs, for joint learning of (1)
    latent representations at multiple levels in a task hierarchy, and (2)
    classification of multiple ordinal outputs (AUs intensities). We show on
    benchmark datasets for AU intensity estimation that the proposed DeepCoder
    significantly outperforms state-of-the-art approaches, and related parametric
    VAEs, deep learning and parametric models.

    High-Quality Correspondence and Segmentation Estimation for Dual-Lens Smart-Phone Portraits

    Xiaoyong Shen, Hongyun Gao, Xin Tao, Chao Zhou, Jiaya Jia
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Estimating correspondence between two images and extracting the foreground
    object are two challenges in computer vision. With dual-lens smart phones, such
    as iPhone 7Plus and Huawei P9, coming into the market, two images of slightly
    different views provide us new information to unify the two topics. We propose
    a joint method to tackle them simultaneously via a joint fully connected
    conditional random field (CRF) framework. The regional correspondence is used
    to handle textureless regions in matching and make our CRF system
    computationally efficient. Our method is evaluated over 2,000 new image pairs,
    and produces promising results on challenging portrait images.

    Privacy-Preserving Visual Learning Using Doubly Permuted Homomorphic Encryption

    Ryo Yonetani, Vishnu Naresh Boddeti, Kris M. Kitani, Yoichi Sato
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)

    We propose a privacy-preserving framework for learning visual classifiers by
    leveraging distributed private image data. This framework is designed to
    aggregate multiple classifiers updated locally using private data and to ensure
    that no private information about the data is exposed during its learning
    procedure. We utilize a homomorphic cryptosystem that can aggregate the local
    classifiers while they are encrypted and thus kept secret. To overcome the high
    computational cost of homomorphic encryption of high-dimensional classifiers,
    we (1) impose sparsity constraints on local classifier updates and (2) propose
    a novel efficient encryption scheme named doubly-permuted homomorphic
    encryption (DPHE) which is tailored to sparse high-dimensional data. DPHE (i)
    decomposes sparse data into its constituent non-zero values and their
    corresponding support indices, (ii) applies homomorphic encryption only to the
    non-zero values, and (iii) employs double permutations on the support indices
    to make them secret. Our experimental evaluation on several public datasets
    demonstrates that the proposed approach significantly outperforms other
    privacy-preserving methods and achieves comparable performance against
    state-of-the-art visual recognition methods without privacy preservation.

    Real-time Hand Tracking under Occlusion from an Egocentric RGB-D Sensor

    Franziska Mueller, Dushyant Mehta, Oleksandr Sotnychenko, Srinath Sridhar, Dan Casas, Christian Theobalt
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We present an approach for real-time, robust and accurate hand pose
    estimation from moving egocentric RGB-D cameras in cluttered real environments.
    Existing methods typically fail for hand-object interactions in cluttered
    scenes imaged from egocentric viewpoints, common for virtual or augmented
    reality applications. Our approach uses two subsequently applied Convolutional
    Neural Networks (CNNs) to localize the hand and regress 3D joint locations.
    Hand localization is achieved by using a CNN to estimate the 2D position of the
    hand center in the input, even in the presence of clutter and occlusions. The
    localized hand position, together with the corresponding input depth value, is
    used to generate a normalized cropped image that is fed into a second CNN to
    regress relative 3D hand joint locations in real time. For added accuracy,
    robustness and temporal stability, we refine the pose estimates using a
    kinematic pose tracking energy. To train the CNNs, we introduce a new
    photorealistic dataset that uses a merged reality approach to capture and
    synthesize large amounts of annotated data of natural hand interaction in
    cluttered scenes. Through quantitative and qualitative evaluation, we show that
    our method is robust to self-occlusion and occlusions by objects, particularly
    in moving egocentric perspectives.

    Could you guess an interesting movie from the posters?: An evaluation of vision-based features on movie poster database

    Yuta Matsuzaki, Kazushige Okayasu, Takaaki Imanari, Naomichi Kobayashi, Yoshihiro Kanehara, Ryousuke Takasawa, Akio Nakamura, Hirokatsu Kataoka
    Comments: 4 pages, 4 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we aim to estimate the Winner of world-wide film festival from
    the exhibited movie poster. The task is an extremely challenging because the
    estimation must be done with only an exhibited movie poster, without any film
    ratings and box-office takings. In order to tackle this problem, we have
    created a new database which is consist of all movie posters included in the
    four biggest film festivals. The movie poster database (MPDB) contains historic
    movies over 80 years which are nominated a movie award at each year. We apply a
    couple of feature types, namely hand-craft, mid-level and deep feature to
    extract various information from a movie poster. Our experiments showed
    suggestive knowledge, for example, the Academy award estimation can be better
    rate with a color feature and a facial emotion feature generally performs good
    rate on the MPDB. The paper may suggest a possibility of modeling human taste
    for a movie recommendation.

    Semi-Latent GAN: Learning to generate and modify facial images from attributes

    Weidong Yin, Yanwei Fu, Leonid Sigal, Xiangyang Xue
    Comments: 10 pages, submitted to ICCV 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Generating and manipulating human facial images using high-level attributal
    controls are important and interesting problems. The models proposed in
    previous work can solve one of these two problems (generation or manipulation),
    but not both coherently. This paper proposes a novel model that learns how to
    both generate and modify the facial image from high-level semantic attributes.
    Our key idea is to formulate a Semi-Latent Facial Attribute Space (SL-FAS) to
    systematically learn relationship between user-defined and latent attributes,
    as well as between those attributes and RGB imagery. As part of this newly
    formulated space, we propose a new model — SL-GAN which is a specific form of
    Generative Adversarial Network. Finally, we present an iterative training
    algorithm for SL-GAN. The experiments on recent CelebA and CASIA-WebFace
    datasets validate the effectiveness of our proposed framework. We will also
    make data, pre-trained models and code available.

    Egocentric Video Description based on Temporally-Linked Sequences

    Marc Bolaños, Álvaro Peris, Francisco Casacuberta, Sergi Soler, Petia Radeva
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Egocentric vision consists in acquiring images along the day from a first
    person point-of-view using wearable cameras. The automatic analysis of this
    information allows to discover daily patterns for improving the quality of life
    of the user. A natural topic that arises in egocentric vision is storytelling,
    that is, how to understand and tell the story relying behind the pictures.

    In this paper, we tackle storytelling as an egocentric sequences description
    problem. We propose a novel methodology that exploits information from
    temporally neighboring events, matching precisely the nature of egocentric
    sequences. Furthermore, we present a new method for multimodal data fusion
    consisting on a multi-input attention recurrent network. We also publish the
    first dataset for egocentric image sequences description, consisting of 1,339
    events with 3,991 descriptions, from 55 days acquired by 11 people.
    Furthermore, we prove that our proposal outperforms classical attentional
    encoder-decoder methods for video description.

    ReLayNet: Retinal Layer and Fluid Segmentation of Macular Optical Coherence Tomography using Fully Convolutional Network

    Abhijit Guha Roy, Sailesh Conjeti, Sri Phani Krishna Karri, Debdoot Sheet, Amin Katouzian, Christian Wachinger, Nassir Navab
    Comments: Manuscript Under Review
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Optical coherence tomography (OCT) is extensively used for diagnosis of
    diabetic macular edema due to its non-invasive imaging based assessment of the
    retinal layers. In this paper, we propose a new fully convolutional deep
    learning architecture, termed ReLayNet, for segmentation of retinal layers and
    fluid masses in eye OCT scans. ReLayNet uses a contracting path of
    convolutional blocks (encoders) to learn a heirarchy of contextual features,
    followed by an expansive path of convolutional blocks (decoders) for semantic
    segmentation. Additionally, skip connections relaying encoder outputs to
    matched decoder inputs are introduced to recover spatial information lost
    during downsampling. ReLayNet is trained with stochastic gradient descent to
    optimize a joint loss function comprising of both weighted logistic regression
    and Dice overlap loss. The framework is validated on a publicly available
    benchmark dataset with comparisons against five state-of-the-art segmentation
    methods which includes two deep learning based approaches. Additionally, eight
    incremental baselines are defined and compared with, to validate the individual
    contributions of the proposed framework. We demonstrate that ReLayNet can
    reliably segment the retinal layers and accumulated fluids with improved
    performance in retinal thickness estimation and contour delineation. With a
    segmentation time of 5s per volume, it is well suited for clinical
    applications.

    Multi-Scale Continuous CRFs as Sequential Deep Networks for Monocular Depth Estimation

    Dan Xu, Elisa Ricci, Wanli Ouyang, Xiaogang Wang, Nicu Sebe
    Comments: Accepted as a spotlight paper at CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper addresses the problem of depth estimation from a single still
    image. Inspired by recent works on multi- scale convolutional neural networks
    (CNN), we propose a deep model which fuses complementary information derived
    from multiple CNN side outputs. Different from previous methods, the
    integration is obtained by means of continuous Conditional Random Fields
    (CRFs). In particular, we propose two different variations, one based on a
    cascade of multiple CRFs, the other on a unified graphical model. By designing
    a novel CNN implementation of mean-field updates for continuous CRFs, we show
    that both proposed models can be regarded as sequential deep networks and that
    training can be performed end-to-end. Through extensive experimental evaluation
    we demonstrate the effective- ness of the proposed approach and establish new
    state of the art results on publicly available datasets.

    Partial Face Detection in the Mobile Domain

    Upal Mahbub, Sayantan Sarkar, Rama Chellappa
    Comments: 18 pages, 22 figures, 3 tables, submitted to IEEE Transactions on Image Processing
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Generic face detection algorithms do not perform well in the mobile domain
    due to significant presence of occluded and partially visible faces. One
    promising technique to handle the challenge of partial faces is to design face
    detectors based on facial segments. In this paper two different approaches of
    facial segment-based face detection are discussed, namely, proposal-based
    detection and detection by end-to-end regression. Methods that follow the first
    approach rely on generating face proposals that contain facial segment
    information. The three detectors following this approach, namely Facial
    Segment-based Face Detector (FSFD), SegFace and DeepSegFace, discussed in this
    paper, perform binary classification on each proposal based on features learned
    from facial segments. The process of proposal generation, however, needs to be
    handled separately, which can be very time consuming, and is not truly
    necessary given the nature of the active authentication problem. Hence a novel
    algorithm, Deep Regression-based User Image Detector (DRUID) is proposed, which
    shifts from the classification to the regression paradigm, thus obviating the
    need for proposal generation. DRUID has an unique network architecture with
    customized loss functions, is trained using a relatively small amount of data
    by utilizing a novel data augmentation scheme and is fast since it outputs the
    bounding boxes of a face and its segments in a single pass. Being robust to
    occlusion by design, the facial segment-based face detection methods,
    especially DRUID show superior performance over other state-of-the-art face
    detectors in terms of precision-recall and ROC curve on two mobile face
    datasets.

    Generalized Rank Pooling for Activity Recognition

    Anoop Cherian, Basura Fernando, Mehrtash Harandi, Stephen Gould
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Most popular deep models for action recognition split video sequences into
    short sub-sequences consisting of a few frames; frame-based features are then
    pooled for recognizing the activity. Usually, this pooling step discards the
    temporal order of the frames, which could otherwise be used for better
    recognition. Towards this end, we propose a novel pooling method, generalized
    rank pooling (GRP), that takes as input, features from the intermediate layers
    of a CNN that is trained on tiny sub-sequences, and produces as output the
    parameters of a subspace which (i) provides a low-rank approximation to the
    features and (ii) preserves their temporal order. We propose to use these
    parameters as a compact representation for the video sequence, which is then
    used in a classification setup. We formulate an objective for computing this
    subspace as a Riemannian optimization problem on the Grassmann manifold, and
    propose an efficient conjugate gradient scheme for solving it. Experiments on
    several activity recognition datasets show that our scheme leads to
    state-of-the-art performance.

    Supervised Deep Hashing for Hierarchical Labeled Data

    Dan Wang, Heyan Huang, Chi Lu, Bo-Si Feng, Liqiang Nie, Guihua Wen, Xian-Ling Mao
    Comments: 7 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Recently, hashing methods have been widely used in large-scale image
    retrieval. However, most existing hashing methods did not consider the
    hierarchical relation of labels, which means that they ignored the rich
    information stored in the hierarchy. Moreover, most of previous works treat
    each bit in a hash code equally, which does not meet the scenario of
    hierarchical labeled data. In this paper, we propose a novel deep hashing
    method, called supervised hierarchical deep hashing (SHDH), to perform hash
    code learning for hierarchical labeled data. Specifically, we define a novel
    similarity formula for hierarchical labeled data by weighting each layer, and
    design a deep convolutional neural network to obtain a hash code for each data
    point. Extensive experiments on several real-world public datasets show that
    the proposed method outperforms the state-of-the-art baselines in the image
    retrieval task.

    "RAPID" Regions-of-Interest Detection In Big Histopathological Images

    Li Sulimowicz, Ishfaq Ahmad
    Comments: 6 pages, 5 figures, ICME conference
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    The sheer volume and size of histopathological images (e.g.,10^6 MPixel)
    underscores the need for faster and more accurate Regions-of-interest (ROI)
    detection algorithms. In this paper, we propose such an algorithm, which has
    four main components that help achieve greater accuracy and faster speed:
    First, while using coarse-to-fine topology preserving segmentation as the
    baseline, the proposed algorithm uses a superpixel regularity optimization
    scheme for avoiding irregular and extremely small superpixels. Second, the
    proposed technique employs a prediction strategy to focus only on important
    superpixels at finer image levels. Third, the algorithm reuses the information
    gained from the coarsest image level at other finer image levels. Both the
    second and the third components drastically lower the complexity. Fourth, the
    algorithm employs a highly effective parallelization scheme using adap- tive
    data partitioning, which gains high speedup. Experimental results, conducted on
    the BSD500 [1] and 500 whole-slide histological images from the National Lung
    Screening Trial (NLST)1 dataset, confirm that the proposed algorithm gained 13
    times speedup compared with the baseline, and around 160 times compared with
    SLIC [11], without losing accuracy.

    Convolutional Neural Pyramid for Image Processing

    Xiaoyong Shen, Ying-Cong Chen, Xin Tao, Jiaya Jia
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We propose a principled convolutional neural pyramid (CNP) framework for
    general low-level vision and image processing tasks. It is based on the
    essential finding that many applications require large receptive fields for
    structure understanding. But corresponding neural networks for regression
    either stack many layers or apply large kernels to achieve it, which is
    computationally very costly. Our pyramid structure can greatly enlarge the
    field while not sacrificing computation efficiency. Extra benefit includes
    adaptive network depth and progressive upsampling for quasi-realtime testing on
    VGA-size input. Our method profits a broad set of applications, such as
    depth/RGB image restoration, completion, noise/artifact removal, edge
    refinement, image filtering, image enhancement and colorization.

    Evolution in Groups: A deeper look at synaptic cluster driven evolution of deep neural networks

    Mohammad Javad Shafiee, Elnaz Barshan, Alexander Wong
    Comments: 8 pages. arXiv admin note: substantial text overlap with arXiv:1609.01360
    Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

    A promising paradigm for achieving highly efficient deep neural networks is
    the idea of evolutionary deep intelligence, which mimics biological evolution
    processes to progressively synthesize more efficient networks. A crucial design
    factor in evolutionary deep intelligence is the genetic encoding scheme used to
    simulate heredity and determine the architectures of offspring networks. In
    this study, we take a deeper look at the notion of synaptic cluster-driven
    evolution of deep neural networks which guides the evolution process towards
    the formation of a highly sparse set of synaptic clusters in offspring
    networks. Utilizing a synaptic cluster-driven genetic encoding, the
    probabilistic encoding of synaptic traits considers not only individual
    synaptic properties but also inter-synaptic relationships within a deep neural
    network. This process results in highly sparse offspring networks which are
    particularly tailored for parallel computational devices such as GPUs and deep
    neural network accelerator chips. Comprehensive experimental results using four
    well-known deep neural network architectures (LeNet-5, AlexNet, ResNet-56, and
    DetectNet) on two different tasks (object categorization and object detection)
    demonstrate the efficiency of the proposed method. Cluster-driven genetic
    encoding scheme synthesizes networks that can achieve state-of-the-art
    performance with significantly smaller number of synapses than that of the
    original ancestor network. ((sim)125-fold decrease in synapses for MNIST).
    Furthermore, the improved cluster efficiency in the generated offspring
    networks ((sim)9.71-fold decrease in clusters for MNIST and a (sim)8.16-fold
    decrease in clusters for KITTI) is particularly useful for accelerated
    performance on parallel computing hardware architectures such as those in GPUs
    and deep neural network accelerator chips.


    Artificial Intelligence

    Recurrent Environment Simulators

    Silvia Chiappa, Sébastien Racaniere, Daan Wierstra, Shakir Mohamed
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    Models that can simulate how environments change in response to actions can
    be used by agents to plan and act efficiently. We improve on previous
    environment simulators from high-dimensional pixel observations by introducing
    recurrent neural networks that are able to make temporally and spatially
    coherent predictions for hundreds of time-steps into the future. We present an
    in-depth analysis of the factors affecting performance, providing the most
    extensive attempt to advance the understanding of the properties of these
    models. We address the issue of computationally inefficiency with a model that
    does not need to generate a high-dimensional image at each time-step. We show
    that our approach can be used to improve exploration and is adaptable to many
    diverse environments, namely 10 Atari games, a 3D car racing environment, and
    complex 3D mazes.

    Improving content marketing processes with the approaches by artificial intelligence

    Utku Kose, Selcuk Sert
    Comments: 8 pages, 6 figures
    Journal-ref: Ecoforum Journal, 6, 1 (10), 2017
    Subjects: Artificial Intelligence (cs.AI); Social and Information Networks (cs.SI)

    Content marketing is todays one of the most remarkable approaches in the
    context of marketing processes of companies. Value of this kind of marketing
    has improved in time, thanks to the latest developments regarding to computer
    and communication technologies. Nowadays, especially social media based
    platforms have a great importance on enabling companies to design multimedia
    oriented, interactive content. But on the other hand, there is still something
    more to do for improved content marketing approaches. In this context,
    objective of this study is to focus on intelligent content marketing, which can
    be done by using artificial intelligence. Artificial Intelligence is todays one
    of the most remarkable research fields and it can be used easily as
    multidisciplinary. So, this study has aimed to discuss about its potential on
    improving content marketing. In detail, the study has enabled readers to
    improve their awareness about the intersection point of content marketing and
    artificial intelligence. Furthermore, the authors have introduced some example
    models of intelligent content marketing, which can be achieved by using current
    Web technologies and artificial intelligence techniques.

    A Constrained Sequence-to-Sequence Neural Model for Sentence Simplification

    Yaoyuan Zhang, Zhenxu Ye, Yansong Feng, Dongyan Zhao, Rui Yan
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    Sentence simplification reduces semantic complexity to benefit people with
    language impairments. Previous simplification studies on the sentence level and
    word level have achieved promising results but also meet great challenges. For
    sentence-level studies, sentences after simplification are fluent but sometimes
    are not really simplified. For word-level studies, words are simplified but
    also have potential grammar errors due to different usages of words before and
    after simplification. In this paper, we propose a two-step simplification
    framework by combining both the word-level and the sentence-level
    simplifications, making use of their corresponding advantages. Based on the
    two-step framework, we implement a novel constrained neural generation model to
    simplify sentences given simplified words. The final results on Wikipedia and
    Simple Wikipedia aligned datasets indicate that our method yields better
    performance than various baselines.

    Echantillonnage de signaux sur graphes via des processus déterminantaux

    Nicolas Tremblay (CNRS, GIPSA-CICS), Simon Barthelme (CNRS, GIPSA-VIBS), Pierre-Olivier Amblard (CNRS, GIPSA-CICS)
    Comments: in French
    Subjects: Data Structures and Algorithms (cs.DS); Artificial Intelligence (cs.AI); Discrete Mathematics (cs.DM); Learning (cs.LG)

    We consider the problem of sampling k-bandlimited graph signals, ie, linear
    combinations of the first k graph Fourier modes. We know that a set of k nodes
    embedding all k-bandlimited signals always exists, thereby enabling their
    perfect reconstruction after sampling. Unfortunately, to exhibit such a set,
    one needs to partially diagonalize the graph Laplacian, which becomes
    prohibitive at large scale. We propose a novel strategy based on determinantal
    point processes that side-steps partial diagonalisation and enables
    reconstruction with only O(k) samples. While doing so, we exhibit a new general
    algorithm to sample determinantal process, faster than the state-of-the-art
    algorithm by an order k.

    Evolution in Groups: A deeper look at synaptic cluster driven evolution of deep neural networks

    Mohammad Javad Shafiee, Elnaz Barshan, Alexander Wong
    Comments: 8 pages. arXiv admin note: substantial text overlap with arXiv:1609.01360
    Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

    A promising paradigm for achieving highly efficient deep neural networks is
    the idea of evolutionary deep intelligence, which mimics biological evolution
    processes to progressively synthesize more efficient networks. A crucial design
    factor in evolutionary deep intelligence is the genetic encoding scheme used to
    simulate heredity and determine the architectures of offspring networks. In
    this study, we take a deeper look at the notion of synaptic cluster-driven
    evolution of deep neural networks which guides the evolution process towards
    the formation of a highly sparse set of synaptic clusters in offspring
    networks. Utilizing a synaptic cluster-driven genetic encoding, the
    probabilistic encoding of synaptic traits considers not only individual
    synaptic properties but also inter-synaptic relationships within a deep neural
    network. This process results in highly sparse offspring networks which are
    particularly tailored for parallel computational devices such as GPUs and deep
    neural network accelerator chips. Comprehensive experimental results using four
    well-known deep neural network architectures (LeNet-5, AlexNet, ResNet-56, and
    DetectNet) on two different tasks (object categorization and object detection)
    demonstrate the efficiency of the proposed method. Cluster-driven genetic
    encoding scheme synthesizes networks that can achieve state-of-the-art
    performance with significantly smaller number of synapses than that of the
    original ancestor network. ((sim)125-fold decrease in synapses for MNIST).
    Furthermore, the improved cluster efficiency in the generated offspring
    networks ((sim)9.71-fold decrease in clusters for MNIST and a (sim)8.16-fold
    decrease in clusters for KITTI) is particularly useful for accelerated
    performance on parallel computing hardware architectures such as those in GPUs
    and deep neural network accelerator chips.

    Treatment-Response Models for Counterfactual Reasoning with Continuous-time, Continuous-valued Interventions

    Hossein Soleimani, Adarsh Subbaswamy, Suchi Saria
    Comments: The first two authors contributed equally to this work
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Treatment effects can be estimated from observational data as the difference
    in potential outcomes. In this paper, we address the challenge of estimating
    the potential outcome when treatment-dose levels can vary continuously over
    time. Further, the outcome variable may not be measured at a regular frequency.
    Our proposed solution represents the treatment response curves using linear
    time-invariant dynamical systems—this provides a flexible means for modeling
    response over time to highly variable dose curves. Moreover, for multivariate
    data, the proposed method: uncovers shared structure in treatment response and
    the baseline across multiple markers; and, flexibly models challenging
    correlation structure both across and within signals over time. For this, we
    build upon the framework of multiple-output Gaussian Processes. On simulated
    and a challenging clinical dataset, we show significant gains in accuracy over
    state-of-the-art models.

    An Automated Text Categorization Framework based on Hyperparameter Optimization

    Eric S. Tellez, Daniela Moctezuma, Sabino Miranda-Jímenez, Mario Graff
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    The amount of textual data generated in environments such as social media,
    blogs, online newspapers, and so on, have attracted the attention of the
    scientific community in order to automatize and improve several tasks that were
    manually performed such as sentiment analysis, user profiling, or text
    categorization, just to mention a few. Fortunately, several of these activities
    can be posed as a classification problem, i.e., a problem where one is
    interested in developing a function, from a set of texts with associated
    labels, capable of predicting a label given an unseen text. In this
    contribution, we propose a text classifier, named (mu)TC. (mu)TC is composed
    of a number of easy to implement text transformation, text representation and a
    machine learning algorithm that produce a competitive classifier even over
    informal written text when these parts are correctly configured. We provide a
    detailed description of (mu)TC along with an extensive experimental comparison
    with the relevant state-of-the-art methods. (mu)TC was compared on 30
    different datasets obtaining the best performance (regarding accuracy) in 18 of
    them. The different datasets include several problems like topic and polarity
    classification, spam detection, user profiling and authorship attribution.
    Furthermore, it is important to comment that our approach allows the usage of
    the technology even for users without knowledge of machine learning and natural
    language processing.


    Information Retrieval

    TransNets: Learning to Transform for Recommendation

    Rose Catherine, William Cohen
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Learning (cs.LG)

    Recently, deep learning methods have been shown to improve the performance of
    recommender systems over traditional methods, especially when review text is
    available. For example, a recent model, DeepCoNN, uses neural nets to learn one
    latent representation for the text of all reviews written by a target user, and
    a second latent representation for the text of all reviews for a target item,
    and then combines these latent representations to obtain state-of-the-art
    performance on recommendation tasks. We show that (unsurprisingly) much of the
    predictive value of review text comes from reviews of the target user for the
    target item. We then introduce a way in which this information can be used in
    recommendation, even when the target user’s review for the target item is not
    available. Our model, called TransNets, extends the DeepCoNN model by
    introducing an additional latent layer representing the target user-target item
    pair. We then regularize this layer, at training time, to be similar to another
    latent representation of the target user’s review of the target item. We show
    that TransNets and extensions of it improve substantially over the previous
    state-of-the-art.

    OBTAIN: Real-Time Beat Tracking in Audio Signals

    Ali Mottaghi, Kayhan Behdin, Ashkan Esmaeili, Mohammadreza Heydari, Farokh Marvasti
    Subjects: Sound (cs.SD); Information Retrieval (cs.IR); Learning (cs.LG); Multimedia (cs.MM)

    In this paper, we design a system in order to perform the real-time beat
    tracking for an audio signal. We use Onset Strength Signal (OSS) to detect the
    onsets and estimate the tempos. Then, we form Cumulative Beat Strength Signal
    (CBSS) by taking advantage of OSS and estimated tempos. Next, we perform peak
    detection by extracting the periodic sequence of beats among all CBSS peaks. In
    simulations, we can see that our proposed algorithm, Online Beat TrAckINg
    (OBTAIN), outperforms state-of-art results in terms of prediction accuracy
    while maintaining comparable and practical computational complexity. The
    real-time performance is tractable visually as illustrated in the simulations.

    Conceptualization Topic Modeling

    Yi-Kun Tang, Xian-Ling Mao, Heyan Huang, Guihua Wen
    Comments: 7 pages
    Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)

    Recently, topic modeling has been widely used to discover the abstract topics
    in text corpora. Most of the existing topic models are based on the assumption
    of three-layer hierarchical Bayesian structure, i.e. each document is modeled
    as a probability distribution over topics, and each topic is a probability
    distribution over words. However, the assumption is not optimal. Intuitively,
    it’s more reasonable to assume that each topic is a probability distribution
    over concepts, and then each concept is a probability distribution over words,
    i.e. adding a latent concept layer between topic layer and word layer in
    traditional three-layer assumption. In this paper, we verify the proposed
    assumption by incorporating the new assumption in two representative topic
    models, and obtain two novel topic models. Extensive experiments were conducted
    among the proposed models and corresponding baselines, and the results show
    that the proposed models significantly outperform the baselines in terms of
    case study and perplexity, which means the new assumption is more reasonable
    than traditional one.


    Computation and Language

    A Constrained Sequence-to-Sequence Neural Model for Sentence Simplification

    Yaoyuan Zhang, Zhenxu Ye, Yansong Feng, Dongyan Zhao, Rui Yan
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    Sentence simplification reduces semantic complexity to benefit people with
    language impairments. Previous simplification studies on the sentence level and
    word level have achieved promising results but also meet great challenges. For
    sentence-level studies, sentences after simplification are fluent but sometimes
    are not really simplified. For word-level studies, words are simplified but
    also have potential grammar errors due to different usages of words before and
    after simplification. In this paper, we propose a two-step simplification
    framework by combining both the word-level and the sentence-level
    simplifications, making use of their corresponding advantages. Based on the
    two-step framework, we implement a novel constrained neural generation model to
    simplify sentences given simplified words. The final results on Wikipedia and
    Simple Wikipedia aligned datasets indicate that our method yields better
    performance than various baselines.

    Comparison of Global Algorithms in Word Sense Disambiguation

    Loïc Vial, Andon Tchechmedjiev, Didier Schwab
    Subjects: Computation and Language (cs.CL)

    This article compares four probabilistic algorithms (global algorithms) for
    Word Sense Disambiguation (WSD) in terms of the number of scorer calls (local
    algo- rithm) and the F1 score as determined by a gold-standard scorer. Two
    algorithms come from the state of the art, a Simulated Annealing Algorithm
    (SAA) and a Genetic Algorithm (GA) as well as two algorithms that we first
    adapt from WSD that are state of the art probabilistic search algorithms,
    namely a Cuckoo search algorithm (CSA) and a Bat Search algorithm (BS). As WSD
    requires to evaluate exponentially many word sense combinations (with branching
    factors of up to 6 or more), probabilistic algorithms allow to find approximate
    solution in a tractable time by sampling the search space. We find that CSA, GA
    and SA all eventually converge to similar results (0.98 F1 score), but CSA gets
    there faster (in fewer scorer calls) and reaches up to 0.95 F1 before SA in
    fewer scorer calls. In BA a strict convergence criterion prevents it from
    reaching above 0.89 F1.

    NILC-USP at SemEval-2017 Task 4: A Multi-view Ensemble for Twitter Sentiment Analysis

    Edilson A. Corrêa Jr., Vanessa Queiroz Marinho, Leandro Borges dos Santos
    Comments: Published in Proceedings of SemEval-2017, 5 pages
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    This paper describes our multi-view ensemble approach to SemEval-2017 Task 4
    on Sentiment Analysis in Twitter, specifically, the Message Polarity
    Classification subtask for English (subtask A). Our system is a voting
    ensemble, where each base classifier is trained in a different feature space.
    The first space is a bag-of-words model and has a Linear SVM as base
    classifier. The second and third spaces are two different strategies of
    combining word embeddings to represent sentences and use a Linear SVM and a
    Logistic Regressor as base classifiers. The proposed system was ranked 18th out
    of 38 systems considering F1 score and 20th considering recall.

    EELECTION at SemEval-2017 Task 10: Ensemble of nEural Learners for kEyphrase ClassificaTION

    Steffen Eger, Erik-Lân Do Dinh, Ilia Kuznetsov, Masoud Kiaeeha, Iryna Gurevych
    Subjects: Computation and Language (cs.CL)

    This paper describes our approach to the SemEval 2017 Task 10: “Extracting
    Keyphrases and Relations from Scientific Publications”, specifically to Subtask
    (B): “Classification of identified keyphrases”. We explored three different
    deep learning approaches: a character-level convolutional neural network (CNN),
    a stacked learner with an MLP meta-classifier, and an attention based Bi-LSTM.
    From these approaches, we created an ensemble of differently
    hyper-parameterized systems, achieving a micro-F 1 -score of 0.63 on the test
    data. Our approach ranks 2nd (score of 1st placed system: 0.64) out of four
    according to this official score. However, we erroneously trained 2 out of 3
    neural nets (the stacker and the CNN) on only roughly 15% of the full data,
    namely, the original development set. When trained on the full data
    (training+development), our ensemble has a micro-F 1 -score of 0.69. Our code
    is available from this https URL

    The Meaning Factory at SemEval-2017 Task 9: Producing AMRs with Neural Semantic Parsing

    Rik van Noord, Johan Bos
    Comments: To appear in Proceedings of SemEval, 2017
    Subjects: Computation and Language (cs.CL)

    We evaluate a semantic parser based on a character-based sequence-to-sequence
    model in the context of the SemEval-2017 shared task on semantic parsing for
    AMRs. With data augmentation, super characters, and POS-tagging we gain major
    improvements in performance compared to a baseline model. The overall accuracy
    however is still lower than a state-of-the-art AMR parser. An ensemble
    combining our neural semantic parser with an existing, traditional parser,
    yields a small gain in performance.

    Adposition Supersenses v2

    Nathan Schneider, Jena D. Hwang, Archna Bhatia, Na-Rae Han, Vivek Srikumar, Tim O'Gorman, Omri Abend
    Subjects: Computation and Language (cs.CL)

    This document describes an inventory of 50 semantic labels designed to
    characterize the use of adpositions and case markers at a somewhat coarse level
    of granularity. Version 2 is a revision of the supersense inventory proposed
    for English by Schneider et al. (2015, 2016) and documented in PrepWiki
    (henceforth “v1”), which in turn was based on previous schemes. The present
    inventory was developed after extensive review of the v1 corpus annotations for
    English, as well as consideration of adposition and case phenomena in Hebrew,
    Hindi, and Korean. Examples in this document are limited to English; a
    multilingual and more detailed online lexical resource is forthcoming.

    Conceptualization Topic Modeling

    Yi-Kun Tang, Xian-Ling Mao, Heyan Huang, Guihua Wen
    Comments: 7 pages
    Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)

    Recently, topic modeling has been widely used to discover the abstract topics
    in text corpora. Most of the existing topic models are based on the assumption
    of three-layer hierarchical Bayesian structure, i.e. each document is modeled
    as a probability distribution over topics, and each topic is a probability
    distribution over words. However, the assumption is not optimal. Intuitively,
    it’s more reasonable to assume that each topic is a probability distribution
    over concepts, and then each concept is a probability distribution over words,
    i.e. adding a latent concept layer between topic layer and word layer in
    traditional three-layer assumption. In this paper, we verify the proposed
    assumption by incorporating the new assumption in two representative topic
    models, and obtain two novel topic models. Extensive experiments were conducted
    among the proposed models and corresponding baselines, and the results show
    that the proposed models significantly outperform the baselines in terms of
    case study and perplexity, which means the new assumption is more reasonable
    than traditional one.

    Conversation Modeling on Reddit using a Graph-Structured LSTM

    Vicky Zayats, Mari Ostendorf
    Comments: Submitted to TACL
    Subjects: Computation and Language (cs.CL)

    This paper presents a novel approach for modeling threaded discussions on
    social media using a graph-structured bidirectional LSTM which represents both
    hierarchical and temporal conversation structure. In experiments with a task of
    predicting popularity of comments in Reddit discussions, the proposed model
    outperforms a node-independent architecture for different sets of input
    features. Analyses show a benefit to the model over the full course of the
    discussion, improving detection in both early and late stages. Further, the use
    of language cues with the bidirectional tree state updates helps with
    identifying controversial comments.

    An Automated Text Categorization Framework based on Hyperparameter Optimization

    Eric S. Tellez, Daniela Moctezuma, Sabino Miranda-Jímenez, Mario Graff
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    The amount of textual data generated in environments such as social media,
    blogs, online newspapers, and so on, have attracted the attention of the
    scientific community in order to automatize and improve several tasks that were
    manually performed such as sentiment analysis, user profiling, or text
    categorization, just to mention a few. Fortunately, several of these activities
    can be posed as a classification problem, i.e., a problem where one is
    interested in developing a function, from a set of texts with associated
    labels, capable of predicting a label given an unseen text. In this
    contribution, we propose a text classifier, named (mu)TC. (mu)TC is composed
    of a number of easy to implement text transformation, text representation and a
    machine learning algorithm that produce a competitive classifier even over
    informal written text when these parts are correctly configured. We provide a
    detailed description of (mu)TC along with an extensive experimental comparison
    with the relevant state-of-the-art methods. (mu)TC was compared on 30
    different datasets obtaining the best performance (regarding accuracy) in 18 of
    them. The different datasets include several problems like topic and polarity
    classification, spam detection, user profiling and authorship attribution.
    Furthermore, it is important to comment that our approach allows the usage of
    the technology even for users without knowledge of machine learning and natural
    language processing.

    TransNets: Learning to Transform for Recommendation

    Rose Catherine, William Cohen
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Learning (cs.LG)

    Recently, deep learning methods have been shown to improve the performance of
    recommender systems over traditional methods, especially when review text is
    available. For example, a recent model, DeepCoNN, uses neural nets to learn one
    latent representation for the text of all reviews written by a target user, and
    a second latent representation for the text of all reviews for a target item,
    and then combines these latent representations to obtain state-of-the-art
    performance on recommendation tasks. We show that (unsurprisingly) much of the
    predictive value of review text comes from reviews of the target user for the
    target item. We then introduce a way in which this information can be used in
    recommendation, even when the target user’s review for the target item is not
    available. Our model, called TransNets, extends the DeepCoNN model by
    introducing an additional latent layer representing the target user-target item
    pair. We then regularize this layer, at training time, to be similar to another
    latent representation of the target user’s review of the target item. We show
    that TransNets and extensions of it improve substantially over the previous
    state-of-the-art.


    Distributed, Parallel, and Cluster Computing

    GLoP: Enabling Massively Parallel Incident Response Through GPU Log Processing

    Xavier Bellekens, Christos Tachtatzis, Robert Atkinson, Craig Renfrew, Tony Kirkham
    Comments: Published in The 7th International Conference of Security of Information and Networks, SIN 2014, Glasgow, UK, September, 2014
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Cryptography and Security (cs.CR)

    Large industrial systems that combine services and applications, have become
    targets for cyber criminals and are challenging from the security, monitoring
    and auditing perspectives. Security log analysis is a key step for uncovering
    anomalies, detecting intrusion, and enabling incident response. The constant
    increase of link speeds, threats and users, produce large volumes of log data
    and become increasingly difficult to analyse on a Central Processing Unit
    (CPU). This paper presents a massively parallel Graphics Processing Unit (GPU)
    LOg Processing (GLoP) library and can also be used for Deep Packet Inspection
    (DPI), using a prefix matching technique, harvesting the full power of
    off-the-shelf technologies. GLoP implements two different algorithm using
    different GPU memory and is compared against CPU counterpart implementations.
    The library can be used for processing nodes with single or multiple GPUs as
    well as GPU cloud farms. The results show throughput of 20Gbps and demonstrate
    that modern GPUs can be utilised to increase the operational speed of large
    scale log processing scenarios, saving precious time before and after an
    intrusion has occurred.

    A Highly-Efficient Memory-Compression Scheme for GPU-Accelerated Intrusion Detection Systems

    Xavier Bellekens, Christos Tachtatzis, Robert Atkinson, Craig Renfrew, Tony Kirkham
    Comments: Published in The 7th International Conference of Security of Information and Networks, SIN 2014, Glasgow, UK, September, 2014
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Cryptography and Security (cs.CR)

    Pattern Matching is a computationally intensive task used in many research
    fields and real world applications. Due to the ever-growing volume of data to
    be processed, and increasing link speeds, the number of patterns to be matched
    has risen significantly. In this paper we explore the parallel capabilities of
    modern General Purpose Graphics Processing Units (GPGPU) applications for high
    speed pattern matching. A highly compressed failure-less Aho-Corasick algorithm
    is presented for Intrusion Detection Systems on off-the-shelf hardware. This
    approach maximises the bandwidth for data transfers between the host and the
    Graphics Processing Unit (GPU). Experiments are performed on multiple alphabet
    sizes, demonstrating the capabilities of the library to be used in different
    research fields, while sustaining an adequate throughput for intrusion
    detection systems or DNA sequencing. The work also explores the performance
    impact of adequate prefix matching for alphabet sizes and varying pattern
    numbers achieving speeds up to 8Gbps and low memory consumption for intrusion
    detection systems.

    Vectorization of Hybrid Breadth First Search on the Intel Xeon Phi

    Mireya Paredes, Graham Riley, Mikel Lujan
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    The Breadth-First Search (BFS) algorithm is an important building block for
    graph analysis of large datasets. The BFS parallelisation has been shown to be
    challenging because of its inherent characteristics, including irregular memory
    access patterns, data dependencies and workload imbalance, that limit its
    scalability. We investigate the optimisation and vectorisation of the hybrid
    BFS (a combination of top-down and bottom-up approaches for BFS) on the Xeon
    Phi, which has advanced vector processing capabilities. The results show that
    our new implementation improves by 33\%, for a one million vertices graph,
    compared to the state-of-the-art.

    Probabilistic Recurrence Relations for Work and Span of Parallel Algorithms

    Joseph Tassarotti
    Comments: 12 pages
    Subjects: Data Structures and Algorithms (cs.DS); Distributed, Parallel, and Cluster Computing (cs.DC)

    In this paper we present a method for obtaining tail-bounds for random
    variables satisfying certain probabilistic recurrences that arise in the
    analysis of randomized parallel divide and conquer algorithms. In such
    algorithms, some computation is initially done to process an input x, which is
    then randomly split into subproblems (h_1(x), …, h_n(x)), and the algorithm
    proceeds recursively in parallel on each subproblem. The total work on input x,
    W(x), then satisfies a probabilistic recurrence of the form (W(x) = a(x) +
    sum_{i=1}^n W (h_i(x))), and the span (the longest chain of sequential
    dependencies), satisfies (S(x) = b(x) + max_{i=1}^n S(h_i(x))), where a(x) and
    b(x) are the work and span to split x and combine the results of the recursive
    calls.

    Karp has previously presented methods for obtaining tail-bounds in the case
    when n = 1, and under certain stronger assumptions for the work-recurrence when
    n > 1, but left open the question of the span-recurrence. We first show how to
    extend his technique to handle the span-recurrence. We then show that in some
    cases, the work-recurrence can be bounded under simpler assumptions than Karp’s
    by transforming it into a related span-recurrence and applying our first
    result. We demonstrate our results by deriving tail bounds for the work and
    span of quicksort and the height of a randomly generated binary search tree.

    A Comparison of Parallel Graph Processing Benchmarks

    Samuel Pollard, Boyana Norris
    Comments: 12 pages, 4 figures, Submitted to EuroPar 2017
    Subjects: Performance (cs.PF); Distributed, Parallel, and Cluster Computing (cs.DC)

    The rapidly growing number of large network analysis problems has led to the
    emergence of many parallel and distributed graph processing systems—one
    survey in 2014 identified over 80. Since then, the landscape has evolved; some
    packages have become inactive while more are being developed. Determining the
    best approach for a given problem is infeasible for most developers. To enable
    easy, rigorous, and repeatable comparison of the capabilities of such systems,
    we present an approach and associated software for analyzing the performance
    and scalability of parallel, open-source graph libraries. We demonstrate our
    approach on five graph frameworks: GraphMat, the Graph500, the Graph Algorithm
    Platform, GraphBIG, and PowerGraph using synthetic and real-world datasets. We
    examine previously overlooked aspects of parallel graph processing performance
    such as phases of execution and energy usage for three algorithms: breadth
    first search, single source shortest paths, and PageRank and compare our
    results to Graphalytics.


    Learning

    Thresholding Bandits with Augmented UCB

    Subhojyoti Mukherjee, K. P. Naveen, Nandan Sudarsanam, Balaraman Ravindran
    Subjects: Learning (cs.LG)

    In this paper we propose the Augmented-UCB (AugUCB) algorithm for a
    fixed-budget version of the thresholding bandit problem (TBP), where the
    objective is to identify a set of arms whose quality is above a threshold. A
    key feature of AugUCB is that it uses both mean and variance estimates to
    eliminate arms that have been sufficiently explored; to the best of our
    knowledge this is the first algorithm to employ such an approach for the
    considered TBP. Theoretically, we obtain an upper bound on the loss
    (probability of mis-classification) incurred by AugUCB. Although UCBEV in
    literature provides a better guarantee, it is important to emphasize that UCBEV
    has access to problem complexity (whose computation requires arms’ mean and
    variances), and hence is not realistic in practice; this is in contrast to
    AugUCB whose implementation does not require any such complexity inputs. We
    conduct extensive simulation experiments to validate the performance of AugUCB.
    Through our simulation work, we establish that AugUCB, owing to its utilization
    of variance estimates, performs significantly better than the state-of-the-art
    APT, CSAR and other non variance-based algorithms.

    Rapid Mixing Swendsen-Wang Sampler for Stochastic Partitioned Attractive Models

    Sejun Park, Yunhun Jang, Andreas Galanis, Jinwoo Shin, Daniel Stefankovic, Eric Vigoda
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    The Gibbs sampler is a particularly popular Markov chain used for learning
    and inference problems in Graphical Models (GMs). These tasks are
    computationally intractable in general, and the Gibbs sampler often suffers
    from slow mixing. In this paper, we study the Swendsen-Wang dynamics which is a
    more sophisticated Markov chain designed to overcome bottlenecks that impede
    the Gibbs sampler. We prove O(log n) mixing time for attractive binary
    pairwise GMs (i.e., ferromagnetic Ising models) on stochastic partitioned
    graphs having n vertices, under some mild conditions, including low temperature
    regions where the Gibbs sampler provably mixes exponentially slow. Our
    experiments also confirm that the Swendsen-Wang sampler significantly
    outperforms the Gibbs sampler when they are used for learning parameters of
    attractive GMs.

    Training Triplet Networks with GAN

    Maciej Zieba, Lei Wang
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    Triplet networks are widely used models that are characterized by good
    performance in classification and retrieval tasks. In this work we propose to
    train a triplet network by putting it as the discriminator in Generative
    Adversarial Nets (GANs). We make use of the good capability of representation
    learning of the discriminator to increase the predictive quality of the model.
    We evaluated our approach on Cifar10 and MNIST datasets and observed
    significant improvement on the classification performance using the simple k-nn
    method.

    Variance Based Moving K-Means Algorithm

    Vibin Vijay, Raghunath Vp, Amarjot Singh, SN Omar
    Comments: Accepted at the 7th IEEE International Advance Computing Conference (IACC-2017)
    Subjects: Learning (cs.LG)

    Clustering is a useful data exploratory method with its wide applicability in
    multiple fields. However, data clustering greatly relies on initialization of
    cluster centers that can result in large intra-cluster variance and dead
    centers, therefore leading to sub-optimal solutions. This paper proposes a
    novel variance based version of the conventional Moving K-Means (MKM) algorithm
    called Variance Based Moving K-Means (VMKM) that can partition data into
    optimal homogeneous clusters, irrespective of cluster initialization. The
    algorithm utilizes a novel distance metric and a unique data element selection
    criteria to transfer the selected elements between clusters to achieve low
    intra-cluster variance and subsequently avoid dead centers. Quantitative and
    qualitative comparison with various clustering techniques is performed on four
    datasets selected from image processing, bioinformatics, remote sensing and the
    stock market respectively. An extensive analysis highlights the superior
    performance of the proposed method over other techniques.

    Speech signals frequency modulation decoding via deep neural networks

    Dan Elbaz, Michael Zibulevsky
    Subjects: Learning (cs.LG); Sound (cs.SD)

    Long short-term memory (LSTM) Hochreiter et al. is a powerful type of neural
    network that can capture long range dependencies and nonlinear dynamics.
    Frequency modulation FM is a nonlinear encoding of information in a carrier
    wave. In this paper we present a novel application of software defined radio
    (SDR) based on LSTM recurrent neural networks to decode audio signal from its
    noisy frequency demodulated version.

    TransNets: Learning to Transform for Recommendation

    Rose Catherine, William Cohen
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Learning (cs.LG)

    Recently, deep learning methods have been shown to improve the performance of
    recommender systems over traditional methods, especially when review text is
    available. For example, a recent model, DeepCoNN, uses neural nets to learn one
    latent representation for the text of all reviews written by a target user, and
    a second latent representation for the text of all reviews for a target item,
    and then combines these latent representations to obtain state-of-the-art
    performance on recommendation tasks. We show that (unsurprisingly) much of the
    predictive value of review text comes from reviews of the target user for the
    target item. We then introduce a way in which this information can be used in
    recommendation, even when the target user’s review for the target item is not
    available. Our model, called TransNets, extends the DeepCoNN model by
    introducing an additional latent layer representing the target user-target item
    pair. We then regularize this layer, at training time, to be similar to another
    latent representation of the target user’s review of the target item. We show
    that TransNets and extensions of it improve substantially over the previous
    state-of-the-art.

    NILC-USP at SemEval-2017 Task 4: A Multi-view Ensemble for Twitter Sentiment Analysis

    Edilson A. Corrêa Jr., Vanessa Queiroz Marinho, Leandro Borges dos Santos
    Comments: Published in Proceedings of SemEval-2017, 5 pages
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    This paper describes our multi-view ensemble approach to SemEval-2017 Task 4
    on Sentiment Analysis in Twitter, specifically, the Message Polarity
    Classification subtask for English (subtask A). Our system is a voting
    ensemble, where each base classifier is trained in a different feature space.
    The first space is a bag-of-words model and has a Linear SVM as base
    classifier. The second and third spaces are two different strategies of
    combining word embeddings to represent sentences and use a Linear SVM and a
    Logistic Regressor as base classifiers. The proposed system was ranked 18th out
    of 38 systems considering F1 score and 20th considering recall.

    Recurrent Environment Simulators

    Silvia Chiappa, Sébastien Racaniere, Daan Wierstra, Shakir Mohamed
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    Models that can simulate how environments change in response to actions can
    be used by agents to plan and act efficiently. We improve on previous
    environment simulators from high-dimensional pixel observations by introducing
    recurrent neural networks that are able to make temporally and spatially
    coherent predictions for hundreds of time-steps into the future. We present an
    in-depth analysis of the factors affecting performance, providing the most
    extensive attempt to advance the understanding of the properties of these
    models. We address the issue of computationally inefficiency with a model that
    does not need to generate a high-dimensional image at each time-step. We show
    that our approach can be used to improve exploration and is adaptable to many
    diverse environments, namely 10 Atari games, a 3D car racing environment, and
    complex 3D mazes.

    Echantillonnage de signaux sur graphes via des processus déterminantaux

    Nicolas Tremblay (CNRS, GIPSA-CICS), Simon Barthelme (CNRS, GIPSA-VIBS), Pierre-Olivier Amblard (CNRS, GIPSA-CICS)
    Comments: in French
    Subjects: Data Structures and Algorithms (cs.DS); Artificial Intelligence (cs.AI); Discrete Mathematics (cs.DM); Learning (cs.LG)

    We consider the problem of sampling k-bandlimited graph signals, ie, linear
    combinations of the first k graph Fourier modes. We know that a set of k nodes
    embedding all k-bandlimited signals always exists, thereby enabling their
    perfect reconstruction after sampling. Unfortunately, to exhibit such a set,
    one needs to partially diagonalize the graph Laplacian, which becomes
    prohibitive at large scale. We propose a novel strategy based on determinantal
    point processes that side-steps partial diagonalisation and enables
    reconstruction with only O(k) samples. While doing so, we exhibit a new general
    algorithm to sample determinantal process, faster than the state-of-the-art
    algorithm by an order k.

    OBTAIN: Real-Time Beat Tracking in Audio Signals

    Ali Mottaghi, Kayhan Behdin, Ashkan Esmaeili, Mohammadreza Heydari, Farokh Marvasti
    Subjects: Sound (cs.SD); Information Retrieval (cs.IR); Learning (cs.LG); Multimedia (cs.MM)

    In this paper, we design a system in order to perform the real-time beat
    tracking for an audio signal. We use Onset Strength Signal (OSS) to detect the
    onsets and estimate the tempos. Then, we form Cumulative Beat Strength Signal
    (CBSS) by taking advantage of OSS and estimated tempos. Next, we perform peak
    detection by extracting the periodic sequence of beats among all CBSS peaks. In
    simulations, we can see that our proposed algorithm, Online Beat TrAckINg
    (OBTAIN), outperforms state-of-art results in terms of prediction accuracy
    while maintaining comparable and practical computational complexity. The
    real-time performance is tractable visually as illustrated in the simulations.

    Hierarchical Clustering: Objective Functions and Algorithms

    Vincent Cohen-Addad, Varun Kanade, Frederik Mallmann-Trenn, Claire Mathieu
    Subjects: Data Structures and Algorithms (cs.DS); Learning (cs.LG)

    Hierarchical clustering is a recursive partitioning of a dataset into
    clusters at an increasingly finer granularity. Motivated by the fact that most
    work on hierarchical clustering was based on providing algorithms, rather than
    optimizing a specific objective, Dasgupta framed similarity-based hierarchical
    clustering as a combinatorial optimization problem, where a `good’ hierarchical
    clustering is one that minimizes some cost function. He showed that this cost
    function has certain desirable properties.

    We take an axiomatic approach to defining `good’ objective functions for both
    similarity and dissimilarity-based hierarchical clustering. We characterize a
    set of “admissible” objective functions (that includes Dasgupta’s one) that
    have the property that when the input admits a `natural’ hierarchical
    clustering, it has an optimal value.

    Equipped with a suitable objective function, we analyze the performance of
    practical algorithms, as well as develop better algorithms. For
    similarity-based hierarchical clustering, Dasgupta showed that the divisive
    sparsest-cut approach achieves an (O(log^{3/2} n))-approximation. We give a
    refined analysis of the algorithm and show that it in fact achieves an
    (O(sqrt{log n}))-approx. (Charikar and Chatziafratis independently proved
    that it is a (O(sqrt{log n}))-approx.). This improves upon the LP-based
    (O(log n))-approx. of Roy and Pokutta. For dissimilarity-based hierarchical
    clustering, we show that the classic average-linkage algorithm gives a factor 2
    approx., and provide a simple and better algorithm that gives a factor 3/2
    approx..

    Finally, we consider `beyond-worst-case’ scenario through a generalisation of
    the stochastic block model for hierarchical clustering. We show that Dasgupta’s
    cost function has desirable properties for these inputs and we provide a simple
    1 + o(1)-approximation in this setting.

    Quantum ensembles of quantum classifiers

    Maria Schuld, Francesco Petruccione
    Comments: 19 pages, 9 figures
    Subjects: Quantum Physics (quant-ph); Learning (cs.LG); Statistics Theory (math.ST)

    Quantum machine learning witnesses an increasing amount of quantum algorithms
    for data-driven decision making, a problem with potential applications ranging
    from automated image recognition to medical diagnosis. Many of those algorithms
    are implementations of quantum classifiers, or models for the classification of
    data inputs with a quantum computer. Following the success of collective
    decision making with ensembles in classical machine learning, this paper
    introduces the concept of quantum ensembles of quantum classifiers. Creating
    the ensemble corresponds to a state preparation routine, after which the
    quantum classifiers are evaluated in parallel and their combined decision is
    accessed by a single-qubit measurement. This framework naturally allows for
    exponentially large ensembles in which — similar to Bayesian learning — the
    individual classifiers do not have to be trained. As an example, we analyse an
    exponentially large quantum ensemble in which each classifier is weighed
    according to its performance in classifying the training data, leading to new
    results for quantum as well as classical machine learning.

    Jet Constituents for Deep Neural Network Based Top Quark Tagging

    Jannicke Pearkes, Wojciech Fedorko, Alison Lister, Colin Gay
    Comments: 20 pages, 13 figures
    Subjects: High Energy Physics – Experiment (hep-ex); Learning (cs.LG); High Energy Physics – Phenomenology (hep-ph); Machine Learning (stat.ML)

    Recent literature on deep neural networks for tagging of highly energetic
    jets resulting from top quark decays has focused on image based techniques or
    multivariate approaches using high level jet substructure variables. Here a
    sequential approach to this task is taken by using an ordered sequence of jet
    constituents as training inputs. Unlike previous approaches, this strategy does
    not result in a loss of information during pixelisation or the calculation of
    high level features. New preprocessing methods that do not alter key physical
    quantities such as the jet mass are developed. The jet classification method
    achieves background rejection of 45 for 50% efficiency operating point for
    reconstruction level jets with transverse momentum range of 600 to 2500 GeV and
    is insensitive to multiple proton-proton interactions at the levels expected
    throughout LHC Run 2.

    Restricted Isometry Property of Gaussian Random Projection for Finite Set of Subspaces

    Gen Li, Yuantao Gu
    Comments: 30 pages, 6 figures
    Subjects: Information Theory (cs.IT); Learning (cs.LG)

    Dimension reduction plays an essential role when decreasing the complexity of
    solving large-scale problems. The well-known Johnson-Lindenstrauss (JL) Lemma
    and Restricted Isometry Property (RIP) admit the use of random projection to
    reduce the dimension while keeping the Euclidean distance, which leads to the
    boom of Compressed Sensing and the field of sparsity related signal processing.
    Recently, successful applications of sparse models in computer vision and
    machine learning have increasingly hinted that the underlying structure of high
    dimensional data looks more like a union of subspaces (UoS). In this paper,
    motivated by JL Lemma and an emerging field of Compressed Subspace Clustering,
    we study for the first time the RIP of Gaussian random matrix for compressing
    two subspaces. We theoretically prove that with high probability the affinity
    or distance between two projected subspaces are concentrated around their
    estimates. When the ambient dimension after projection is sufficiently large,
    the affinity and distance between two subspaces almost remain unchanged after
    random projection. Numerical experiments verify the theoretical work.

    Treatment-Response Models for Counterfactual Reasoning with Continuous-time, Continuous-valued Interventions

    Hossein Soleimani, Adarsh Subbaswamy, Suchi Saria
    Comments: The first two authors contributed equally to this work
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Treatment effects can be estimated from observational data as the difference
    in potential outcomes. In this paper, we address the challenge of estimating
    the potential outcome when treatment-dose levels can vary continuously over
    time. Further, the outcome variable may not be measured at a regular frequency.
    Our proposed solution represents the treatment response curves using linear
    time-invariant dynamical systems—this provides a flexible means for modeling
    response over time to highly variable dose curves. Moreover, for multivariate
    data, the proposed method: uncovers shared structure in treatment response and
    the baseline across multiple markers; and, flexibly models challenging
    correlation structure both across and within signals over time. For this, we
    build upon the framework of multiple-output Gaussian Processes. On simulated
    and a challenging clinical dataset, we show significant gains in accuracy over
    state-of-the-art models.

    Recognizing Multi-talker Speech with Permutation Invariant Training

    Dong Yu, Xuankai Chang, Yanmin Qian
    Comments: 5 pages, 6 figures, InterSpeech2017
    Subjects: Sound (cs.SD); Learning (cs.LG)

    In this paper, we propose a novel technique for direct recognition of
    multiple speech streams given the single channel of mixed speech, without first
    separating them. Our technique is based on permutation invariant training (PIT)
    for automatic speech recognition (ASR). In PIT-ASR, we compute the average
    cross entropy (CE) over all frames in the whole utterance for each possible
    output-target assignment, pick the one with the minimum CE, and optimize for
    that assignment. PIT-ASR forces all the frames of the same speaker to be
    aligned with the same output layer. This strategy elegantly solves the label
    permutation problem and speaker tracing problem in one shot. Our experiments on
    artificially mixed AMI data showed that the proposed approach is very
    promising.

    Multi-space Variational Encoder-Decoders for Semi-supervised Labeled Sequence Transduction

    Chunting Zhou, Graham Neubig
    Comments: Accepted by ACL 2017
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    Labeled sequence transduction is a task of transforming one sequence into
    another sequence that satisfies desiderata specified by a set of labels. In
    this paper we propose multi-space variational encoder-decoders, a new model for
    labeled sequence transduction with semi-supervised learning. The generative
    model can use neural networks to handle both discrete and continuous latent
    variables to exploit various features of data. Experiments show that our model
    provides not only a powerful supervised framework but also can effectively take
    advantage of the unlabeled data. On the SIGMORPHON morphological inflection
    benchmark, our model outperforms single-model state-of-art results by a large
    margin for the majority of languages.

    A Homological Theory of Functions

    Greg Yang
    Comments: 72 pages, 22 figures. Comments welcome
    Subjects: Commutative Algebra (math.AC); Computational Complexity (cs.CC); Discrete Mathematics (cs.DM); Learning (cs.LG); Combinatorics (math.CO)

    In computational complexity, a complexity class is given by a set of problems
    or functions, and a basic challenge is to show separations of complexity
    classes (A
    ot= B) especially when (A) is known to be a subset of (B). In this
    paper we introduce a homological theory of functions that can be used to
    establish complexity separations, while also providing other interesting
    consequences. We propose to associate a topological space (S_A) to each class
    of functions (A), such that, to separate complexity classes (A subseteq B’),
    it suffices to observe a change in “the number of holes”, i.e. homology, in
    (S_A) as a subclass (B) of (B’) is added to (A). In other words, if the
    homologies of (S_A) and (S_{A cup B}) are different, then (A
    ot= B’). We
    develop the underlying theory of functions based on combinatorial and
    homological commutative algebra and Stanley-Reisner theory, and recover Minsky
    and Papert’s 1969 result that parity cannot be computed by nonmaximal degree
    polynomial threshold functions. In the process, we derive a “maximal principle”
    for polynomial threshold functions that is used to extend this result further
    to arbitrary symmetric functions. A surprising coincidence is demonstrated,
    where the maximal dimension of “holes” in (S_A) upper bounds the VC dimension
    of (A), with equality for common computational cases such as the class of
    polynomial threshold functions or the class of linear functionals in (mathbb
    F_2), or common algebraic cases such as when the Stanley-Reisner ring of (S_A)
    is Cohen-Macaulay. As another interesting application of our theory, we prove a
    result that a priori has nothing to do with complexity separation: it
    characterizes when a vector subspace intersects the positive cone, in terms of
    homological conditions. By analogy to Farkas’ result doing the same with
    *linear conditions*, we call our theorem the Homological Farkas Lemma.


    Information Theory

    Spatial Alignment of Coding and Modulation Helps Content Delivery

    Dongzhu Liu, Kaibin Huang
    Comments: 30 pages, submitted to IEEE Trans. on Wireless Communication
    Subjects: Information Theory (cs.IT)

    Multimedia content especially videos is expected to dominate data traffic in
    next-generation mobile networks. Caching popular content at the network edge
    has emerged to be a solution for low-latency content delivery. Compared with
    the traditional wireless communication, content delivery has a key
    characteristic that many signals coexisting in the air carry identical popular
    content. They, however, can interfere with each other at a receiver if their
    modulation-and-coding (MAC) schemes are adapted to individual channels
    following the classic approach. To address this issue, we present a novel idea
    of content adaptive MAC (CAMAC) where adapting MAC schemes to content ensures
    that all signals carry identical content are encoded using an identical MAC
    scheme, achieving spatial MAC alignment. Consequently, interference can be
    harnessed as signals, to improve the reliability of wireless delivery. In the
    remaining part of the paper, we focus on quantifying the gain CAMAC can bring
    to a content-delivery network using a stochastic-geometry model. Specifically,
    content helpers are distributed as a Poisson point process, each of which
    transmits a file from a content database based on a given popularity
    distribution. It is discovered that the successful content-delivery probability
    is closely related to the distribution of the ratio of two independent shot
    noise processes, named a shot-noise ratio. The distribution itself is an open
    mathematical problem that we tackle in this work. Using stable-distribution
    theory and tools from stochastic geometry, the distribution function is derived
    in closed form. Extending the result in the context of content-delivery
    networks with CAMAC yields the content-delivery probability in different closed
    forms. In addition, the gain in the probability due to CAMAC is shown to grow
    with the level of skewness in the content popularity distribution.

    A Converse Bound on Wyner-Ahlswede-Körner Network via Gray-Wyner Network

    Shun Watanabe
    Comments: 5 pages, 0 figure
    Subjects: Information Theory (cs.IT)

    We show a reduction method to construct a code for the Gray-Wyner (GW)
    network from a given code for the Wyner-Ahlswede-K”orner (WAK) network. By
    combining this reduction with a converse bound on the GW network, we derive a
    converse bound on the WAK network. The derived bound gives an alternative proof
    of the strong converse theorem for the WAK network.

    Modulation in the Air: Backscatter Communication over Ambient OFDM Carrier

    Gang Yang, Ying-Chang Liang, Rui Zhang
    Comments: 32 pages, 10 figures, journal paper
    Subjects: Information Theory (cs.IT)

    Ambient backscatter communication (AmBC) enables radio-frequency (RF) powered
    backscatter devices (BDs) (e.g., sensors, tags) to modulate their information
    bits over ambient RF carriers in an over-the-air manner. This technology also
    called “modulation in the air”, thus has emerged as a promising solution to
    achieve green communications for future Internet-of-Things. This paper studies
    an AmBC system by leveraging the ambient orthogonal frequency division
    multiplexing (OFDM) modulated signals in the air. We first model such AmBC
    system from a spread-spectrum communication perspective, upon which a novel
    joint design for BD waveform and receiver detector is proposed. The BD symbol
    period is designed to be in general an integer multiplication of the OFDM
    symbol period, and the waveform for BD bit `0′ maintains the same state within
    a BD symbol period, while the waveform for BD bit `1′ has a state transition in
    the middle of each OFDM symbol period within a BD symbol period. In the
    receiver detector design, we construct the test statistic that cancels out the
    direct-link interference by exploiting the repeating structure of the ambient
    OFDM signals due to the use of cyclic prefix. For the system with a
    single-antenna receiver, the maximum-likelihood detector is proposed to recover
    the BD bits, for which the optimal threshold is obtained in closed-form
    expression. For the system with a multi-antenna receiver, we propose a new test
    statistic, and derive the optimal detector. Moreover, practical timing
    synchronization algorithms are proposed, and we also analyze the effect of
    various system parameters on the system performance. Finally, extensive
    numerical results are provided to verify that the proposed transceiver design
    can improve the system bit-error-rate (BER) performance and the operating range
    significantly, and achieve much higher data rate, as compared to the
    conventional design.

    Modeling and Analysis of HetNets with mm-Wave Multi-RAT Small Cells Deployed Along Roads

    Gourab Ghatak, Antonio De Domenico, Marceau Coupechoux
    Comments: A 7-page version is submitted to IEEE GLOBECOM 2017
    Subjects: Information Theory (cs.IT)

    We characterize a multi tier network with classical macro cells, and multi
    radio access technology (RAT) small cells, which are able to operate in
    microwave and millimeter-wave (mm-wave) bands. The small cells are assumed to
    be deployed along roads modeled as a Poisson line process. This
    characterization is more realistic as compared to the classical Poisson point
    processes typically used in literature. In this context, we derive the
    association and RAT selection probabilities of the typical user under various
    system parameters such as the small cell deployment density and mm-wave antenna
    gain, and with varying street densities. Finally, we calculate the signal to
    interference plus noise ratio (SINR) coverage probability for the typical user
    considering a tractable dominant interference based model for mm-wave
    interference. Our analysis reveals the need of deploying more small cells per
    street in cities with more streets to maintain coverage, and highlights that
    mm-wave RAT in small cells can help to improve the SINR performance of the
    users.

    Non-linear maximum rank distance codes in the cyclic model for the field reduction of finite geometries

    Nicola Durante, Alessandro Siciliano
    Comments: submitted; 22 pages
    Subjects: Information Theory (cs.IT); Combinatorics (math.CO)

    In this paper we construct infinite families of non-linear maximum rank
    distance codes by using the setting of bilinear forms of a finite vector space.
    We also give a geometric description of such codes by using the cyclic model
    for the field reduction of finite geometries and we show that these families
    contain the non-linear maximum rank distance codes recently provided by
    Cossidente, Marino and Pavese.

    Restricted Isometry Property of Gaussian Random Projection for Finite Set of Subspaces

    Gen Li, Yuantao Gu
    Comments: 30 pages, 6 figures
    Subjects: Information Theory (cs.IT); Learning (cs.LG)

    Dimension reduction plays an essential role when decreasing the complexity of
    solving large-scale problems. The well-known Johnson-Lindenstrauss (JL) Lemma
    and Restricted Isometry Property (RIP) admit the use of random projection to
    reduce the dimension while keeping the Euclidean distance, which leads to the
    boom of Compressed Sensing and the field of sparsity related signal processing.
    Recently, successful applications of sparse models in computer vision and
    machine learning have increasingly hinted that the underlying structure of high
    dimensional data looks more like a union of subspaces (UoS). In this paper,
    motivated by JL Lemma and an emerging field of Compressed Subspace Clustering,
    we study for the first time the RIP of Gaussian random matrix for compressing
    two subspaces. We theoretically prove that with high probability the affinity
    or distance between two projected subspaces are concentrated around their
    estimates. When the ambient dimension after projection is sufficiently large,
    the affinity and distance between two subspaces almost remain unchanged after
    random projection. Numerical experiments verify the theoretical work.

    Total Variation Minimization in Compressed Sensing

    Felix Krahmer, Christian Kruschel, Michael Sandbichler
    Comments: 21 pages, 1 figure
    Subjects: Information Theory (cs.IT)

    This chapter gives an overview over recovery guarantees for total variation
    minimization in compressed sensing for different measurement scenarios. In
    addition to summarizing the results in the area, we illustrate why an approach
    that is common for synthesis sparse signals fails and different techniques are
    necessary. Lastly, we discuss a generalizations of recent results for Gaussian
    measurements to the subgaussian case.

    Codes with Unequal Disjoint Local Erasure Correction Constraints

    Geonu Kim, Jungwoo Lee
    Comments: 15 pages. arXiv admin note: text overlap with arXiv:1701.07340
    Subjects: Information Theory (cs.IT)

    Recently, locally repairable codes (LRCs) with local erasure correction
    constraints that are unequal and disjoint have been proposed. In this work, we
    study the same topic and provide some improved and additional results.

    Link Flow Correction For Inconsistent Traffic Flow Data Via (ell_1)-Minimization

    Penghang Yin, Zhe Sun, Wenlong Jin, Jack Xin
    Subjects: Information Theory (cs.IT); Physics and Society (physics.soc-ph)

    A computational method, based on (ell_1)-norm minimization, is proposed for
    the problem traffic flow correction. Without extra information, the problem is
    generally ill-posed when a large portion of the link sensors are unhealthy. It
    is possible, however, to correct the corruptions extit{accurately} if there
    are only a few bad sensors which are located at certain links by the proposed
    method. We mathematically quantify these links that are robust to miscounts and
    relate them to the geometric structure of the traffic network. In a more
    realistic setting, besides the unhealthy link sensors, if small measure noise
    are present at the other sensors, our method guarantees to give an estimated
    traffic flow fairly close to the ground-truth. Both toy and real-world examples
    are provided to demonstrate the effectiveness of the proposed method.

    Secure Transmission of Delay-Sensitive Data over Wireless Fading Channels

    Mustafa Ozmen, M. Cenk Gursoy
    Subjects: Information Theory (cs.IT)

    In this paper, throughput and energy efficiency of secure wireless
    transmission of delay sensitive data generated by random sources is studied. A
    fading broadcast model in which the transmitter sends confidential and common
    messages to two receivers is considered. It is assumed that the common and
    confidential data, generated from Markovian sources, is stored in buffers prior
    to transmission, and the transmitter operates under constraints on buffer/delay
    violation probability. Under such statistical quality of service (QoS)
    constraints, effective capacity of time-varying wireless transmissions and
    effective bandwidth of Markovian sources are employed to determine the
    throughput. In particular, secrecy capacity is used to describe the service
    rate of buffers containing confidential messages. Moreover, energy per bit is
    used as the energy efficiency metric and energy efficiency is studied in the
    low signal-to-noise (SNR) regime. Specifically, minimum energy per bit required
    for the reliable communication of common and confidential messages is
    determined and wideband slope expressions are identified. The impact of
    buffer/delay constraints, correlation between channels, source
    characteristics/burstiness, channel knowledge at the transmitter, power
    allocation, and secrecy requirements on the throughput and energy efficiency of
    common and confidential message transmissions is identified.

    An efficient algorithm for compression-based compressed sensing

    Sajjad Beygi, Shirin Jalali, Arian Maleki, Urbashi Mitra
    Subjects: Information Theory (cs.IT)

    Modern image and video compression codes employ elaborate structures existing
    in such signals to encode them into few number of bits. Compressed sensing
    recovery algorithms on the other hand use such signals’ structures to recover
    them from few linear observations. Despite the steady progress in the field of
    compressed sensing, structures that are often used for signal recovery are
    still much simpler than those employed by state-of-the-art compression codes.
    The main goal of this paper is to bridge this gap through answering the
    following question: Can one employ a given compression code to build an
    efficient (polynomial time) compressed sensing recovery algorithm? In response
    to this question, the compression-based gradient descent (C-GD) algorithm is
    proposed. C-GD, which is a low-complexity iterative algorithm, is able to
    employ a generic compression code for compressed sensing and therefore elevates
    the scope of structures used in compressed sensing to those used by compression
    codes. The convergence performance of C-GD and its required number of
    measurements in terms of the rate-distortion performance of the compression
    code are theoretically analyzed. It is also shown that C-GD is robust to
    additive white Gaussian noise. Finally, the presented simulation results show
    that combining C-GD with commercial image compression codes such as JPEG2000
    yields state-of-the-art performance in imaging applications.

    Random Access Analysis for Massive IoT Networks under A New Spatio-Temporal Model: A Stochastic Geometry Approach

    Nan Jiang, Yansha Deng, Xin Kang, Arumugam Nallanathan
    Comments: This manuscript has been submitted to IEEE Transactions on Wireless Communications
    Subjects: Information Theory (cs.IT)

    Massive Internet of Things (mIoT) has provided an auspicious opportunity to
    build powerful and ubiquitous connections that faces a plethora of new
    challenges, where cellular networks are potential solutions due to its
    scalability, reliability, and efficiency. The contention-based random access
    procedure (RACH) is the first step of connection establishment between IoT
    devices and Base Stations (BSs) in the cellular-based mIoT network, where
    modelling the interactions between static properties of physical layer network
    and dynamic properties of queue evolving in each IoT device are challenging. To
    tackle this, we provide a novel traffic-aware spatio-temporal model to analyze
    RACH in cellular-based mIoT networks, where the physical layer network are
    modelled and analyzed based on stochastic geometry in the spatial domain, and
    the queue evolution are analyzed based on probability theory in the time
    domain. For performance evaluation, we derive the exact expressions for the
    preamble transmission success probabilities of a randomly chosen IoT device
    with different RACH transmission schemes in each time slot, which offer
    insights into effectiveness of each transmission scheme. Our derived analytical
    results are verified by the realistic simulations capturing the evolution of
    packets in each IoT device. This mathematical model and analytical framework
    can be applied to evaluate the performance of other types of RACH transmission
    schemes in the cellular-based networks by simply integrating its packets
    evolution principle.

    A Delay-Aware Caching Algorithm for Wireless D2D Caching Networks

    Yi Li, M. Cenk Gursoy, Senem Velipasalar
    Subjects: Information Theory (cs.IT)

    Recently, wireless caching techniques have been studied to satisfy lower
    delay requirements and offload traffic from peak periods. By storing parts of
    the popular files at the mobile users, users can locate some of their requested
    files in their own caches or the caches at their neighbors. In the latter case,
    when a user receives files from its neighbors, device-to-device (D2D)
    communication is enabled. D2D communication underlaid with cellular networks is
    also a new paradigm for the upcoming 5G wireless systems. By allowing a pair of
    adjacent D2D users to communicate directly, D2D communication can achieve
    higher throughput, better energy efficiency and lower traffic delay. In this
    work, we propose a very efficient caching algorithm for D2D-enabled cellular
    networks to minimize the average transmission delay. Instead of searching over
    all possible solutions, our algorithm finds out the best <file,user> pairs,
    which provide the best delay improvement in each loop to form a caching policy
    with very low transmission delay and high throughput. This algorithm is also
    extended to address a more general scenario, in which the distributions of
    fading coefficients and values of system parameters potentially change over
    time. Via numerical results, the superiority of the proposed algorithm is
    verified by comparing it with a naive algorithm, in which all users simply
    cache their favorite files.

    Precise Real-Time Navigation of LEO Satellites Using a Single-Frequency GPS Receiver and Ultra-Rapid Ephemerides

    Xiucong Sun, Chao Han, Pei Chen
    Comments: 25 pages, 6 figures, Ready for publication in Aerospace Science and Technology, 2017
    Subjects: Instrumentation and Methods for Astrophysics (astro-ph.IM); Information Theory (cs.IT)

    Precise (sub-meter level) real-time navigation using a space-capable
    single-frequency global positioning system (GPS) receiver and ultra-rapid
    (real-time) ephemerides from the international global navigation satellite
    systems service is proposed for low-Earth-orbiting (LEO) satellites. The C/A
    code and L1 carrier phase measurements are combined and single-differenced to
    eliminate first-order ionospheric effects and receiver clock offsets. A
    random-walk process is employed to model the phase ambiguities in order to
    absorb the time-varying and satellite-specific higher-order measurement errors
    as well as the GPS clock correction errors. A sequential Kalman filter which
    incorporates the known orbital dynamic model is developed to estimate orbital
    states and phase ambiguities without matrix inversion. Real flight data from
    the single-frequency GPS receiver onboard China’s SJ-9A small satellite are
    processed to evaluate the orbit determination accuracy. Statistics from
    internal orbit assessments indicate that three-dimensional accuracies of better
    than 0.50 m and 0.55 mm/s are achieved for position and velocity, respectively.




沪ICP备19023445号-2号
友情链接