IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Tue, 1 Nov 2016

    我爱机器学习(52ml.net)发表于 2016-11-01 00:00:00
    love 0

    Neural and Evolutionary Computing

    Tensor Switching Networks

    Chuan-Yung Tsai, Andrew Saxe, David Cox
    Subjects: Neural and Evolutionary Computing (cs.NE); Learning (cs.LG); Machine Learning (stat.ML)

    We present a novel neural network algorithm, the Tensor Switching (TS)
    network, which generalizes the Rectified Linear Unit (ReLU) nonlinearity to
    tensor-valued hidden units. The TS network copies its entire input vector to
    different locations in an expanded representation, with the location determined
    by its hidden unit activity. In this way, even a simple linear readout from the
    TS representation can implement a highly expressive deep-network-like function.
    The TS network hence avoids the vanishing gradient problem by construction, at
    the cost of larger representation size. We develop several methods to train the
    TS network, including equivalent kernels for infinitely wide and deep TS
    networks, a one-pass linear learning algorithm, and two
    backpropagation-inspired representation learning algorithms. Our experimental
    results demonstrate that the TS network is indeed more expressive and
    consistently learns faster than standard ReLU networks.

    Building Energy Load Forecasting using Deep Neural Networks

    Daniel L. Marino, Kasun Amarasinghe, Milos Manic
    Subjects: Neural and Evolutionary Computing (cs.NE)

    Ensuring sustainability demands more efficient energy management with
    minimized energy wastage. Therefore, the power grid of the future should
    provide an unprecedented level of flexibility in energy management. To that
    end, intelligent decision making requires accurate predictions of future energy
    demand/load, both at aggregate and individual site level. Thus, energy load
    forecasting have received increased attention in the recent past, however has
    proven to be a difficult problem. This paper presents a novel energy load
    forecasting methodology based on Deep Neural Networks, specifically Long Short
    Term Memory (LSTM) algorithms. The presented work investigates two variants of
    the LSTM: 1) standard LSTM and 2) LSTM-based Sequence to Sequence (S2S)
    architecture. Both methods were implemented on a benchmark data set of
    electricity consumption data from one residential customer. Both architectures
    where trained and tested on one hour and one-minute time-step resolution
    datasets. Experimental results showed that the standard LSTM failed at
    one-minute resolution data while performing well in one-hour resolution data.
    It was shown that S2S architecture performed well on both datasets. Further, it
    was shown that the presented methods produced comparable results with the other
    deep learning methods for energy forecasting in literature.

    Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition

    Hagen Soltau, Hank Liao, Hasim Sak
    Subjects: Computation and Language (cs.CL); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    We present results that show it is possible to build a competitive, greatly
    simplified, large vocabulary continuous speech recognition system with whole
    words as acoustic units. We model the output vocabulary of about 100,000 words
    directly using deep bi-directional LSTM RNNs with CTC loss. The model is
    trained on 125,000 hours of semi-supervised acoustic training data, which
    enables us to alleviate the data sparsity problem for word models. We show that
    the CTC word models work very well as an end-to-end all-neural speech
    recognition model without the use of traditional context-dependent sub-word
    phone units that require a pronunciation lexicon, and without any language
    model removing the need to decode. We demonstrate that the CTC word models
    perform better than a strong, more complex, state-of-the-art baseline with
    sub-word units.

    Depth Separation in ReLU Networks for Approximating Smooth Non-Linear Functions

    Itay Safran, Ohad Shamir
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    We provide a depth-based separation result for feed-forward ReLU neural
    networks, showing that a wide family of non-linear, twice-differentiable
    functions on ([0,1]^d), which can be approximated to accuracy (epsilon) by
    ReLU networks of depth and width (mathcal{O}( ext{poly}(log(1/epsilon)))),
    cannot be approximated to similar accuracy by constant-depth ReLU networks,
    unless their width is at least (Omega(1/epsilon)).

    A Survey of Brain Inspired Technologies for Engineering

    Jarryd Son, Amit Kumar Mishra
    Subjects: Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    Cognitive engineering is a multi-disciplinary field and hence it is difficult
    to find a review article consolidating the leading developments in the field.
    The in-credible pace at which technology is advancing pushes the boundaries of
    what is achievable in cognitive engineering. There are also differing
    approaches to cognitive engineering brought about from the multi-disciplinary
    nature of the field and the vastness of possible applications. Thus research
    communities require more frequent reviews to keep up to date with the latest
    trends. In this paper we shall dis-cuss some of the approaches to cognitive
    engineering holistically to clarify the reasoning behind the different
    approaches and to highlight their strengths and weaknesses. We shall then show
    how developments from seemingly disjointed views could be integrated to achieve
    the same goal of creating cognitive machines. By reviewing the major
    contributions in the different fields and showing the potential for a combined
    approach, this work intends to assist the research community in devising more
    unified methods and techniques for developing cognitive machines.

    Feature-Augmented Neural Networks for Patient Note De-identification

    Ji Young Lee, Franck Dernoncourt, Ozlem Uzuner, Peter Szolovits
    Comments: Accepted as a conference paper at COLING ClinicalNLP 2016. The first two authors contributed equally to this work
    Subjects: Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    Patient notes contain a wealth of information of potentially great interest
    to medical investigators. However, to protect patients’ privacy, Protected
    Health Information (PHI) must be removed from the patient notes before they can
    be legally released, a process known as patient note de-identification. The
    main objective for a de-identification system is to have the highest possible
    recall. Recently, the first neural-network-based de-identification system has
    been proposed, yielding state-of-the-art results. Unlike other systems, it does
    not rely on human-engineered features, which allows it to be quickly deployed,
    but does not leverage knowledge from human experts or from electronic health
    records (EHRs). In this work, we explore a method to incorporate
    human-engineered features as well as features derived from EHRs to a
    neural-network-based de-identification system. Our results show that the
    addition of features, especially the EHR-derived features, further improves the
    state-of-the-art in patient note de-identification, including for some of the
    most sensitive PHI types such as patient names. Since in a real-life setting
    patient notes typically come with EHRs, we recommend developers of
    de-identification systems to leverage the information EHRs contain.

    Compact Deep Convolutional Neural Networks With Coarse Pruning

    Sajid Anwar, Wonyong Sung
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    The learning capability of a neural network improves with increasing depth at
    higher computational costs. Wider layers with dense kernel connectivity
    patterns furhter increase this cost and may hinder real-time inference. We
    propose feature map and kernel level pruning for reducing the computational
    complexity of a deep convolutional neural network. Pruning feature maps reduces
    the width of a layer and hence does not need any sparse representation.
    Further, kernel pruning converts the dense connectivity pattern into a sparse
    one. Due to coarse nature, these pruning granularities can be exploited by GPUs
    and VLSI based implementations. We propose a simple and generic strategy to
    choose the least adversarial pruning masks for both granularities. The pruned
    networks are retrained which compensates the loss in accuracy. We obtain the
    best pruning ratios when we prune a network with both granularities.
    Experiments with the CIFAR-10 dataset show that more than 85% sparsity can be
    induced in the convolution layers with less than 1% increase in the
    missclassification rate of the baseline network.

    Generalized Haar Filter based Deep Networks for Real-Time Object Detection in Traffic Scene

    Keyu Lu, Jian Li, Xiangjing An, Hangen He
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    Vision-based object detection is one of the fundamental functions in numerous
    traffic scene applications such as self-driving vehicle systems and advance
    driver assistance systems (ADAS). However, it is also a challenging task due to
    the diversity of traffic scene and the storage, power and computing source
    limitations of the platforms for traffic scene applications. This paper
    presents a generalized Haar filter based deep network which is suitable for the
    object detection tasks in traffic scene. In this approach, we first decompose a
    object detection task into several easier local regression tasks. Then, we
    handle the local regression tasks by using several tiny deep networks which
    simultaneously output the bounding boxes, categories and confidence scores of
    detected objects. To reduce the consumption of storage and computing resources,
    the weights of the deep networks are constrained to the form of generalized
    Haar filter in training phase. Additionally, we introduce the strategy of
    sparse windows generation to improve the efficiency of the algorithm. Finally,
    we perform several experiments to validate the performance of our proposed
    approach. Experimental results demonstrate that the proposed approach is both
    efficient and effective in traffic scene compared with the state-of-the-art.

    A Theoretical Study of The Relationship Between Whole An ELM Network and Its Subnetworks

    Enmei Tu, Guanghao Zhang, Lily Rachmawati, Eshan Rajabally, Guang-Bin Huang
    Comments: 3 figures
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    A biological neural network is constituted by numerous subnetworks and
    modules with different functionalities. For an artificial neural network, the
    relationship between a network and its subnetworks is also important and useful
    for both theoretical and algorithmic research, i.e. it can be exploited to
    develop incremental network training algorithm or parallel network training
    algorithm. In this paper we explore the relationship between an ELM neural
    network and its subnetworks. To the best of our knowledge, we are the first to
    prove a theorem that shows an ELM neural network can be scattered into
    subnetworks and its optimal solution can be constructed recursively by the
    optimal solutions of these subnetworks. Based on the theorem we also present
    two algorithms to train a large ELM neural network efficiently: one is a
    parallel network training algorithm and the other is an incremental network
    training algorithm. The experimental results demonstrate the usefulness of the
    theorem and the validity of the developed algorithms.

    Machine learning methods for accurate delineation of tumors in PET images

    Jakub Czakon, Filip Drapejkowski, Grzegorz Zurek, Piotr Giedziun, Jacek Zebrowski, Witold Dyrka
    Comments: 19th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2016) – PETSEG challenge, Athens, Greece, 21/10/2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    In oncology, Positron Emission Tomography imaging is widely used in
    diagnostics of cancer metastases, in monitoring of progress in course of the
    cancer treatment, and in planning radiotherapeutic interventions. Accurate and
    reproducible delineation of the tumor in the Positron Emission Tomography scans
    remains a difficult task, despite being crucial for delivering appropriate
    radiation dose, minimizing adverse side-effects of the therapy, and reliable
    evaluation of treatment. In this piece of research we attempt to solve the
    problem of automated delineation of the tumor using 3d implementations of the
    spatial distance weighted fuzzy c-means, the deep convolutional neural network
    and a dictionary model. The methods, in diverse ways, combine intensity and
    spatial information.


    Computer Vision and Pattern Recognition

    Bi-modal First Impressions Recognition using Temporally Ordered Deep Audio and Stochastic Visual Features

    Arulkumar Subramaniam, Vismay Patel, Ashish Mishra, Prashanth Balasubramanian, Anurag Mittal
    Comments: to be published in: ECCV 2016 Workshops proceedings (Apparent Personality Analysis)
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We propose a novel approach for First Impressions Recognition in terms of the
    Big Five personality-traits from short videos. The Big Five personality traits
    is a model to describe human personality using five broad categories:
    Extraversion, Agreeableness, Conscientiousness, Neuroticism and Openness. We
    train two bi-modal end-to-end deep neural network architectures using
    temporally ordered audio and novel stochastic visual features from few frames,
    without over-fitting. We empirically show that the trained models perform
    exceptionally well, even after training from a small sub-portions of inputs.
    Our method is evaluated in ChaLearn LAP 2016 Apparent Personality Analysis
    (APA) competition using ChaLearn LAP APA2016 dataset and achieved excellent
    performance.

    ConfocalGN : a minimalistic confocal image simulator

    Serge Dmitrieff, François Nédélec
    Comments: The source code is available on this https URL, and is subjected to GPL 3.0 licence. Source code is written in Matlab and requires no additional toolbox
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Quantitative Methods (q-bio.QM)

    Image analysis has a central importance in applied and fundamental science.
    Robustness is a much needed feature of image analysis tools, but is hard to
    evaluate in the absence of knowledge of the ground truth. We developed a very
    simple confocal image generator to simulate confocal microscopy images from a
    known ground truth. The software can analyze a sample image to derivate noise
    parameters and implement them directly in the image simulation. This provides a
    fast and realistic way to assert the quality and robustness of an image
    analysis procedure.

    A Detailed Rubric for Motion Segmentation

    Pia Bideau, Erik Learned-Miller
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Motion segmentation is currently an active area of research in computer
    Vision. The task of comparing different methods of motion segmentation is
    complicated by the fact that researchers may use subtly different definitions
    of the problem. Questions such as “Which objects are moving?”, “What is
    background?”, and “How can we use motion of the camera to segment objects,
    whether they are static or moving?” are clearly related to each other, but lead
    to different algorithms, and imply different versions of the ground truth. This
    report has two goals. The first is to offer a precise definition of motion
    segmentation so that the intent of an algorithm is as well-defined as possible.
    The second is to report on new versions of three previously existing data sets
    that are compatible with this definition. We hope that this more detailed
    definition, and the three data sets that go with it, will allow more meaningful
    comparisons of certain motion segmentation methods.

    Joint Large-Scale Motion Estimation and Image Reconstruction

    Hendrik Dirks
    Comments: 21 pages, 1 figure
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Optimization and Control (math.OC)

    This article describes the implementation of the joint motion estimation and
    image reconstruction framework presented by Burger, Dirks and Sch”onlieb and
    extends this framework to large-scale motion between consecutive image frames.
    The variational framework uses displacements between consecutive frames based
    on the optical flow approach to improve the image reconstruction quality on the
    one hand and the motion estimation quality on the other. The energy functional
    consists of a data-fidelity term with a general operator that connects the
    input sequence to the solution, it has a total variation term for the image
    sequence and is connected to the underlying flow using an optical flow term.
    Additional spatial regularity for the flow is modeled by a total variation
    regularizer for both components of the flow. The numerical minimization is
    performed in an alternating manner using primal-dual techniques. The resulting
    schemes are presented as pseudo-code together with a short numerical
    evaluation.

    Robust Gait Recognition by Integrating Inertial and RGBD Sensors

    Qin Zou, Lihao Ni, Qian Wang, Qingquan Li, Song Wang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Gait has been considered as a promising and unique biometric for person
    identification. Traditionally, gait data are collected using either color
    sensors, such as a CCD camera, depth sensors, such as a Microsoft Kinect, or
    inertial sensors, such as an accelerometer. However, a single type of sensors
    may only capture part of the dynamic gait features and make the gait
    recognition sensitive to complex covariate conditions, leading to fragile
    gait-based person identification systems. In this paper, we propose to combine
    all three types of sensors for gait data collection and gait recognition, which
    can be used for important identification applications, such as identity
    recognition to access a restricted building or area. We propose two new
    algorithms, namely EigenGait and TrajGait, to extract gait features from the
    inertial data and the RGBD (color and depth) data, respectively. Specifically,
    EigenGait extracts general gait dynamics from the accelerometer readings in the
    eigenspace and TrajGait extracts more detailed sub-dynamics by analyzing 3D
    dense trajectories. Finally, both extracted features are fed into a supervised
    classifier for gait recognition and person identification. Experiments on 50
    subjects, with comparisons to several other state-of-the-art gait-recognition
    approaches, show that the proposed approach can achieve higher recognition
    accuracy and robustness.

    A New Distance Measure for Non-Identical Data with Application to Image Classification

    Muthukaruppan Swaminathan, Pankaj Kumar Yadav, Obdulio Piloto, Tobias Sjöblom, Ian Cheong
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Distance measures are part and parcel of many computer vision algorithms. The
    underlying assumption in all existing distance measures is that feature
    elements are independent and identically distributed. However, in real-world
    settings, data generally originate from heterogeneous sources even if they do
    possess a common data-generating mechanism. Since these sources are not
    identically distributed by necessity, the assumption of identical distribution
    is inappropriate. Here, we use statistical analysis to show that feature
    elements of local image descriptors are indeed non-identically distributed. To
    test the effect of omitting the unified distribution assumption, we created a
    new distance measure called the Poisson-Binomial Radius (PBR). PBR is a
    bin-to-bin distance which accounts for the dispersion of bin-to-bin
    information. PBR’s performance was evaluated on twelve benchmark data sets
    covering six different classification and recognition applications: texture,
    material, leaf, scene, ear biometrics and category-level image classification.
    Results from these experiments demonstrate that PBR outperforms
    state-of-the-art distance measures for most of the data sets and achieves
    comparable performance on the rest, suggesting that accounting for different
    distributions in distance measures can improve performance in classification
    and recognition tasks.

    WaveNet: a deep convolutional neural network using directional wavelets for low-dose X-ray CT reconstruction

    Eunhee Kan, Junhong Min, Jong Chul Ye
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Due to the potential risk of inducing cancers, radiation dose of X-ray CT
    should be reduced for routine patient scanning. However, in low-dose X-ray CT,
    severe artifacts usually occur due to photon starvation, beamhardening, etc,
    which decrease the reliability of diagnosis. Thus, high quality reconstruction
    from low-dose X-ray CT data has become one of the important research topics in
    CT community. Conventional model-based denoising approaches are, however,
    computationally very expensive, and image domain denoising approaches hardly
    deal with CT specific noise patterns. To address these issues, we propose an
    algorithm using a deep convolutional neural network (CNN), which is applied to
    wavelet transform coefficients of low-dose CT images. Specifically, by using a
    directional wavelet transform for extracting directional component of artifacts
    and exploiting the intra- and inter-band correlations, our deep network can
    effectively suppress CT specific noises. Moreover, our CNN is designed to have
    various types of residual learning architecture for faster network training and
    better denoising. Experimental results confirm that the proposed algorithm
    effectively removes complex noise patterns of CT images, originated from the
    reduced X-ray dose. In addition, we show that wavelet domain CNN is efficient
    in removing the noises from low-dose CT compared to an image domain CNN. Our
    results were rigorously evaluated by several radiologists and won the second
    place award in 2016 AAPM Low-Dose CT Grand Challenge. To the best of our
    knowledge, this work is the first deep learning architecture for low-dose CT
    reconstruction that has been rigorously evaluated and proven for its efficacy.

    Real-Time Image Distortion Correction: Analysis and Evaluation of FPGA-Compatible Algorithms

    Paolo Di Febbo, Stefano Mattoccia, Carlo Dal Mutto
    Comments: To be published in Proceedings of the International Conference on Reconfigurable Computing and FPGAs, Cancun, Mexico, 30 November, 2 December 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Image distortion correction is a critical pre-processing step for a variety
    of computer vision and image processing algorithms. Standard real-time software
    implementations are generally not suited for direct hardware porting, so
    appropriated versions need to be designed in order to obtain implementations
    deployable on FPGAs. In this paper, hardware-compatible techniques for image
    distortion correction are introduced and analyzed in details. The considered
    solutions are compared in terms of output quality by using a
    geometrical-error-based approach, with particular emphasis on robustness with
    respect to increasing lens distortion. The required amount of hardware
    resources is also estimated for each considered approach.

    Visual Tracking via Boolean Map Representations

    Kaihua Zhang, Qingshan Liu, Ming-Hsuan Yang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we present a simple yet effective Boolean map based
    representation that exploits connectivity cues for visual tracking. We describe
    a target object with histogram of oriented gradients and raw color features, of
    which each one is characterized by a set of Boolean maps generated by uniformly
    thresholding their values. The Boolean maps effectively encode multi-scale
    connectivity cues of the target with different granularities. The fine-grained
    Boolean maps capture spatially structural details that are effective for
    precise target localization while the coarse-grained ones encode global shape
    information that are robust to large target appearance variations. Finally, all
    the Boolean maps form together a robust representation that can be approximated
    by an explicit feature map of the intersection kernel, which is fed into a
    logistic regression classifier with online update, and the target location is
    estimated within a particle filter framework. The proposed representation
    scheme is computationally efficient and facilitates achieving favorable
    performance in terms of accuracy and robustness against the state-of-the-art
    tracking methods on a large benchmark dataset of 50 image sequences.

    Accurate Deep Representation Quantization with Gradient Snapping Layer for Similarity Search

    Shicong Liu, Hongtao Lu
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Recent advance of large scale similarity search involves using deeply learned
    representations to improve the search accuracy and use vector quantization
    methods to increase the search speed. However, how to learn deep
    representations that strongly preserve similarities between data pairs and can
    be accurately quantized via vector quantization remains a challenging task.
    Existing methods simply leverage quantization loss and similarity loss, which
    result in unexpectedly biased back-propagating gradients and affect the search
    performances. To this end, we propose a novel gradient snapping layer (GSL) to
    directly regularize the back-propagating gradient towards a neighboring
    codeword, the generated gradients are un-biased for reducing similarity loss
    and also propel the learned representations to be accurately quantized. Joint
    deep representation and vector quantization learning can be easily performed by
    alternatively optimize the quantization codebook and the deep neural network.
    The proposed framework is compatible with various existing vector quantization
    approaches. Experimental results demonstrate that the proposed framework is
    effective, flexible and outperforms the state-of-the-art large scale similarity
    search methods.

    Compressed Learning: A Deep Neural Network Approach

    Amir Adler, Michael Elad, Michael Zibulevsky
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Compressed Learning (CL) is a joint signal processing and machine learning
    framework for inference from a signal, using a small number of measurements
    obtained by linear projections of the signal. In this paper we present an
    end-to-end deep learning approach for CL, in which a network composed of
    fully-connected layers followed by convolutional layers perform the linear
    sensing and non-linear inference stages. During the training phase, the sensing
    matrix and the non-linear inference operator are jointly optimized, and the
    proposed approach outperforms state-of-the-art for the task of image
    classification. For example, at a sensing rate of 1% (only 8 measurements of 28
    X 28 pixels images), the classification error for the MNIST handwritten digits
    dataset is 6.46% compared to 41.06% with state-of-the-art.

    Generalized Haar Filter based Deep Networks for Real-Time Object Detection in Traffic Scene

    Keyu Lu, Jian Li, Xiangjing An, Hangen He
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    Vision-based object detection is one of the fundamental functions in numerous
    traffic scene applications such as self-driving vehicle systems and advance
    driver assistance systems (ADAS). However, it is also a challenging task due to
    the diversity of traffic scene and the storage, power and computing source
    limitations of the platforms for traffic scene applications. This paper
    presents a generalized Haar filter based deep network which is suitable for the
    object detection tasks in traffic scene. In this approach, we first decompose a
    object detection task into several easier local regression tasks. Then, we
    handle the local regression tasks by using several tiny deep networks which
    simultaneously output the bounding boxes, categories and confidence scores of
    detected objects. To reduce the consumption of storage and computing resources,
    the weights of the deep networks are constrained to the form of generalized
    Haar filter in training phase. Additionally, we introduce the strategy of
    sparse windows generation to improve the efficiency of the algorithm. Finally,
    we perform several experiments to validate the performance of our proposed
    approach. Experimental results demonstrate that the proposed approach is both
    efficient and effective in traffic scene compared with the state-of-the-art.

    A Scalable and Robust Framework for Intelligent Real-time Video Surveillance

    Shreenath Dutt, Ankita Kalra
    Comments: 4 pages, 3 figures, Presented in International Conference on Advances in Computing, Communications and Informatics (ICACCI-2016), September 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC)

    In this paper, we present an intelligent, reliable and storage-efficient
    video surveillance system using Apache Storm and OpenCV. As a Storm topology,
    we have added multiple information extraction modules that only write important
    content to the disk. Our topology is extensible, capable of adding novel
    algorithms as per the use case without affecting the existing ones, since all
    the processing is independent of each other. This framework is also highly
    scalable and fault tolerant, which makes it a best option for organisations
    that need to monitor a large network of surveillance cameras.

    Diversity Promoting Online Sampling for Streaming Video Summarization

    Rushil Anirudh, Ahnaf Masroor, Pavan Turaga
    Comments: Published at ICIP 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Many applications benefit from sampling algorithms where a small number of
    well chosen samples are used to generalize different properties of a large
    dataset. In this paper, we use diverse sampling for streaming video
    summarization. Several emerging applications support streaming video, but
    existing summarization algorithms need access to the entire video which
    requires a lot of memory and computational power. We propose a memory efficient
    and computationally fast, online algorithm that uses competitive learning for
    diverse sampling. Our algorithm is a generalization of online K-means such that
    the cost function reduces clustering error, while also ensuring a diverse set
    of samples. The diversity is measured as the volume of a convex hull around the
    samples. Finally, the performance of the proposed algorithm is measured against
    human users for 50 videos in the VSUMM dataset. The algorithm performs better
    than batch mode summarization, while requiring significantly lower memory and
    computational requirements.

    FlyCap: Markerless Motion Capture Using Multiple Autonomous Flying Cameras

    Lan Xu, Lu Fang, Wei Cheng, Kaiwen Guo, Guyue Zhou, Qionghai Dai, Yebin Liu
    Comments: 13 pages, 17 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Robotics (cs.RO)

    Aiming at automatic, convenient and non-instrusive motion capture, this paper
    presents the new generation markerless motion capture technique, the FlyCap
    system, to capture surface motions of moving characters using multiple
    autonomous flying cameras (autonomous unmanned aerial vehicle (UAV) integrated
    with a RGBD video camera). During data capture, three cooperative flying
    cameras automatically track and follow the moving target who performs large
    scale motions in a wide space. We propose a novel non-rigid surface
    registration method to track and fuse the depth of the three flying cameras for
    surface motion tracking of the moving target, and simultaneously calculate the
    pose of each flying camera. We leverage the using of visual-odometry
    information provided by the UAV platform, and formulate the surface tracking
    problem in a non-linear objective function that can be linearized and
    effectively minimized through a quasi-Newton method. Quantitative and
    qualitative experimental results demonstrate the competent and plausible
    surface and motion reconstruction results.

    Multi-Camera Occlusion and Sudden-Appearance-Change Detection Using Hidden Markovian Chains

    Xudong Ma
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper was originally submitted to Xinova as a response to a Request for
    Invention (RFI) on new event monitoring methods. In this paper, a new object
    tracking algorithm using multiple cameras for surveillance applications is
    proposed. The proposed system can detect sudden-appearance-changes and
    occlusions using a hidden Markovian statistical model. The experimental results
    confirm that our system detect the sudden-appearance changes and occlusions
    reliably.

    A MAP-MRF filter for phase-sensitive coil combination in autocalibrating partially parallel susceptibility weighted MRI

    Sreekanth Madhusoodhanan, Joseph Suresh Paul
    Comments: Submitted to IEEE TMI, At the end of the document the rebuttal is added. Expecting comments from other researchers
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    A statistical approach for combination of channel phases is developed for
    optimizing the Contrast-to-Noise Ratio (CNR) in Susceptibility Weighted Images
    (SWI) acquired using autocalibrating partially parallel techniques. The
    unwrapped phase images of each coil are filtered using local random field based
    probabilistic weights, derived using energy functions representative of noisy
    sensitivity and tissue information pertaining to venous structure in the
    individual channel phase images. The channel energy functions are obtained as
    functions of local image intensities, first or second order clique phase
    difference and a threshold scaling parameter dependent on the input noise
    level. Whereas the expectation of the individual energy functions with respect
    to the noise distribution in clique phase differences is to be maximized for
    optimal filtering, the expectation of tissue energy function decreases and
    noise energy function increases with increase in threshold scale parameter. The
    optimum scaling parameter is shown to occur at the point where expectations of
    both energy functions contribute to the largest possible extent. It is shown
    that implementation of the filter in the same lines as that of Iterated
    Conditional Modes (ICM) algorithm provides structural enhancement in the coil
    combined phase, with reduced noise amplification. Application to simulated and
    in vivo multi-channel SWI shows that CNR of combined phase obtained using
    MAP-MRF filter is higher as compared to that of coil combination using weighted
    average.

    Machine learning methods for accurate delineation of tumors in PET images

    Jakub Czakon, Filip Drapejkowski, Grzegorz Zurek, Piotr Giedziun, Jacek Zebrowski, Witold Dyrka
    Comments: 19th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2016) – PETSEG challenge, Athens, Greece, 21/10/2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    In oncology, Positron Emission Tomography imaging is widely used in
    diagnostics of cancer metastases, in monitoring of progress in course of the
    cancer treatment, and in planning radiotherapeutic interventions. Accurate and
    reproducible delineation of the tumor in the Positron Emission Tomography scans
    remains a difficult task, despite being crucial for delivering appropriate
    radiation dose, minimizing adverse side-effects of the therapy, and reliable
    evaluation of treatment. In this piece of research we attempt to solve the
    problem of automated delineation of the tumor using 3d implementations of the
    spatial distance weighted fuzzy c-means, the deep convolutional neural network
    and a dictionary model. The methods, in diverse ways, combine intensity and
    spatial information.

    Selective De-noising of Sparse-Coloured Images

    Arjun Chaudhuri
    Comments: 4 pages, 5 figures, International Journal of Computer Science and Information Technologies, ISSN: 0975-9646, March-April, 2016, Website: this http URL
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Since time immemorial, noise has been a constant source of disturbance to the
    various entities known to mankind. Noise models of different kinds have been
    developed to study noise in more detailed fashion over the years. Image
    processing, particularly, has extensively implemented several algorithms to
    reduce noise in photographs and pictorial documents to alleviate the effect of
    noise. Images with sparse colours-lesser number of distinct colours in them-are
    common nowadays, especially in astronomy and astrophysics where black and white
    colours form the main components. Additive noise of Gaussian type is the most
    common form of noise to be studied and analysed in majority of communication
    channels, namely-satellite links, mobile base station to local cellular tower
    communication channel,et. al. Most of the time, we encounter images from
    astronomical sources being distorted with noise maximally as they travel long
    distance from telescopes in outer space to Earth. Considering Additive White
    Gaussian Noise(AWGN) to be the common noise in these long distance channels,
    this paper provides an insight and an algorithmic approach to pixel-specific
    de-noising of sparse-coloured images affected by AWGN. The paper concludes with
    some essential future avenues and applications of this de-noising method in
    industry and academia.

    Learning Adaptive Parameter Tuning for Image Processing

    Jingming Dong, Iuri Frosio, Jan Kautz
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    The non-stationary nature of image characteristics calls for adaptive
    processing, based on the local image content. We propose a simple and flexible
    method to learn local tuning of parameters in adaptive image processing: we
    extract simple local features from an image and learn the relation between
    these features and the optimal filtering parameters. Learning is performed by
    optimizing a user defined cost function (any image quality metric) on a
    training set. We apply our method to three classical problems (denoising,
    demosaicing and deblurring) and we show the effectiveness of the learned
    parameter modulation strategies. We also show that these strategies are
    consistent with theoretical results from the literature.

    Detecting Breast Cancer using a Compressive Sensing Unmixing Algorithm

    Richard Obermeier, Jose Angel Martinez-Lorenzo
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Optimization and Control (math.OC)

    Traditional breast cancer imaging methods using microwave Nearfield Radar
    Imaging (NRI) seek to recover the complex permittivity of the tissues at each
    voxel in the imaging region. This approach is suboptimal, in that it does not
    directly consider the permittivity values that healthy and cancerous breast
    tissues typically have. In this paper, we describe a novel unmixing algorithm
    for detecting breast cancer. In this approach, the breast tissue is separated
    into three components, low water content (LWC), high water content (HWC), and
    cancerous tissues, and the goal of the optimization procedure is to recover the
    mixture proportions for each component. By utilizing this approach in a hybrid
    DBT / NRI system, the unmixing reconstruction process can be posed as a sparse
    recovery problem, such that compressive sensing (CS) techniques can be
    employed. A numerical analysis is performed, which demonstrates that cancerous
    lesions can be detected from their mixture proportion under the appropriate
    conditions.

    Discovering containment: from infants to machines

    Shimon Ullman, Nimrod Dorfman, Daniel Harari
    Subjects: Neurons and Cognition (q-bio.NC); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Current artificial learning systems can recognize thousands of visual
    categories, or play Go at a champion”s level, but cannot explain infants
    learning, in particular the ability to learn complex concepts without guidance,
    in a specific order. A notable example is the category of ‘containers’ and the
    notion of containment, one of the earliest spatial relations to be learned,
    starting already at 2.5 months, and preceding other common relations (e.g.,
    support). Such spontaneous unsupervised learning stands in contrast with
    current highly successful computational models, which learn in a supervised
    manner, that is, by using large data sets of labeled examples. How can
    meaningful concepts be learned without guidance, and what determines the
    trajectory of infant learning, making some notions appear consistently earlier
    than others?

    Conditional Image Synthesis With Auxiliary Classifier GANs

    Augustus Odena, Christopher Olah, Jonathon Shlens
    Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV)

    Synthesizing high resolution photorealistic images has been a long-standing
    challenge in machine learning. In this paper we introduce new methods for the
    improved training of generative adversarial networks (GANs) for image
    synthesis. We construct a variant of GANs employing label conditioning that
    results in 128×128 resolution image samples exhibiting global coherence. We
    expand on previous work for image quality assessment to provide two new
    analyses for assessing the discriminability and diversity of samples from
    class-conditional image synthesis models. These analyses demonstrate that high
    resolution samples provide class information not present in low resolution
    samples. Across 1000 ImageNet classes, 128×128 samples are more than twice as
    discriminable as artificially resized 32×32 samples. In addition, 84.7% of the
    classes have samples exhibiting diversity comparable to real ImageNet data.


    Artificial Intelligence

    Ontology Verbalization using Semantic-Refinement

    Vinu E.V, P Sreenivasa Kumar
    Comments: Currently this paper is under review in the Semantic Web Journal. arXiv admin note: text overlap with arXiv:1607.07027
    Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

    We propose a rule-based technique to generate redundancy-free NL descriptions
    of OWL entities.The existing approaches which address the problem of
    verbalizing OWL ontologies generate NL text segments which are close to their
    counterpart OWL statements.Some of these approaches also perform grouping and
    aggregating of these NL text segments to generate a more fluent and
    comprehensive form of the content.Restricting our attention to description of
    individuals and concepts, we find that the approach currently followed in the
    available tools is that of determining the set of all logical conditions that
    are satisfied by the given individual/concept name and translate these
    conditions verbatim into corresponding NL descriptions.Human-understandability
    of such descriptions is affected by the presence of repetitions and
    redundancies, as they have high fidelity to their OWL representation.In the
    literature, no efforts had been taken to remove redundancies and repetitions at
    the logical-level before generating the NL descriptions of entities and we find
    this to be the main reason for lack of readability of the generated
    text.Herein, we propose a technique called semantic-refinement(SR) to generate
    meaningful and easily-understandable descriptions of individuals and concepts
    of a given OWLontology.We identify the combinations of OWL/DL constructs that
    lead to repetitive/redundant descriptions and propose a series of refinement
    rules to rewrite the conditions that are satisfied by an individual/concept in
    a meaning-preserving manner.The reduced set of conditions are then employed for
    generating NL descriptions.Our experiments show that, SR leads to significantly
    improved descriptions of ontology entities.We also test the effectiveness and
    usefulness of the the generated descriptions for the purpose of validating the
    ontologies and find that the proposed technique is indeed helpful in the
    context.

    Inference Compilation and Universal Probabilistic Programming

    Tuan Anh Le, Atilim Gunes Baydin, Frank Wood
    Comments: 10 pages, 6 figures
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    We introduce a method for using deep neural networks to amortize the cost of
    inference in models from the family induced by universal probabilistic
    programming languages, establishing a framework that combines the strengths of
    probabilistic programming and deep learning methods. We call what we do
    “compilation of inference” because our method transforms a denotational
    specification of an inference problem in the form of a probabilistic program
    written in a universal programming language into a trained neural network
    denoted in a neural network specification language. When at test time this
    neural network is fed observational data and executed, it performs approximate
    inference in the original model specified by the probabilistic program. Our
    training objective and learning procedure are designed to allow the trained
    neural network to be used as a proposal distribution in a sequential importance
    sampling inference engine. We illustrate our method on mixture models and
    Captcha solving and show significant speedups in the efficiency of inference.

    A Survey of Brain Inspired Technologies for Engineering

    Jarryd Son, Amit Kumar Mishra
    Subjects: Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    Cognitive engineering is a multi-disciplinary field and hence it is difficult
    to find a review article consolidating the leading developments in the field.
    The in-credible pace at which technology is advancing pushes the boundaries of
    what is achievable in cognitive engineering. There are also differing
    approaches to cognitive engineering brought about from the multi-disciplinary
    nature of the field and the vastness of possible applications. Thus research
    communities require more frequent reviews to keep up to date with the latest
    trends. In this paper we shall dis-cuss some of the approaches to cognitive
    engineering holistically to clarify the reasoning behind the different
    approaches and to highlight their strengths and weaknesses. We shall then show
    how developments from seemingly disjointed views could be integrated to achieve
    the same goal of creating cognitive machines. By reviewing the major
    contributions in the different fields and showing the potential for a combined
    approach, this work intends to assist the research community in devising more
    unified methods and techniques for developing cognitive machines.

    Probabilistic Model Checking for Complex Cognitive Tasks — A case study in human-robot interaction

    Sebastian Junges, Nils Jansen, Joost-Pieter Katoen, Ufuk Topcu
    Subjects: Artificial Intelligence (cs.AI); Robotics (cs.RO)

    This paper proposes to use probabilistic model checking to synthesize optimal
    robot policies in multi-tasking autonomous systems that are subject to
    human-robot interaction. Given the convincing empirical evidence that human
    behavior can be related to reinforcement models, we take as input a
    well-studied Q-table model of the human behavior for flexible scenarios. We
    first describe an automated procedure to distill a Markov decision process
    (MDP) for the human in an arbitrary but fixed scenario. The distinctive issue
    is that — in contrast to existing models — under-specification of the human
    behavior is included. Probabilistic model checking is used to predict the
    human’s behavior. Finally, the MDP model is extended with a robot model.
    Optimal robot policies are synthesized by analyzing the resulting two-player
    stochastic game. Experimental results with a prototypical implementation using
    PRISM show promising results.

    Selfishness as the most probable selection for higher efficiency

    Rasoul Kheiri
    Comments: 17 pages, 11 figures
    Subjects: Artificial Intelligence (cs.AI)

    We develop a two-defender (Alice and Bob) invasion game using the method of
    projective simulation. The agent, say Alice, encounters attack’ symbols coming
    from the right attacker where she can learn to prevent. However, some of these
    signs are invisible for her. Instead, she perceives some other signs that are
    related to Bob’s task. We elaborate an example in which an agent perceives an
    equal portion of percepts from both attackers. Alice can choose to concentrate
    on her job, though she loses some attacks. Alternatively, she can have some
    collaboration with Bob to get and give help. It is concluded that the maximum
    blocking efficiency in concentration is just the minimum blocking efficiency in
    collaboration. Furthermore, Alice has a choice to select two different
    forgetting factors for her task or helping. Therefore, she can choose between
    herself and the other. As the main result, we conclude that if Alice selects to
    be selfish, she probably earns more blocking in her task and also, higher
    efficiency in collective blocking, regardless of Bob’s selection. In addition,
    it turns out that when the selection of both partners is selfishness, it is the
    highest justice on sharing individual efficiency and it is a maximum in
    collective blocking efficiency too. Finally, we propose some other questions
    that can be tracked regarding the present study.

    From Node Embedding To Community Embedding

    Vincent W. Zheng, Sandro Cavallari, Hongyun Cai, Kevin Chen-Chuan Chang, Erik Cambria
    Subjects: Social and Information Networks (cs.SI); Artificial Intelligence (cs.AI)

    Most of the existing graph embedding methods focus on nodes, which aim to
    output a vector representation for each node in the graph such that two nodes
    being “close” on the graph are close too in the low-dimensional space. Despite
    the success of embedding individual nodes for graph analytics, we notice that
    an important concept of embedding communities (i.e., groups of nodes) is
    missing. Embedding communities is useful, not only for supporting various
    community-level applications, but also to help preserve community structure in
    graph embedding. In fact, we see community embedding as providing a
    higher-order proximity to define the node closeness, whereas most of the
    popular graph embedding methods focus on first-order and/or second-order
    proximities. To learn the community embedding, we hinge upon the insight that
    community embedding and node embedding reinforce with each other. As a result,
    we propose ComEmbed, the first community embedding method, which jointly
    optimizes the community embedding and node embedding together. We evaluate
    ComEmbed on real-world data sets. We show it outperforms the state-of-the-art
    baselines in both tasks of node classification and community prediction.

    Mining Social Media for Open Innovation in Transportation Systems

    Daniela Ulloa, Pedro Saleiro, Rosaldo J. F. Rossetti, Elis Regina Silva
    Subjects: Computers and Society (cs.CY); Artificial Intelligence (cs.AI); Social and Information Networks (cs.SI)

    This work proposes a novel framework for the development of new products and
    services in transportation through an open innovation approach based on
    automatic content analysis of social media data. The framework is able to
    extract users comments from Online Social Networks (OSN), to process and
    analyze text through information extraction and sentiment analysis techniques
    to obtain relevant information about product reception on the market. A use
    case was developed using the mobile application Uber, which is today one of the
    fastest growing technology companies in the world. We measured how a
    controversial, highly diffused event influences the volume of tweets about Uber
    and the perception of its users. While there is no change in the image of Uber,
    a large increase in the number of tweets mentioning the company is observed,
    which meant a free and important diffusion of its product.

    Chinese Poetry Generation with Planning based Neural Network

    Zhe Wang, Wei He, Hua Wu, Haiyang Wu, Wei Li, Haifeng Wang, Enhong Chen
    Comments: Accepted at COLING 2016
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

    Chinese poetry generation is a very challenging task in natural language
    processing. In this paper, we propose a novel two-stage poetry generating
    method which first plans the sub-topics of the poem according to the user’s
    writing intent, and then generates each line of the poem sequentially, using a
    modified recurrent neural network encoder-decoder framework. The proposed
    planning-based method can ensure that the generated poem is coherent and
    semantically consistent with the user’s intent. A comprehensive evaluation with
    human judgments demonstrates that our proposed approach outperforms the
    state-of-the-art poetry generating methods and the poem quality is somehow
    comparable to human poets.

    Edward: A library for probabilistic modeling, inference, and criticism

    Dustin Tran, Alp Kucukelbir, Adji B. Dieng, Maja Rudolph, Dawen Liang, David M. Blei
    Subjects: Computation (stat.CO); Artificial Intelligence (cs.AI); Programming Languages (cs.PL); Applications (stat.AP); Machine Learning (stat.ML)

    Probabilistic modeling is a powerful approach for analyzing empirical
    information. We describe Edward, a library for probabilistic modeling. Edward’s
    design reflects an iterative process pioneered by George Box: build a model of
    a phenomenon, make inferences about the model given data, and criticize the
    model’s fit to the data. Edward supports a broad class of probabilistic models,
    efficient algorithms for inference, and many techniques for model criticism.
    The library builds on top of TensorFlow to support distributed training and
    hardware such as GPUs. Edward enables the development of complex probabilistic
    models and their algorithms at a massive scale.

    DPPred: An Effective Prediction Framework with Concise Discriminative Patterns

    Jingbo Shang, Meng Jiang, Wenzhu Tong, Jinfeng Xiao, Jian Peng, Jiawei Han
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

    In the literature, two series of models have been proposed to address
    prediction problems including classification and regression. Simple models,
    such as generalized linear models, have ordinary performance but strong
    interpretability on a set of simple features. The other series, including
    tree-based models, organize numerical, categorical and high dimensional
    features into a comprehensive structure with rich interpretable information in
    the data.

    In this paper, we propose a novel Discriminative Pattern-based Prediction
    framework (DPPred) to accomplish the prediction tasks by taking their
    advantages of both effectiveness and interpretability. Specifically, DPPred
    adopts the concise discriminative patterns that are on the prefix paths from
    the root to leaf nodes in the tree-based models. DPPred selects a limited
    number of the useful discriminative patterns by searching for the most
    effective pattern combination to fit generalized linear models. Extensive
    experiments show that in many scenarios, DPPred provides competitive accuracy
    with the state-of-the-art as well as the valuable interpretability for
    developers and experts. In particular, taking a clinical application dataset as
    a case study, our DPPred outperforms the baselines by using only 40 concise
    discriminative patterns out of a potentially exponentially large set of
    patterns.

    Generalized Haar Filter based Deep Networks for Real-Time Object Detection in Traffic Scene

    Keyu Lu, Jian Li, Xiangjing An, Hangen He
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    Vision-based object detection is one of the fundamental functions in numerous
    traffic scene applications such as self-driving vehicle systems and advance
    driver assistance systems (ADAS). However, it is also a challenging task due to
    the diversity of traffic scene and the storage, power and computing source
    limitations of the platforms for traffic scene applications. This paper
    presents a generalized Haar filter based deep network which is suitable for the
    object detection tasks in traffic scene. In this approach, we first decompose a
    object detection task into several easier local regression tasks. Then, we
    handle the local regression tasks by using several tiny deep networks which
    simultaneously output the bounding boxes, categories and confidence scores of
    detected objects. To reduce the consumption of storage and computing resources,
    the weights of the deep networks are constrained to the form of generalized
    Haar filter in training phase. Additionally, we introduce the strategy of
    sparse windows generation to improve the efficiency of the algorithm. Finally,
    we perform several experiments to validate the performance of our proposed
    approach. Experimental results demonstrate that the proposed approach is both
    efficient and effective in traffic scene compared with the state-of-the-art.

    Beyond Exchangeability: The Chinese Voting Process

    Moontae Lee, Seok Hyun Jin, David Mimno
    Subjects: Social and Information Networks (cs.SI); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Learning (cs.LG)

    Many online communities present user-contributed responses such as reviews of
    products and answers to questions. User-provided helpfulness votes can
    highlight the most useful responses, but voting is a social process that can
    gain momentum based on the popularity of responses and the polarity of existing
    votes. We propose the Chinese Voting Process (CVP) which models the evolution
    of helpfulness votes as a self-reinforcing process dependent on position and
    presentation biases. We evaluate this model on Amazon product reviews and more
    than 80 StackExchange forums, measuring the intrinsic quality of individual
    responses and behavioral coefficients of different communities.


    Information Retrieval

    Off the Beaten Path: Let's Replace Term-Based Retrieval with k-NN Search

    Leonid Boytsov, David Novak, Yury Malkov, Eric Nyberg
    Subjects: Information Retrieval (cs.IR)

    Retrieval pipelines commonly rely on a term-based search to obtain candidate
    records, which are subsequently re-ranked. Some candidates are missed by this
    approach, e.g., due to a vocabulary mismatch. We address this issue by
    replacing the term-based search with a generic k-NN retrieval algorithm, where
    a similarity function can take into account subtle term associations. While an
    exact brute-force k-NN search using this similarity function is slow, we
    demonstrate that an approximate algorithm can be nearly two orders of magnitude
    faster at the expense of only a small loss in accuracy. A retrieval pipeline
    using an approximate k-NN search can be more effective and efficient than the
    term-based pipeline. This opens up new possibilities for designing effective
    retrieval pipelines. Our software (including data-generating code) and
    derivative data based on the Stack Overflow collection is available online.

    Numerical Range Facets Partition: Evaluation Metric and Methods

    Xueqing Liu, Chengxiang Zhai, Wei Han, Onur Gungor
    Comments: Under review for WWW’17
    Subjects: Information Retrieval (cs.IR); Human-Computer Interaction (cs.HC); Learning (cs.LG)

    Faceted browsing is a very useful interface component provided in many of
    today’s search engines. It is especially useful when the user has an
    exploratory information need or have a clear preference for certain attribute
    values. Existing work has tried to optimize faceted browsing systems in many
    aspects, but little work has been done on optimizing the partitions of
    numerical facet ranges (e.g., price ranges of products). In this paper, we
    introduce the new problem of numerical facet range partition and formally frame
    it as an optimization problem. To enable quantitative evaluation and reuse of
    search log data, we propose an evaluation metric based on user’s browsing cost
    when using the suggested ranges for navigation. We further propose and study
    multiple methods to computationally optimize the partition by leveraging search
    logs. Experimental results on a two-month search log from a major e-Commerce
    engine show that learning can significantly improve the performance over
    baseline.

    Sentiment Analysis of Review Datasets Using Naive Bayes and K-NN Classifier

    Lopamudra Dey, Sanjay Chakraborty, Anuraag Biswas, Beepa Bose, Sweta Tiwari
    Comments: Volume-8, Issue-4, pp.54-62, 2016
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    The advent of Web 2.0 has led to an increase in the amount of sentimental
    content available in the Web. Such content is often found in social media web
    sites in the form of movie or product reviews, user comments, testimonials,
    messages in discussion forums etc. Timely discovery of the sentimental or
    opinionated web content has a number of advantages, the most important of all
    being monetization. Understanding of the sentiments of human masses towards
    different entities and products enables better services for contextual
    advertisements, recommendation systems and analysis of market trends. The focus
    of our project is sentiment focussed web crawling framework to facilitate the
    quick discovery of sentimental contents of movie reviews and hotel reviews and
    analysis of the same. We use statistical methods to capture elements of
    subjective style and the sentence polarity. The paper elaborately discusses two
    supervised machine learning algorithms: K-Nearest Neighbour(K-NN) and Naive
    Bayes and compares their overall accuracy, precisions as well as recall values.
    It was seen that in case of movie reviews Naive Bayes gave far better results
    than K-NN but for hotel reviews these algorithms gave lesser, almost same
    accuracies.

    Finding Street Gang Members on Twitter

    Lakshika Balasuriya, Sanjaya Wijeratne, Derek Doran, Amit Sheth
    Comments: 8 pages, 9 figures, 2 tables, Published as a full paper at 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2016)
    Journal-ref: The 2016 IEEE/ACM Int. Conf. on Advances in Social Networks
    Analysis and Mining. vol. 8, pp. 685-692. San Francisco, CA, USA (2016)
    Subjects: Social and Information Networks (cs.SI); Computation and Language (cs.CL); Computers and Society (cs.CY); Information Retrieval (cs.IR)

    Most street gang members use Twitter to intimidate others, to present
    outrageous images and statements to the world, and to share recent illegal
    activities. Their tweets may thus be useful to law enforcement agencies to
    discover clues about recent crimes or to anticipate ones that may occur.
    Finding these posts, however, requires a method to discover gang member Twitter
    profiles. This is a challenging task since gang members represent a very small
    population of the 320 million Twitter users. This paper studies the problem of
    automatically finding gang members on Twitter. It outlines a process to curate
    one of the largest sets of verifiable gang member profiles that have ever been
    studied. A review of these profiles establishes differences in the language,
    images, YouTube links, and emojis gang members use compared to the rest of the
    Twitter population. Features from this review are used to train a series of
    supervised classifiers. Our classifier achieves a promising F1 score with a low
    false positive rate.

    Beyond Exchangeability: The Chinese Voting Process

    Moontae Lee, Seok Hyun Jin, David Mimno
    Subjects: Social and Information Networks (cs.SI); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Learning (cs.LG)

    Many online communities present user-contributed responses such as reviews of
    products and answers to questions. User-provided helpfulness votes can
    highlight the most useful responses, but voting is a social process that can
    gain momentum based on the popularity of responses and the polarity of existing
    votes. We propose the Chinese Voting Process (CVP) which models the evolution
    of helpfulness votes as a self-reinforcing process dependent on position and
    presentation biases. We evaluate this model on Amazon product reviews and more
    than 80 StackExchange forums, measuring the intrinsic quality of individual
    responses and behavioral coefficients of different communities.


    Computation and Language

    Neural Machine Translation in Linear Time

    Nal Kalchbrenner, Lasse Espeholt, Karen Simonyan, Aaron van den Oord, Alex Graves, Koray Kavukcuoglu
    Comments: 11 pages
    Subjects: Computation and Language (cs.CL)

    We present a neural architecture for sequence processing. The ByteNet is a
    stack of two dilated convolutional neural networks, one to encode the source
    sequence and one to decode the target sequence, where the target network
    unfolds dynamically to generate variable length outputs. The ByteNet has two
    core properties: it runs in time that is linear in the length of the sequences
    and it preserves the sequences’ temporal resolution. The ByteNet decoder
    attains state-of-the-art performance on character-level language modelling and
    outperforms the previous best results obtained with recurrent neural networks.
    The ByteNet also achieves a performance on raw character-level machine
    translation that approaches that of the best neural translation models that run
    in quadratic time. The implicit structure learnt by the ByteNet mirrors the
    expected alignments between the sequences.

    End-to-End Reading Comprehension with Dynamic Answer Chunk Ranking

    Yang Yu, Wei Zhang, Kazi Hasan, Mo Yu, Bing Xiang, Bowen Zhou
    Comments: Submitted to AAAI
    Subjects: Computation and Language (cs.CL)

    This paper proposes dynamic chunk reader (DCR), an end-to-end neural reading
    comprehension (RC) model that is able to extract and rank a set of answer
    candidates from a given document to answer questions. DCR is able to predict
    answers of variable lengths, whereas previous neural RC models primarily
    focused on predicting single tokens or entities. DCR encodes a document and an
    input question with recurrent neural networks, and then applies a word-by-word
    attention mechanism to acquire question-aware representations for the document,
    followed by the generation of chunk representations and a ranking module to
    propose the top-ranked chunk as the answer. Experimental results show that DCR
    achieves state-of-the-art exact match and F1 scores on the SQuAD dataset.

    Generating Sentiment Lexicons for German Twitter

    Uladzimir Sidarenka, Manfred Stede
    Comments: This paper is the first in a planned series of articles on an automatic generation of sentiment lexicons for non-English Twitter. It will be presented as a poster at the PEOPLES workshop (this https URL)
    Subjects: Computation and Language (cs.CL)

    Despite a substantial progress made in developing new sentiment lexicon
    generation (SLG) methods for English, the task of transferring these approaches
    to other languages and domains in a sound way still remains open. In this
    paper, we contribute to the solution of this problem by systematically
    comparing semi-automatic translations of common English polarity lists with the
    results of the original automatic SLG algorithms, which were applied directly
    to German data. We evaluate these lexicons on a corpus of 7,992 manually
    annotated tweets. In addition to that, we also collate the results of
    dictionary- and corpus-based SLG methods in order to find out which of these
    paradigms is better suited for the inherently noisy domain of social media. Our
    experiments show that semi-automatic translations notably outperform automatic
    systems (reaching a macro-averaged F1-score of 0.589), and that
    dictionary-based techniques produce much better polarity lists as compared to
    corpus-based approaches (whose best F1-scores run up to 0.479 and 0.419
    respectively) even for the non-standard Twitter genre.

    Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition

    Hagen Soltau, Hank Liao, Hasim Sak
    Subjects: Computation and Language (cs.CL); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    We present results that show it is possible to build a competitive, greatly
    simplified, large vocabulary continuous speech recognition system with whole
    words as acoustic units. We model the output vocabulary of about 100,000 words
    directly using deep bi-directional LSTM RNNs with CTC loss. The model is
    trained on 125,000 hours of semi-supervised acoustic training data, which
    enables us to alleviate the data sparsity problem for word models. We show that
    the CTC word models work very well as an end-to-end all-neural speech
    recognition model without the use of traditional context-dependent sub-word
    phone units that require a pronunciation lexicon, and without any language
    model removing the need to decode. We demonstrate that the CTC word models
    perform better than a strong, more complex, state-of-the-art baseline with
    sub-word units.

    Knowledge Questions from Knowledge Graphs

    Dominic Seyler, Mohamed Yahya, Klaus Berberich
    Subjects: Computation and Language (cs.CL)

    We address the novel problem of automatically generating quiz-style knowledge
    questions from a knowledge graph such as DBpedia. Questions of this kind have
    ample applications, for instance, to educate users about or to evaluate their
    knowledge in a specific domain. To solve the problem, we propose an end-to-end
    approach. The approach first selects a named entity from the knowledge graph as
    an answer. It then generates a structured triple-pattern query, which yields
    the answer as its sole result. If a multiple-choice question is desired, the
    approach selects alternative answer options. Finally, our approach uses a
    template-based method to verbalize the structured query and yield a natural
    language question. A key challenge is estimating how difficult the generated
    question is to human users. To do this, we make use of historical data from the
    Jeopardy! quiz show and a semantically annotated Web-scale document collection,
    engineer suitable features, and train a logistic regression classifier to
    predict question difficulty. Experiments demonstrate the viability of our
    overall approach.

    Named Entity Recognition for Novel Types by Transfer Learning

    Lizhen Qu, Gabriela Ferraro, Liyuan Zhou, Weiwei Hou, Timothy Baldwin
    Journal-ref: EMNLP 2016
    Subjects: Computation and Language (cs.CL)

    In named entity recognition, we often don’t have a large in-domain training
    corpus or a knowledge base with adequate coverage to train a model directly. In
    this paper, we propose a method where, given training data in a related domain
    with similar (but not identical) named entity (NE) types and a small amount of
    in-domain training data, we use transfer learning to learn a domain-specific NE
    model. That is, the novelty in the task setup is that we assume not just domain
    mismatch, but also label mismatch.

    LightRNN: Memory and Computation-Efficient Recurrent Neural Networks

    Xiang Li, Tao Qin, Jian Yang, Tie-Yan Liu
    Comments: NIPS 2016
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    Recurrent neural networks (RNNs) have achieved state-of-the-art performances
    in many natural language processing tasks, such as language modeling and
    machine translation. However, when the vocabulary is large, the RNN model will
    become very big (e.g., possibly beyond the memory capacity of a GPU device) and
    its training will become very inefficient. In this work, we propose a novel
    technique to tackle this challenge. The key idea is to use 2-Component (2C)
    shared embedding for word representations. We allocate every word in the
    vocabulary into a table, each row of which is associated with a vector, and
    each column associated with another vector. Depending on its position in the
    table, a word is jointly represented by two components: a row vector and a
    column vector. Since the words in the same row share the row vector and the
    words in the same column share the column vector, we only need (2 sqrt{|V|})
    vectors to represent a vocabulary of (|V|) unique words, which are far less
    than the (|V|) vectors required by existing approaches. Based on the
    2-Component shared embedding, we design a new RNN algorithm and evaluate it
    using the language modeling task on several benchmark datasets. The results
    show that our algorithm significantly reduces the model size and speeds up the
    training process, without sacrifice of accuracy (it achieves similar, if not
    better, perplexity as compared to state-of-the-art language models).
    Remarkably, on the One-Billion-Word benchmark Dataset, our algorithm achieves
    comparable perplexity to previous language models, whilst reducing the model
    size by a factor of 40-100, and speeding up the training process by a factor of
    2. We name our proposed algorithm emph{LightRNN} to reflect its very small
    model size and very high training speed.

    Chinese Poetry Generation with Planning based Neural Network

    Zhe Wang, Wei He, Hua Wu, Haiyang Wu, Wei Li, Haifeng Wang, Enhong Chen
    Comments: Accepted at COLING 2016
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

    Chinese poetry generation is a very challenging task in natural language
    processing. In this paper, we propose a novel two-stage poetry generating
    method which first plans the sub-topics of the poem according to the user’s
    writing intent, and then generates each line of the poem sequentially, using a
    modified recurrent neural network encoder-decoder framework. The proposed
    planning-based method can ensure that the generated poem is coherent and
    semantically consistent with the user’s intent. A comprehensive evaluation with
    human judgments demonstrates that our proposed approach outperforms the
    state-of-the-art poetry generating methods and the poem quality is somehow
    comparable to human poets.

    Experiments with POS Tagging Code-mixed Indian Social Media Text

    Prakash B. Pimpale, Raj Nath Patel
    Comments: 3 Pages, Published in the Proceedings of the Tool Contest on POS Tagging for Code-mixed Indian Social Media (Facebook, Twitter, and Whatsapp) Text
    Journal-ref: In the Proceedings of the 12th International Conference on Natural
    Language Processing (ICON 2015)
    Subjects: Computation and Language (cs.CL)

    This paper presents Centre for Development of Advanced Computing Mumbai’s
    (CDACM) submission to the NLP Tools Contest on Part-Of-Speech (POS) Tagging For
    Code-mixed Indian Social Media Text (POSCMISMT) 2015 (collocated with ICON
    2015). We submitted results for Hindi (hi), Bengali (bn), and Telugu (te)
    languages mixed with English (en). In this paper, we have described our
    approaches to the POS tagging techniques, we exploited for this task. Machine
    learning has been used to POS tag the mixed language text. For POS tagging,
    distributed representations of words in vector space (word2vec) for feature
    extraction and Log-linear models have been tried. We report our work on all
    three languages hi, bn, and te mixed with en.

    Towards Deep Learning in Hindi NER: An approach to tackle the Labelled Data Sparsity

    Vinayak Athavale, Shreenivas Bharadwaj, Monik Pamecha, Ameya Prabhu, Manish Shrivastava
    Comments: 6 pages
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    In this paper we describe an end to end Neural Model for Named Entity
    Recognition NER) which is based on Bi-Directional RNN-LSTM’s. Almost all NER
    systems for Hindi use Language Specific features and handcrafted rules with
    gazetteers. Our model is language independent and uses no domain specific
    features or any handcrafted rules. Our models rely on semantic information in
    the form of word vectors which are learnt by an unsupervised learning algorithm
    on an unannotated corpus. Our model attained state of the art per- formance in
    both English and Hindi which is a morphologically rich language without the use
    of any morphological analysis or without using gazetteers of any sort.

    Represent, Aggregate, and Constrain: A Novel Architecture for Machine Reading from Noisy Sources

    Jason Naradowsky, Sebastian Riedel
    Subjects: Computation and Language (cs.CL)

    In order to extract event information from text, a machine reading model must
    learn to accurately read and interpret the ways in which that information is
    expressed. But it must also, as the human reader must, aggregate numerous
    individual value hypotheses into a single coherent global analysis, applying
    global constraints which reflect prior knowledge of the domain.

    In this work we focus on the task of extracting plane crash event information
    from clusters of related news articles whose labels are derived via distant
    supervision. Unlike previous machine reading work, we assume that while most
    target values will occur frequently in most clusters, they may also be missing
    or incorrect.

    We introduce a novel neural architecture to explicitly model the noisy nature
    of the data and to deal with these aforementioned learning issues. Our models
    are trained end-to-end and achieve an improvement of more than 12.1 F(_1) over
    previous work, despite using far less linguistic annotation. We apply factor
    graph constraints to promote more coherent event analyses, with belief
    propagation inference formulated within the transitions of a recurrent neural
    network. We show this technique additionally improves maximum F(_1) by up to
    2.8 points, resulting in a relative improvement of (50\%) over the previous
    state-of-the-art.

    Feature-Augmented Neural Networks for Patient Note De-identification

    Ji Young Lee, Franck Dernoncourt, Ozlem Uzuner, Peter Szolovits
    Comments: Accepted as a conference paper at COLING ClinicalNLP 2016. The first two authors contributed equally to this work
    Subjects: Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    Patient notes contain a wealth of information of potentially great interest
    to medical investigators. However, to protect patients’ privacy, Protected
    Health Information (PHI) must be removed from the patient notes before they can
    be legally released, a process known as patient note de-identification. The
    main objective for a de-identification system is to have the highest possible
    recall. Recently, the first neural-network-based de-identification system has
    been proposed, yielding state-of-the-art results. Unlike other systems, it does
    not rely on human-engineered features, which allows it to be quickly deployed,
    but does not leverage knowledge from human experts or from electronic health
    records (EHRs). In this work, we explore a method to incorporate
    human-engineered features as well as features derived from EHRs to a
    neural-network-based de-identification system. Our results show that the
    addition of features, especially the EHR-derived features, further improves the
    state-of-the-art in patient note de-identification, including for some of the
    most sensitive PHI types such as patient names. Since in a real-life setting
    patient notes typically come with EHRs, we recommend developers of
    de-identification systems to leverage the information EHRs contain.

    Sequence-to-sequence neural network models for transliteration

    Mihaela Rosca, Thomas Breuel
    Subjects: Computation and Language (cs.CL)

    Transliteration is a key component of machine translation systems and
    software internationalization. This paper demonstrates that neural
    sequence-to-sequence models obtain state of the art or close to state of the
    art results on existing datasets. In an effort to make machine transliteration
    accessible, we open source a new Arabic to English transliteration dataset and
    our trained models.

    Sentiment Analysis of Review Datasets Using Naive Bayes and K-NN Classifier

    Lopamudra Dey, Sanjay Chakraborty, Anuraag Biswas, Beepa Bose, Sweta Tiwari
    Comments: Volume-8, Issue-4, pp.54-62, 2016
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    The advent of Web 2.0 has led to an increase in the amount of sentimental
    content available in the Web. Such content is often found in social media web
    sites in the form of movie or product reviews, user comments, testimonials,
    messages in discussion forums etc. Timely discovery of the sentimental or
    opinionated web content has a number of advantages, the most important of all
    being monetization. Understanding of the sentiments of human masses towards
    different entities and products enables better services for contextual
    advertisements, recommendation systems and analysis of market trends. The focus
    of our project is sentiment focussed web crawling framework to facilitate the
    quick discovery of sentimental contents of movie reviews and hotel reviews and
    analysis of the same. We use statistical methods to capture elements of
    subjective style and the sentence polarity. The paper elaborately discusses two
    supervised machine learning algorithms: K-Nearest Neighbour(K-NN) and Naive
    Bayes and compares their overall accuracy, precisions as well as recall values.
    It was seen that in case of movie reviews Naive Bayes gave far better results
    than K-NN but for hotel reviews these algorithms gave lesser, almost same
    accuracies.

    Ontology Verbalization using Semantic-Refinement

    Vinu E.V, P Sreenivasa Kumar
    Comments: Currently this paper is under review in the Semantic Web Journal. arXiv admin note: text overlap with arXiv:1607.07027
    Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

    We propose a rule-based technique to generate redundancy-free NL descriptions
    of OWL entities.The existing approaches which address the problem of
    verbalizing OWL ontologies generate NL text segments which are close to their
    counterpart OWL statements.Some of these approaches also perform grouping and
    aggregating of these NL text segments to generate a more fluent and
    comprehensive form of the content.Restricting our attention to description of
    individuals and concepts, we find that the approach currently followed in the
    available tools is that of determining the set of all logical conditions that
    are satisfied by the given individual/concept name and translate these
    conditions verbatim into corresponding NL descriptions.Human-understandability
    of such descriptions is affected by the presence of repetitions and
    redundancies, as they have high fidelity to their OWL representation.In the
    literature, no efforts had been taken to remove redundancies and repetitions at
    the logical-level before generating the NL descriptions of entities and we find
    this to be the main reason for lack of readability of the generated
    text.Herein, we propose a technique called semantic-refinement(SR) to generate
    meaningful and easily-understandable descriptions of individuals and concepts
    of a given OWLontology.We identify the combinations of OWL/DL constructs that
    lead to repetitive/redundant descriptions and propose a series of refinement
    rules to rewrite the conditions that are satisfied by an individual/concept in
    a meaning-preserving manner.The reduced set of conditions are then employed for
    generating NL descriptions.Our experiments show that, SR leads to significantly
    improved descriptions of ontology entities.We also test the effectiveness and
    usefulness of the the generated descriptions for the purpose of validating the
    ontologies and find that the proposed technique is indeed helpful in the
    context.

    Finding Street Gang Members on Twitter

    Lakshika Balasuriya, Sanjaya Wijeratne, Derek Doran, Amit Sheth
    Comments: 8 pages, 9 figures, 2 tables, Published as a full paper at 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2016)
    Journal-ref: The 2016 IEEE/ACM Int. Conf. on Advances in Social Networks
    Analysis and Mining. vol. 8, pp. 685-692. San Francisco, CA, USA (2016)
    Subjects: Social and Information Networks (cs.SI); Computation and Language (cs.CL); Computers and Society (cs.CY); Information Retrieval (cs.IR)

    Most street gang members use Twitter to intimidate others, to present
    outrageous images and statements to the world, and to share recent illegal
    activities. Their tweets may thus be useful to law enforcement agencies to
    discover clues about recent crimes or to anticipate ones that may occur.
    Finding these posts, however, requires a method to discover gang member Twitter
    profiles. This is a challenging task since gang members represent a very small
    population of the 320 million Twitter users. This paper studies the problem of
    automatically finding gang members on Twitter. It outlines a process to curate
    one of the largest sets of verifiable gang member profiles that have ever been
    studied. A review of these profiles establishes differences in the language,
    images, YouTube links, and emojis gang members use compared to the rest of the
    Twitter population. Features from this review are used to train a series of
    supervised classifiers. Our classifier achieves a promising F1 score with a low
    false positive rate.


    Distributed, Parallel, and Cluster Computing

    A GPU-Based Genetic Algorithm for the P-Median Problem

    Bader F. AlBdaiwi, Hosam M.F. AboElFotoh
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    The p-median problem is a well-known NP-hard problem. Many heuristics have
    been proposed in the literature for this problem. In this paper, we exploit a
    GPGPU parallel computing platform to present a new genetic algorithm
    implemented in Cuda and based on a Pseudo Boolean formulation of the p-median
    problem. We have tested the effectiveness of our algorithm using a Tesla K40
    (2880 Cuda cores) on 290 different benchmark instances obtained from
    OR-Library, discrete location problems benchmark library, and benchmarks
    introduced in recent publications. The algorithm succeeded in finding optimal
    solutions for all instances except for two OR-library instances, namely pmed30
    and pmed40, where better than 99.9\% approximations were obtained.

    Big Data Frameworks: A Comparative Study

    Wissem Inoubli, Sabeur Aridhi, Haithem Mezni, Alexander Jung
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Recently, increasingly large amounts of data are generated from a variety of
    sources. Existing data processing technologies are not suitable to cope with
    the huge amounts of generated data. Yet, many research works focus on Big Data,
    a buzzword referring to the processing of massive volumes of (unstructured)
    data. Recently proposed frameworks for Big Data applications help to store,
    analyze and process the data. In this paper, we discuss the challenges of Big
    Data and we survey existing Big Data frameworks. We also present an
    experimental evaluation and a comparative study of the most popular Big Data
    frameworks. This survey is concluded with a presentation of best practices
    related to the use of studied frameworks in several application domains such as
    machine learning, graph processing and real-world applications.

    On the Power of Weaker Pairwise Interaction: Fault-Tolerant Simulation of Population Protocols

    Giuseppe Antonio Di Luna, Paola Flocchini, Taisuke Izumi, Tomoko Izumi, Nicola Santoro, Giovanni Viglietta
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    In this paper we investigate the computational power of Population Protocols
    (PP) under some unreliable and/or weaker interaction models. More precisely, we
    focus on two features related to the power of interactions: omission failures
    and one-way communications. An omission failure, a notion that this paper
    introduces for the first time in the context of PP, is the loss by one or both
    parties of the information transmitted in an interaction. The failure may or
    may not be detected by either party. On the other hand, in one-way models,
    communication happens only in one direction: only one of the two agents can
    change its state depending on both agents’ states, and the other agent may or
    may not be aware of the interaction. These notions can be combined, obtaining
    one-way protocols with (possibly detectable) omission failures.

    A general question is what additional power is necessary and sufficient to
    completely overcome the weakness of one-way protocols and enable them to
    simulate two-way protocols, with and without omission failures. As a basic
    feature, a simulator needs to implement an atomic communication of states
    between two agents; this task is further complicated by the anonymity of the
    agents, their lack of knowledge of the system, and the limited amount of memory
    that they may have.

    We provide the first answers to these questions by presenting and analyzing
    several simulators, i.e., wrapper protocols converting any protocol for the
    standard two-way model into one running on a weaker one.

    A Scalable and Robust Framework for Intelligent Real-time Video Surveillance

    Shreenath Dutt, Ankita Kalra
    Comments: 4 pages, 3 figures, Presented in International Conference on Advances in Computing, Communications and Informatics (ICACCI-2016), September 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC)

    In this paper, we present an intelligent, reliable and storage-efficient
    video surveillance system using Apache Storm and OpenCV. As a Storm topology,
    we have added multiple information extraction modules that only write important
    content to the disk. Our topology is extensible, capable of adding novel
    algorithms as per the use case without affecting the existing ones, since all
    the processing is independent of each other. This framework is also highly
    scalable and fault tolerant, which makes it a best option for organisations
    that need to monitor a large network of surveillance cameras.

    KeystoneML: Optimizing Pipelines for Large-Scale Advanced Analytics

    Evan R. Sparks, Shivaram Venkataraman, Tomer Kaftan, Michael J. Franklin, Benjamin Recht
    Subjects: Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC)

    Modern advanced analytics applications make use of machine learning
    techniques and contain multiple steps of domain-specific and general-purpose
    processing with high resource requirements. We present KeystoneML, a system
    that captures and optimizes the end-to-end large-scale machine learning
    applications for high-throughput training in a distributed environment with a
    high-level API. This approach offers increased ease of use and higher
    performance over existing systems for large scale learning. We demonstrate the
    effectiveness of KeystoneML in achieving high quality statistical accuracy and
    scalable training using real world datasets in several domains. By optimizing
    execution KeystoneML achieves up to 15x training throughput over unoptimized
    execution on a real image classification application.


    Learning

    Learning Runtime Parameters in Computer Systems with Delayed Experience Injection

    Michael Schaarschmidt, Felix Gessert, Valentin Dalibard, Eiko Yoneki
    Comments: Deep Reinforcement Learning Workshop, NIPS 2016
    Subjects: Learning (cs.LG)

    Learning effective configurations in computer systems without hand-crafting
    models for every parameter is a long-standing problem. This paper investigates
    the use of deep reinforcement learning for runtime parameters of cloud
    databases under latency constraints. Cloud services serve up to thousands of
    concurrent requests per second and can adjust critical parameters by leveraging
    performance metrics. In this work, we use continuous deep reinforcement
    learning to learn optimal cache expirations for HTTP caching in content
    delivery networks. To this end, we introduce a technique for asynchronous
    experience management called delayed experience injection, which facilitates
    delayed reward and next-state computation in concurrent environments where
    measurements are not immediately available. Evaluation results show that our
    approach based on normalized advantage functions and asynchronous CPU-only
    training outperforms a statistical estimator.

    Depth Separation in ReLU Networks for Approximating Smooth Non-Linear Functions

    Itay Safran, Ohad Shamir
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    We provide a depth-based separation result for feed-forward ReLU neural
    networks, showing that a wide family of non-linear, twice-differentiable
    functions on ([0,1]^d), which can be approximated to accuracy (epsilon) by
    ReLU networks of depth and width (mathcal{O}( ext{poly}(log(1/epsilon)))),
    cannot be approximated to similar accuracy by constant-depth ReLU networks,
    unless their width is at least (Omega(1/epsilon)).

    DPPred: An Effective Prediction Framework with Concise Discriminative Patterns

    Jingbo Shang, Meng Jiang, Wenzhu Tong, Jinfeng Xiao, Jian Peng, Jiawei Han
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

    In the literature, two series of models have been proposed to address
    prediction problems including classification and regression. Simple models,
    such as generalized linear models, have ordinary performance but strong
    interpretability on a set of simple features. The other series, including
    tree-based models, organize numerical, categorical and high dimensional
    features into a comprehensive structure with rich interpretable information in
    the data.

    In this paper, we propose a novel Discriminative Pattern-based Prediction
    framework (DPPred) to accomplish the prediction tasks by taking their
    advantages of both effectiveness and interpretability. Specifically, DPPred
    adopts the concise discriminative patterns that are on the prefix paths from
    the root to leaf nodes in the tree-based models. DPPred selects a limited
    number of the useful discriminative patterns by searching for the most
    effective pattern combination to fit generalized linear models. Extensive
    experiments show that in many scenarios, DPPred provides competitive accuracy
    with the state-of-the-art as well as the valuable interpretability for
    developers and experts. In particular, taking a clinical application dataset as
    a case study, our DPPred outperforms the baselines by using only 40 concise
    discriminative patterns out of a potentially exponentially large set of
    patterns.

    Active Learning from Imperfect Labelers

    Songbai Yan, Kamalika Chaudhuri, Tara Javidi
    Comments: To appear in NIPS 2016
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    We study active learning where the labeler can not only return incorrect
    labels but also abstain from labeling. We consider different noise and
    abstention conditions of the labeler. We propose an algorithm which utilizes
    abstention responses, and analyze its statistical consistency and query
    complexity under fairly natural assumptions on the noise and abstention rate of
    the labeler. This algorithm is adaptive in a sense that it can automatically
    request less queries with a more informed or less noisy labeler. We couple our
    algorithm with lower bounds to show that under some technical conditions, it
    achieves nearly optimal query complexity.

    The Multi-fidelity Multi-armed Bandit

    Kirthevasan Kandasamy, Gautam Dasarathy, Jeff Schneider, Barnabás Póczos
    Comments: To appear at NIPS 2016
    Subjects: Learning (cs.LG)

    We study a variant of the classical stochastic (K)-armed bandit where
    observing the outcome of each arm is expensive, but cheap approximations to
    this outcome are available. For example, in online advertising the performance
    of an ad can be approximated by displaying it for shorter time periods or to
    narrower audiences. We formalise this task as a multi-fidelity bandit, where,
    at each time step, the forecaster may choose to play an arm at any one of (M)
    fidelities. The highest fidelity (desired outcome) expends cost
    (lambda^{(m)}). The (m^{ ext{th}}) fidelity (an approximation) expends
    (lambda^{(m)} < lambda^{(M)}) and returns a biased estimate of the highest
    fidelity. We develop MF-UCB, a novel upper confidence bound procedure for this
    setting and prove that it naturally adapts to the sequence of available
    approximations and costs thus attaining better regret than naive strategies
    which ignore the approximations. For instance, in the above online advertising
    example, MF-UCB would use the lower fidelities to quickly eliminate suboptimal
    ads and reserve the larger expensive experiments on a small set of promising
    candidates. We complement this result with a lower bound and show that MF-UCB
    is nearly optimal under certain conditions.

    Doubly Convolutional Neural Networks

    Shuangfei Zhai, Yu Cheng, Weining Lu, Zhongfei Zhang
    Comments: To appear in NIPS 2016
    Subjects: Learning (cs.LG)

    Building large models with parameter sharing accounts for most of the success
    of deep convolutional neural networks (CNNs). In this paper, we propose doubly
    convolutional neural networks (DCNNs), which significantly improve the
    performance of CNNs by further exploring this idea. In stead of allocating a
    set of convolutional filters that are independently learned, a DCNN maintains
    groups of filters where filters within each group are translated versions of
    each other. Practically, a DCNN can be easily implemented by a two-step
    convolution procedure, which is supported by most modern deep learning
    libraries. We perform extensive experiments on three image classification
    benchmarks: CIFAR-10, CIFAR-100 and ImageNet, and show that DCNNs consistently
    outperform other competing architectures. We have also verified that replacing
    a convolutional layer with a doubly convolutional layer at any depth of a CNN
    can improve its performance. Moreover, various design choices of DCNNs are
    demonstrated, which shows that DCNN can serve the dual purpose of building more
    accurate models and/or reducing the memory footprint without sacrificing the
    accuracy.

    Deep Model Compression: Distilling Knowledge from Noisy Teachers

    Bharat Bhusan Sau, Vineeth N. Balasubramanian
    Comments: Submitted to WACV,2017
    Subjects: Learning (cs.LG)

    The remarkable successes of deep learning models across various applications
    have resulted in the design of deeper networks that can solve complex problems.
    However, the increasing depth of such models also results in a higher storage
    and runtime complexity, which restricts the deployability of such very deep
    models on mobile and portable devices, which have limited storage and battery
    capacity. While many methods have been proposed for deep model compression in
    recent years, almost all of them have focused on reducing storage complexity.
    In this work, we extend the teacher-student framework for deep model
    compression, since it has the potential to address runtime and train time
    complexity too. We propose a simple methodology to include a noise-based
    regularizer while training the student from the teacher, which provides a
    healthy improvement in the performance of the student network. Our experiments
    on the CIFAR-10, SVHN and MNIST datasets show promising improvement, with the
    best performance on the CIFAR-10 dataset. We also conduct a comprehensive
    empirical evaluation of the proposed method under related settings on the
    CIFAR-10 dataset to show the promise of the proposed approach.

    Compact Deep Convolutional Neural Networks With Coarse Pruning

    Sajid Anwar, Wonyong Sung
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    The learning capability of a neural network improves with increasing depth at
    higher computational costs. Wider layers with dense kernel connectivity
    patterns furhter increase this cost and may hinder real-time inference. We
    propose feature map and kernel level pruning for reducing the computational
    complexity of a deep convolutional neural network. Pruning feature maps reduces
    the width of a layer and hence does not need any sparse representation.
    Further, kernel pruning converts the dense connectivity pattern into a sparse
    one. Due to coarse nature, these pruning granularities can be exploited by GPUs
    and VLSI based implementations. We propose a simple and generic strategy to
    choose the least adversarial pruning masks for both granularities. The pruned
    networks are retrained which compensates the loss in accuracy. We obtain the
    best pruning ratios when we prune a network with both granularities.
    Experiments with the CIFAR-10 dataset show that more than 85% sparsity can be
    induced in the convolution layers with less than 1% increase in the
    missclassification rate of the baseline network.

    A Theoretical Study of The Relationship Between Whole An ELM Network and Its Subnetworks

    Enmei Tu, Guanghao Zhang, Lily Rachmawati, Eshan Rajabally, Guang-Bin Huang
    Comments: 3 figures
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    A biological neural network is constituted by numerous subnetworks and
    modules with different functionalities. For an artificial neural network, the
    relationship between a network and its subnetworks is also important and useful
    for both theoretical and algorithmic research, i.e. it can be exploited to
    develop incremental network training algorithm or parallel network training
    algorithm. In this paper we explore the relationship between an ELM neural
    network and its subnetworks. To the best of our knowledge, we are the first to
    prove a theorem that shows an ELM neural network can be scattered into
    subnetworks and its optimal solution can be constructed recursively by the
    optimal solutions of these subnetworks. Based on the theorem we also present
    two algorithms to train a large ELM neural network efficiently: one is a
    parallel network training algorithm and the other is an incremental network
    training algorithm. The experimental results demonstrate the usefulness of the
    theorem and the validity of the developed algorithms.

    Rawlsian Fairness for Machine Learning

    Matthew Joseph, Michael Kearns, Jamie Morgenstern, Seth Neel, Aaron Roth
    Subjects: Learning (cs.LG)

    Motivated by concerns that automated decision-making procedures can
    unintentionally lead to discriminatory behavior, we study a technical
    definition of fairness modeled after John Rawls’ notion of “fair equality of
    opportunity”. In the context of a simple model of online decision making, we
    give an algorithm that satisfies this fairness constraint, while still being
    able to learn at a rate that is comparable to (but necessarily worse than) that
    of the best algorithms absent a fairness constraint. We prove a regret bound
    for fair algorithms in the linear contextual bandit framework that is a
    significant improvement over our technical companion paper [16], which gives
    black-box reductions in a more general setting. We analyze our algorithms both
    theoretically and experimentally. Finally, we introduce the notion of a
    “discrimination index”, and show that standard algorithms for our problem
    exhibit structured discriminatory behavior, whereas the “fair” algorithms we
    develop do not.

    TensorLy: Tensor Learning in Python

    Jean Kossaifi, Yannis Panagakis, Maja Pantic
    Subjects: Learning (cs.LG)

    Tensor methods are gaining increasing traction in machine learning. However,
    there are scant to no resources available to perform tensor learning and
    decomposition in Python. To answer this need we developed TensorLy. TensorLy is
    a state of the art general purpose library for tensor learning. Written in
    Python, it aims at following the same standards adopted by the main projects of
    the Python scientific community and fully integrating with these. It allows for
    fast and straightforward tensor decomposition and learning and comes with
    exhaustive tests, thorough documentation and minimal dependencies. It can be
    easily extended and its BSD licence makes it suitable for both academic and
    commercial applications. TensorLy is available at
    this https URL

    Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences

    Daniel Neil, Michael Pfeiffer, Shih-Chii Liu
    Comments: Selected for an oral presentation at NIPS, 2016
    Subjects: Learning (cs.LG)

    Recurrent Neural Networks (RNNs) have become the state-of-the-art choice for
    extracting patterns from temporal sequences. However, current RNN models are
    ill-suited to process irregularly sampled data triggered by events generated in
    continuous time by sensors or other neurons. Such data can occur, for example,
    when the input comes from novel event-driven artificial sensors that generate
    sparse, asynchronous streams of events or from multiple conventional sensors
    with different update intervals. In this work, we introduce the Phased LSTM
    model, which extends the LSTM unit by adding a new time gate. This gate is
    controlled by a parametrized oscillation with a frequency range that produces
    updates of the memory cell only during a small percentage of the cycle. Even
    with the sparse updates imposed by the oscillation, the Phased LSTM network
    achieves faster convergence than regular LSTMs on tasks which require learning
    of long sequences. The model naturally integrates inputs from sensors of
    arbitrary sampling rates, thereby opening new areas of investigation for
    processing asynchronous sensory events that carry timing information. It also
    greatly improves the performance of LSTMs in standard RNN applications, and
    does so with an order-of-magnitude fewer computes at runtime.

    Contextual Decision Processes with Low Bellman Rank are PAC-Learnable

    Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John Langford, Robert E. Schapire
    Comments: 43 pages, 1 figure
    Subjects: Learning (cs.LG)

    This paper studies systematic exploration for reinforcement learning with
    rich observations and function approximation. We introduce a new formulation,
    called contextual decision processes, that unifies and generalizes most prior
    settings. Our first contribution is a new complexity measure, the Bellman Rank,
    that we show enables tractable learning of near-optimal behavior in these
    processes and is naturally small for many well-studied reinforcement learning
    settings. Our second contribution is a new reinforcement learning algorithm
    that engages in systematic exploration to learn contextual decision processes
    with low Bellman Rank. The algorithm provably learns near-optimal behavior with
    a number of samples that is polynomial in all relevant parameters but
    independent of the number of unique observations. The algorithm uses Bellman
    error minimization with optimistic exploration and provides new insights into
    efficient exploration for reinforcement learning with function approximation.

    SDP Relaxation with Randomized Rounding for Energy Disaggregation

    Kiarash Shaloudegi, András György, Csaba Szepesvári, Wilsun Xu
    Subjects: Learning (cs.LG)

    We develop a scalable, computationally efficient method for the task of
    energy disaggregation for home appliance monitoring. In this problem the goal
    is to estimate the energy consumption of each appliance over time based on the
    total energy-consumption signal of a household. The current state of the art is
    to model the problem as inference in factorial HMMs, and use quadratic
    programming to find an approximate solution to the resulting quadratic integer
    program. Here we take a more principled approach, better suited to integer
    programming problems, and find an approximate optimum by combining convex
    semidefinite relaxations randomized rounding, as well as a scalable ADMM method
    that exploits the special structure of the resulting semidefinite program.
    Simulation results both in synthetic and real-world datasets demonstrate the
    superiority of our method.

    Fast Learning with Nonconvex L1-2 Regularization

    Quanming Yao, James T. Kwok
    Subjects: Learning (cs.LG); Numerical Analysis (cs.NA); Numerical Analysis (math.NA)

    Convex regularizers are often used for sparse learning. They are easy to
    optimize, but can lead to inferior prediction performance. The difference of
    (ell_1) and (ell_2) ((ell_{1-2})) regularizer has been recently proposed as
    a nonconvex regularizer. It yields better recovery than both (ell_0) and
    (ell_1) regularizers on compressed sensing. However, how to efficiently
    optimize its learning problem is still challenging. The main difficulty is that
    both the (ell_1) and (ell_2) norms in (ell_{1-2}) are not differentiable,
    and existing optimization algorithms cannot be applied. In this paper, we show
    that a closed-form solution can be derived for the proximal step associated
    with this regularizer. We further extend the result for low-rank matrix
    learning and the total variation model. Experiments on both synthetic and real
    data sets show that the resultant accelerated proximal gradient algorithm is
    more efficient than other noncovex optimization algorithms.

    KeystoneML: Optimizing Pipelines for Large-Scale Advanced Analytics

    Evan R. Sparks, Shivaram Venkataraman, Tomer Kaftan, Michael J. Franklin, Benjamin Recht
    Subjects: Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC)

    Modern advanced analytics applications make use of machine learning
    techniques and contain multiple steps of domain-specific and general-purpose
    processing with high resource requirements. We present KeystoneML, a system
    that captures and optimizes the end-to-end large-scale machine learning
    applications for high-throughput training in a distributed environment with a
    high-level API. This approach offers increased ease of use and higher
    performance over existing systems for large scale learning. We demonstrate the
    effectiveness of KeystoneML in achieving high quality statistical accuracy and
    scalable training using real world datasets in several domains. By optimizing
    execution KeystoneML achieves up to 15x training throughput over unoptimized
    execution on a real image classification application.

    Asynchronous Doubly Stochastic Proximal Optimization with Variance Reduction

    Bin Gu, Zhouyuan Huo, Heng Huang
    Subjects: Learning (cs.LG)

    In the big data era, both of the sample size and dimension could be huge at
    the same time. Asynchronous parallel technology was recently proposed to handle
    the big data. Specifically, asynchronous stochastic gradient descent algorithms
    were recently proposed to scale the sample size, and asynchronous stochastic
    coordinate descent algorithms were proposed to scale the dimension. However, a
    few existing asynchronous parallel algorithms can scale well in sample size and
    dimension simultaneously. In this paper, we focus on a composite objective
    function consists of a smooth convex function f and a separable convex function
    g. We propose an asynchronous doubly stochastic proximal optimization algorithm
    with variance reduction (AsyDSPOVR) to scale well with the sample size and
    dimension simultaneously. We prove that AsyDSPOVR achieves a linear convergence
    rate when the function f is with the optimal strong convexity property, and a
    sublinear rate when f is with the general convexity.

    Discriminative Gaifman Models

    Mathias Niepert
    Comments: NIPS 2016
    Subjects: Learning (cs.LG)

    We present discriminative Gaifman models, a novel family of relational
    machine learning models. Gaifman models learn feature representations bottom up
    from representations of locally connected and bounded-size regions of knowledge
    bases (KBs). Considering local and bounded-size neighborhoods of knowledge
    bases renders logical inference and learning tractable, mitigates the problem
    of overfitting, and facilitates weight sharing. Gaifman models sample
    neighborhoods of knowledge bases so as to make the learned relational models
    more robust to missing objects and relations which is a common situation in
    open-world KBs. We present the core ideas of Gaifman models and apply them to
    large-scale relational learning problems. We also discuss the ways in which
    Gaifman models relate to some existing relational machine learning approaches.

    Tensor Switching Networks

    Chuan-Yung Tsai, Andrew Saxe, David Cox
    Subjects: Neural and Evolutionary Computing (cs.NE); Learning (cs.LG); Machine Learning (stat.ML)

    We present a novel neural network algorithm, the Tensor Switching (TS)
    network, which generalizes the Rectified Linear Unit (ReLU) nonlinearity to
    tensor-valued hidden units. The TS network copies its entire input vector to
    different locations in an expanded representation, with the location determined
    by its hidden unit activity. In this way, even a simple linear readout from the
    TS representation can implement a highly expressive deep-network-like function.
    The TS network hence avoids the vanishing gradient problem by construction, at
    the cost of larger representation size. We develop several methods to train the
    TS network, including equivalent kernels for infinitely wide and deep TS
    networks, a one-pass linear learning algorithm, and two
    backpropagation-inspired representation learning algorithms. Our experimental
    results demonstrate that the TS network is indeed more expressive and
    consistently learns faster than standard ReLU networks.

    Optimization for Large-Scale Machine Learning with Distributed Features and Observations

    Alexandros Nathan, Diego Klabjan
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    As the size of modern data sets exceeds the disk and memory capacities of a
    single computer, machine learning practitioners have resorted to parallel and
    distributed computing. Given that optimization is one of the pillars of machine
    learning and predictive modeling, distributed optimization methods have
    recently garnered ample attention in the literature. Although previous research
    has mostly focused on settings where either the observations, or features of
    the problem at hand are stored in distributed fashion, the situation where both
    are partitioned across the nodes of a computer cluster (doubly distributed) has
    barely been studied. In this work we propose two doubly distributed
    optimization algorithms. The first one falls under the umbrella of distributed
    dual coordinate ascent methods, while the second one belongs to the class of
    stochastic gradient/coordinate descent hybrid methods. We conduct numerical
    experiments in Spark using real-world and simulated data sets and study the
    scaling properties of our methods. Our empirical evaluation of the proposed
    algorithms demonstrates the out-performance of a block distributed ADMM method,
    which, to the best of our knowledge is the only other existing doubly
    distributed optimization algorithm.

    Numerical Range Facets Partition: Evaluation Metric and Methods

    Xueqing Liu, Chengxiang Zhai, Wei Han, Onur Gungor
    Comments: Under review for WWW’17
    Subjects: Information Retrieval (cs.IR); Human-Computer Interaction (cs.HC); Learning (cs.LG)

    Faceted browsing is a very useful interface component provided in many of
    today’s search engines. It is especially useful when the user has an
    exploratory information need or have a clear preference for certain attribute
    values. Existing work has tried to optimize faceted browsing systems in many
    aspects, but little work has been done on optimizing the partitions of
    numerical facet ranges (e.g., price ranges of products). In this paper, we
    introduce the new problem of numerical facet range partition and formally frame
    it as an optimization problem. To enable quantitative evaluation and reuse of
    search log data, we propose an evaluation metric based on user’s browsing cost
    when using the suggested ranges for navigation. We further propose and study
    multiple methods to computationally optimize the partition by leveraging search
    logs. Experimental results on a two-month search log from a major e-Commerce
    engine show that learning can significantly improve the performance over
    baseline.

    Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition

    Hagen Soltau, Hank Liao, Hasim Sak
    Subjects: Computation and Language (cs.CL); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    We present results that show it is possible to build a competitive, greatly
    simplified, large vocabulary continuous speech recognition system with whole
    words as acoustic units. We model the output vocabulary of about 100,000 words
    directly using deep bi-directional LSTM RNNs with CTC loss. The model is
    trained on 125,000 hours of semi-supervised acoustic training data, which
    enables us to alleviate the data sparsity problem for word models. We show that
    the CTC word models work very well as an end-to-end all-neural speech
    recognition model without the use of traditional context-dependent sub-word
    phone units that require a pronunciation lexicon, and without any language
    model removing the need to decode. We demonstrate that the CTC word models
    perform better than a strong, more complex, state-of-the-art baseline with
    sub-word units.

    Support Vector Machines and Generalisation in HEP

    A. Bethani, A. J. Bevan, J. Hays, T. J. Stevenson
    Comments: 5 pages, 6 figures. Contribution to the proceedings of the 17th International workshop on Advanced Computing and Analysis Techniques in physics research – ACAT 2016, 18 – 22 January 2016, Valpara’iso, Chile
    Subjects: Data Analysis, Statistics and Probability (physics.data-an); Learning (cs.LG); High Energy Physics – Experiment (hep-ex)

    We review the concept of support vector machines (SVMs) and discuss examples
    of their use. One of the benefits of SVM algorithms, compared with neural
    networks and decision trees is that they can be less susceptible to over
    fitting than those other algorithms are to over training. This issue is related
    to the generalisation of a multivariate algorithm (MVA); a problem that has
    often been overlooked in particle physics. We discuss cross validation and how
    this can be used to improve the generalisation of a MVA in the context of High
    Energy Physics analyses. The examples presented use the Toolkit for
    Multivariate Analysis (TMVA) based on ROOT and describe our improvements to the
    SVM functionality and new tools introduced for cross validation within this
    framework.

    Complex-Valued Kernel Methods for Regression

    Rafael Boloix-Tortosa, Juan José Murillo-Fuentes, Irene Santos Velázquez, Fernando Pérez-Cruz
    Comments: 8 pages, 9 figures
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Usually, complex-valued RKHS are presented as an straightforward application
    of the real-valued case. In this paper we prove that this procedure yields a
    limited solution for regression. We show that another kernel, here denoted as
    pseudo kernel, is needed to learn any function in complex-valued fields.
    Accordingly, we derive a novel RKHS to include it, the widely RKHS (WRKHS).
    When the pseudo-kernel cancels, WRKHS reduces to complex-valued RKHS of
    previous approaches. We address the kernel and pseudo-kernel design, paying
    attention to the kernel and the pseudo-kernel being complex-valued. In the
    experiments included we report remarkable improvements in simple scenarios
    where real a imaginary parts have different similitude relations for given
    inputs or cases where real and imaginary parts are correlated. In the context
    of these novel results we revisit the problem of non-linear channel
    equalization, to show that the WRKHS helps to design more efficient solutions.

    Inference Compilation and Universal Probabilistic Programming

    Tuan Anh Le, Atilim Gunes Baydin, Frank Wood
    Comments: 10 pages, 6 figures
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    We introduce a method for using deep neural networks to amortize the cost of
    inference in models from the family induced by universal probabilistic
    programming languages, establishing a framework that combines the strengths of
    probabilistic programming and deep learning methods. We call what we do
    “compilation of inference” because our method transforms a denotational
    specification of an inference problem in the form of a probabilistic program
    written in a universal programming language into a trained neural network
    denoted in a neural network specification language. When at test time this
    neural network is fed observational data and executed, it performs approximate
    inference in the original model specified by the probabilistic program. Our
    training objective and learning procedure are designed to allow the trained
    neural network to be used as a proposal distribution in a sequential importance
    sampling inference engine. We illustrate our method on mixture models and
    Captcha solving and show significant speedups in the efficiency of inference.

    LightRNN: Memory and Computation-Efficient Recurrent Neural Networks

    Xiang Li, Tao Qin, Jian Yang, Tie-Yan Liu
    Comments: NIPS 2016
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    Recurrent neural networks (RNNs) have achieved state-of-the-art performances
    in many natural language processing tasks, such as language modeling and
    machine translation. However, when the vocabulary is large, the RNN model will
    become very big (e.g., possibly beyond the memory capacity of a GPU device) and
    its training will become very inefficient. In this work, we propose a novel
    technique to tackle this challenge. The key idea is to use 2-Component (2C)
    shared embedding for word representations. We allocate every word in the
    vocabulary into a table, each row of which is associated with a vector, and
    each column associated with another vector. Depending on its position in the
    table, a word is jointly represented by two components: a row vector and a
    column vector. Since the words in the same row share the row vector and the
    words in the same column share the column vector, we only need (2 sqrt{|V|})
    vectors to represent a vocabulary of (|V|) unique words, which are far less
    than the (|V|) vectors required by existing approaches. Based on the
    2-Component shared embedding, we design a new RNN algorithm and evaluate it
    using the language modeling task on several benchmark datasets. The results
    show that our algorithm significantly reduces the model size and speeds up the
    training process, without sacrifice of accuracy (it achieves similar, if not
    better, perplexity as compared to state-of-the-art language models).
    Remarkably, on the One-Billion-Word benchmark Dataset, our algorithm achieves
    comparable perplexity to previous language models, whilst reducing the model
    size by a factor of 40-100, and speeding up the training process by a factor of
    2. We name our proposed algorithm emph{LightRNN} to reflect its very small
    model size and very high training speed.

    Meta-Path Guided Embedding for Similarity Search in Large-Scale Heterogeneous Information Networks

    Jingbo Shang, Meng Qu, Jialu Liu, Lance M. Kaplan, Jiawei Han, Jian Peng
    Subjects: Social and Information Networks (cs.SI); Learning (cs.LG)

    Most real-world data can be modeled as heterogeneous information networks
    (HINs) consisting of vertices of multiple types and their relationships. Search
    for similar vertices of the same type in large HINs, such as bibliographic
    networks and business-review networks, is a fundamental problem with broad
    applications. Although similarity search in HINs has been studied previously,
    most existing approaches neither explore rich semantic information embedded in
    the network structures nor take user’s preference as a guidance.

    In this paper, we re-examine similarity search in HINs and propose a novel
    embedding-based framework. It models vertices as low-dimensional vectors to
    explore network structure-embedded similarity. To accommodate user preferences
    at defining similarity semantics, our proposed framework, ESim, accepts
    user-defined meta-paths as guidance to learn vertex vectors in a user-preferred
    embedding space. Moreover, an efficient and parallel sampling-based
    optimization algorithm has been developed to learn embeddings in large-scale
    HINs. Extensive experiments on real-world large-scale HINs demonstrate a
    significant improvement on the effectiveness of ESim over several
    state-of-the-art algorithms as well as its scalability.

    Towards Deep Learning in Hindi NER: An approach to tackle the Labelled Data Sparsity

    Vinayak Athavale, Shreenivas Bharadwaj, Monik Pamecha, Ameya Prabhu, Manish Shrivastava
    Comments: 6 pages
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    In this paper we describe an end to end Neural Model for Named Entity
    Recognition NER) which is based on Bi-Directional RNN-LSTM’s. Almost all NER
    systems for Hindi use Language Specific features and handcrafted rules with
    gazetteers. Our model is language independent and uses no domain specific
    features or any handcrafted rules. Our models rely on semantic information in
    the form of word vectors which are learnt by an unsupervised learning algorithm
    on an unannotated corpus. Our model attained state of the art per- formance in
    both English and Hindi which is a morphologically rich language without the use
    of any morphological analysis or without using gazetteers of any sort.

    Discovering containment: from infants to machines

    Shimon Ullman, Nimrod Dorfman, Daniel Harari
    Subjects: Neurons and Cognition (q-bio.NC); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Current artificial learning systems can recognize thousands of visual
    categories, or play Go at a champion”s level, but cannot explain infants
    learning, in particular the ability to learn complex concepts without guidance,
    in a specific order. A notable example is the category of ‘containers’ and the
    notion of containment, one of the earliest spatial relations to be learned,
    starting already at 2.5 months, and preceding other common relations (e.g.,
    support). Such spontaneous unsupervised learning stands in contrast with
    current highly successful computational models, which learn in a supervised
    manner, that is, by using large data sets of labeled examples. How can
    meaningful concepts be learned without guidance, and what determines the
    trajectory of infant learning, making some notions appear consistently earlier
    than others?

    FEAST: An Automated Feature Selection Framework for Compilation Tasks

    Pai-Shun Ting, Chun-Chen Tu, Pin-Yu Chen, Ya-Yun Lo, Shin-Ming Cheng
    Subjects: Programming Languages (cs.PL); Learning (cs.LG); Software Engineering (cs.SE)

    The success of the application of machine-learning techniques to compilation
    tasks can be largely attributed to the recent development and advancement of
    program characterization, a process that numerically or structurally quantifies
    a target program. While great achievements have been made in identifying key
    features to characterize programs, choosing a correct set of features for a
    specific compiler task remains an ad hoc procedure. In order to guarantee a
    comprehensive coverage of features, compiler engineers usually need to select
    excessive number of features. This, unfortunately, would potentially lead to a
    selection of multiple similar features, which in turn could create a new
    problem of bias that emphasizes certain aspects of a program’s characteristics,
    hence reducing the accuracy and performance of the target compiler task. In
    this paper, we propose FEAture Selection for compilation Tasks (FEAST), an
    efficient and automated framework for determining the most relevant and
    representative features from a feature pool. Specifically, FEAST utilizes
    widely used statistics and machine-learning tools, including LASSO, sequential
    forward and backward selection, for automatic feature selection, and can in
    general be applied to any numerical feature set. This paper further proposes an
    automated approach to compiler parameter assignment for assessing the
    performance of FEAST. Intensive experimental results demonstrate that, under
    the compiler parameter assignment task, FEAST can achieve comparable results
    with about 18% of features that are automatically selected from the entire
    feature pool. We also inspect these selected features and discuss their roles
    in program execution.

    Sparse Signal Recovery for Binary Compressed Sensing by Majority Voting Neural Networks

    Daisuke Ito, Tadashi Wadayama
    Comments: 6 pages
    Subjects: Information Theory (cs.IT); Learning (cs.LG); Machine Learning (stat.ML)

    In this paper, we propose majority voting neural networks for sparse signal
    recovery in binary compressed sensing. The majority voting neural network is
    composed of several independently trained feedforward neural networks employing
    the sigmoid function as an activation function. Our empirical study shows that
    a choice of a loss function used in training processes for the network is of
    prime importance. We found a loss function suitable for sparse signal recovery,
    which includes a cross entropy-like term and an (L_1) regularized term. From
    the experimental results, we observed that the majority voting neural network
    achieves excellent recovery performance, which is approaching the optimal
    performance as the number of component nets grows. The simple architecture of
    the majority voting neural networks would be beneficial for both software and
    hardware implementations.

    Beyond Exchangeability: The Chinese Voting Process

    Moontae Lee, Seok Hyun Jin, David Mimno
    Subjects: Social and Information Networks (cs.SI); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Learning (cs.LG)

    Many online communities present user-contributed responses such as reviews of
    products and answers to questions. User-provided helpfulness votes can
    highlight the most useful responses, but voting is a social process that can
    gain momentum based on the popularity of responses and the polarity of existing
    votes. We propose the Chinese Voting Process (CVP) which models the evolution
    of helpfulness votes as a self-reinforcing process dependent on position and
    presentation biases. We evaluate this model on Amazon product reviews and more
    than 80 StackExchange forums, measuring the intrinsic quality of individual
    responses and behavioral coefficients of different communities.

    Dynamic matrix recovery from incomplete observations under an exact low-rank constraint

    Liangbei Xu, Mark A. Davenport
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Low-rank matrix factorizations arise in a wide variety of applications —
    including recommendation systems, topic models, and source separation, to name
    just a few. In these and many other applications, it has been widely noted that
    by incorporating temporal information and allowing for the possibility of
    time-varying models, significant improvements are possible in practice.
    However, despite the reported superior empirical performance of these dynamic
    models over their static counterparts, there is limited theoretical
    justification for introducing these more complex models. In this paper we aim
    to address this gap by studying the problem of recovering a dynamically
    evolving low-rank matrix from incomplete observations. First, we propose the
    locally weighted matrix smoothing (LOWEMS) framework as one possible approach
    to dynamic matrix recovery. We then establish error bounds for LOWEMS in both
    the {em matrix sensing} and {em matrix completion} observation models. Our
    results quantify the potential benefits of exploiting dynamic constraints both
    in terms of recovery accuracy and sample complexity. To illustrate these
    benefits we provide both synthetic and real-world experimental results.


    Information Theory

    Maximal Packing with Interference Constraints

    Rakshith Jagannath, Radha Krishna Ganti, Neelesh S Upadhye
    Subjects: Information Theory (cs.IT); Applications (stat.AP)

    In this work, we study the problem of scheduling a maximal set of
    transmitters subjected to an interference constraint across all the nodes.
    Given a set of nodes, the problem reduces to finding the maximum cardinality of
    a subset of nodes that can concurrently transmit without violating interference
    constraints. The resulting packing problem is a binary optimization problem and
    is NP hard. We propose a semi-definite relaxation (SDR) for this problem and
    provide bounds on the relaxation.

    Joint Transceiver and Power Splitter Design Over Two-Way Relaying Channel with Lattice Codes and Energy Harvesting

    Zhigang Wen, Shuai Wang, Chunxiao Fan, Weidong Xiang
    Comments: This version corrects some errors in the published letter
    Journal-ref: IEEE Communications Letters, vol. 18, no. 11, pp. 2039-2042
    Subjects: Information Theory (cs.IT)

    This letter considers a compute-and-forward two-way relaying channel with
    simultaneous wireless information and power transfer. Specifically, two
    single-antenna users exchange information via a multi-antenna relay station
    based on nested lattice codes. Meanwhile, wireless energies flow from the relay
    to users for circuit consumption and uplink transmission. Based on this model,
    an optimization problem is formulated to minimize the transmit power at relay,
    while guaranteeing the minimal transmission rate at each user. To solve the
    problem, we propose an efficient iterative algorithm to jointly optimize the
    transmitter, receiver and power splitter, based on semi-definite relaxation and
    semi-definite programming. Numerical results of relay transmission powers
    validate our analysis.

    Massive Machine-Type Communication (mMTC) Access with Integrated Authentication

    Nuno K. Pratas, Sarath Pattathil, Cedomir Stefanovic, Petar Popovski
    Subjects: Information Theory (cs.IT)

    We present a connection establishment protocol with integrated
    authentication, suited for Massive Machine-Type Communications (mMTC). The
    protocol is contention-based and its main feature is that a device contends
    with a unique signature that also enables the authentication of the device
    towards the network. The signatures are inspired by Bloom filters and are
    created based on the output of the MILENAGE authentication and encryption
    algorithm set, which is used in the authentication and security procedures in
    the LTE protocol family. We show that our method utilizes the system resources
    more efficiently, achieves lower latency of connection establishment for
    Poisson arrivals and allows a (86\%) signalling overhead reduction. An
    important conclusion is that the mMTC traffic benefits profoundly from
    integration of security features into the connection establishment/access
    protocols, instead of addressing them post-hoc, which has been a common
    practice.

    Generalized Solution for the Demodulation of Reaction Shift Keying Signals in Molecular Communication Networks

    Hamdan Awan, Chun Tung Chou
    Comments: 34 Pages , 9 Figures
    Subjects: Information Theory (cs.IT)

    This paper considers a diffusion-based molecular communication system where
    the transmitter uses Reaction Shift Keying (RSK) as the modulation scheme. We
    focus on the demodulation of RSK signal at the receiver. The receiver consists
    of a front-end molecular circuit and a back-end demodulator. The front-end
    molecular circuit is a set of chemical reactions consisting of multiple
    chemical species. The optimal demodulator computes the posteriori probability
    of the transmitted symbols given the history of the observation. The derivation
    of the optimal demodulator requires the solution to a specific Bayesian
    filtering problem. The solution to this Bayesian filtering problem had been
    derived for a few specific molecular circuits and specific choice(s) of
    observed chemical species. The derivation of such solution is also lengthy. The
    key contribution of this paper is to present a general solution to this
    Bayesian filtering problem which can be applied to any molecular circuit and
    any choice of observed species.

    Fast Construction of Polar Codes

    Wei Wang, Liping Li
    Comments: 5 pages, 3 figures, submitted to WCNC 2017
    Subjects: Information Theory (cs.IT)

    The construction of polar codes for channels other than BECs requires sorting
    of all bit channels and then selecting the best (K) of them for a block length
    (N=2^n). The sorting algorithms, be it density evolution or Tal-Vardy’s
    algorithm, typically require intense computations. In this paper, two types of
    partial orders (PO) of polar codes are incorporated in the construction process
    to decrease the required computations. Three sets, corresponding to the good
    bit channels ((mathcal{I})), the frozen bit channels ((mathcal{F})), and the
    undetermined bit channels ((mathcal{U})), are selected by applying PO
    relations. A new process, called Dimension Reduction (DR), is proposed in this
    paper to further reduce the size of (mathcal{U}). Our studies show that for
    (N=10) and the code rate (R=0.5) (being the worst code rate), incorporating PO
    relations alone can determine 50\% of the bit channels ((|mathcal{I}| +
    |mathcal{F}| approx N/2)), resulting in only 50\% of the sorting
    calculations. With our proposed DR, this number of the determined bit channels
    goes up to 75\%, which brings a significant reduction of computations in the
    construction of polar codes.

    On Sequential Locally Repairable Codes

    Wentu Song, Kai Cai, Chau Yuen
    Subjects: Information Theory (cs.IT)

    We consider the locally repairable codes (LRC), aiming at sequential
    recovering multiple erasures. We define the (n,k,r,t)-SLRC (Sequential Locally
    Repairable Codes) as an [n,k] linear code where any t'(>= t) erasures can be
    sequentially recovered, each one by r (2<=r<k) other code symbols. Sequential
    recovering means that the erased symbols are recovered one by one, and an
    already recovered symbol can be used to recover the remaining erased symbols.
    This important recovering method, in contrast with the vastly studied parallel
    recovering, is currently far from understanding, say, lacking codes constructed
    for arbitrary t>=3 erasures and bounds to evaluate the performance of such
    codes.

    We first derive a tight upper bound on the code rate of (n, k, r, t)-SLRC for
    t=3 and r>=2. We then propose two constructions of binary (n, k, r, t)-SLRCs
    for general r,t>=2 (Existing constructions are dealing with t<=7 erasures. The
    first construction generalizes the method of direct product construction. The
    second construction is based on the resolvable configurations and yields SLRCs
    for any r>=2 and odd t>=3. For both constructions, the rates are optimal for t
    in {2,3} and are higher than most of the existing LRC families for arbitrary
    t>=4.

    Energy-Efficient Heterogeneous Cellular Networks with Spectrum Underlay and Overlay Access

    Jie Tang, Daniel K. C. So, Emad Alsusa, Khairi Ashour Hamdi, Arman Shojaeifard, Kai-Kit Wong
    Subjects: Information Theory (cs.IT)

    In this paper, we provide joint subcarrier assignment and power allocation
    schemes for quality-of-service (QoS)-constrained energy-efficiency (EE)
    optimization in the downlink of an orthogonal frequency division multiple
    access (OFDMA)-based two-tier heterogeneous cellular network (HCN). Considering
    underlay transmission, where spectrum-efficiency (SE) is fully exploited, the
    EE solution involves tackling a complex mixed-combinatorial and non-convex
    optimization problem. With appropriate decomposition of the original problem
    and leveraging on the quasi-concavity of the EE function, we propose a
    dual-layer resource allocation approach and provide a complete solution using
    difference-of-two-concave-functions approximation, successive convex
    approximation, and gradient-search methods. On the other hand, the inherent
    inter-tier interference from spectrum underlay access may degrade EE
    particularly under dense small-cell deployment and large bandwidth utilization.
    We therefore develop a novel resource allocation approach based on the concepts
    of spectrum overlay access and resource efficiency (RE) (normalized EE-SE
    trade-off). Specifically, the optimization procedure is separated in this case
    such that the macro-cell optimal RE and corresponding bandwidth is first
    determined, then the EE of small-cells utilizing the remaining spectrum is
    maximized. Simulation results confirm the theoretical findings and demonstrate
    that the proposed resource allocation schemes can approach the optimal EE with
    each strategy being superior under certain system settings.

    Hybrid Analog and Digital Precoding: From Practical RF System Models to Information Theoretic Bounds

    Vijay Venkateswaran, Rajet Krishnan
    Comments: Accepted in Globecom 16 WS
    Subjects: Information Theory (cs.IT)

    Hybrid analog-digital precoding is a key millimeter wave access technology,
    where an antenna array with reduced number of radio frequency (RF) chains is
    used with an RF precoding matrix to increase antenna gain at a reasonable cost.
    However, digital and RF precoder algorithms must be accompa- nied by a detailed
    system model of the RF precoder. In this work, we provide fundamental RF system
    models for these precoders, and show their impact on achievable rates. We show
    that hybrid precoding systems suffer from significant degradation, once the
    limitations of RF precoding network are accounted. We subsequently quantify
    this performance degradation, and use it as a reference for comparing the
    performance of different precoding methods. These results indicate that hybrid
    precoders must be redesigned (and their rates recomputed) to account for
    practical factors.

    Downlink Achievable Rate Analysis in Massive MIMO Systems with One-Bit DACs

    Yongzhi Li, Cheng Tao, A. Lee Swindlehurst, Amine Mezghani, Liu Liu
    Comments: 5 pages, 4 figures
    Subjects: Information Theory (cs.IT)

    In this letter, we investigate the downlink performance of massive
    multiple-input multiple-output (MIMO) systems where the base station is
    equipped with one-bit analog-to-digital/digital-to-analog converters
    (ADC/DACs). Considering training-based transmission, we assume the base station
    (BS) employs the linear minimum mean-squared-error (LMMSE) channel estimator
    and treats the channel estimate as the true channel to precode the data
    symbols. We derive an expression for the downlink achievable rate for
    matched-filter (MF) precoding. A detailed analysis of the resulting power
    efficiency is pursued using our expression of the achievable rate. Numerical
    results are presented to verify our analysis. In particular it is shown that,
    compared with conventional massive MIMO systems, the performance loss in
    one-bit massive MIMO systems can be compensated for by deploying 2.5 times more
    antennas at the BS.

    Mobile Millimeter Wave Channel Acquisition, Tracking, and Abrupt Change Detection

    Chuang Zhang, Dongning Guo, Pingyi Fan
    Comments: This paper is submitted
    Subjects: Information Theory (cs.IT)

    Millimeter wave provides a promising approach for meeting the ever-growing
    traffic demand in next generation wireless networks. It is crucial to obtain
    relatively accurate channel state information so that beamforming/combining can
    be performed to compensate for severe path loss in this band. In contrast to
    lower frequencies, a typical mobile millimeter wave channel consists of a few
    dominant paths. It is generally sufficient to estimate the path gains, angles
    of departure (AoD), and angles of arrival (AoA) of those paths. In this paper,
    multiple transmit and receive antennas and beamforming with a single baseband
    processing chain are assumed. We propose a framework for estimating millimeter
    wave channels with intermittent abrupt changes (e.g., blockage or emergence of
    dominant paths) and slow variations of AoDs and AoAs. The solution consists of
    three components: tracking of the slow channel variations, detection of abrupt
    changes, followed by (re-)acquisition of channel (and back to the tracking
    stage). For acquisition, we formulate a least squares problem and find its
    solution based on the Levenberg-Marquardt algorithm. To track slow variations
    of AoDs and AoAs, we propose a new approach using Kalman filtering. Finally, an
    algorithm based on a likelihood test is devised for detecting abrupt changes.
    Simulation results show that, with moderate signal-to-noise ratios, the
    proposed scheme can achieve more than 8 dB higher estimation accuracy than
    several other methods using the same number of pilots.

    Bit Allocation for Increased Power Efficiency in 5G Receivers with Variable-Resolution ADCs

    Waqas bin Abbas, Felipe Gomez-Cuba, Michele Zorzi
    Subjects: Information Theory (cs.IT)

    In future high-capacity wireless systems based on mmWave or massive multiple
    input multiple output (MIMO), the power consumption of receiver Analog to
    Digital Converters (ADC) is a concern. Although hybrid or analog systems with
    fewer ADCs have been proposed, fully digital receivers with many lower
    resolution ADCs (and lower power) may be a more versatile solution. In this
    paper, focusing on an uplink scenario, we propose to take the optimization of
    ADC resolution one step further by enabling variable resolutions in the ADCs
    that sample the signal received at each antenna. This allows to give more bits
    to the antennas that capture the strongest incoming signal and fewer bits to
    the antennas that capture little signal energy and mostly noise. Simulation
    results show that, depending on the unquantized link SNR, a power saving in the
    order of 20-80% can be obtained by our variable resolution proposal in
    comparison with a reference fully digital receiver with a fixed low number of
    bits in all its ADCs.

    Solving Large-scale Systems of Random Quadratic Equations via Stochastic Truncated Amplitude Flow

    Gang Wang, Georgios B. Giannakis, Jie Chen
    Comments: 21 pages, 12 figures
    Subjects: Information Theory (cs.IT); Optimization and Control (math.OC); Machine Learning (stat.ML)

    A novel approach termed emph{stochastic truncated amplitude flow} (STAF) is
    developed to reconstruct an unknown (n)-dimensional real-/complex-valued signal
    (m{x}) from (m) `phaseless’ quadratic equations of the form
    (psi_i=|langlem{a}_i,m{x}
    angle|). This problem, also known as phase
    retrieval from magnitude-only information, is emph{NP-hard} in general.
    Adopting an amplitude-based nonconvex formulation, STAF leads to an iterative
    solver comprising two stages: s1) Orthogonality-promoting initialization
    through a stochastic variance reduced gradient algorithm; and, s2) A series of
    iterative refinements of the initialization using stochastic truncated gradient
    iterations. Both stages involve a single equation per iteration, thus rendering
    STAF a simple, scalable, and fast approach amenable to large-scale
    implementations that is useful when (n) is large. When ({m{a}_i}_{i=1}^m)
    are independent Gaussian, STAF provably recovers exactly any
    (m{x}inmathbb{R}^n) exponentially fast based on order of (n) quadratic
    equations. STAF is also robust in the presence of additive noise of bounded
    support. Simulated tests involving real Gaussian ({m{a}_i}) vectors
    demonstrate that STAF empirically reconstructs any (m{x}inmathbb{R}^n)
    exactly from about (2.3n) magnitude-only measurements, outperforming
    state-of-the-art approaches and narrowing the gap from the
    information-theoretic number of equations (m=2n-1). Extensive experiments using
    synthetic data and real images corroborate markedly improved performance of
    STAF over existing alternatives.

    Power-Efficient Multi-Quality Multicast Beamforming based on Scalable Video Coding and Superposition Coding

    Chengjun Guo, Ying Cui, Derrick Wing Kwan Ng
    Comments: 26 pages, submitted to ICC 2017
    Subjects: Information Theory (cs.IT)

    In this paper, we consider multi-quality multicast of a video stream from a
    multi-antenna base station (BS) to multiple groups of single-antenna users
    requiring different qualities of the video stream, using scalable video coding
    (SVC). Leveraging the layered structure of SVC and exploiting superposition
    coding (SC) as well as successive interference cancelation (SIC), we propose a
    power-efficient layer-based multi-quality multicast beamforming scheme and a
    power-efficient quality-based multiquality multicast beamforming scheme. For
    each scheme, we formulate the corresponding optimal beamforming design as a
    non-convex power minimization problem, and obtain a globally optimal solution
    in a special case as well as a locally optimal solution in the general case.
    Then, we show that the minimum total transmission power of the quality-based
    optimization problem is the same as that of the layer-based optimization
    problem, but the computational complexity for solving the quality-based
    optimization problem is lower than that for solving the layerbased optimization
    problem. Finally, numerical results show that the proposed solutions achieve
    better performance than existing solutions.

    Network Coding-based Caching in Large-Scale SIC-Enabled Wireless Networks

    Dongdong Jiang, Ying Cui
    Comments: 6 pages, submitted to ICC 2017
    Subjects: Information Theory (cs.IT)

    Network coding-based caching at base stations (BSs) is a promising caching
    approach to support massive content delivery over wireless networks. However,
    existing network coding-based caching designs do not fully explore and exploit
    the potential advantages. In this paper, we consider the analysis and
    optimization of a random linear network coding-based caching design in
    large-scale successive interference cancelation (SIC)-enabled wireless
    networks. By utilizing tools from stochastic geometry, we derive a tractable
    expression for the successful transmission probability in the general file size
    regime. To further obtain design insights, we also derive closed-form
    expressions for the successful transmission probability in the small and large
    file size regimes, respectively. Then, we consider the successful transmission
    probability maximization by optimizing a design parameter, which is a complex
    discrete optimization problem. We propose a two-stage optimization framework
    and obtain a near optimal solution with superior performance and manageable
    complexity. The analysis and optimization results provide valuable design
    insights for practical cache and SIC enabled wireless networks. Finally, by
    numerical results, we show that the proposed near optimal caching design
    achieves a significant performance gain over some baseline caching designs.

    Resource Management in Non-orthogonal Multiple Access Systems: State of the Art and Research Challenges

    Lingyang Song, Yonghui Li, Zhiguo Ding, H. Vincent Poor
    Comments: 15 pages, 3 figures, IEEE Network, 2017
    Subjects: Information Theory (cs.IT)

    Non-orthogonal multiple access (NOMA) schemes have been proposed for the next
    generation of mobile communication systems to improve the access efficiency by
    allowing multiple users to share the same spectrum in a non-orthogonal way. Due
    to the strong co-channel interference among mobile users introduced by NOMA, it
    poses significant challenges for system design and resource management. This
    article reviews resource management issues in NOMA systems. The main taxonomy
    of NOMA is presented by focusing on the following two categories: power-domain
    NOMA and code-domain NOMA. Then a novel radio resource management framework is
    presented based on game-theoretic models for uplink and downlink transmissions.
    Finally, potential applications and open research directions in the area of
    resource management for NOMA are provided.

    Sparse Signal Recovery for Binary Compressed Sensing by Majority Voting Neural Networks

    Daisuke Ito, Tadashi Wadayama
    Comments: 6 pages
    Subjects: Information Theory (cs.IT); Learning (cs.LG); Machine Learning (stat.ML)

    In this paper, we propose majority voting neural networks for sparse signal
    recovery in binary compressed sensing. The majority voting neural network is
    composed of several independently trained feedforward neural networks employing
    the sigmoid function as an activation function. Our empirical study shows that
    a choice of a loss function used in training processes for the network is of
    prime importance. We found a loss function suitable for sparse signal recovery,
    which includes a cross entropy-like term and an (L_1) regularized term. From
    the experimental results, we observed that the majority voting neural network
    achieves excellent recovery performance, which is approaching the optimal
    performance as the number of component nets grows. The simple architecture of
    the majority voting neural networks would be beneficial for both software and
    hardware implementations.

    Degrees of Freedom in Wireless Interference Networks with Cooperative Transmission and Backhaul Load Constraints

    Meghana Bande, Aly El Gamal, Venugopal V. Veeravalli
    Comments: submitted to IEEE Transactions on Information Theory
    Subjects: Information Theory (cs.IT)

    Degrees of freedom (DoF) gains are studied in wireless networks with
    cooperative transmission under a backhaul load constraint that limits the
    average number of messages that can be delivered from a centralized controller
    to the base station transmitters. The backhaul load is defined as the sum of
    all the messages available at all the transmitters per channel use, normalized
    by the number of users. For Wyner’s linear interference network, where each
    transmitter is connected to the receiver having the same index as well as one
    succeeding receiver, the per user DoF is characterized and the optimal scheme
    is presented. When the backhaul load is constrained to an integer level B, the
    asymptotic per user DoF is shown to equal (4B-1)/(4B). Furthermore, it is shown
    that the optimal assignment of messages to transmitters is asymmetric and
    satisfies a local cooperation constraint, and that the optimal coding scheme
    relies only on one-shot zero-forcing transmit beamforming. The results are then
    extended to locally connected linear interference networks and the more
    practical hexagonal sectored cellular networks. Our main conclusion for all the
    studied network models is that by allowing for cooperative transmission and a
    flexible message assignment that is constrained only by an average backhaul
    load, one can deliver the rate gains promised by information-theoretic upper
    bounds with practical one-shot schemes that incur little or no additional load
    on the backhaul.

    Spatial-Temporal BEM and Channel Estimation Strategy for Massive MIMO Time-Varying Systems

    Hongxiang Xie, Feifei Gao, Shun Zhang, Shi Jin
    Comments: 6 pages, 6 figures, accepted by IEEE GLOBECOM2016
    Subjects: Information Theory (cs.IT)

    This paper proposes a new channel estimation scheme for the multiuser massive
    multiple-input multiple-output (MIMO) systems in time-varying environment. We
    introduce a discrete Fourier transform (DFT) aided spatial-temporal basis
    expansion model (ST-BEM) to reduce the effective dimensions of uplink/downlink
    channels, such that training overhead and feedback cost could be greatly
    decreased. The newly proposed ST-BEM is suitable for both time division duplex
    (TDD) systems and frequency division duplex (FDD) systems thanks to the angle
    reciprocity, and can be efficiently deployed by fast Fourier transform (FFT).
    Various numerical results have corroborated the proposed studies.

    On Achievability for Downlink Cloud Radio Access Networks with Base Station Cooperation

    Chien-Yi Wang, Michèle Wigger, Abdellatif Zaidi
    Comments: 25 pages, 18 figures, submitted to IEEE Transactions on Information Theory
    Subjects: Information Theory (cs.IT)

    This work investigates the downlink of a cloud radio access network (C-RAN)
    in which a central processor communicates with two mobile users through two
    base stations (BSs). The BSs act as relay nodes and cooperate with each other
    through error-free rate-limited links. We develop and analyze two coding
    schemes for this scenario. The first coding scheme is based on Liu-Kang scheme
    for C-RANs without BS cooperation; and extends it to scenarios allowing
    conferencing between the BSs. Among few other features, our new coding scheme
    enables arbitrary correlation among the auxiliary codewords that are recovered
    by the BSs. It also introduces common codewords to be described to both BSs.
    For the analysis of this coding scheme, we extend the multivariate covering
    lemma to non-Cartesian product sets, thereby correcting an erroneous
    application of this lemma in Liu-Kang’s related work. We highlight key aspects
    of this scheme by studying three important instances of it. The second coding
    scheme extends the so-called compression scheme that was originally developed
    for memoryless Gaussian C-RANs without BS cooperation to general discrete
    memoryless C-RANs with BS cooperation. We show that this scheme subsumes the
    original compression scheme when applied to memoryless Gaussian C-RAN models.
    In the analysis of this scheme, we also highlight important connections with
    the so-called distributed decode–forward scheme, and refine the approximate
    capacity of a general (N)-BS (L)-user C-RAN model in the memoryless Gaussian
    case.

    Gaussian states minimize the output entropy of one-mode quantum Gaussian channels

    Giacomo De Palma, Dario Trevisan, Vittorio Giovannetti
    Subjects: Quantum Physics (quant-ph); Information Theory (cs.IT); Mathematical Physics (math-ph); Probability (math.PR)

    We prove that Gaussian thermal input states minimize the output von Neumann
    entropy of any one-mode phase-covariant quantum Gaussian channel among all the
    input states with a given entropy. The starting point of the proof is the
    recent result stating that Gaussian thermal input states saturate the p->q
    norms of one-mode quantum-limited Gaussian channels. Quantum Gaussian channels
    model in the quantum regime the attenuation and the noise that affect any
    electromagnetic signal. This result is crucial to prove the converse theorems
    for both the triple trade-off region and the capacity region for broadcast
    communication of the Gaussian quantum-limited amplifier.

    Cognitive Access Protocol for Alleviating Sensing Errors in Cognitive Multiple-Access Systems

    Ahmed El Shafie
    Comments: Published in IEEE Wireless Communications Letters: IEEE Wireless Communications Letters, Vol. 3, No. 3, June 2014
    Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT)

    This letter studies a time-slotted multiple-access system with a primary user
    (PU) and a secondary user (SU) sharing the same channel resource. We propose a
    novel secondary access protocol which alleviates sensing errors and detects the
    availability of primary channels with the highest ability of detection. Under
    the proposed protocol, the SU may access the channel at one of a predefined
    instants within the time slot each of which associated with a certain access
    probability that changes based on the sensing outcome. There is also a
    possibility of accessing the channel at the beginning of the time slot without
    channel sensing. The optimization problem is stated such that the secondary
    throughput is maximized under stability of the primary queue and a constraint
    on the primary queueing delay. Numerical results demonstrate the beneficial
    gains of the proposed protocol in terms of secondary throughput.




沪ICP备19023445号-2号
友情链接