IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Tue, 4 Oct 2016

    我爱机器学习(52ml.net)发表于 2016-10-04 00:00:00
    love 0

    Neural and Evolutionary Computing

    Superconducting optoelectronic circuits for neuromorphic computing

    Jeffrey M. Shainline, Sonia M. Buckley, Richard P. Mirin, Sae Woo Nam
    Comments: 34 pages, 22 figures
    Subjects: Neural and Evolutionary Computing (cs.NE); Superconductivity (cond-mat.supr-con); Optics (physics.optics)

    We propose a hybrid semiconductor-superconductor hardware platform for the
    implementation of neural networks and large-scale neuromorphic computing. The
    platform combines semiconducting few-photon light-emitting diodes with
    superconducting-nanowire single-photon detectors to behave as spiking neurons.
    These processing units are connected via a network of optical waveguides, and
    variable weights of connection can be implemented using several approaches. The
    use of light as a signaling mechanism overcomes the requirement for
    time-multiplexing that has limited the event rates of purely electronic
    platforms. The proposed processing units can operate at $20$ MHz with fully
    asynchronous activity, light-speed-limited latency, and power densities on the
    order of 1 mW/cm$^2$ for neurons with 700 connections operating at full speed
    at 2 K. The processing units achieve an energy efficiency of $approx 20$ aJ
    per synapse event. By leveraging multilayer photonics with
    low-temperature-deposited waveguides and superconductors with feature sizes $>$
    100 nm, this approach could scale to massive interconnectivity near that of the
    human brain, and could surpass the brain in speed and energy efficiency.

    Sentiment Analysis on Bangla and Romanized Bangla Text (BRBT) using Deep Recurrent models

    A. Hassan, N. Mohammed, A. K. A. Azad
    Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Sentiment Analysis (SA) is an action research area in the digital age. With
    rapid and constant growth of online social media sites and services, and the
    increasing amount of textual data such as – statuses, comments, reviews etc.
    available in them, application of automatic SA is on the rise. However, most of
    the research works on SA in natural language processing (NLP) are based on
    English language. Despite being the sixth most widely spoken language in the
    world, Bangla still does not have a large and standard dataset. Because of
    this, recent research works in Bangla have failed to produce results that can
    be both comparable to works done by others and reusable as stepping stones for
    future researchers to progress in this field. Therefore, we first tried to
    provide a textual dataset – that includes not just Bangla, but Romanized Bangla
    texts as well, is substantial, post-processed and multiple validated, ready to
    be used in SA experiments. We tested this dataset in Deep Recurrent model,
    specifically, Long Short Term Memory (LSTM), using two types of loss functions
    – binary crossentropy and categorical crossentropy, and also did some
    experimental pre-training by using data from one validation to pre-train the
    other and vice versa. Lastly, we documented the results along with some
    analysis on them, which were promising.

    Accelerating Deep Convolutional Networks using low-precision and sparsity

    Ganesh Venkatesh, Eriko Nurvitadhi, Debbie Marr
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    We explore techniques to significantly improve the compute efficiency and
    performance of Deep Convolution Networks without impacting their accuracy. To
    improve the compute efficiency, we focus on achieving high accuracy with
    extremely low-precision (2-bit) weight networks, and to accelerate the
    execution time, we aggressively skip operations on zero-values. We achieve the
    highest reported accuracy of 76.6% Top-1/93% Top-5 on the Imagenet object
    classification challenge with low-precision networkfootnote{github release of
    the source code coming soon} while reducing the compute requirement by ~3x
    compared to a full-precision network that achieves similar accuracy.
    Furthermore, to fully exploit the benefits of our low-precision networks, we
    build a deep learning accelerator core, dLAC, that can achieve up to 1
    TFLOP/mm^2 equivalent for single-precision floating-point operations (~2
    TFLOP/mm^2 for half-precision).

    Very Deep Convolutional Neural Networks for Raw Waveforms

    Wei Dai, Chia Dai, Shuhui Qu, Juncheng Li, Samarjit Das
    Comments: 5 pages, 2 figures, under submission to International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2017
    Subjects: Sound (cs.SD); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Learning acoustic models directly from the raw waveform data with minimal
    processing is challenging. Current waveform-based models have generally used
    very few (~2) convolutional layers, which might be insufficient for building
    high-level discriminative features. In this work, we propose very deep
    convolutional neural networks (CNNs) that directly use time-domain waveforms as
    inputs. Our CNNs, with up to 34 weight layers, are efficient to optimize over
    very long sequences (e.g., vector of size 32000), necessary for processing
    acoustic waveforms. This is achieved through batch normalization, residual
    learning, and a careful design of down-sampling in the initial layers. Our
    networks are fully convolutional, without the use of fully connected layers and
    dropout, to maximize representation learning. We use a large receptive field in
    the first convolutional layer to mimic bandpass filters, but very small
    receptive fields subsequently to control the model capacity. We demonstrate the
    performance gains with the deeper models. Our evaluation shows that the CNN
    with 18 weight layers outperform the CNN with 3 weight layers by over 15% in
    absolute accuracy for an environmental sound recognition task and matches the
    performance of models using log-mel features.


    Computer Vision and Pattern Recognition

    Kernel Selection using Multiple Kernel Learning and Domain Adaptation in Reproducing Kernel Hilbert Space, for Face Recognition under Surveillance Scenario

    Samik Banerjee, Sukhendu Das
    Comments: 13 pages, 15 figures, 4 tables. Kernel Selection, Surveillance, Multiple Kernel Learning, Domain Adaptation, RKHS, Hallucination
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Learning (cs.LG)

    Face Recognition (FR) has been the interest to several researchers over the
    past few decades due to its passive nature of biometric authentication. Despite
    high accuracy achieved by face recognition algorithms under controlled
    conditions, achieving the same performance for face images obtained in
    surveillance scenarios, is a major hurdle. Some attempts have been made to
    super-resolve the low-resolution face images and improve the contrast, without
    considerable degree of success. The proposed technique in this paper tries to
    cope with the very low resolution and low contrast face images obtained from
    surveillance cameras, for FR under surveillance conditions. For Support Vector
    Machine classification, the selection of appropriate kernel has been a widely
    discussed issue in the research community. In this paper, we propose a novel
    kernel selection technique termed as MFKL (Multi-Feature Kernel Learning) to
    obtain the best feature-kernel pairing. Our proposed technique employs a
    effective kernel selection by Multiple Kernel Learning (MKL) method, to choose
    the optimal kernel to be used along with unsupervised domain adaptation method
    in the Reproducing Kernel Hilbert Space (RKHS), for a solution to the problem.
    Rigorous experimentation has been performed on three real-world surveillance
    face datasets : FR\_SURV, SCface and ChokePoint. Results have been shown using
    Rank-1 Recognition Accuracy, ROC and CMC measures. Our proposed method
    outperforms all other recent state-of-the-art techniques by a considerable
    margin.

    Video Pixel Networks

    Nal Kalchbrenner, Aaron van den Oord, Karen Simonyan, Ivo Danihelka, Oriol Vinyals, Alex Graves, Koray Kavukcuoglu
    Comments: 16 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We propose a probabilistic video model, the Video Pixel Network (VPN), that
    estimates the discrete joint distribution of the raw pixel values in a video.
    The model and the neural architecture reflect the time, space and color
    structure of video tensors and encode it as a four-dimensional dependency
    chain. The VPN approaches the best possible performance on the Moving MNIST
    benchmark, a leap over the previous state of the art, and the generated videos
    show only minor deviations from the ground truth. The VPN also produces
    detailed samples on the action-conditional Robotic Pushing benchmark and
    generalizes to the motion of novel objects.

    Rain structure transfer using an exemplar rain image for synthetic rain image generation

    Chang-Hwan Son, Xiao-Ping Zhang
    Comments: 6 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This letter proposes a simple method of transferring rain structures of a
    given exemplar rain image into a target image. Given the exemplar rain image
    and its corresponding masked rain image, rain patches including rain structures
    are extracted randomly, and then residual rain patches are obtained by
    subtracting those rain patches from their mean patches. Next, residual rain
    patches are selected randomly, and then added to the given target image along a
    raster scanning direction. To decrease boundary artifacts around the added
    patches on the target image, minimum error boundary cuts are found using
    dynamic programming, and then blending is conducted between overlapping
    patches. Our experiment shows that the proposed method can generate realistic
    rain images that have similar rain structures in the exemplar images. Moreover,
    it is expected that the proposed method can be used for rain removal. More
    specifically, natural images and synthetic rain images generated via the
    proposed method can be used to learn classifiers, for example, deep neural
    networks, in a supervised manner.

    On the Empirical Effect of Gaussian Noise in Under-sampled MRI Reconstruction

    Patrick Virtue, Michael Lustig
    Comments: 24 pages, 7 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT)

    In Fourier-based medical imaging, sampling below the Nyquist rate results in
    an underdetermined system, in which linear reconstructions will exhibit
    artifacts. Another consequence of under-sampling is lower signal to noise ratio
    (SNR) due to fewer acquired measurements. Even if an oracle provided the
    information to perfectly disambiguate the underdetermined system, the
    reconstructed image could still have lower image quality than a corresponding
    fully sampled acquisition because of the reduced measurement time. The effects
    of lower SNR and the underdetermined system are coupled during reconstruction,
    making it difficult to isolate the impact of lower SNR on image quality. To
    this end, we present an image quality prediction process that reconstructs
    fully sampled, fully determined data with noise added to simulate the loss of
    SNR induced by a given under-sampling pattern. The resulting prediction image
    empirically shows the effect of noise in under-sampled image reconstruction
    without any effect from an underdetermined system.

    We discuss how our image quality prediction process can simulate the
    distribution of noise for a given under-sampling pattern, including variable
    density sampling that produces colored noise in the measurement data. An
    interesting consequence of our prediction model is that we can show that
    recovery from underdetermined non-uniform sampling is equivalent to a weighted
    least squares optimization that accounts for heterogeneous noise levels across
    measurements.

    Through a series of experiments with synthetic and in vivo datasets, we
    demonstrate the efficacy of the image quality prediction process and show that
    it provides a better estimation of reconstruction image quality than the
    corresponding fully-sampled reference image.

    Seeing into Darkness: Scotopic Visual Recognition

    Bo Chen, Pietro Perona
    Comments: 23 pages, 6 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Images are formed by counting how many photons traveling from a given set of
    directions hit an image sensor during a given time interval. When photons are
    few and far in between, the concept of `image’ breaks down and it is best to
    consider directly the flow of photons. Computer vision in this regime, which we
    call `scotopic’, is radically different from the classical image-based paradigm
    in that visual computations (classification, control, search) have to take
    place while the stream of photons is captured and decisions may be taken as
    soon as enough information is available. The scotopic regime is important for
    biomedical imaging, security, astronomy and many other fields. Here we develop
    a framework that allows a machine to classify objects with as few photons as
    possible, while maintaining the error rate below an acceptable threshold. A
    dynamic and asymptotically optimal speed-accuracy tradeoff is a key feature of
    this framework. We propose and study an algorithm to optimize the tradeoff of a
    convolutional network directly from lowlight images and evaluate on simulated
    images from standard datasets. Surprisingly, scotopic systems can achieve
    comparable classification performance as traditional vision systems while using
    less than 0.1% of the photons in a conventional image. In addition, we
    demonstrate that our algorithms work even when the illuminance of the
    environment is unknown and varying. Last, we outline a spiking neural network
    coupled with photon-counting sensors as a power-efficient hardware realization
    of scotopic algorithms.

    Rain Removal via Shrinkage-Based Sparse Coding and Learned Rain Dictionary

    Chang-Hwan Son, Xiao-Ping Zhang
    Comments: 17 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper introduces a new rain removal model based on the shrinkage of the
    sparse codes for a single image. Recently, dictionary learning and sparse
    coding have been widely used for image restoration problems. These methods can
    also be applied to the rain removal by learning two types of rain and non-rain
    dictionaries and forcing the sparse codes of the rain dictionary to be zero
    vectors. However, this approach can generate unwanted edge artifacts and detail
    loss in the non-rain regions. Based on this observation, a new approach for
    shrinking the sparse codes is presented in this paper. To effectively shrink
    the sparse codes in the rain and non-rain regions, an error map between the
    input rain image and the reconstructed rain image is generated by using the
    learned rain dictionary. Based on this error map, both the sparse codes of rain
    and non-rain dictionaries are used jointly to represent the image structures of
    objects and avoid the edge artifacts in the non-rain regions. In the rain
    regions, the correlation matrix between the rain and non-rain dictionaries is
    calculated. Then, the sparse codes corresponding to the highly correlated
    signal-atoms in the rain and non-rain dictionaries are shrunk jointly to
    improve the removal of the rain structures. The experimental results show that
    the proposed shrinkage-based sparse coding can preserve image structures and
    avoid the edge artifacts in the non-rain regions, and it can remove the rain
    structures in the rain regions. Also, visual quality evaluation confirms that
    the proposed method outperforms the conventional texture and rain removal
    methods.

    Near-Infrared Coloring via a Contrast-Preserving Mapping Model

    Chang-Hwan Son, Xiao-Ping Zhang
    Comments: 12 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Near-infrared gray images captured together with corresponding visible color
    images have recently proven useful for image restoration and classification.
    This paper introduces a new coloring method to add colors to near-infrared gray
    images based on a contrast-preserving mapping model. A naive coloring method
    directly adds the colors from the visible color image to the near-infrared gray
    image; however, this method results in an unrealistic image because of the
    discrepancies in brightness and image structure between the captured
    near-infrared gray image and the visible color image. To solve the discrepancy
    problem, first we present a new contrast-preserving mapping model to create a
    new near-infrared gray image with a similar appearance in the luminance plane
    to the visible color image, while preserving the contrast and details of the
    captured near-infrared gray image. Then based on the proposed
    contrast-preserving mapping model, we develop a method to derive realistic
    colors that can be added to the newly created near-infrared gray image.
    Experimental results show that the proposed method can not only preserve the
    local contrasts and details of the captured near-infrared gray image, but
    transfers the realistic colors from the visible color image to the newly
    created near-infrared gray image. Experimental results also show that the
    proposed approach can be applied to near-infrared denoising.

    Stacked Autoencoders for Medical Image Search

    S. Sharma, I. Umar, L. Ospina, D. Wong, H.R. Tizhoosh
    Comments: To appear in proceedings of the 12th International Symposium on Visual Computing, December 12-14, 2016, Las Vegas, Nevada, USA
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Medical images can be a valuable resource for reliable information to support
    medical diagnosis. However, the large volume of medical images makes it
    challenging to retrieve relevant information given a particular scenario. To
    solve this challenge, content-based image retrieval (CBIR) attempts to
    characterize images (or image regions) with invariant content information in
    order to facilitate image search. This work presents a feature extraction
    technique for medical images using stacked autoencoders, which encode images to
    binary vectors. The technique is applied to the IRMA dataset, a collection of
    14,410 x-ray images in order to demonstrate the ability of autoencoders to
    retrieve similar x-rays given test queries. Using IRMA dataset as a benchmark,
    it was found that stacked autoencoders gave excellent results with a retrieval
    error of 376 for 1,733 test images with a compression of 74.61%.

    MinMax Radon Barcodes for Medical Image Retrieval

    H.R. Tizhoosh, Shujin Zhu, Hanson Lo, Varun Chaudhari, Tahmid Mehdi
    Comments: To appear in proceedings of the 12th International Symposium on Visual Computing, December 12-14, 2016, Las Vegas, Nevada, USA
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Content-based medical image retrieval can support diagnostic decisions by
    clinical experts. Examining similar images may provide clues to the expert to
    remove uncertainties in his/her final diagnosis. Beyond conventional feature
    descriptors, binary features in different ways have been recently proposed to
    encode the image content. A recent proposal is “Radon barcodes” that employ
    binarized Radon projections to tag/annotate medical images with content-based
    binary vectors, called barcodes. In this paper, MinMax Radon barcodes are
    introduced which are superior to “local thresholding” scheme suggested in the
    literature. Using IRMA dataset with 14,410 x-ray images from 193 different
    classes, the advantage of using MinMax Radon barcodes over emph{thresholded}
    Radon barcodes are demonstrated. The retrieval error for direct search drops by
    more than 15\%. As well, SURF, as a well-established non-binary approach, and
    BRISK, as a recent binary method are examined to compare their results with
    MinMax Radon barcodes when retrieving images from IRMA dataset. The results
    demonstrate that MinMax Radon barcodes are faster and more accurate when
    applied on IRMA images.

    Plug-and-Play CNN for Crowd Motion Analysis: An Application in Abnormal Event Detection

    Mahdyar Ravanbakhsh, Moin Nabi, Hossein Mousavi, Enver Sangineto, Nicu Sebe
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Most of the crowd abnormal event detection methods rely on complex
    hand-crafted features to represent the crowd motion and appearance.
    Convolutional Neural Networks (CNN) have shown to be a powerful tool with
    excellent representational capacities, which can leverage the need for
    hand-crafted features. In this paper, we show that keeping track of the changes
    in the CNN feature across time can facilitate capturing the local abnormality.
    We specifically propose a novel measure-based method which allows measuring the
    local abnormality in a video by combining semantic information (inherited from
    existing CNN models) with low-level Optical-Flow. One of the advantage of this
    method is that it can be used without the fine-tuning costs. The proposed
    method is validated on challenging abnormality detection datasets and the
    results show the superiority of our method compared to the state-of-the-art
    methods.

    Deep Feature Consistent Variational Autoencoder

    Xianxu Hou, Linlin Shen, Ke Sun, Guoping Qiu
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We present a novel method for constructing Variational Autoencoder (VAE).
    Instead of using pixel-by-pixel loss, we enforce deep feature consistency
    between the input and the output of a VAE, which ensures the VAE’s output to
    preserve the spatial correlation characteristics of the input, thus leading the
    output to have a more natural visual appearance and better perceptual quality.
    Based on recent deep learning works such as style transfer, we employ a
    pre-trained deep convolutional neural network (CNN) and use its hidden features
    to define a feature perceptual loss for VAE training. Evaluated on the CelebA
    face dataset, we show that our model produces better results than other methods
    in the literature. We also show that our method can produce latent vectors that
    can capture the semantic information of face expressions and can be used to
    achieve state-of-the-art performance in facial attribute prediction.

    Deep Learning Algorithms for Signal Recognition in Long Perimeter Monitoring Distributed Fiber Optic Sensors

    A.V. Makarenko
    Comments: 11 pages, 7 figures, 2 tables. Slightly extended preprint of paper accepted for IEEE MLSP 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

    In this paper, we show an approach to build deep learning algorithms for
    recognizing signals in distributed fiber optic monitoring and security systems
    for long perimeters. Synthesizing such detection algorithms poses a non-trivial
    research and development challenge, because these systems face stringent error
    (type I and II) requirements and operate in difficult signal-jamming
    environments, with intensive signal-like jamming and a variety of changing
    possible signal portraits of possible recognized events. To address these
    issues, we have developed a twolevel event detection architecture, where the
    primary classifier is based on an ensemble of deep convolutional networks, can
    recognize 7 classes of signals and receives time-space data frames as input.
    Using real-life data, we have shown that the applied methods result in
    efficient and robust multiclass detection algorithms that have a high degree of
    adaptability.

    Near-Infrared Image Dehazing Via Color Regularization

    Chang-Hwan Son, Xiao-Ping Zhang
    Comments: 12 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Near-infrared imaging can capture haze-free near-infrared gray images and
    visible color images, according to physical scattering models, e.g., Rayleigh
    or Mie models. However, there exist serious discrepancies in brightness and
    image structures between the near-infrared gray images and the visible color
    images. The direct use of the near-infrared gray images brings about another
    color distortion problem in the dehazed images. Therefore, the color distortion
    should also be considered for near-infrared dehazing. To reflect this point,
    this paper presents an approach of adding a new color regularization to
    conventional dehazing framework. The proposed color regularization can model
    the color prior for unknown haze-free images from two captured images. Thus,
    natural-looking colors and fine details can be induced on the dehazed images.
    The experimental results show that the proposed color regularization model can
    help remove the color distortion and the haze at the same time. Also, the
    effectiveness of the proposed color regularization is verified by comparing
    with other conventional regularizations. It is also shown that the proposed
    color regularization can remove the edge artifacts which arise from the use of
    the conventional dark prior model.

    How Transferable are CNN-based Features for Age and Gender Classification?

    Gökhan Özbulak, Yusuf Aytar, Hazım Kemal Ekenel
    Comments: 12 pages, 3 figures, 2 tables, International Conference of the Biometrics Special Interest Group (BIOSIG) 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Age and gender are complementary soft biometric traits for face recognition.
    Successful estimation of age and gender from facial images taken under
    real-world conditions can contribute improving the identification results in
    the wild. In this study, in order to achieve robust age and gender
    classification in the wild, we have benefited from Deep Convolutional Neural
    Networks based representation. We have explored transferability of existing
    deep convolutional neural network (CNN) models for age and gender
    classification. The generic AlexNet-like architecture and domain specific
    VGG-Face CNN model are employed and fine-tuned with the Adience dataset
    prepared for age and gender classification in uncontrolled environments. In
    addition, task specific GilNet CNN model has also been utilized and used as a
    baseline method in order to compare with transferred models. Experimental
    results show that both transferred deep CNN models outperform the GilNet CNN
    model, which is the state-of-the-art age and gender classification approach on
    the Adience dataset, by an absolute increase of 7% and 4.5% in accuracy,
    respectively. This outcome indicates that transferring a deep CNN model can
    provide better classification performance than a task specific CNN model, which
    has a limited number of layers and trained from scratch using a limited amount
    of data as in the case of GilNet. Domain specific VGG-Face CNN model has been
    found to be more useful and provided better performance for both age and gender
    classification tasks, when compared with generic AlexNet-like model, which
    shows that transfering from a closer domain is more useful.

    Microscopic Pedestrian Flow Characteristics: Development of an Image Processing Data Collection and Simulation Model

    Kardi Teknomo
    Comments: 140 pages, Teknomo, Kardi, Microscopic Pedestrian Flow Characteristics: Development of an Image Processing Data Collection and Simulation Model, Ph.D. Dissertation, Tohoku University Japan, Sendai, 2002
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Microscopic pedestrian studies consider detailed interaction of pedestrians
    to control their movement in pedestrian traffic flow. The tools to collect the
    microscopic data and to analyze microscopic pedestrian flow are still very much
    in its infancy. The microscopic pedestrian flow characteristics need to be
    understood. Manual, semi manual and automatic image processing data collection
    systems were developed. It was found that the microscopic speed resemble a
    normal distribution with a mean of 1.38 m/second and standard deviation of 0.37
    m/second. The acceleration distribution also bear a resemblance to the normal
    distribution with an average of 0.68 m/ square second. A physical based
    microscopic pedestrian simulation model was also developed. Both Microscopic
    Video Data Collection and Microscopic Pedestrian Simulation Model generate a
    database called NTXY database. The formulations of the flow performance or
    microscopic pedestrian characteristics are explained. Sensitivity of the
    simulation and relationship between the flow performances are described.
    Validation of the simulation using real world data is then explained through
    the comparison between average instantaneous speed distributions of the real
    world data with the result of the simulations. The simulation model is then
    applied for some experiments on a hypothetical situation to gain more
    understanding of pedestrian behavior in one way and two way situations, to know
    the behavior of the system if the number of elderly pedestrian increases and to
    evaluate a policy of lane-like segregation toward pedestrian crossing and
    inspects the performance of the crossing. It was revealed that the microscopic
    pedestrian studies have been successfully applied to give more understanding to
    the behavior of microscopic pedestrians flow, predict the theoretical and
    practical situation and evaluate some design policies before its
    implementation.

    Deep Visual Foresight for Planning Robot Motion

    Chelsea Finn, Sergey Levine
    Comments: Supplementary video: this https URL
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)

    A key challenge in scaling up robot learning to many skills and environments
    is removing the need for human supervision, so that robots can collect their
    own data and improve their own performance without being limited by the cost of
    requesting human feedback. Model-based reinforcement learning holds the promise
    of enabling an agent to learn to predict the effects of its actions, which
    could provide flexible predictive models for a wide range of tasks and
    environments, without detailed human supervision. We develop a method for
    combining deep action-conditioned video prediction models with model-predictive
    control that uses entirely unlabeled training data. Our approach does not
    require a calibrated camera, an instrumented training set-up, nor precise
    sensing and actuation. Our results show that our method enables a real robot to
    perform nonprehensile manipulation — pushing objects — and can handle novel
    objects not seen during training.

    Low-dose CT denoising with convolutional neural network

    Hu Chen, Yi Zhang, Weihua Zhang, Peixi Liao, Ke Li, Jiliu Zhou, Ge Wang
    Comments: arXiv admin note: substantial text overlap with arXiv:1609.08508
    Subjects: Medical Physics (physics.med-ph); Computer Vision and Pattern Recognition (cs.CV)

    To reduce the potential radiation risk, low-dose CT has attracted much
    attention. However, simply lowering the radiation dose will lead to significant
    deterioration of the image quality. In this paper, we propose a noise reduction
    method for low-dose CT via deep neural network without accessing original
    projection data. A deep convolutional neural network is trained to transform
    low-dose CT images towards normal-dose CT images, patch by patch. Visual and
    quantitative evaluation demonstrates a competing performance of the proposed
    method.

    X-CNN: Cross-modal Convolutional Neural Networks for Sparse Datasets

    Petar Veličković, Duo Wang, Nicholas D. Lane, Pietro Liò
    Comments: To appear in the 7th IEEE Symposium Series on Computational Intelligence (IEEE SSCI 2016), 8 pages, 6 figures
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

    In this paper we propose cross-modal convolutional neural networks (X-CNNs),
    a novel biologically inspired type of CNN architectures, treating gradient
    descent-specialised CNNs as individual units of processing in a larger-scale
    network topology, while allowing for unconstrained information flow and/or
    weight sharing between analogous hidden layers of the network—thus
    generalising the already well-established concept of neural network ensembles
    (where information typically may flow only between the output layers of the
    individual networks). The constituent networks are individually designed to
    learn the output function on their own subset of the input data, after which
    cross-connections between them are introduced after each pooling operation to
    periodically allow for information exchange between them. This injection of
    knowledge into a model (by prior partition of the input data through domain
    knowledge or unsupervised methods) is expected to yield greatest returns in
    sparse data environments, which are typically less suitable for training CNNs.
    For evaluation purposes, we have compared a standard four-layer CNN as well as
    a sophisticated FitNet4 architecture against their cross-modal variants on the
    CIFAR-10 and CIFAR-100 datasets with differing percentages of the training data
    being removed, and find that at lower levels of data availability, the X-CNNs
    significantly outperform their baselines (typically providing a 2–6% benefit,
    depending on the dataset size and whether data augmentation is used), while
    still maintaining an edge on all of the full dataset tests.

    Radial Velocity Retrieval for Multichannel SAR Moving Targets with Time-Space Doppler De-ambiguity

    Zu-Zhen Huang, Jia Xu, Zhi-Rui Wang, Li Xiao, Xiang-Gen Xia, Teng Long
    Comments: 14 double-column pages, 11 figures, 4 tables
    Subjects: Information Theory (cs.IT); Computer Vision and Pattern Recognition (cs.CV)

    In this paper, for multichannel synthetic aperture radar (SAR) systems we
    first formulate the effects of Doppler ambiguities on the radial velocity (RV)
    estimation of a ground moving target in range-compressed domain, range-Doppler
    domain and image domain, respectively, where cascaded time-space Doppler
    ambiguity (CTSDA) may occur, that is, time domain Doppler ambiguity (TDDA) in
    each channel occurs at first and then spatial domain Doppler ambiguity (SDDA)
    among multi-channels occurs subsequently. Accordingly, the multichannel SAR
    systems with different parameters are divided into three cases with different
    Doppler ambiguity properties, i.e., only TDDA occurs in Case I, and CTSDA
    occurs in Cases II and III, while the CTSDA in Case II can be simply seen as
    the SDDA. Then, a multi-frequency SAR is proposed to obtain the RV estimation
    by solving the ambiguity problem based on Chinese remainder theorem (CRT). For
    Cases I and II, the ambiguity problem can be solved by the existing closed-form
    robust CRT. For Case III, we show that the problem is different from the
    conventional CRT problem and we call it a double remaindering problem. We then
    propose a sufficient condition under which the double remaindering problem,
    i.e., the CTSDA, can be solved by the closed-form robust CRT. When the
    sufficient condition is not satisfied, a searching based method is proposed.
    Finally, some numerical experiments are provided to demonstrate the
    effectiveness of the proposed methods.


    Artificial Intelligence

    Phase-Mapper: An AI Platform to Accelerate High Throughput Materials Discovery

    Yexiang Xue, Junwen Bai, Ronan Le Bras, Richard Bernstein, Johan Bjorck, Liane Longpre, Santosh K. Suram, Santosh K. Suram, John Gregoire, Carla P. Gomes
    Subjects: Artificial Intelligence (cs.AI)

    High-Throughput materials discovery involves the rapid synthesis,
    measurement, and characterization of many different but structurally-related
    materials. A key problem in materials discovery, the phase map identification
    problem, involves the determination of the crystal phase diagram from the
    materials’ composition and structural characterization data. We present
    Phase-Mapper, a novel AI platform to solve the phase map identification problem
    that allows humans to interact with both the data and products of AI
    algorithms, including the incorporation of human feedback to constrain or
    initialize solutions. Phase-Mapper affords incorporation of any spectral
    demixing algorithm, including our novel solver, AgileFD, which is based on a
    convolutive non-negative matrix factorization algorithm. AgileFD can
    incorporate constraints to capture the physics of the materials as well as
    human feedback. We compare three solver variants with previously proposed
    methods in a large-scale experiment involving 20 synthetic systems,
    demonstrating the efficacy of imposing physical constrains using AgileFD.
    Phase-Mapper has also been used by materials scientists to solve a wide variety
    of phase diagrams, including the previously unsolved Nb-Mn-V oxide system,
    which is provided here as an illustrative example.

    A Probability Distribution Strategy with Efficient Clause Selection for Hard Max-SAT Formulas

    Sixue Liu, Yulong Ceng, Gerard de Melo
    Comments: 11 pages, 3 tables
    Subjects: Artificial Intelligence (cs.AI)

    Many real-world problems involving constraints can be regarded as instances
    of the Max-SAT problem, which is the optimization variant of the classic
    satisfiability problem. In this paper, we propose a novel probabilistic
    approach for Max-SAT called ProMS. Our algorithm relies on a stochastic local
    search strategy using a novel probability distribution function with two
    strategies for picking variables, one based on available information and
    another purely random one. Moreover, while most previous algorithms based on
    WalkSAT choose unsatisfied clauses randomly, we introduce a novel clause
    selection strategy to improve our algorithm. Experimental results illustrate
    that ProMS outperforms many state-of-the-art stochastic local search solvers on
    hard unweighted random Max-SAT benchmarks.

    Improving Accuracy and Scalability of the PC Algorithm by Maximizing P-value

    Joseph Ramsey
    Comments: 11 pages, 4 figures, 2 tables, technical report
    Subjects: Artificial Intelligence (cs.AI)

    A number of attempts have been made to improve accuracy and/or scalability of
    the PC (Peter and Clark) algorithm, some well known (Buhlmann, et al., 2010;
    Kalisch and Buhlmann, 2007; 2008; Zhang, 2012, to give some examples). We add
    here one more tool to the toolbox: the simple observation that if one is forced
    to choose between a variety of possible conditioning sets for a pair of
    variables, one should choose the one with the highest p-value. One can use the
    CPC (Conservative PC, Ramsey et al., 2012) algorithm as a guide to possible
    sepsets for a pair of variables. However, whereas CPC uses a voting rule to
    classify colliders versus noncolliders, our proposed algorithm, PC-Max, picks
    the conditioning set with the highest p-value, so that there are no
    ambiguities. We combine this with two other optimizations: (a) avoiding
    bidirected edges in the orientation of colliders, and (b) parallelization. For
    (b) we borrow ideas from the PC-Stable algorithm (Colombo and Maathuis, 2014).
    The result is an algorithm that scales quite well both in terms of accuracy and
    time, with no risk of bidirected edges.

    Funneled Bayesian Optimization for Design, Tuning and Control of Autonomous Systems

    Ruben Martinez-Cantin
    Subjects: Artificial Intelligence (cs.AI)

    Bayesian optimization has become a fundamental global optimization algorithm
    in many problems where sample efficiency is of paramount importance. Recently,
    there has been proposed a large number of new applications in fields such as
    robotics, machine learning, experimental design, simulation, etc. In this
    paper, we focus on several problems that appear in robotics and autonomous
    systems: algorithm tuning, automatic control and intelligent design. All those
    problems can be mapped to global optimization problems. However, they become
    hard optimization problems. Bayesian optimization internally uses a
    probabilistic surrogate model (e.g.: Gaussian process) to learn from the
    process and reduce the number of samples required. In order to generalize to
    unknown functions in a black-box fashion, the common assumption is that the
    underlying function can be modeled with a stationary process. Nonstationary
    Gaussian process regression cannot generalize easily and it typically requires
    prior knowledge of the function. Some works have designed techniques to
    generalize Bayesian optimization to nonstationary functions in an indirect way,
    but using techniques originally designed for regression, where the objective is
    to improve the quality of the surrogate model everywhere. Instead optimization
    should focus on improving the surrogate model near the optimum. In this paper,
    we present a novel kernel function specially designed for Bayesian
    optimization, that allows nonstationary behavior of the surrogate model in an
    adaptive local region. In our experiments, we found that this new kernel
    results in an improved local search (exploitation), without penalizing the
    global search (exploration). We provide results in well-known benchmarks and
    real applications. The new method outperforms the state of the art in Bayesian
    optimization both in stationary and nonstationary problems.

    Deep Spatio-Temporal Residual Networks for Citywide Crowd Flows Prediction

    Junbo Zhang, Yu Zheng, Dekang Qi
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG)

    Forecasting the flow of crowds is of great importance to traffic management
    and public safety, yet a very challenging task affected by many complex
    factors, such as inter-region traffic, events and weather. In this paper, we
    propose a deep-learning-based approach, called ST-ResNet, to collectively
    forecast the in-flow and out-flow of crowds in each and every region through a
    city. We design an end-to-end structure of ST-ResNet based on unique properties
    of spatio-temporal data. More specifically, we employ the framework of the
    residual neural networks to model the temporal closeness, period, and trend
    properties of the crowd traffic, respectively. For each property, we design a
    branch of residual convolutional units, each of which models the spatial
    properties of the crowd traffic. ST-ResNet learns to dynamically aggregate the
    output of the three residual neural networks based on data, assigning different
    weights to different branches and regions. The aggregation is further combined
    with external factors, such as weather and day of the week, to predict the
    final traffic of crowds in each and every region. We evaluate ST-ResNet based
    on two types of crowd flows in Beijing and NYC, finding that its performance
    exceeds six well-know methods.

    Outlier Detection from Network Data with Subnetwork Interpretation

    Xuan-Hong Dang, Arlei Silva, Ambuj Singh, Ananthram Swami, Prithwish Basu
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG)

    Detecting a small number of outliers from a set of data observations is
    always challenging. This problem is more difficult in the setting of multiple
    network samples, where computing the anomalous degree of a network sample is
    generally not sufficient. In fact, explaining why the network is exceptional,
    expressed in the form of subnetwork, is also equally important. In this paper,
    we develop a novel algorithm to address these two key problems. We treat each
    network sample as a potential outlier and identify subnetworks that mostly
    discriminate it from nearby regular samples. The algorithm is developed in the
    framework of network regression combined with the constraints on both network
    topology and L1-norm shrinkage to perform subnetwork discovery. Our method thus
    goes beyond subspace/subgraph discovery and we show that it converges to a
    global optimum. Evaluation on various real-world network datasets demonstrates
    that our algorithm not only outperforms baselines in both network and high
    dimensional setting, but also discovers highly relevant and interpretable local
    subnetworks, further enhancing our understanding of anomalous networks.

    Deep Visual Foresight for Planning Robot Motion

    Chelsea Finn, Sergey Levine
    Comments: Supplementary video: this https URL
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)

    A key challenge in scaling up robot learning to many skills and environments
    is removing the need for human supervision, so that robots can collect their
    own data and improve their own performance without being limited by the cost of
    requesting human feedback. Model-based reinforcement learning holds the promise
    of enabling an agent to learn to predict the effects of its actions, which
    could provide flexible predictive models for a wide range of tasks and
    environments, without detailed human supervision. We develop a method for
    combining deep action-conditioned video prediction models with model-predictive
    control that uses entirely unlabeled training data. Our approach does not
    require a calibrated camera, an instrumented training set-up, nor precise
    sensing and actuation. Our results show that our method enables a real robot to
    perform nonprehensile manipulation — pushing objects — and can handle novel
    objects not seen during training.

    Collective Robot Reinforcement Learning with Distributed Asynchronous Guided Policy Search

    Ali Yahya, Adrian Li, Mrinal Kalakrishnan, Yevgen Chebotar, Sergey Levine
    Comments: Submitted to the IEEE International Conference on Robotics and Automation 2017
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO)

    In principle, reinforcement learning and policy search methods can enable
    robots to learn highly complex and general skills that may allow them to
    function amid the complexity and diversity of the real world. However, training
    a policy that generalizes well across a wide range of real-world conditions
    requires far greater quantity and diversity of experience than is practical to
    collect with a single robot. Fortunately, it is possible for multiple robots to
    share their experience with one another, and thereby, learn a policy
    collectively. In this work, we explore distributed and asynchronous policy
    learning as a means to achieve generalization and improved training times on
    challenging, real-world manipulation tasks. We propose a distributed and
    asynchronous version of Guided Policy Search and use it to demonstrate
    collective policy learning on a vision-based door opening task using four
    robots. We show that it achieves better generalization, utilization, and
    training times than the single robot alternative.

    Kernel Selection using Multiple Kernel Learning and Domain Adaptation in Reproducing Kernel Hilbert Space, for Face Recognition under Surveillance Scenario

    Samik Banerjee, Sukhendu Das
    Comments: 13 pages, 15 figures, 4 tables. Kernel Selection, Surveillance, Multiple Kernel Learning, Domain Adaptation, RKHS, Hallucination
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Learning (cs.LG)

    Face Recognition (FR) has been the interest to several researchers over the
    past few decades due to its passive nature of biometric authentication. Despite
    high accuracy achieved by face recognition algorithms under controlled
    conditions, achieving the same performance for face images obtained in
    surveillance scenarios, is a major hurdle. Some attempts have been made to
    super-resolve the low-resolution face images and improve the contrast, without
    considerable degree of success. The proposed technique in this paper tries to
    cope with the very low resolution and low contrast face images obtained from
    surveillance cameras, for FR under surveillance conditions. For Support Vector
    Machine classification, the selection of appropriate kernel has been a widely
    discussed issue in the research community. In this paper, we propose a novel
    kernel selection technique termed as MFKL (Multi-Feature Kernel Learning) to
    obtain the best feature-kernel pairing. Our proposed technique employs a
    effective kernel selection by Multiple Kernel Learning (MKL) method, to choose
    the optimal kernel to be used along with unsupervised domain adaptation method
    in the Reproducing Kernel Hilbert Space (RKHS), for a solution to the problem.
    Rigorous experimentation has been performed on three real-world surveillance
    face datasets : FR\_SURV, SCface and ChokePoint. Results have been shown using
    Rank-1 Recognition Accuracy, ROC and CMC measures. Our proposed method
    outperforms all other recent state-of-the-art techniques by a considerable
    margin.

    Deep Reinforcement Learning for Robotic Manipulation

    Shixiang Gu, Ethan Holly, Timothy Lillicrap, Sergey Levine
    Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Reinforcement learning holds the promise of enabling autonomous robots to
    learn large repertoires of behavioral skills with minimal human intervention.
    However, robotic applications of reinforcement learning often compromise the
    autonomy of the learning process in favor of achieving training times that are
    practical for real physical systems. This typically involves introducing
    hand-engineered policy representations and human-supplied demonstrations. Deep
    reinforcement learning alleviates this limitation by training general-purpose
    neural network policies, but applications of direct deep reinforcement learning
    algorithms have so far been restricted to simulated settings and relatively
    simple tasks, due to their apparent high sample complexity. In this paper, we
    demonstrate that a recent deep reinforcement learning algorithm based on
    off-policy training of deep Q-functions can scale to complex 3D manipulation
    tasks and can learn deep neural network policies efficiently enough to train on
    real physical robots. We demonstrate that the training times can be further
    reduced by parallelizing the algorithm across multiple robots which pool their
    policy updates asynchronously. Our experimental evaluation shows that our
    method can learn a variety of 3D manipulation skills in simulation and a
    complex door opening skill on real robots without any prior demonstrations or
    manually designed representations.

    Deep unsupervised learning through spatial contrasting

    Elad Hoffer, Itay Hubara, Nir Ailon
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    Convolutional networks have marked their place over the last few years as the
    best performing model for various visual tasks. They are, however, most suited
    for supervised learning from large amounts of labeled data. Previous attempts
    have been made to use unlabeled data to improve model performance by applying
    unsupervised techniques. These attempts require different architectures and
    training methods. In this work we present a novel approach for unsupervised
    training of Convolutional networks that is based on contrasting between spatial
    regions within images. This criterion can be employed within conventional
    neural networks and trained using standard techniques such as SGD and
    back-propagation, thus complementing supervised methods.

    X-CNN: Cross-modal Convolutional Neural Networks for Sparse Datasets

    Petar Veličković, Duo Wang, Nicholas D. Lane, Pietro Liò
    Comments: To appear in the 7th IEEE Symposium Series on Computational Intelligence (IEEE SSCI 2016), 8 pages, 6 figures
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

    In this paper we propose cross-modal convolutional neural networks (X-CNNs),
    a novel biologically inspired type of CNN architectures, treating gradient
    descent-specialised CNNs as individual units of processing in a larger-scale
    network topology, while allowing for unconstrained information flow and/or
    weight sharing between analogous hidden layers of the network—thus
    generalising the already well-established concept of neural network ensembles
    (where information typically may flow only between the output layers of the
    individual networks). The constituent networks are individually designed to
    learn the output function on their own subset of the input data, after which
    cross-connections between them are introduced after each pooling operation to
    periodically allow for information exchange between them. This injection of
    knowledge into a model (by prior partition of the input data through domain
    knowledge or unsupervised methods) is expected to yield greatest returns in
    sparse data environments, which are typically less suitable for training CNNs.
    For evaluation purposes, we have compared a standard four-layer CNN as well as
    a sophisticated FitNet4 architecture against their cross-modal variants on the
    CIFAR-10 and CIFAR-100 datasets with differing percentages of the training data
    being removed, and find that at lower levels of data availability, the X-CNNs
    significantly outperform their baselines (typically providing a 2–6% benefit,
    depending on the dataset size and whether data augmentation is used), while
    still maintaining an edge on all of the full dataset tests.

    Consistency Ensuring in Social Web Services Based on Commitments Structure

    Marzieh Adelnia, Mohammad Reza Khayyambashi
    Comments: International Journal of Computer Science and Information Security (IJCSIS), Vol. 14, No. 8, August 2016
    Subjects: Social and Information Networks (cs.SI); Artificial Intelligence (cs.AI)

    Web Service is one of the most significant current discussions in information
    sharing technologies and one of the examples of service oriented processing. To
    ensure accurate execution of web services operations, it must be adaptable with
    policies of the social networks in which it signs up. This adaptation
    implements using controls called ‘Commitment’. This paper describes commitments
    structure and existing research about commitments and social web services, then
    suggests an algorithm for consistency of commitments in social web services. As
    regards the commitments may be executed concurrently, a key challenge in web
    services execution based on commitment structure is consistency ensuring in
    execution time. The purpose of this research is providing an algorithm for
    consistency ensuring between web services operations based on commitments
    structure.

    Bacterial Foraging Optimized STATCOM for Stability Assessment in Power System

    Shiba R. Paital, Prakash K. Ray, Asit Mohanty, Sandipan Patra, Harishchandra Dubey
    Comments: 5 pages, 7 figures, 2016 IEEE Students’ Technology Symposium (TechSym 2016), At IIT Kharagpur, India
    Subjects: Systems and Control (cs.SY); Artificial Intelligence (cs.AI)

    This paper presents a study of improvement in stability in a single machine
    connected to infinite bus (SMIB) power system by using static compensator
    (STATCOM). The gains of Proportional-Integral-Derivative (PID) controller in
    STATCOM are being optimized by heuristic technique based on Particle swarm
    optimization (PSO). Further, Bacterial Foraging Optimization (BFO) as an
    alternative heuristic method is also applied to select optimal gains of PID
    controller. The performance of STATCOM with the above soft-computing techniques
    are studied and compared with the conventional PID controller under various
    scenarios. The simulation results are accompanied with performance indices
    based quantitative analysis. The analysis clearly signifies the robustness of
    the new scheme in terms of stability and voltage regulation when compared with
    conventional PID.

    Learning real manipulation tasks from virtual demonstrations using LSTM

    Rouhollah Rahmatizadeh, Pooya Abolghasemi, Aman Behal, Ladislau Bölöni
    Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Robots assisting disabled or elderly people in activities of daily living
    must perform complex manipulation tasks. These tasks are dependent on the
    user’s environment and preferences. Thus, learning from demonstration (LfD) is
    a promising choice that would allow the non-expert user to teach the robot
    different tasks. Unfortunately, learning general solutions from raw
    demonstrations requires a significant amount of data. Performing this number of
    physical demonstrations is unfeasible for a disabled user. In this paper we
    propose an approach where the user demonstrates the manipulation task in a
    virtual environment. The collected demonstrations are used to train an LSTM
    recurrent neural network that can act as the controller for the robot. We show
    that the controller learned from virtual demonstrations can be used to
    successfully perform the manipulation tasks on a physical robot.


    Information Retrieval

    A large scale study of SVM based methods for abstract screening in systematic reviews

    Tanay Kumar Saha, Mourad Ouzzani, Ahmed K. Elmagarmid
    Subjects: Information Retrieval (cs.IR); Learning (cs.LG)

    A major task in systematic reviews is abstract screening, i.e., excluding,
    often hundreds or thousand of, irrelevant citations returned from a database
    search based on titles and abstracts. Thus, a systematic review platform that
    can automate the abstract screening process is of huge importance. Several
    methods have been proposed for this task. However, it is very hard to clearly
    understand the applicability of these methods in a systematic review platform
    because of the following challenges: (1) the use of non-overlapping metrics for
    the evaluation of the proposed methods, (2) usage of features that are very
    hard to collect, (3) using a small set of reviews for the evaluation, and (4)
    no solid statistical testing or equivalence grouping of the methods. In this
    paper, we use feature representation that can be extracted per citation. We
    evaluate SVM-based methods (commonly used) on a large set of reviews ($61$) and
    metrics ($11$) to provide equivalence grouping of methods based on a solid
    statistical test. Our analysis also includes a strong variability of the
    metrics using $500$x$2$ cross validation. While some methods shine for
    different metrics and for different datasets, there is no single method that
    dominates the pack. Furthermore, we observe that in some cases relevant
    (included) citations can be found after screening only 15-20% of them via a
    certainty based sampling. A few included citations present outlying
    characteristics and can only be found after a very large number of screening
    steps. Finally, we present an ensemble algorithm for producing a $5$-star
    rating of citations based on their relevance. Such algorithm combines the best
    methods from our evaluation and through its $5$-star rating outputs a more
    easy-to-consume prediction.

    Cosine Similarity Search with Multi Index Hashing

    Sepehr Eghbali, Ladan Tahvildari
    Subjects: Databases (cs.DB); Data Structures and Algorithms (cs.DS); Information Retrieval (cs.IR); Learning (cs.LG)

    Due to rapid development of the Internet, recent years have witnessed an
    explosion in the rate of data generation. Dealing with data at current scales
    brings up unprecedented challenges. From the algorithmic view point, executing
    existing linear algorithms in information retrieval and machine learning on
    such tremendous amounts of data incur intolerable computational and storage
    costs. To address this issue, there is a growing interest to map data points in
    large-scale datasets to binary codes. This can significantly reduce the storage
    complexity of large-scale datasets. However, one of the most compelling reasons
    for using binary codes or any discrete representation is that they can be used
    as direct indices into a hash table. Incorporating hash table offers fast query
    execution; one can look up the nearby buckets in a hash table populated with
    binary codes to retrieve similar items. Nonetheless, if binary codes are
    compared in terms of the cosine similarity rather than the Hamming distance,
    there is no fast exact sequential procedure to find the $K$ closest items to
    the query other than the exhaustive search. Given a large dataset of binary
    codes and a binary query, the problem that we address is to efficiently find
    $K$ closest codes in the dataset that yield the largest cosine similarities to
    the query. To handle this issue, we first elaborate on the relation between the
    Hamming distance and the cosine similarity. This allows finding the sequence of
    buckets to check in the hash table. Having this sequence, we propose a
    multi-index hashing approach that can increase the search speed up to orders of
    magnitude in comparison to the exhaustive search and even approximation methods
    such as LSH. We empirically evaluate the performance of the proposed algorithm
    on real world datasets.

    An Arabic-Hebrew parallel corpus of TED talks

    Mauro Cettolo
    Comments: To appear in Proceedings of the AMTA 2016 Workshop on Semitic Machine Translation (SeMaT)
    Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)

    We describe an Arabic-Hebrew parallel corpus of TED talks built upon WIT3,
    the Web inventory that repurposes the original content of the TED website in a
    way which is more convenient for MT researchers. The benchmark consists of
    about 2,000 talks, whose subtitles in Arabic and Hebrew have been accurately
    aligned and rearranged in sentences, for a total of about 3.5M tokens per
    language. Talks have been partitioned in train, development and test sets
    similarly in all respects to the MT tasks of the IWSLT 2016 evaluation
    campaign. In addition to describing the benchmark, we list the problems
    encountered in preparing it and the novel methods designed to solve them.
    Baseline MT results and some measures on sentence length are provided as an
    extrinsic evaluation of the quality of the benchmark.

    Sentiment Analysis on Bangla and Romanized Bangla Text (BRBT) using Deep Recurrent models

    A. Hassan, N. Mohammed, A. K. A. Azad
    Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Sentiment Analysis (SA) is an action research area in the digital age. With
    rapid and constant growth of online social media sites and services, and the
    increasing amount of textual data such as – statuses, comments, reviews etc.
    available in them, application of automatic SA is on the rise. However, most of
    the research works on SA in natural language processing (NLP) are based on
    English language. Despite being the sixth most widely spoken language in the
    world, Bangla still does not have a large and standard dataset. Because of
    this, recent research works in Bangla have failed to produce results that can
    be both comparable to works done by others and reusable as stepping stones for
    future researchers to progress in this field. Therefore, we first tried to
    provide a textual dataset – that includes not just Bangla, but Romanized Bangla
    texts as well, is substantial, post-processed and multiple validated, ready to
    be used in SA experiments. We tested this dataset in Deep Recurrent model,
    specifically, Long Short Term Memory (LSTM), using two types of loss functions
    – binary crossentropy and categorical crossentropy, and also did some
    experimental pre-training by using data from one validation to pre-train the
    other and vice versa. Lastly, we documented the results along with some
    analysis on them, which were promising.

    Battling the Digital Forensic Backlog through Data Deduplication

    Mark Scanlon
    Comments: Scanlon, M., Battling the Digital Forensic Backlog through Data Deduplication, 6th IEEE International Conference on Innovative Computing Technology (INTECH 2016), Dublin, Ireland, August 2016
    Subjects: Computers and Society (cs.CY); Cryptography and Security (cs.CR); Information Retrieval (cs.IR)

    In everyday life. Technological advancement can be found in many facets of
    life, including personal computers, mobile devices, wearables, cloud services,
    video gaming, web-powered messaging, social media, Internet-connected devices,
    etc. This technological influence has resulted in these technologies being
    employed by criminals to conduct a range of crimes — both online and offline.
    Both the number of cases requiring digital forensic analysis and the sheer
    volume of information to be processed in each case has increased rapidly in
    recent years. As a result, the requirement for digital forensic investigation
    has ballooned, and law enforcement agencies throughout the world are scrambling
    to address this demand. While more and more members of law enforcement are
    being trained to perform the required investigations, the supply is not keeping
    up with the demand. Current digital forensic techniques are arduously
    time-consuming and require a significant amount of man power to execute. This
    paper discusses a novel solution to combat the digital forensic backlog. This
    solution leverages a deduplication-based paradigm to eliminate the
    reacquisition, redundant storage, and reanalysis of previously processed data.

    Text Network Exploration via Heterogeneous Web of Topics

    Junxian He, Ying Huang, Changfeng Liu, Jiaming Shen, Yuting Jia, Xinbing Wang
    Comments: 8 pages
    Subjects: Social and Information Networks (cs.SI); Computation and Language (cs.CL); Information Retrieval (cs.IR)

    A text network refers to a data type that each vertex is associated with a
    text document and the relationship between documents is represented by edges.
    The proliferation of text networks such as hyperlinked webpages and academic
    citation networks has led to an increasing demand for quickly developing a
    general sense of a new text network, namely text network exploration. In this
    paper, we address the problem of text network exploration through constructing
    a heterogeneous web of topics, which allows people to investigate a text
    network associating word level with document level. To achieve this, a
    probabilistic generative model for text and links is proposed, where three
    different relationships in the heterogeneous topic web are quantified. We also
    develop a prototype demo system named TopicAtlas to exhibit such heterogeneous
    topic web, and demonstrate how this system can facilitate the task of text
    network exploration. Extensive qualitative analyses are included to verify the
    effectiveness of this heterogeneous topic web. Besides, we validate our model
    on real-life text networks, showing that it preserves good performance on
    objective evaluation metrics.


    Computation and Language

    Orthographic Syllable as basic unit for SMT between Related Languages

    Anoop Kunchukuttan, Pushpak Bhattacharyya
    Comments: 7 pages, 1 figure, compiled with XeTex, to be published at the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2016
    Subjects: Computation and Language (cs.CL)

    We explore the use of the orthographic syllable, a variable-length
    consonant-vowel sequence, as a basic unit of translation between related
    languages which use abugida or alphabetic scripts. We show that orthographic
    syllable level translation significantly outperforms models trained over other
    basic units (word, morpheme and character) when training over small parallel
    corpora.

    Multimodal Semantic Simulations of Linguistically Underspecified Motion Events

    Nikhil Krishnaswamy, James Pustejovsky
    Subjects: Computation and Language (cs.CL)

    In this paper, we describe a system for generating three-dimensional visual
    simulations of natural language motion expressions. We use a rich formal model
    of events and their participants to generate simulations that satisfy the
    minimal constraints entailed by the associated utterance, relying on semantic
    knowledge of physical objects and motion events. This paper outlines technical
    considerations and discusses implementing the aforementioned semantic models
    into such a system.

    An Arabic-Hebrew parallel corpus of TED talks

    Mauro Cettolo
    Comments: To appear in Proceedings of the AMTA 2016 Workshop on Semitic Machine Translation (SeMaT)
    Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)

    We describe an Arabic-Hebrew parallel corpus of TED talks built upon WIT3,
    the Web inventory that repurposes the original content of the TED website in a
    way which is more convenient for MT researchers. The benchmark consists of
    about 2,000 talks, whose subtitles in Arabic and Hebrew have been accurately
    aligned and rearranged in sentences, for a total of about 3.5M tokens per
    language. Talks have been partitioned in train, development and test sets
    similarly in all respects to the MT tasks of the IWSLT 2016 evaluation
    campaign. In addition to describing the benchmark, we list the problems
    encountered in preparing it and the novel methods designed to solve them.
    Baseline MT results and some measures on sentence length are provided as an
    extrinsic evaluation of the quality of the benchmark.

    FPGA-Based Low-Power Speech Recognition with Recurrent Neural Networks

    Minjae Lee, Kyuyeon Hwang, Jinhwan Park, Sungwook Choi, Sungho Shin, Wonyong Sung
    Comments: Accepted to SiPS 2016
    Subjects: Computation and Language (cs.CL); Learning (cs.LG); Sound (cs.SD)

    In this paper, a neural network based real-time speech recognition (SR)
    system is developed using an FPGA for very low-power operation. The implemented
    system employs two recurrent neural networks (RNNs); one is a
    speech-to-character RNN for acoustic modeling (AM) and the other is for
    character-level language modeling (LM). The system also employs a statistical
    word-level LM to improve the recognition accuracy. The results of the AM, the
    character-level LM, and the word-level LM are combined using a fairly simple
    N-best search algorithm instead of the hidden Markov model (HMM) based network.
    The RNNs are implemented using massively parallel processing elements (PEs) for
    low latency and high throughput. The weights are quantized to 6 bits to store
    all of them in the on-chip memory of an FPGA. The proposed algorithm is
    implemented on a Xilinx XC7Z045, and the system can operate much faster than
    real-time.

    Nonsymbolic Text Representation

    Hinrich Schuetze
    Subjects: Computation and Language (cs.CL)

    We introduce the first generic text representation model that is completely
    nonsymbolic, i.e., it does not require the availability of a segmentation or
    tokenization method that attempts to identify words or other symbolic units in
    text. This applies to training the parameters of the model on a training corpus
    as well as to applying it when computing the representation of a new text. We
    show that our model performs better than prior work on an information
    extraction and a text denoising task.

    Learning to Translate in Real-time with Neural Machine Translation

    Jiatao Gu, Graham Neubig, Kyunghyun Cho, Victor O.K. Li
    Comments: 9 pages, 8 figures
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    Translating in real-time, a.k.a. simultaneous translation, outputs
    translation words before the input sentence ends, which is a challenging
    problem for conventional machine translation methods. We propose a neural
    machine translation (NMT) framework for simultaneous translation in which an
    agent learns to make decisions on when to translate from the interaction with a
    pre-trained NMT environment. To trade off quality and delay, we extensively
    explore various targets for delay and design a method for beam-search
    applicable in the simultaneous MT setting. Experiments against state-of-the-art
    baselines on two language pairs demonstrate the efficacy of the proposed
    framework both quantitatively and qualitatively.

    Sentiment Analysis on Bangla and Romanized Bangla Text (BRBT) using Deep Recurrent models

    A. Hassan, N. Mohammed, A. K. A. Azad
    Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Sentiment Analysis (SA) is an action research area in the digital age. With
    rapid and constant growth of online social media sites and services, and the
    increasing amount of textual data such as – statuses, comments, reviews etc.
    available in them, application of automatic SA is on the rise. However, most of
    the research works on SA in natural language processing (NLP) are based on
    English language. Despite being the sixth most widely spoken language in the
    world, Bangla still does not have a large and standard dataset. Because of
    this, recent research works in Bangla have failed to produce results that can
    be both comparable to works done by others and reusable as stepping stones for
    future researchers to progress in this field. Therefore, we first tried to
    provide a textual dataset – that includes not just Bangla, but Romanized Bangla
    texts as well, is substantial, post-processed and multiple validated, ready to
    be used in SA experiments. We tested this dataset in Deep Recurrent model,
    specifically, Long Short Term Memory (LSTM), using two types of loss functions
    – binary crossentropy and categorical crossentropy, and also did some
    experimental pre-training by using data from one validation to pre-train the
    other and vice versa. Lastly, we documented the results along with some
    analysis on them, which were promising.

    Syntactic Structures and Code Parameters

    Kevin Shu, Matilde Marcolli
    Comments: 14 pages, LaTeX, 12 png figures
    Subjects: Computation and Language (cs.CL)

    We assign binary and ternary error-correcting codes to the data of syntactic
    structures of world languages and we study the distribution of code points in
    the space of code parameters. We show that, while most codes populate the lower
    region approximating a superposition of Thomae functions, there is a
    substantial presence of codes above the Gilbert-Varshamov bound and even above
    the asymptotic bound and the Plotkin bound. We investigate the dynamics induced
    on the space of code parameters by spin glass models of language change, and
    show that, in the presence of entailment relations between syntactic parameters
    the dynamics can sometimes improve the code. For large sets of languages and
    syntactic data, one can gain information on the spin glass dynamics from the
    induced dynamics in the space of code parameters.

    Very Deep Convolutional Neural Networks for Robust Speech Recognition

    Yanmin Qian, Philip C Woodland
    Comments: accepted by SLT 2016
    Subjects: Computation and Language (cs.CL)

    This paper describes the extension and optimization of our previous work on
    very deep convolutional neural networks (CNNs) for effective recognition of
    noisy speech in the Aurora 4 task. The appropriate number of convolutional
    layers, the sizes of the filters, pooling operations and input feature maps are
    all modified: the filter and pooling sizes are reduced and dimensions of input
    feature maps are extended to allow adding more convolutional layers.
    Furthermore appropriate input padding and input feature map selection
    strategies are developed. In addition, an adaptation framework using joint
    training of very deep CNN with auxiliary features i-vector and fMLLR features
    is developed. These modifications give substantial word error rate reductions
    over the standard CNN used as baseline. Finally the very deep CNN is combined
    with an LSTM-RNN acoustic model and it is shown that state-level weighted log
    likelihood score combination in a joint acoustic model decoding scheme is very
    effective. On the Aurora 4 task, the very deep CNN achieves a WER of 8.81%,
    further 7.99% with auxiliary feature joint training, and 7.09% with LSTM-RNN
    joint decoding.

    Sentence Segmentation in Narrative Transcripts from Neuropsycological Tests using Recurrent Convolutional Neural Networks

    Marcos Vinícius Treviso, Christopher Shulby, Sandra Maria Aluísio
    Comments: 10 pages
    Subjects: Computation and Language (cs.CL)

    Automated discourse analysis tools based on Natural Language Processing (NLP)
    aiming at the diagnosis of language-impairing dementias generally extract
    several textual metrics of narrative transcripts. However, the absence of
    sentence boundary segmentation in the transcripts prevents the direct
    application of NLP methods which rely on these marks in order to function
    properly, such as taggers and parsers. We present the first steps taken towards
    automatic neuropsychological evaluation based on narrative discourse analysis,
    presenting a new automatic sentence segmentation method for impaired speech.
    Our model uses recurrent convolutional neural networks with prosodic, Part of
    Speech (PoS) features, and word embeddings. It was evaluated intrinsically on
    impaired, spontaneous speech as well as normal, prepared speech. The results
    suggest that our model is robust for impaired speech and can be used in
    automated discourse analysis tools to differentiate narratives produced by Mild
    Cognitive Impairment and healthy elderly patients.

    Vocabulary Selection Strategies for Neural Machine Translation

    Gurvan L'Hostis, David Grangier, Michael Auli
    Subjects: Computation and Language (cs.CL)

    Classical translation models constrain the space of possible outputs by
    selecting a subset of translation rules based on the input sentence. Recent
    work on improving the efficiency of neural translation models adopted a similar
    strategy by restricting the output vocabulary to a subset of likely candidates
    given the source. In this paper we experiment with context and embedding-based
    selection methods and extend previous work by examining speed and accuracy
    trade-offs in more detail. We show that decoding time on CPUs can be reduced by
    up to 90% and training time by 25% on the WMT15 English-German and WMT16
    English-Romanian tasks at the same or only negligible change in accuracy. This
    brings the time to decode with a state of the art neural translation system to
    just over 140 msec per sentence on a single CPU core for English-German.

    Discriminating Similar Languages: Evaluations and Explorations

    Cyril Goutte, Serge Léger, Shervin Malmasi, Marcos Zampieri
    Comments: Proceedings of Language Resources and Evaluation (LREC)
    Journal-ref: Proceedings of Language Resources and Evaluation (LREC). Portoroz,
    Slovenia. pp 1800-1807 (2016)
    Subjects: Computation and Language (cs.CL)

    We present an analysis of the performance of machine learning classifiers on
    discriminating between similar languages and language varieties. We carried out
    a number of experiments using the results of the two editions of the
    Discriminating between Similar Languages (DSL) shared task. We investigate the
    progress made between the two tasks, estimate an upper bound on possible
    performance using ensemble and oracle combination, and provide learning curves
    to help us understand which languages are more challenging. A number of
    difficult sentences are identified and investigated further with human
    annotation.

    Modeling Language Change in Historical Corpora: The Case of Portuguese

    Marcos Zampieri, Shervin Malmasi, Mark Dras
    Comments: Proceedings of Language Resources and Evaluation (LREC)
    Journal-ref: Proceedings of Language Resources and Evaluation (LREC). Portoroz,
    Slovenia. pp. 4098-4104 (2016)
    Subjects: Computation and Language (cs.CL)

    This paper presents a number of experiments to model changes in a historical
    Portuguese corpus composed of literary texts for the purpose of temporal text
    classification. Algorithms were trained to classify texts with respect to their
    publication date taking into account lexical variation represented as word
    n-grams, and morphosyntactic variation represented by part-of-speech (POS)
    distribution. We report results of 99.8% accuracy using word unigram features
    with a Support Vector Machines classifier to predict the publication date of
    documents in time intervals of both one century and half a century. A feature
    analysis is performed to investigate the most informative features for this
    task and how they are linked to language change.

    Semi-supervised Learning with Sparse Autoencoders in Phone Classification

    Akash Kumar Dhaka, Giampiero Salvi
    Comments: 5 pages, 1 figure, 2 tables
    Subjects: Machine Learning (stat.ML); Computation and Language (cs.CL); Learning (cs.LG)

    We propose the application of a semi-supervised learning method to improve
    the performance of acoustic modelling for automatic speech recognition based on
    deep neural net- works. As opposed to unsupervised initialisation followed by
    supervised fine tuning, our method takes advantage of both unlabelled and
    labelled data simultaneously through mini- batch stochastic gradient descent.
    We tested the method with varying proportions of labelled vs unlabelled
    observations in frame-based phoneme classification on the TIMIT database. Our
    experiments show that the method outperforms standard supervised training for
    an equal amount of labelled data and provides competitive error rates compared
    to state-of-the-art graph-based semi-supervised learning techniques.

    Text Network Exploration via Heterogeneous Web of Topics

    Junxian He, Ying Huang, Changfeng Liu, Jiaming Shen, Yuting Jia, Xinbing Wang
    Comments: 8 pages
    Subjects: Social and Information Networks (cs.SI); Computation and Language (cs.CL); Information Retrieval (cs.IR)

    A text network refers to a data type that each vertex is associated with a
    text document and the relationship between documents is represented by edges.
    The proliferation of text networks such as hyperlinked webpages and academic
    citation networks has led to an increasing demand for quickly developing a
    general sense of a new text network, namely text network exploration. In this
    paper, we address the problem of text network exploration through constructing
    a heterogeneous web of topics, which allows people to investigate a text
    network associating word level with document level. To achieve this, a
    probabilistic generative model for text and links is proposed, where three
    different relationships in the heterogeneous topic web are quantified. We also
    develop a prototype demo system named TopicAtlas to exhibit such heterogeneous
    topic web, and demonstrate how this system can facilitate the task of text
    network exploration. Extensive qualitative analyses are included to verify the
    effectiveness of this heterogeneous topic web. Besides, we validate our model
    on real-life text networks, showing that it preserves good performance on
    objective evaluation metrics.


    Distributed, Parallel, and Cluster Computing

    CDSFA Stochastic Frontier Analysis Approach to Revenue Modeling in Large Cloud Data Centers

    Jyotirmoy Sarkar, Bidisha Goswami, Snehanshu Saha, Saibal Kar
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Enterprises are investing heavily in cloud data centers to meet the ever
    surging business demand. Data Center is a facility, which houses computer
    systems and associated components, such as telecommunications and storage
    systems. It generally includes power supply equipment, communication
    connections and cooling equipment. A large data center can use as much
    electricity as a small town. Due to the emergence of data center based
    computing services, it has become necessary to examine how the costs associated
    with data centers evolve over time, mainly in view of efficiency issues. We
    have presented a quasi form of Cobb Douglas model, which addresses revenue and
    profit issues in running large data centers. The stochastic form has been
    introduced and explored along with the quasi Cobb Douglas model to understand
    the behavior of the model in depth. Harrod neutrality and Solow neutrality are
    incorporated in the model to identify the technological progress in cloud data
    centers. This allows us to shed light on the stochastic uncertainty of cloud
    data center operations. A general approach to optimizing the revenue cost of
    data centers using Cobb Douglas Stochastic Frontier Analysis,CDSFA is
    presented. Next, we develop the optimization model for large data centers. The
    mathematical basis of CDSFA has been utilized for cost optimization and profit
    maximization in data centers. The results are found to be quite useful in view
    of production reorganization in large data centers around the world.

    Energy Efficient Restoring of Barrier Coverage in Wireless Sensor Networks Using Limited Mobility Sensors

    Dinesh Dash, Anurag Dasgupta
    Comments: 20 pages, 8 figures
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    In Wireless Sensor Networks, sensors are used for tracking objects,
    monitoring health and observing a region/territory for different environmental
    parameters. Coverage problem in sensor network ensures quality of monitoring a
    given region. Depending on applications different measures of coverage are
    there. Barrier coverage is a type of coverage, which ensures all paths that
    cross the boundary of a region intersect at least one sensor’s sensing region.
    The goal of the sensors is to detect intruders as they cross the boundary or as
    they penetrate a protected area. The sensors are dependent on their battery
    life. Restoring barrier coverage on sensor failure using mobile sensors with
    minimum total displacement is the primary objective of this paper. A
    centralized barrier coverage restoring scheme is proposed to increase the
    robustness of the network. We formulate restoring barrier coverage as bipartite
    matching problem. A distributed restoring of barrier coverage algorithm is also
    proposed, which restores it by first finding existing alternate barrier. If
    alternate barrier is not found, an alternate barrier is reconstructed by
    shifting existing sensors in a cascaded manner. Detailed simulation results are
    shown to evaluate the performance of our algorithms.

    Dithen: A Computation-as-a-Service Cloud Platform For Large-Scale Multimedia Processing

    Joseph Doyle, Vasileios Giotsas, Mohammad Ashraful Anam, Yiannis Andreopoulos
    Comments: to appear in IEEE Transactions on Cloud Computing. arXiv admin note: substantial text overlap with arXiv:1604.04804
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    We present Dithen, a novel computation-as-a-service (CaaS) cloud platform
    specifically tailored to the parallel execution of large-scale multimedia
    tasks. Dithen handles the upload/download of both multimedia data and
    executable items, the assignment of compute units to multimedia workloads, and
    the reactive control of the available compute units to minimize the cloud
    infrastructure cost under deadline-abiding execution. Dithen combines three key
    properties: (i) the reactive assignment of individual multimedia tasks to
    available computing units according to availability and predetermined
    time-to-completion constraints; (ii) optimal resource estimation based on
    Kalman-filter estimates; (iii) the use of additive increase multiplicative
    decrease (AIMD) algorithms (famous for being the resource management in the
    transport control protocol) for the control of the number of units servicing
    workloads. The deployment of Dithen over Amazon EC2 spot instances is shown to
    be capable of processing more than 80,000 video transcoding, face detection and
    image processing tasks (equivalent to the processing of more than 116 GB of
    compressed data) for less than $1 in billing cost from EC2. Moreover, the
    proposed AIMD-based control mechanism, in conjunction with the Kalman
    estimates, is shown to provide for more than 27% reduction in EC2 spot instance
    cost against methods based on reactive resource estimation. Finally, Dithen is
    shown to offer a 38% to 500% reduction of the billing cost against the current
    state-of-the-art in CaaS platforms on Amazon EC2 (Amazon Lambda and Amazon
    Autoscale). A baseline version of Dithen is currently available at
    this http URL under the “AutoScale” option.

    Exploiting Universal Redundancy

    Ali Shoker
    Comments: 10 pages
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Fault tolerance is essential for building reliable services; however, it
    comes at the price of redundancy, mainly the “replication factor” and
    “diversity”. With the increasing reliance on Internet-based services, more
    machines (mainly servers) are needed to scale out, multiplied with the extra
    expense of replication. This paper revisits the very fundamentals of fault
    tolerance and presents “artificial redundancy”: a formal generalization of
    “exact copy” redundancy in which new sources of redundancy are exploited to
    build fault tolerant systems. On this concept, we show how to build “artificial
    replication” and design “artificial fault tolerance” (AFT). We discuss the
    properties of these new techniques showing that AFT extends current fault
    tolerant approaches to use other forms of redundancy aiming at reduced cost and
    high diversity.

    A Study of Revenue Cost Dynamics in Large Data Centers: A Factorial Design Approach

    Gambhire Swati Sampatrao, Sudeepa Roy Dey, Bidisha Goswami, Sai Prasanna M.S, Snehanshu Saha
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Revenue optimization of large data centers is an open and challenging
    problem. The intricacy of the problem is due to the presence of too many
    parameters posing as costs or investment. This paper proposes a model to
    optimize the revenue in cloud data center and analyzes the model, revenue and
    different investment or cost commitments of organizations investing in data
    centers. The model uses the Cobb-Douglas production function to quantify the
    boundaries and the most significant factors to generate the revenue. The
    dynamics between revenue and cost is explored by designing an experiment (DoE)
    which is an interpretation of revenue as function of cost/investment as factors
    with different levels/fluctuations. Optimal elasticities associated with these
    factors of the model for maximum revenue are computed and verified . The model
    response is interpreted in light of the business scenario of data centers.

    Darwini: Generating realistic large-scale social graphs

    Sergey Edunov, Dionysios Logothetis, Cheng Wang, Avery Ching, Maja Kabiljo
    Subjects: Social and Information Networks (cs.SI); Distributed, Parallel, and Cluster Computing (cs.DC)

    Synthetic graph generators facilitate research in graph algorithms and
    processing systems by providing access to data, for instance, graphs resembling
    social networks, while circumventing privacy and security concerns.
    Nevertheless, their practical value lies in their ability to capture important
    metrics of real graphs, such as degree distribution and clustering properties.
    Graph generators must also be able to produce such graphs at the scale of
    real-world industry graphs, that is, hundreds of billions or trillions of
    edges.

    In this paper, we propose Darwini, a graph generator that captures a number
    of core characteristics of real graphs. Importantly, given a source graph, it
    can reproduce the degree distribution and, unlike existing approaches, the
    local clustering coefficient and joint-degree distributions. Furthermore,
    Darwini maintains metrics such node PageRank, eigenvalues and the K-core
    decomposition of a source graph. Comparing Darwini with state-of-the-art
    generative models, we show that it can reproduce these characteristics more
    accurately. Finally, we provide an open source implementation of our approach
    on the vertex-centric Apache Giraph model that allows us to create synthetic
    graphs with one trillion edges.

    Flocking Virtual Machines in Quest for Responsive IoT Cloud Services

    Sherif Abdelwahab, Bechir Hamdaoui
    Subjects: Networking and Internet Architecture (cs.NI); Distributed, Parallel, and Cluster Computing (cs.DC)

    We propose Flock; a simple and scalable protocol that enables live migration
    of Virtual Machines (VMs) across heterogeneous edge and conventional cloud
    platforms to improve the responsiveness of cloud services. Flock is designed
    with properties that are suitable for the use cases of the Internet of Things
    (IoT). We describe the properties of regularized latency measurements that
    Flock can use for asynchronous and autonomous migration decisions. Such
    decisions allow communicating VMs to follow a flocking-like behavior that
    consists of three simple rules: separation, alignment, and cohesion. Using game
    theory, we derive analytical bounds on Flock’s Price of Anarchy (PoA), and
    prove that flocking VMs converge to a Nash Equilibrium while settling in the
    best possible cloud platforms. We verify the effectiveness of Flock through
    simulations and discuss how its generic objective can simply be tweaked to
    achieve other objectives, such as cloud load balancing and energy consumption
    minimization.


    Learning

    Deep Visual Foresight for Planning Robot Motion

    Chelsea Finn, Sergey Levine
    Comments: Supplementary video: this https URL
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)

    A key challenge in scaling up robot learning to many skills and environments
    is removing the need for human supervision, so that robots can collect their
    own data and improve their own performance without being limited by the cost of
    requesting human feedback. Model-based reinforcement learning holds the promise
    of enabling an agent to learn to predict the effects of its actions, which
    could provide flexible predictive models for a wide range of tasks and
    environments, without detailed human supervision. We develop a method for
    combining deep action-conditioned video prediction models with model-predictive
    control that uses entirely unlabeled training data. Our approach does not
    require a calibrated camera, an instrumented training set-up, nor precise
    sensing and actuation. Our results show that our method enables a real robot to
    perform nonprehensile manipulation — pushing objects — and can handle novel
    objects not seen during training.

    Network structures and fast distributed MMSE estimation

    Muhammed O. Sayin, Suleyman S. Kozat
    Comments: Submitted to Digital Signal Processing
    Subjects: Learning (cs.LG)

    We construct optimal estimation algorithms over distributed networks for
    state estimation in the mean-square error (MSE) sense. Here, we have a
    distributed collection of agents with processing and cooperation capabilities.
    These agents continually observe a noisy version of a desired state of the
    nature through a linear model and seek to learn this state by interacting with
    each other. Although this problem has attracted significant attention and
    extensively been studied in several different fields including machine learning
    theory to signal processing, all the well-known strategies achieve suboptimal
    learning performance in the MSE sense. To this end, we provide algorithms that
    achieve distributed minimum MSE (MMSE) performance over an arbitrary network
    topology based on the aggregation of information at each agent. This approach
    differs from the diffusion of information across network, i.e., exchange of
    local estimates per time instance. Importantly, we show that exchange of local
    estimates is sufficient only over the certain network topologies. By inspecting
    these network structures, we also propose strategies that achieve the
    distributed MMSE performance also through the diffusion of information such
    that we can substantially reduce the communication load while achieving the
    best possible MSE performance. For practical implementations we provide
    approaches to reduce the complexity of the algorithms through the
    time-windowing of the observations. Finally, in the numerical examples, we
    demonstrate the superior performance of the introduced algorithms in the MSE
    sense due to optimal estimation.

    Collective Robot Reinforcement Learning with Distributed Asynchronous Guided Policy Search

    Ali Yahya, Adrian Li, Mrinal Kalakrishnan, Yevgen Chebotar, Sergey Levine
    Comments: Submitted to the IEEE International Conference on Robotics and Automation 2017
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO)

    In principle, reinforcement learning and policy search methods can enable
    robots to learn highly complex and general skills that may allow them to
    function amid the complexity and diversity of the real world. However, training
    a policy that generalizes well across a wide range of real-world conditions
    requires far greater quantity and diversity of experience than is practical to
    collect with a single robot. Fortunately, it is possible for multiple robots to
    share their experience with one another, and thereby, learn a policy
    collectively. In this work, we explore distributed and asynchronous policy
    learning as a means to achieve generalization and improved training times on
    challenging, real-world manipulation tasks. We propose a distributed and
    asynchronous version of Guided Policy Search and use it to demonstrate
    collective policy learning on a vision-based door opening task using four
    robots. We show that it achieves better generalization, utilization, and
    training times than the single robot alternative.

    Flint Water Crisis: Data-Driven Risk Assessment Via Residential Water Testing

    Jacob Abernethy (University of Michigan), Cyrus Anderson (University of Michigan), Chengyu Dai (University of Michigan), Arya Farahi (University of Michigan), Linh Nguyen (University of Michigan), Adam Rauh (University of Michigan), Eric Schwartz (University of Michigan), Wenbo Shen (University of Michigan), Guangsha Shi (University of Michigan), Jonathan Stroud (University of Michigan), Xinyu Tan (University of Michigan), Jared Webb (University of Michigan), Sheng Yang (University of Michigan)
    Comments: Presented at the Data For Good Exchange 2016
    Subjects: Learning (cs.LG); Applications (stat.AP)

    Recovery from the Flint Water Crisis has been hindered by uncertainty in both
    the water testing process and the causes of contamination. In this work, we
    develop an ensemble of predictive models to assess the risk of lead
    contamination in individual homes and neighborhoods. To train these models, we
    utilize a wide range of data sources, including voluntary residential water
    tests, historical records, and city infrastructure data. Additionally, we use
    our models to identify the most prominent factors that contribute to a high
    risk of lead contamination. In this analysis, we find that lead service lines
    are not the only factor that is predictive of the risk of lead contamination of
    water. These results could be used to guide the long-term recovery efforts in
    Flint, minimize the immediate damages, and improve resource-allocation
    decisions for similar water infrastructure crises.

    Quantifying Urban Traffic Anomalies

    Zhengyi Zhou (AT&T Labs Research), Philipp Meerkamp (Bloomberg LP), Chris Volinsky (AT&T Labs Research)
    Comments: Presented at the Data For Good Exchange 2016
    Subjects: Learning (cs.LG)

    Detecting and quantifying anomalies in urban traffic is critical for
    real-time alerting or re-routing in the short run and urban planning in the
    long run. We describe a two-step framework that achieves these two goals in a
    robust, fast, online, and unsupervised manner. First, we adapt stable principal
    component pursuit to detect anomalies for each road segment. This allows us to
    pinpoint traffic anomalies early and precisely in space. Then we group the
    road-level anomalies across time and space into meaningful anomaly events using
    a simple graph expansion procedure. These events can be easily clustered,
    visualized, and analyzed by urban planners. We demonstrate the effectiveness of
    our system using 7 weeks of anonymized and aggregated cellular location data in
    Dallas-Fort Worth. We suggest potential opportunities for urban planners and
    policy makers to use our methodology to make informed changes. These
    applications include real-time re-routing of traffic in response to abnormally
    high traffic, or identifying candidates for high-impact infrastructure
    projects.

    End-to-End Radio Traffic Sequence Recognition with Deep Recurrent Neural Networks

    Timothy J. O'Shea, Seth Hitefield, Johnathan Corgan
    Subjects: Learning (cs.LG); Networking and Internet Architecture (cs.NI)

    We investigate sequence machine learning techniques on raw radio signal
    time-series data. By applying deep recurrent neural networks we learn to
    discriminate between several application layer traffic types on top of a
    constant envelope modulation without using an expert demodulation algorithm. We
    show that complex protocol sequences can be learned and used for both
    classification and generation tasks using this approach.

    Can Evolutionary Sampling Improve Bagged Ensembles?

    Harsh Nisar, Bhanu Pratap Singh Rawat
    Comments: 3 pages, 1 table, Data Efficient Machine Learning Workshop (DEML’16), ICML
    Subjects: Learning (cs.LG)

    Perturb and Combine (P&C) group of methods generate multiple versions of the
    predictor by perturbing the training set or construction and then combining
    them into a single predictor (Breiman, 1996b). The motive is to improve the
    accuracy in unstable classification and regression methods. One of the most
    well known method in this group is Bagging. Arcing or Adaptive Resampling and
    Combining methods like AdaBoost are smarter variants of P&C methods. In this
    extended abstract, we lay the groundwork for a new family of methods under the
    P&C umbrella, known as Evolutionary Sampling (ES). We employ Evolutionary
    algorithms to suggest smarter sampling in both the feature space (sub-spaces)
    as well as training samples. We discuss multiple fitness functions to assess
    ensembles and empirically compare our performance against randomized sampling
    of training data and feature sub-spaces.

    Accelerating Deep Convolutional Networks using low-precision and sparsity

    Ganesh Venkatesh, Eriko Nurvitadhi, Debbie Marr
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    We explore techniques to significantly improve the compute efficiency and
    performance of Deep Convolution Networks without impacting their accuracy. To
    improve the compute efficiency, we focus on achieving high accuracy with
    extremely low-precision (2-bit) weight networks, and to accelerate the
    execution time, we aggressively skip operations on zero-values. We achieve the
    highest reported accuracy of 76.6% Top-1/93% Top-5 on the Imagenet object
    classification challenge with low-precision networkfootnote{github release of
    the source code coming soon} while reducing the compute requirement by ~3x
    compared to a full-precision network that achieves similar accuracy.
    Furthermore, to fully exploit the benefits of our low-precision networks, we
    build a deep learning accelerator core, dLAC, that can achieve up to 1
    TFLOP/mm^2 equivalent for single-precision floating-point operations (~2
    TFLOP/mm^2 for half-precision).

    Deep unsupervised learning through spatial contrasting

    Elad Hoffer, Itay Hubara, Nir Ailon
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    Convolutional networks have marked their place over the last few years as the
    best performing model for various visual tasks. They are, however, most suited
    for supervised learning from large amounts of labeled data. Previous attempts
    have been made to use unlabeled data to improve model performance by applying
    unsupervised techniques. These attempts require different architectures and
    training methods. In this work we present a novel approach for unsupervised
    training of Convolutional networks that is based on contrasting between spatial
    regions within images. This criterion can be employed within conventional
    neural networks and trained using standard techniques such as SGD and
    back-propagation, thus complementing supervised methods.

    Latent Tree Analysis

    Nevin L. Zhang, Leonard K. M. Poon
    Comments: 7 pages, 5 figures
    Subjects: Learning (cs.LG)

    Latent tree analysis seeks to model the correlations among a set of random
    variables using a tree of latent variables. It was proposed as an improvement
    to latent class analysis — a method widely used in social sciences and
    medicine to identify homogeneous subgroups in a population. It provides new and
    fruitful perspectives on a number of machine learning areas, including cluster
    analysis, topic detection, and deep probabilistic modeling. This paper gives an
    overview of the research on latent tree analysis and various ways it is used in
    practice.

    Faster Kernels for Graphs with Continuous Attributes via Hashing

    Christopher Morris, Nils M. Kriege, Kristian Kersting, Petra Mutzel
    Comments: IEEE ICDM 2016
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    While state-of-the-art kernels for graphs with discrete labels scale well to
    graphs with thousands of nodes, the few existing kernels for graphs with
    continuous attributes, unfortunately, do not scale well. To overcome this
    limitation, we present hash graph kernels, a general framework to derive
    kernels for graphs with continuous attributes from discrete ones. The idea is
    to iteratively turn continuous attributes into discrete labels using randomized
    hash functions. We illustrate hash graph kernels for the Weisfeiler-Lehman
    subtree kernel and for the shortest-path kernel. The resulting novel graph
    kernels are shown to be, both, able to handle graphs with continuous attributes
    and scalable to large graphs and data sets. This is supported by our
    theoretical analysis and demonstrated by an extensive experimental evaluation.

    Kernel Selection using Multiple Kernel Learning and Domain Adaptation in Reproducing Kernel Hilbert Space, for Face Recognition under Surveillance Scenario

    Samik Banerjee, Sukhendu Das
    Comments: 13 pages, 15 figures, 4 tables. Kernel Selection, Surveillance, Multiple Kernel Learning, Domain Adaptation, RKHS, Hallucination
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Learning (cs.LG)

    Face Recognition (FR) has been the interest to several researchers over the
    past few decades due to its passive nature of biometric authentication. Despite
    high accuracy achieved by face recognition algorithms under controlled
    conditions, achieving the same performance for face images obtained in
    surveillance scenarios, is a major hurdle. Some attempts have been made to
    super-resolve the low-resolution face images and improve the contrast, without
    considerable degree of success. The proposed technique in this paper tries to
    cope with the very low resolution and low contrast face images obtained from
    surveillance cameras, for FR under surveillance conditions. For Support Vector
    Machine classification, the selection of appropriate kernel has been a widely
    discussed issue in the research community. In this paper, we propose a novel
    kernel selection technique termed as MFKL (Multi-Feature Kernel Learning) to
    obtain the best feature-kernel pairing. Our proposed technique employs a
    effective kernel selection by Multiple Kernel Learning (MKL) method, to choose
    the optimal kernel to be used along with unsupervised domain adaptation method
    in the Reproducing Kernel Hilbert Space (RKHS), for a solution to the problem.
    Rigorous experimentation has been performed on three real-world surveillance
    face datasets : FR\_SURV, SCface and ChokePoint. Results have been shown using
    Rank-1 Recognition Accuracy, ROC and CMC measures. Our proposed method
    outperforms all other recent state-of-the-art techniques by a considerable
    margin.

    Deep Reinforcement Learning for Robotic Manipulation

    Shixiang Gu, Ethan Holly, Timothy Lillicrap, Sergey Levine
    Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Reinforcement learning holds the promise of enabling autonomous robots to
    learn large repertoires of behavioral skills with minimal human intervention.
    However, robotic applications of reinforcement learning often compromise the
    autonomy of the learning process in favor of achieving training times that are
    practical for real physical systems. This typically involves introducing
    hand-engineered policy representations and human-supplied demonstrations. Deep
    reinforcement learning alleviates this limitation by training general-purpose
    neural network policies, but applications of direct deep reinforcement learning
    algorithms have so far been restricted to simulated settings and relatively
    simple tasks, due to their apparent high sample complexity. In this paper, we
    demonstrate that a recent deep reinforcement learning algorithm based on
    off-policy training of deep Q-functions can scale to complex 3D manipulation
    tasks and can learn deep neural network policies efficiently enough to train on
    real physical robots. We demonstrate that the training times can be further
    reduced by parallelizing the algorithm across multiple robots which pool their
    policy updates asynchronously. Our experimental evaluation shows that our
    method can learn a variety of 3D manipulation skills in simulation and a
    complex door opening skill on real robots without any prior demonstrations or
    manually designed representations.

    Cosine Similarity Search with Multi Index Hashing

    Sepehr Eghbali, Ladan Tahvildari
    Subjects: Databases (cs.DB); Data Structures and Algorithms (cs.DS); Information Retrieval (cs.IR); Learning (cs.LG)

    Due to rapid development of the Internet, recent years have witnessed an
    explosion in the rate of data generation. Dealing with data at current scales
    brings up unprecedented challenges. From the algorithmic view point, executing
    existing linear algorithms in information retrieval and machine learning on
    such tremendous amounts of data incur intolerable computational and storage
    costs. To address this issue, there is a growing interest to map data points in
    large-scale datasets to binary codes. This can significantly reduce the storage
    complexity of large-scale datasets. However, one of the most compelling reasons
    for using binary codes or any discrete representation is that they can be used
    as direct indices into a hash table. Incorporating hash table offers fast query
    execution; one can look up the nearby buckets in a hash table populated with
    binary codes to retrieve similar items. Nonetheless, if binary codes are
    compared in terms of the cosine similarity rather than the Hamming distance,
    there is no fast exact sequential procedure to find the $K$ closest items to
    the query other than the exhaustive search. Given a large dataset of binary
    codes and a binary query, the problem that we address is to efficiently find
    $K$ closest codes in the dataset that yield the largest cosine similarities to
    the query. To handle this issue, we first elaborate on the relation between the
    Hamming distance and the cosine similarity. This allows finding the sequence of
    buckets to check in the hash table. Having this sequence, we propose a
    multi-index hashing approach that can increase the search speed up to orders of
    magnitude in comparison to the exhaustive search and even approximation methods
    such as LSH. We empirically evaluate the performance of the proposed algorithm
    on real world datasets.

    FPGA-Based Low-Power Speech Recognition with Recurrent Neural Networks

    Minjae Lee, Kyuyeon Hwang, Jinhwan Park, Sungwook Choi, Sungho Shin, Wonyong Sung
    Comments: Accepted to SiPS 2016
    Subjects: Computation and Language (cs.CL); Learning (cs.LG); Sound (cs.SD)

    In this paper, a neural network based real-time speech recognition (SR)
    system is developed using an FPGA for very low-power operation. The implemented
    system employs two recurrent neural networks (RNNs); one is a
    speech-to-character RNN for acoustic modeling (AM) and the other is for
    character-level language modeling (LM). The system also employs a statistical
    word-level LM to improve the recognition accuracy. The results of the AM, the
    character-level LM, and the word-level LM are combined using a fairly simple
    N-best search algorithm instead of the hidden Markov model (HMM) based network.
    The RNNs are implemented using massively parallel processing elements (PEs) for
    low latency and high throughput. The weights are quantized to 6 bits to store
    all of them in the on-chip memory of an FPGA. The proposed algorithm is
    implemented on a Xilinx XC7Z045, and the system can operate much faster than
    real-time.

    Path Integral Guided Policy Search

    Yevgen Chebotar, Mrinal Kalakrishnan, Ali Yahya, Adrian Li, Stefan Schaal, Sergey Levine
    Comments: Under review at the International Conference on Robotics and Automation (ICRA), 2017
    Subjects: Robotics (cs.RO); Learning (cs.LG)

    We present a policy search method for learning complex feedback control
    policies that map from high-dimensional sensory inputs to motor torques, for
    manipulation tasks with discontinuous contact dynamics. We build on a prior
    technique called guided policy search (GPS), which iteratively optimizes a set
    of local policies for specific instances of a task, and uses these to train a
    complex, high-dimensional global policy that generalizes across task instances.
    We extend GPS in the following ways: (1) we propose the use of a model-free
    local optimizer based on path integral stochastic optimal control (PI2), which
    enables us to learn local policies for tasks with highly discontinuous contact
    dynamics; and (2) we enable GPS to train on a new set of task instances in
    every iteration by using on-policy sampling: this increases the diversity of
    the instances that the policy is trained on, and is crucial for achieving good
    generalization. We show that these contributions enable us to learn deep neural
    network policies that can directly perform torque control from visual input. We
    validate the method on a challenging door opening task and a pick-and-place
    task, and we demonstrate that our approach substantially outperforms the prior
    LQR-based local policy optimizer on these tasks. Furthermore, we show that
    on-policy sampling significantly increases the generalization ability of these
    policies.

    Semi-supervised Learning with Sparse Autoencoders in Phone Classification

    Akash Kumar Dhaka, Giampiero Salvi
    Comments: 5 pages, 1 figure, 2 tables
    Subjects: Machine Learning (stat.ML); Computation and Language (cs.CL); Learning (cs.LG)

    We propose the application of a semi-supervised learning method to improve
    the performance of acoustic modelling for automatic speech recognition based on
    deep neural net- works. As opposed to unsupervised initialisation followed by
    supervised fine tuning, our method takes advantage of both unlabelled and
    labelled data simultaneously through mini- batch stochastic gradient descent.
    We tested the method with varying proportions of labelled vs unlabelled
    observations in frame-based phoneme classification on the TIMIT database. Our
    experiments show that the method outperforms standard supervised training for
    an equal amount of labelled data and provides competitive error rates compared
    to state-of-the-art graph-based semi-supervised learning techniques.

    Learning to Translate in Real-time with Neural Machine Translation

    Jiatao Gu, Graham Neubig, Kyunghyun Cho, Victor O.K. Li
    Comments: 9 pages, 8 figures
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    Translating in real-time, a.k.a. simultaneous translation, outputs
    translation words before the input sentence ends, which is a challenging
    problem for conventional machine translation methods. We propose a neural
    machine translation (NMT) framework for simultaneous translation in which an
    agent learns to make decisions on when to translate from the interaction with a
    pre-trained NMT environment. To trade off quality and delay, we extensively
    explore various targets for delay and design a method for beam-search
    applicable in the simultaneous MT setting. Experiments against state-of-the-art
    baselines on two language pairs demonstrate the efficacy of the proposed
    framework both quantitatively and qualitatively.

    Sentiment Analysis on Bangla and Romanized Bangla Text (BRBT) using Deep Recurrent models

    A. Hassan, N. Mohammed, A. K. A. Azad
    Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Sentiment Analysis (SA) is an action research area in the digital age. With
    rapid and constant growth of online social media sites and services, and the
    increasing amount of textual data such as – statuses, comments, reviews etc.
    available in them, application of automatic SA is on the rise. However, most of
    the research works on SA in natural language processing (NLP) are based on
    English language. Despite being the sixth most widely spoken language in the
    world, Bangla still does not have a large and standard dataset. Because of
    this, recent research works in Bangla have failed to produce results that can
    be both comparable to works done by others and reusable as stepping stones for
    future researchers to progress in this field. Therefore, we first tried to
    provide a textual dataset – that includes not just Bangla, but Romanized Bangla
    texts as well, is substantial, post-processed and multiple validated, ready to
    be used in SA experiments. We tested this dataset in Deep Recurrent model,
    specifically, Long Short Term Memory (LSTM), using two types of loss functions
    – binary crossentropy and categorical crossentropy, and also did some
    experimental pre-training by using data from one validation to pre-train the
    other and vice versa. Lastly, we documented the results along with some
    analysis on them, which were promising.

    Sparsity-driven weighted ensemble classifier

    Atilla Özgür, Hamit Erdem, Fatih Nar
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    In this letter, a novel weighted ensemble classifier is proposed that
    improves classification accuracy and minimizes the number of classifiers.
    Ensemble weight finding problem is modeled as a cost function with following
    terms: (a) a data fidelity term aiming to decrease misclassification rate, (b)
    a sparsity term aiming to decrease the number of classifiers, and (c) a
    non-negativity constraint on the weights of the classifiers. The proposed cost
    function is a non-convex and hard to solve; thus, convex relaxation techniques
    and novel approximations are employed to obtain a numerically efficient
    solution. The proposed method achieves better or similar performance compared
    to state-of-the art classifier ensemble methods, while using lower number of
    classifiers.

    HNP3: A Hierarchical Nonparametric Point Process for Modeling Content Diffusion over Social Media

    Seyed Abbas Hosseini, Ali Khodadadi, Soheil Arabzade, Hamid R. Rabiee
    Comments: Accepted in IEEE International Conference on Data Mining (ICDM) 2016, Barcelona
    Subjects: Machine Learning (stat.ML); Learning (cs.LG); Social and Information Networks (cs.SI)

    This paper introduces a novel framework for modeling temporal events with
    complex longitudinal dependency that are generated by dependent sources. This
    framework takes advantage of multidimensional point processes for modeling time
    of events. The intensity function of the proposed process is a mixture of
    intensities, and its complexity grows with the complexity of temporal patterns
    of data. Moreover, it utilizes a hierarchical dependent nonparametric approach
    to model marks of events. These capabilities allow the proposed model to adapt
    its temporal and topical complexity according to the complexity of data, which
    makes it a suitable candidate for real world scenarios. An online inference
    algorithm is also proposed that makes the framework applicable to a vast range
    of applications. The framework is applied to a real world application, modeling
    the diffusion of contents over networks. Extensive experiments reveal the
    effectiveness of the proposed framework in comparison with state-of-the-art
    methods.

    A large scale study of SVM based methods for abstract screening in systematic reviews

    Tanay Kumar Saha, Mourad Ouzzani, Ahmed K. Elmagarmid
    Subjects: Information Retrieval (cs.IR); Learning (cs.LG)

    A major task in systematic reviews is abstract screening, i.e., excluding,
    often hundreds or thousand of, irrelevant citations returned from a database
    search based on titles and abstracts. Thus, a systematic review platform that
    can automate the abstract screening process is of huge importance. Several
    methods have been proposed for this task. However, it is very hard to clearly
    understand the applicability of these methods in a systematic review platform
    because of the following challenges: (1) the use of non-overlapping metrics for
    the evaluation of the proposed methods, (2) usage of features that are very
    hard to collect, (3) using a small set of reviews for the evaluation, and (4)
    no solid statistical testing or equivalence grouping of the methods. In this
    paper, we use feature representation that can be extracted per citation. We
    evaluate SVM-based methods (commonly used) on a large set of reviews ($61$) and
    metrics ($11$) to provide equivalence grouping of methods based on a solid
    statistical test. Our analysis also includes a strong variability of the
    metrics using $500$x$2$ cross validation. While some methods shine for
    different metrics and for different datasets, there is no single method that
    dominates the pack. Furthermore, we observe that in some cases relevant
    (included) citations can be found after screening only 15-20% of them via a
    certainty based sampling. A few included citations present outlying
    characteristics and can only be found after a very large number of screening
    steps. Finally, we present an ensemble algorithm for producing a $5$-star
    rating of citations based on their relevance. Such algorithm combines the best
    methods from our evaluation and through its $5$-star rating outputs a more
    easy-to-consume prediction.

    Very Deep Convolutional Neural Networks for Raw Waveforms

    Wei Dai, Chia Dai, Shuhui Qu, Juncheng Li, Samarjit Das
    Comments: 5 pages, 2 figures, under submission to International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2017
    Subjects: Sound (cs.SD); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Learning acoustic models directly from the raw waveform data with minimal
    processing is challenging. Current waveform-based models have generally used
    very few (~2) convolutional layers, which might be insufficient for building
    high-level discriminative features. In this work, we propose very deep
    convolutional neural networks (CNNs) that directly use time-domain waveforms as
    inputs. Our CNNs, with up to 34 weight layers, are efficient to optimize over
    very long sequences (e.g., vector of size 32000), necessary for processing
    acoustic waveforms. This is achieved through batch normalization, residual
    learning, and a careful design of down-sampling in the initial layers. Our
    networks are fully convolutional, without the use of fully connected layers and
    dropout, to maximize representation learning. We use a large receptive field in
    the first convolutional layer to mimic bandpass filters, but very small
    receptive fields subsequently to control the model capacity. We demonstrate the
    performance gains with the deeper models. Our evaluation shows that the CNN
    with 18 weight layers outperform the CNN with 3 weight layers by over 15% in
    absolute accuracy for an environmental sound recognition task and matches the
    performance of models using log-mel features.

    Deep Spatio-Temporal Residual Networks for Citywide Crowd Flows Prediction

    Junbo Zhang, Yu Zheng, Dekang Qi
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG)

    Forecasting the flow of crowds is of great importance to traffic management
    and public safety, yet a very challenging task affected by many complex
    factors, such as inter-region traffic, events and weather. In this paper, we
    propose a deep-learning-based approach, called ST-ResNet, to collectively
    forecast the in-flow and out-flow of crowds in each and every region through a
    city. We design an end-to-end structure of ST-ResNet based on unique properties
    of spatio-temporal data. More specifically, we employ the framework of the
    residual neural networks to model the temporal closeness, period, and trend
    properties of the crowd traffic, respectively. For each property, we design a
    branch of residual convolutional units, each of which models the spatial
    properties of the crowd traffic. ST-ResNet learns to dynamically aggregate the
    output of the three residual neural networks based on data, assigning different
    weights to different branches and regions. The aggregation is further combined
    with external factors, such as weather and day of the week, to predict the
    final traffic of crowds in each and every region. We evaluate ST-ResNet based
    on two types of crowd flows in Beijing and NYC, finding that its performance
    exceeds six well-know methods.

    Outlier Detection from Network Data with Subnetwork Interpretation

    Xuan-Hong Dang, Arlei Silva, Ambuj Singh, Ananthram Swami, Prithwish Basu
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG)

    Detecting a small number of outliers from a set of data observations is
    always challenging. This problem is more difficult in the setting of multiple
    network samples, where computing the anomalous degree of a network sample is
    generally not sufficient. In fact, explaining why the network is exceptional,
    expressed in the form of subnetwork, is also equally important. In this paper,
    we develop a novel algorithm to address these two key problems. We treat each
    network sample as a potential outlier and identify subnetworks that mostly
    discriminate it from nearby regular samples. The algorithm is developed in the
    framework of network regression combined with the constraints on both network
    topology and L1-norm shrinkage to perform subnetwork discovery. Our method thus
    goes beyond subspace/subgraph discovery and we show that it converges to a
    global optimum. Evaluation on various real-world network datasets demonstrates
    that our algorithm not only outperforms baselines in both network and high
    dimensional setting, but also discovers highly relevant and interpretable local
    subnetworks, further enhancing our understanding of anomalous networks.

    Learning real manipulation tasks from virtual demonstrations using LSTM

    Rouhollah Rahmatizadeh, Pooya Abolghasemi, Aman Behal, Ladislau Bölöni
    Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Robots assisting disabled or elderly people in activities of daily living
    must perform complex manipulation tasks. These tasks are dependent on the
    user’s environment and preferences. Thus, learning from demonstration (LfD) is
    a promising choice that would allow the non-expert user to teach the robot
    different tasks. Unfortunately, learning general solutions from raw
    demonstrations requires a significant amount of data. Performing this number of
    physical demonstrations is unfeasible for a disabled user. In this paper we
    propose an approach where the user demonstrates the manipulation task in a
    virtual environment. The collected demonstrations are used to train an LSTM
    recurrent neural network that can act as the controller for the robot. We show
    that the controller learned from virtual demonstrations can be used to
    successfully perform the manipulation tasks on a physical robot.


    Information Theory

    Privacy-guaranteed Two-Agent Interactions Using Information-Theoretic Mechanisms

    Bahman Moraffah, Lalitha Sankar
    Comments: 33 pages
    Subjects: Information Theory (cs.IT)

    This paper introduces a multi-round interaction problem with privacy
    constraints between two agents that observe correlated data. The agents
    alternately share data with one another for a total of K rounds such that each
    agent initiates sharing over K/2 rounds. The interactions are modeled as a
    collection of K random mechanisms (mappings), one for each round. The goal is
    to jointly design the K private mechanisms to determine the set of all
    achievable distortion-leakage pairs at each agent. Arguing that a mutual
    information-based leakage metric can be appropriate for streaming data
    settings, this paper: (i) determines the set of all achievable distortion-
    leakage tuples ; (ii) shows that the K mechanisms allow for precisely composing
    the total privacy budget over K rounds without loss; and (ii) develops
    conditions under which interaction reduces the net leakage at both agents and
    illustrates it for a specific class of sources. The paper then focuses on
    log-loss distortion to better understand the effect on leakage of using a
    commonly used utility metric in learning theory. The resulting interaction
    problem leads to a non-convex sum-leakage-distortion optimization problem that
    can be viewed as an interactive version of the information bottleneck problem.
    A new merge-and-search algorithm that extends the classical agglomerative
    information bottleneck algorithm to the interactive setting is introduced to
    determine a provable locally optimal solution. Finally, the benefit of
    interaction under log-loss is illustrated for specific source classes and the
    optimality of one-shot is proved for Gaussian sources under both mean-square
    and log-loss distortions constraints.

    Wireless Vehicular Networks in Emergencies: A Single Frequency Network Approach

    Andrea Tassi, Malcolm Egan, Robert J. Piechocki, Andrew Nix
    Comments: The invited paper will be presented in the Telecommunications Systems and Networks symposium of SigTelCom
    Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI); Performance (cs.PF)

    Obtaining high quality sensor information is critical in vehicular
    emergencies. However, existing standards such as IEEE 802.11p/DSRC and LTE-A
    cannot support either the required data rates or the latency requirements. One
    solution to this problem is for municipalities to invest in dedicated base
    stations to ensure that drivers have the information they need to make safe
    decisions in or near accidents. In this paper we further propose that these
    municipality-owned base stations form a Single Frequency Network (SFN). In
    order to ensure that transmissions are reliable, we derive tight bounds on the
    outage probability when the SFN is overlaid on an existing cellular network.
    Using our bounds, we propose a transmission power allocation algorithm. We show
    that our power allocation model can reduce the total instantaneous SFN
    transmission power up to $20$ times compared to a static uniform power
    allocation solution, for the considered scenarios. The result is particularly
    important when base stations rely on an off-grid power source (i.e.,
    batteries).

    Secure Massive MIMO Systems with Limited RF Chains

    Jun Zhu, Wei Xu, Ning Wang
    Comments: Accepted by IEEE Transactions on Vehicular Technology
    Subjects: Information Theory (cs.IT)

    In future practical deployments of massive multi-input multi-output (MIMO)
    systems, the number of radio frequency (RF) chains at the base stations (BSs)
    may be much smaller than the number of BS antennas to reduce the overall
    expenditure. In this paper, we propose a novel design framework for joint data
    and artificial noise (AN) precoding in a multiuser massive MIMO system with
    limited number of RF chains, which improves the wireless security performance.
    With imperfect channel state information (CSI), we analytically derive an
    achievable lower bound on the ergodic secrecy rate of any mobile terminal (MT),
    for both analog and hybrid precoding schemes. The closed-form lower bound is
    used to determine optimal power splitting between data and AN that maximizes
    the secrecy rate through simple one-dimensional search. Analytical and
    numerical results together reveal that the proposed hybrid precoder, although
    suffers from reduced secrecy rate compared with theoretical full-dimensional
    precoder, is free of the high computational complexity of large-scale matrix
    inversion and null-space calculations, and largely reduces the hardware cost.

    Covert Single-hop Communication in a Wireless Network with Distributed Artificial Noise Generation

    Ramin Soltani, Boulat Bash, Dennis Goeckel, Saikat Guha, Don Towsley
    Comments: submitted to Allerton 2014
    Subjects: Information Theory (cs.IT)

    Covert communication, also known as low probability of detection (LPD)
    communication, prevents the adversary from knowing that a communication is
    taking place. Recent work has demonstrated that, in a three-party scenario with
    a transmitter (Alice), intended recipient (Bob), and adversary (Warden Willie),
    the maximum number of bits that can be transmitted reliably from Alice to Bob
    without detection by Willie, when additive white Gaussian noise (AWGN) channels
    exist between all parties, is on the order of the square root of the number of
    channel uses. In this paper, we begin consideration of network scenarios by
    studying the case where there are additional “friendly” nodes present in the
    environment that can produce artificial noise to aid in hiding the
    communication. We establish achievability results by considering constructions
    where the system node closest to the warden produces artificial noise and
    demonstrate a significant improvement in the throughput achieved covertly,
    without requiring close coordination between Alice and the noise-generating
    node. Conversely, under mild restrictions on the communication strategy, we
    demonstrate no higher covert throughput is possible. Extensions to the
    consideration of the achievable covert throughput when multiple wardens
    randomly located in the environment collaborate to attempt detection of the
    transmitter are also considered.

    Covert Communications on Poisson Packet Channels

    Ramin Soltani, Dennis Goeckel, Don Towsley, Amir Houmansadr
    Comments: Allerton 2015 submission, minor edits
    Subjects: Information Theory (cs.IT)

    Consider a channel where authorized transmitter Jack sends packets to
    authorized receiver Steve according to a Poisson process with rate $lambda$
    packets per second for a time period $T$. Suppose that covert transmitter Alice
    wishes to communicate information to covert receiver Bob on the same channel
    without being detected by a watchful adversary Willie. We consider two
    scenarios. In the first scenario, we assume that warden Willie cannot look at
    packet contents but rather can only observe packet timings, and Alice must send
    information by inserting her own packets into the channel. We show that the
    number of packets that Alice can covertly transmit to Bob is on the order of
    the square root of the number of packets that Jack transmits to Steve;
    conversely, if Alice transmits more than that, she will be detected by Willie
    with high probability. In the second scenario, we assume that Willie can look
    at packet contents but that Alice can communicate across an $M/M/1$ queue to
    Bob by altering the timings of the packets going from Jack to Steve. First,
    Alice builds a codebook, with each codeword consisting of a sequence of packet
    timings to be employed for conveying the information associated with that
    codeword. However, to successfully employ this codebook, Alice must always have
    a packet to send at the appropriate time. Hence, leveraging our result from the
    first scenario, we propose a construction where Alice covertly slows down the
    packet stream so as to buffer packets to use during a succeeding codeword
    transmission phase. Using this approach, Alice can covertly and reliably
    transmit $mathcal{O}(lambda T)$ covert bits to Bob in time period $T$ over an
    $M/M/1$ queue with service rate $mu > lambda$.

    Covert Communications on Renewal Packet Channels

    Ramin Soltani, Dennis Goeckel, Don Towsley, Amir Houmansadr
    Comments: Contains details of an Allerton 2016 submission
    Subjects: Information Theory (cs.IT)

    Security and privacy are major concerns in modern communication networks. In
    recent years, the information theory of covert communications, where the very
    presence of the communication is undetectable to a watchful and determined
    adversary, has been of great interest. This emerging body of work has focused
    on additive white Gaussian noise (AWGN), discrete memoryless channels (DMCs),
    and optical channels. In contrast, our recent work introduced the
    information-theoretic limits for covert communications over packet channels
    whose packet timings are governed by a Poisson point process. However, actual
    network packet arrival times do not generally conform to the Poisson process
    assumption, and thus here we consider the extension of our work to timing
    channels characterized by more general renewal processes of rate $lambda$. We
    consider two scenarios. In the first scenario, the source of the packets on the
    channel cannot be authenticated by Willie, and therefore Alice can insert
    packets into the channel. We show that if the total number of transmitted
    packets by Jack is $N$, Alice can covertly insert
    $mathcal{O}left(sqrt{N}
    ight)$ packets and, if she transmits more, she will
    be detected by Willie. In the second scenario, packets are authenticated by
    Willie but we assume that Alice and Bob share a secret key; hence, Alice alters
    the timings of the packets according to a pre-shared codebook with Bob to send
    information to him over a $G/M/1$ queue with service rate $mu>lambda$. We
    show that Alice can covertly and reliably transmit $mathcal{O}(N)$ bits to Bob
    when the total number of packets sent from Jack to Steve is $N$.

    Approximate Gram-Matrix Interpolation for Wideband Massive MU-MIMO

    Zequn Li, Charles Jeon, Christoph Studer
    Comments: Charles Jeon and Zequn Li contributed equally to this work
    Subjects: Information Theory (cs.IT)

    A broad range of linear and non-linear equalization and precoding algorithms
    for wideband massive multi-user (MU) multiple-input multiple-output (MIMO)
    wireless systems that rely on orthogonal frequency-division multiplexing (OFDM)
    or single-carrier frequency-division multiple access (SC-FDMA) requires the
    computation of the Gram matrix for each active subcarrier, which results in
    excessively high computational complexity. In this paper, we propose novel,
    approximate algorithms that reduce the complexity of Gram-matrix computation
    for linear equalization and precoding by exploiting correlation across
    subcarriers. We analytically show that a small fraction of Gram-matrix
    computations in combination with approximate interpolation schemes are
    sufficient to achieve near-optimal error-rate performance at low computational
    complexity in wideband massive MU-MIMO systems. We furthermore demonstrate that
    the proposed methods exhibit improved robustness against channel-estimation
    errors compared to exact Gram-matrix interpolation algorithms that typically
    require high computational complexity.

    Effective Capacity in MIMO Channels with Arbitrary Inputs

    Marwan Hammouda, Sami Akin, M. Cenk Gursoy, Jürgen Peissig
    Comments: Submitted to the IEEE transaction on vehicular technology
    Subjects: Information Theory (cs.IT)

    Recently, communication systems that are both spectrum and energy efficient
    have attracted significant attention. Different from the existing research, we
    investigate the throughput and energy efficiency of a general class of
    multiple-input and multiple-output systems with arbitrary inputs when they are
    subject to statistical quality-of-service (QoS) constraints, which are imposed
    as limits on the delay violation and buffer overflow probabilities. We employ
    the effective capacity as the performance metric. We obtain the optimal input
    covariance matrix that maximizes the effective capacity under a short-term
    average power constraint. Following that, we perform an asymptotic analysis of
    the effective capacity in the low signal-to-noise ratio and large-scale antenna
    regimes. In the low signal-to-noise ratio regime analysis, we utilize the first
    and second derivatives of the effective capacity when the signal-to-noise ratio
    approaches zero in order to determine the minimum energy-per-bit and also the
    slope of the effective capacity versus energy-per-bit curve at the minimum
    energy-per-bit. We observe that the minimum energy-per-bit is independent of
    the input distribution, whereas the slope depends on the input distribution. In
    the large-scale antenna analysis, we show that the effective capacity
    approaches the average transmission rate in the channel with the increasing
    number of transmit and/or receive antennas. Particularly, the gap between the
    effective capacity and the average transmission rate in the channel, which is
    caused by the QoS constraints, is minimized with the number of antennas. In
    addition, we put forward the non-asymptotic backlog and delay violation bounds
    by utilizing the effective capacity. Finally, we substantiate our analytical
    results through numerical illustrations.

    Low Complexity Channel Estimation for Millimeter Wave Systems with Hybrid A/D Antenna Processing

    George C. Alexandropoulos, Symeon Chouvardas
    Comments: 6 pages, 3 figures, IEEE GLOBECOM Workshops 2016
    Subjects: Information Theory (cs.IT)

    The availability of large bandwidth at millimeter wave (mmWave) frequencies
    is one of the major factors that rendered very high frequencies a promising
    candidate enabler for fifth generation (5G) mobile communication networks. To
    confront with the intrinsic characteristics of signal propagation at
    frequencies of tens of GHz and being able to achieve data rates of the order of
    gigabits per second, mmWave systems are expected to employ large antenna arrays
    that implement highly directional beamforming. In this paper, we consider
    mmWave wireless systems comprising of nodes equipped with large antenna arrays
    and being capable of performing hybrid analog and digital (A/D) processing.
    Intending at realizing channel-aware transmit and receive beamforming, we focus
    on designing low complexity compressed sensing channel estimation. In
    particular, by adopting a temporally correlated mmWave channel model, we
    present two compressed sensing algorithms that exploit the temporal correlation
    to reduce the complexity of sparse channel estimation, one being greedy and the
    other one being iterative. Our representative performance evaluation results
    offer useful insights on the interplay among some system and operation
    parameters, and the accuracy of channel estimation.

    Radial Velocity Retrieval for Multichannel SAR Moving Targets with Time-Space Doppler De-ambiguity

    Zu-Zhen Huang, Jia Xu, Zhi-Rui Wang, Li Xiao, Xiang-Gen Xia, Teng Long
    Comments: 14 double-column pages, 11 figures, 4 tables
    Subjects: Information Theory (cs.IT); Computer Vision and Pattern Recognition (cs.CV)

    In this paper, for multichannel synthetic aperture radar (SAR) systems we
    first formulate the effects of Doppler ambiguities on the radial velocity (RV)
    estimation of a ground moving target in range-compressed domain, range-Doppler
    domain and image domain, respectively, where cascaded time-space Doppler
    ambiguity (CTSDA) may occur, that is, time domain Doppler ambiguity (TDDA) in
    each channel occurs at first and then spatial domain Doppler ambiguity (SDDA)
    among multi-channels occurs subsequently. Accordingly, the multichannel SAR
    systems with different parameters are divided into three cases with different
    Doppler ambiguity properties, i.e., only TDDA occurs in Case I, and CTSDA
    occurs in Cases II and III, while the CTSDA in Case II can be simply seen as
    the SDDA. Then, a multi-frequency SAR is proposed to obtain the RV estimation
    by solving the ambiguity problem based on Chinese remainder theorem (CRT). For
    Cases I and II, the ambiguity problem can be solved by the existing closed-form
    robust CRT. For Case III, we show that the problem is different from the
    conventional CRT problem and we call it a double remaindering problem. We then
    propose a sufficient condition under which the double remaindering problem,
    i.e., the CTSDA, can be solved by the closed-form robust CRT. When the
    sufficient condition is not satisfied, a searching based method is proposed.
    Finally, some numerical experiments are provided to demonstrate the
    effectiveness of the proposed methods.

    Review of Buffer-Aided Distributed Space-Time Coding Schemes and Algorithms for Cooperative Wireless Systems

    J. Gu, R. C. de Lamare
    Comments: 20 pages, 8 figures. arXiv admin note: substantial text overlap with arXiv:1608.04439
    Subjects: Information Theory (cs.IT)

    In this work, we propose buffer-aided distributed space-time coding (DSTC)
    schemes and relay selection algorithms for cooperative direct-sequence
    code-division multiple access (DS-CDMA) systems. We first devise a relay pair
    selection algorithm that can form relay pairs and then select the optimum set
    of relays among both the source-relay phase and the relay-destination phase
    according to the signal-to-interference-plus-noise ratio (SINR) criterion.
    Multiple relays equipped with dynamic buffers are then introduced in the
    network, which allows the relays to store data received from the sources and
    wait until the most appropriate time for transmission. { A greedy relay pair
    selection algorithm is then developed to reduce the high cost of the exhaustive
    search required when a large number of relays are involved.} The proposed
    techniques effectively improve the quality of the transmission with an
    acceptable delay as the buffer size is adjustable. An analysis of the
    computational complexity of the proposed algorithms, the delay and a study of
    the greedy algorithm are then carried out. Simulation results show that the
    proposed dynamic buffer-aided DSTC schemes and algorithms outperform prior art.

    BER Performance of Polar Coded OFDM in Multipath Fading

    David R. Wasserman, Ahsen U. Ahmed, David W. Chi
    Comments: 6 pages, 4 figures. Submitted to IEEE WCNC ’17
    Subjects: Information Theory (cs.IT)

    Orthogonal Frequency Division Multiplexing (OFDM) has gained a lot of
    popularity over the years. Due to its popularity, OFDM has been adopted as a
    standard in cellular technology and Wireless Local Area Network (WLAN)
    communication systems. To improve the bit error rate (BER) performance, forward
    error correction (FEC) codes are often utilized to protect signals against
    unknown interference and channel degradations. In this paper, we apply
    soft-decision FEC, more specifically polar codes and a convolutional code, to
    an OFDM system in a quasi-static multipath fading channel, and compare BER
    performance in various channels. We investigate the effect of interleaving bits
    within a polar codeword. Finally, the simulation results for each case are
    presented in the paper.

    Codes for distributed storage from 3-regular graphs

    Shuhong Gao, Fiona Knoll, Felice Manganiello, Gretchen Matthews
    Comments: 13 pages, 4 figures, 1 table
    Subjects: Information Theory (cs.IT); Combinatorics (math.CO)

    This paper considers distributed storage systems (DSSs) from a graph
    theoretic perspective. A DSS is constructed by means of the path decomposition
    of a 3- regular graph into P4 paths. The paths represent the disks of the DSS
    and the edges of the graph act as the blocks of storage. We deduce the
    properties of the DSS from a related graph and show their optimality.

    On Optimal Latency of Communications

    Minh Au, Francois Gagnon
    Subjects: Information Theory (cs.IT)

    In this paper we investigate the optimal latency of communications. Focusing
    on fixed rate communication without any feedback channel, this paper
    encompasses low-latency strategies with which one hop and multi-hop
    communication issues are treated from an information theoretic perspective. By
    defining the latency as the time required to make decisions, we prove that if
    short messages can be transmitted in parallel Gaussian channels, for example,
    via orthogonal frequency-division multiplexing (OFDM)-like signals, there
    exists an optimal low-latency strategy for every code. This can be achieved via
    early-detection schemes or asynchronous detections. We first provide the
    optimal achievable latency in additive white Gaussian noise (AWGN) channels for
    every channel code given a probability block error $epsilon$. This can be
    obtained via sequential ratio tests or a “genie” aided, extit{e.g}.
    error-detecting codes. Results demonstrate the effectiveness of the approach.
    Next, we show how early-detection can be effective with OFDM signals while
    maintaining its spectral efficiency via random coding or pre-coding random
    matrices. Finally, we explore the optimal low-latency strategy in multi-hop
    relaying schemes. For amplify-and-forward (AF) and decode-and-forward (DF)
    relaying schemes there exist an optimal achievable latency. In particular, we
    first show that there exist a better low-latency strategy, for which AF relays
    could transmit while receiving. This can be achieved by using amplify and
    forward combined with early detection.

    On the Empirical Effect of Gaussian Noise in Under-sampled MRI Reconstruction

    Patrick Virtue, Michael Lustig
    Comments: 24 pages, 7 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT)

    In Fourier-based medical imaging, sampling below the Nyquist rate results in
    an underdetermined system, in which linear reconstructions will exhibit
    artifacts. Another consequence of under-sampling is lower signal to noise ratio
    (SNR) due to fewer acquired measurements. Even if an oracle provided the
    information to perfectly disambiguate the underdetermined system, the
    reconstructed image could still have lower image quality than a corresponding
    fully sampled acquisition because of the reduced measurement time. The effects
    of lower SNR and the underdetermined system are coupled during reconstruction,
    making it difficult to isolate the impact of lower SNR on image quality. To
    this end, we present an image quality prediction process that reconstructs
    fully sampled, fully determined data with noise added to simulate the loss of
    SNR induced by a given under-sampling pattern. The resulting prediction image
    empirically shows the effect of noise in under-sampled image reconstruction
    without any effect from an underdetermined system.

    We discuss how our image quality prediction process can simulate the
    distribution of noise for a given under-sampling pattern, including variable
    density sampling that produces colored noise in the measurement data. An
    interesting consequence of our prediction model is that we can show that
    recovery from underdetermined non-uniform sampling is equivalent to a weighted
    least squares optimization that accounts for heterogeneous noise levels across
    measurements.

    Through a series of experiments with synthetic and in vivo datasets, we
    demonstrate the efficacy of the image quality prediction process and show that
    it provides a better estimation of reconstruction image quality than the
    corresponding fully-sampled reference image.

    Iterative Null-space Projection Method with Adaptive Thresholding in Sparse Signal Recovery and Matrix Completion

    Ehsan Asadi, Ashkan Esmaeili, Farokh Marvasti
    Subjects: Methodology (stat.ME); Information Theory (cs.IT)

    Adaptive thresholding methods have proved to yield high SNRs and fast
    convergence in finding the solution to the Compressed Sensing (CS) problems.
    Recently, it was observed that the robustness of a class of iterative sparse
    recovery algorithms such as Iterative Method with Adaptive Thresholding (IMAT)
    has outperformed the well-known LASSO algorithm in terms of reconstruction
    quality, convergence speed, and the sensitivity to the noise. In this paper, we
    introduce a new method towards solving the CS problem. The logic of this method
    is based on iterative projections of the thresholded signal onto the null-space
    of the sensing matrix. The thresholding is carried out by recovering the
    support of the desired signal by projection on thresholding subspaces. The
    simulations reveal that the proposed method has the capability of yielding
    noticeable output SNR values with about as many samples as twice the sparsity
    number, while other methods fail to recover the signals when approaching the
    algebraic bound for the number of samples required. The computational
    complexity of our method is also comparable to other methods as observed in the
    simulations. We have also extended our Algorithm to Matrix Completion (MC)
    scenarios and compared its efficiency to other well-reputed approaches for MC
    in the literature.




沪ICP备19023445号-2号
友情链接