IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Tue, 14 Mar 2017

    我爱机器学习(52ml.net)发表于 2017-03-14 00:00:00
    love 0

    Neural and Evolutionary Computing

    Comparison of echo state network output layer classification methods on noisy data

    Ashley Prater
    Comments: 8 pages. International Joint Conference on Neural Networks (IJCNN 2017)
    Subjects: Neural and Evolutionary Computing (cs.NE)

    Echo state networks are a recently developed type of recurrent neural network
    where the internal layer is fixed with random weights, and only the output
    layer is trained on specific data. Echo state networks are increasingly being
    used to process spatiotemporal data in real-world settings, including speech
    recognition, event detection, and robot control. A strength of echo state
    networks is the simple method used to train the output layer – typically a
    collection of linear readout weights found using a least squares approach.
    Although straightforward to train and having a low computational cost to use,
    this method may not yield acceptable accuracy performance on noisy data.

    This study compares the performance of three echo state network output layer
    methods to perform classification on noisy data: using trained linear weights,
    using sparse trained linear weights, and using trained low-rank approximations
    of reservoir states. The methods are investigated experimentally on both
    synthetic and natural datasets. The experiments suggest that using regularized
    least squares to train linear output weights is superior on data with low
    noise, but using the low-rank approximations may significantly improve accuracy
    on datasets contaminated with higher noise levels.

    Convolutional Spike Timing Dependent Plasticity based Feature Learning in Spiking Neural Networks

    Priyadarshini Panda, Gopalakrishnan Srinivasan, Kaushik Roy
    Comments: 11 pages, 10 figures, Under Consideration in Scientific Reports
    Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

    Brain-inspired learning models attempt to mimic the cortical architecture and
    computations performed in the neurons and synapses constituting the human brain
    to achieve its efficiency in cognitive tasks. In this work, we present
    convolutional spike timing dependent plasticity based feature learning with
    biologically plausible leaky-integrate-and-fire neurons in Spiking Neural
    Networks (SNNs). We use shared weight kernels that are trained to encode
    representative features underlying the input patterns thereby improving the
    sparsity as well as the robustness of the learning model. We demonstrate that
    the proposed unsupervised learning methodology learns several visual categories
    for object recognition with fewer number of examples and outperforms
    traditional fully-connected SNN architectures while yielding competitive
    accuracy. Additionally, we observe that the learning model performs out-of-set
    generalization further making the proposed biologically plausible framework a
    viable and efficient architecture for future neuromorphic applications.

    End-to-End Learning of Geometry and Context for Deep Stereo Regression

    Alex Kendall, Hayk Martirosyan, Saumitro Dasgupta, Peter Henry, Ryan Kennedy, Abraham Bachrach, Adam Bry
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    The globalization of trade and the organization of work are currently causing
    a large migratory flow towards the cities. This growth of cities requires new
    urban planning where digital tools take a preponderant place to capture data
    and understand and decide in face of changes. These tools however hardly resist
    to natural disasters, terrorism, accidents, etc. Based on the expertise of the
    CITI laboratory of INSA Lyon and SC3 of the Industrial University of Santander,
    we propose to create the ALERT project – Autonomous Liable Emergency service in
    Real Time – with decentralized, reliable and efficient services, physically
    close to the citizens, taking decisions locally, in a relevant manner without
    risk of disconnection with a central authority. These information gathering and
    decision-making will involve the population with participatory and social
    approaches.

    Robustness from structure: Inference with hierarchical spiking networks on analog neuromorphic hardware

    Mihai A. Petrovici, Anna Schroeder, Oliver Breitwieser, Andreas Grübl, Johannes Schemmel, Karlheinz Meier
    Comments: accepted at IJCNN 2017
    Subjects: Neurons and Cognition (q-bio.NC); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    How spiking networks are able to perform probabilistic inference is an
    intriguing question, not only for understanding information processing in the
    brain, but also for transferring these computational principles to neuromorphic
    silicon circuits. A number of computationally powerful spiking network models
    have been proposed, but most of them have only been tested, under ideal
    conditions, in software simulations. Any implementation in an analog, physical
    system, be it in vivo or in silico, will generally lead to distorted dynamics
    due to the physical properties of the underlying substrate. In this paper, we
    discuss several such distortive effects that are difficult or impossible to
    remove by classical calibration routines or parameter training. We then argue
    that hierarchical networks of leaky integrate-and-fire neurons can offer the
    required robustness for physical implementation and demonstrate this with both
    software simulations and emulation on an accelerated analog neuromorphic
    device.

    A norm knockout method on indirect reciprocity to reveal indispensable norms

    Hitoshi Yamamoto, Isamu Okada, Satoshi Uchida, Tatsuya Sasaki
    Comments: 15 pages (incl. supplementary materials), 6 figures, 7 table
    Journal-ref: Scientific Reports 7, 44146 (2017)
    Subjects: Physics and Society (physics.soc-ph); Multiagent Systems (cs.MA); Neural and Evolutionary Computing (cs.NE); Populations and Evolution (q-bio.PE)

    Although various norms for reciprocity-based cooperation have been suggested
    that are evolutionarily stable against invasion from free riders, the process
    of alternation of norms and the role of diversified norms remain unclear in the
    evolution of cooperation. We clarify the co-evolutionary dynamics of norms and
    cooperation in indirect reciprocity and also identify the indispensable norms
    for the evolution of cooperation. Inspired by the gene knockout method, a
    genetic engineering technique, we developed the norm knockout method and
    clarified the norms necessary for the establishment of cooperation. The results
    of numerical investigations revealed that the majority of norms gradually
    transitioned to tolerant norms after defectors are eliminated by strict norms.
    Furthermore, no cooperation emerges when specific norms that are intolerant to
    defectors are knocked out.

    Ask Me Even More: Dynamic Memory Tensor Networks (Extended Model)

    Govardana Sachithanandam Ramachandran, Ajay Sohmshetty
    Subjects: Computation and Language (cs.CL); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    We examine Memory Networks for the task of question answering (QA), under
    common real world scenario where training examples are scarce and under weakly
    supervised scenario, that is only extrinsic labels are available for training.
    We propose extensions for the Dynamic Memory Network (DMN), specifically within
    the attention mechanism, we call the resulting Neural Architecture as Dynamic
    Memory Tensor Network (DMTN). Ultimately, we see that our proposed extensions
    results in over 80% improvement in the number of task passed against the
    baselined standard DMN and 20% more task passed compared to state-of-the-art
    End-to-End Memory Network for Facebook’s single task weakly trained 1K bAbi
    dataset.

    Evolution Strategies as a Scalable Alternative to Reinforcement Learning

    Tim Salimans, Jonathan Ho, Xi Chen, Ilya Sutskever
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    We explore the use of Evolution Strategies, a class of black box optimization
    algorithms, as an alternative to popular RL techniques such as Q-learning and
    Policy Gradients. Experiments on MuJoCo and Atari show that ES is a viable
    solution strategy that scales extremely well with the number of CPUs available:
    By using hundreds to thousands of parallel workers, ES can solve 3D humanoid
    walking in 10 minutes and obtain competitive results on most Atari games after
    one hour of training time. In addition, we highlight several advantages of ES
    as a black box optimization technique: it is invariant to action frequency and
    delayed rewards, tolerant of extremely long horizons, and does not need
    temporal discounting or value function approximation.

    Stochastic inference with spiking neurons in the high-conductance state

    Mihai A. Petrovici, Johannes Bill, Ilja Bytschok, Johannes Schemmel, Karlheinz Meier
    Journal-ref: Phys. Rev. E 94, 042312 (2016)
    Subjects: Neurons and Cognition (q-bio.NC); Disordered Systems and Neural Networks (cond-mat.dis-nn); Neural and Evolutionary Computing (cs.NE); Biological Physics (physics.bio-ph); Machine Learning (stat.ML)

    The highly variable dynamics of neocortical circuits observed in vivo have
    been hypothesized to represent a signature of ongoing stochastic inference but
    stand in apparent contrast to the deterministic response of neurons measured in
    vitro. Based on a propagation of the membrane autocorrelation across spike
    bursts, we provide an analytical derivation of the neural activation function
    that holds for a large parameter space, including the high-conductance state.
    On this basis, we show how an ensemble of leaky integrate-and-fire neurons with
    conductance-based synapses embedded in a spiking environment can attain the
    correct firing statistics for sampling from a well-defined target distribution.
    For recurrent networks, we examine convergence toward stationarity in computer
    simulations and demonstrate sample-based Bayesian inference in a mixed
    graphical model. This points to a new computational role of high-conductance
    states and establishes a rigorous link between deterministic neuron models and
    functional stochastic dynamics on the network level.


    Computer Vision and Pattern Recognition

    Detailed, accurate, human shape estimation from clothed 3D scan sequences

    Chao Zhang, Sergi Pujades, Michael Black, Gerard Pons-Moll
    Comments: 10 pages, 11 figures, conference
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We address the problem of estimating human body shape from 3D scans over
    time. Reliable estimation of 3D body shape is necessary for many applications
    including virtual try-on, health monitoring, and avatar creation for virtual
    reality. Scanning bodies in minimal clothing, however, presents a practical
    barrier to these applications. We address this problem by estimating body shape
    under clothing from a sequence of 3D scans. Previous methods that have
    exploited statistical models of body shape produce overly smooth shapes lacking
    personalized details. In this paper we contribute a new approach to recover not
    only an approximate shape of the person, but also their detailed shape. Our
    approach allows the estimated shape to deviate from a parametric model to fit
    the 3D scans. We demonstrate the method using high quality 4D data as well as
    sequences of visual hulls extracted from multi-view images. We also make
    available a new high quality 4D dataset that enables quantitative evaluation.
    Our method outperforms the previous state of the art, both qualitatively and
    quantitatively.

    Guetzli: Perceptually Guided JPEG Encoder

    Jyrki Alakuijala, Robert Obryk, Ostap Stoliarchuk, Zoltan Szabadka, Lode Vandevenne, Jan Wassenberg
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)

    Guetzli is a new JPEG encoder that aims to produce visually indistinguishable
    images at a lower bit-rate than other common JPEG encoders. It optimizes both
    the JPEG global quantization tables and the DCT coefficient values in each JPEG
    block using a closed-loop optimizer. Guetzli uses Butteraugli, our perceptual
    distance metric, as the source of feedback in its optimization process. We
    reach a 29-45% reduction in data size for a given perceptual distance,
    according to Butteraugli, in comparison to other compressors we tried.
    Guetzli’s computation is currently extremely slow, which limits its
    applicability to compressing static content and serving as a proof- of-concept
    that we can achieve significant reductions in size by combining advanced
    psychovisual models with lossy compression techniques.

    Improving LBP and its variants using anisotropic diffusion

    Mariane B. Neiva, Patrick Guidotti, Odemir M. Bruno
    Comments: 14 pages, 10 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    The main purpose of this paper is to propose a new preprocessing step in
    order to improve local feature descriptors and texture classification.
    Preprocessing is implemented by using transformations which help highlight
    salient features that play a significant role in texture recognition. We
    evaluate and compare four different competing methods: three different
    anisotropic diffusion methods including the classical anisotropic Perona-Malik
    diffusion and two subsequent regularizations of it and the application of a
    Gaussian kernel, which is the classical multiscale approach in texture
    analysis. The combination of the transformed images and the original ones are
    analyzed. The results show that the use of the preprocessing step does lead to
    improved texture recognition.

    Users prefer Guetzli JPEG over same-sized libjpeg

    Jyrki Alakuijala, Robert Obryk, Zoltan Szabadka, Jan Wassenberg
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)

    We report on pairwise comparisons by human raters of JPEG images from libjpeg
    and our new Guetzli encoder. Although both files are size-matched, 75% of
    ratings are in favor of Guetzli. This implies the Butteraugli psychovisual
    image similarity metric which guides Guetzli is reasonably close to human
    perception at high quality levels. We provide access to the raw ratings and
    source images for further analysis and study.

    Zero-Shot Learning – The Good, the Bad and the Ugly

    Yongqin Xian, Bernt Schiele, Zeynep Akata
    Comments: Accepted as a full paper in IEEE Computer Vision and Pattern Recognition (CVPR) 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Although various norms for reciprocity-based cooperation have been suggested
    that are evolutionarily stable against invasion from free riders, the process
    of alternation of norms and the role of diversified norms remain unclear in the
    evolution of cooperation. We clarify the co-evolutionary dynamics of norms and
    cooperation in indirect reciprocity and also identify the indispensable norms
    for the evolution of cooperation. Inspired by the gene knockout method, a
    genetic engineering technique, we developed the norm knockout method and
    clarified the norms necessary for the establishment of cooperation. The results
    of numerical investigations revealed that the majority of norms gradually
    transitioned to tolerant norms after defectors are eliminated by strict norms.
    Furthermore, no cooperation emerges when specific norms that are intolerant to
    defectors are knocked out.

    Randomized Iterative Reconstruction for Sparse View X-ray Computed Tomography

    D. Trinca, Y. Zhong
    Comments: 23 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    With the availability of more powerful computers, iterative reconstruction
    algorithms are the subject of an ongoing work in the design of more efficient
    reconstruction algorithms for X-ray computed tomography. In this work, we show
    how two analytical reconstruction algorithms can be improved by correcting the
    corresponding reconstructions using a randomized iterative reconstruction
    algorithm. The combined analytical reconstruction followed by randomized
    iterative reconstruction can also be viewed as a reconstruction algorithm
    which, in the experiments we have conducted, uses up to (35\%) less projection
    angles as compared to the analytical reconstruction algorithms and produces the
    same results in terms of quality of reconstruction, without increasing the
    execution time significantly.

    Enhancement of human color vision by breaking the binocular redundancy

    Bradley S. Gundlach, Alireza Shahsafi, Gregory Vershbow, Chenghao Wan, Jad Salman, Bas Rokers, Laurent Lessard, Mikhail A. Kats
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Biological Physics (physics.bio-ph)

    To see color, the human visual system combines the responses of three types
    of cone cells in the retina – a process that discards a significant amount of
    spectral information. We present an approach that can enhance human color
    vision by breaking the inherent redundancy in binocular vision, providing
    different spectral content to each eye. Using a psychophysical color model and
    thin-film optimization, we designed a wearable passive multispectral device
    that uses two distinct transmission filters, one for each eye, to enhance the
    user’s ability to perceive spectral information. We fabricated and tested a
    design that “splits” the response of the short-wavelength cone of individuals
    with typical trichromatic vision, effectively simulating the presence of four
    distinct cone types between the two eyes (“tetrachromacy”). Users of this
    device were able to differentiate metamers (distinct spectra that resolve to
    the same perceived color in typical observers) without apparent adverse effects
    to vision. The increase in the number of effective cones from the typical three
    reduces the number of possible metamers that can be encountered, enhancing the
    ability to discriminate objects based on their emission, reflection, or
    transmission spectra. This technique represents a significant enhancement of
    the spectral perception of typical humans, and may have applications ranging
    from camouflage detection and anti-counterfeiting to art and data
    visualization.

    Extrinsic Calibration of 3D Range Finder and Camera without Auxiliary Object or Human Intervention

    Qinghai Liao, Ming Liu, Lei Tai, Haoyang Ye
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Fusion of heterogeneous extroceptive sensors is the most effient and
    effective way to representing the environment precisely, as it overcomes
    various defects of each homogeneous sensor. The rigid transformation (aka.
    extrinsic parameters) of heterogeneous sensory systems should be available
    before precisely fusing the multisensor information. Researchers have proposed
    several approaches to estimating the extrinsic parameters. These approaches
    require either auxiliary objects, like chessboards, or extra help from human to
    select correspondences. In this paper, we proposed a novel extrinsic
    calibration approach for the extrinsic calibration of range and image sensors.
    As far as we know, it is the first automatic approach with no requirement of
    auxiliary objects or any human interventions. First, we estimate the initial
    extrinsic parameters from the individual motion of the range finder and the
    camera. Then we extract lines in the image and point-cloud pairs, to refine the
    line feature associations by the initial extrinsic parameters. At the end, we
    discussed the degenerate case which may lead to the algorithm failure and
    validate our approach by simulation. The results indicate high-precision
    extrinsic calibration results against the ground-truth.

    Deep Learning for Skin Lesion Classification

    P. Mirunalini, Aravindan Chandrabose, Vignesh Gokul, S. M. Jaisakthi
    Comments: 3 pages with 3 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Melanoma, a malignant form of skin cancer is very threatening to life.
    Diagnosis of melanoma at an earlier stage is highly needed as it has a very
    high cure rate. Benign and malignant forms of skin cancer can be detected by
    analyzing the lesions present on the surface of the skin using dermoscopic
    images. In this work, an automated skin lesion detection system has been
    developed which learns the representation of the image using Google’s
    pretrained CNN model known as Inception-v3 cite{cnn}. After obtaining the
    representation vector for our input dermoscopic images we have trained two
    layer feed forward neural network to classify the images as malignant or
    benign. The system also classifies the images based on the cause of the cancer
    either due to melanocytic or non-melanocytic cells using a different neural
    network. These classification tasks are part of the challenge organized by
    International Skin Imaging Collaboration (ISIC) 2017. Our system learns to
    classify the images based on the model built using the training images given in
    the challenge and the experimental results were evaluated using validation and
    test sets. Our system has achieved an overall accuracy of 65.8\% for the
    validation set.

    A Localisation-Segmentation Approach for Multi-label Annotation of Lumbar Vertebrae using Deep Nets

    Anjany Sekuboyina, Alexander Valentinitsch, Jan S. Kirschke, Bjoern H. Menze
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Multi-class segmentation of vertebrae is a non-trivial task mainly due to the
    high correlation in the appearance of adjacent vertebrae. Hence, such a task
    calls for the consideration of both global and local context. Based on this
    motivation, we propose a two-staged approach that, given a computed tomography
    dataset of the spine, segments the five lumbar vertebrae and simultaneously
    labels them. The first stage employs a multi-layered perceptron performing
    non-linear regression for locating the lumbar region using the global context.
    The second stage, comprised of a fully-convolutional deep network, exploits the
    local context in the localised lumbar region to segment and label the lumbar
    vertebrae in one go. Aided with practical data augmentation for training, our
    approach is highly generalisable, capable of successfully segmenting both
    healthy and abnormal vertebrae (fractured and scoliotic spines). We
    consistently achieve an average Dice coefficient of over 90 percent on a
    publicly available dataset of the xVertSeg segmentation challenge of MICCAI
    2016. This is particularly noteworthy because the xVertSeg dataset is beset
    with severe deformities in the form of vertebral fractures and scoliosis.

    What You Expect is NOT What You Get! Questioning Reconstruction/Classification Correlation of Stacked Convolutional Auto-Encoder Features

    Michele Alberti, Mathias Seuret, Rolf Ingold, Marcus Liwicki (Document Image Voice Analysis Group, University Of Fribourg, Fribourg, Switzerland)
    Comments: Submitted to the 14th IAPR International Conference on Document Analysis and Recognition (ICDAR 2017) 6 pages, 7 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we thoroughly investigate the quality of features produced by
    deep neural network architectures obtained by stacking and convolving
    Auto-Encoders. In particular, we are interested into the relation of their
    reconstruction score with their performance on document layout analysis. When
    using Auto-Encoders, intuitively one could assume that features which are good
    for reconstruction will also lead to high classification accuracies. However,
    we prove that this is not always the case. We examine the reconstruction score,
    training error and the results obtained if we were to use the same features for
    both input reconstruction and a classification task. We show that the
    reconstruction score is not a good metric because it is biased by the decoder
    quality. Furthermore, experimental results suggest that there is no correlation
    between the reconstruction score and the quality of features for a
    classification task and that given the network size and configuration it is not
    possible to make assumptions on its training error magnitude. Therefore we
    conclude that both, reconstruction score and training error should not be used
    jointly to evaluate the quality of the features produced by a Stacked
    Convolutional Auto-Encoders for a classification task. Consequently one should
    independently investigate the network classification abilities directly.

    End-to-End Learning of Geometry and Context for Deep Stereo Regression

    Alex Kendall, Hayk Martirosyan, Saumitro Dasgupta, Peter Henry, Ryan Kennedy, Abraham Bachrach, Adam Bry
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    We propose a novel deep learning architecture for regressing disparity from a
    rectified pair of stereo images. We leverage knowledge of the problem’s
    geometry to form a cost volume using deep feature representations. We learn to
    incorporate contextual information using 3-D convolutions over this volume.
    Disparity values are regressed from the cost volume using a proposed
    differentiable soft argmin operation, which allows us to train our method
    end-to-end to sub-pixel accuracy without any additional post-processing or
    regularization. We evaluate our method on the Scene Flow and KITTI datasets and
    on KITTI we set a new state-of-the-art benchmark, while being significantly
    faster than competing approaches.

    Automatic Skin Lesion Segmentation using Semi-supervised Learning Technique

    S. M. Jaisakthi, Aravindan Chandrabose, P. Mirunalini
    Comments: 4 pages with 1 figure
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Skin cancer is the most common of all cancers and each year million cases of
    skin cancer are treated. Treating and curing skin cancer is easy, if it is
    diagnosed and treated at an early stage. In this work we propose an automatic
    technique for skin lesion segmentation in dermoscopic images which helps in
    classifying the skin cancer types. The proposed method comprises of two major
    phases (1) preprocessing and (2) segmentation using semi-supervised learning
    algorithm. In the preprocessing phase noise are removed using filtering
    technique and in the segmentation phase skin lesions are segmented based on
    clustering technique. K-means clustering algorithm is used to cluster the
    preprocessed images and skin lesions are filtered from these clusters based on
    the color feature. Color of the skin lesions are learned from the training
    images using histograms calculations in RGB color space. The training images
    were downloaded from the ISIC 2017 challenge website and the experimental
    results were evaluated using validation and test sets.

    Poisson multi-Bernoulli mixture filter: direct derivation and implementation

    Ángel F. García-Fernández, Jason L. Williams, Karl Granström, Lennart Svensson
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Methodology (stat.ME)

    We provide a derivation of the Poisson multi-Bernoulli mixture (PMBM) filter
    for multi-target tracking with the standard point target measurements without
    using probability generating functionals or functional derivatives. We also
    establish the connection with the delta-generalised labelled multi-Bernoulli
    (delta-GLMB) filter, showing that a delta-GLMB density represents a
    multi-Bernoulli mixture with labelled targets so it can be seen as a special
    case of PMBM. In addition, we propose an implementation for linear/Gaussian
    dynamic and measurement models and how to efficiently obtain typical estimators
    in the literature from the PMBM. The PMBM filter is shown to outperform other
    filters in the literature in a challenging scenario

    GUN: Gradual Upsampling Network for single image super-resolution

    Yang Zhao, Ronggang Wang, Weisheng Dong, Wei Jia, Jianchao Yang, Xiaoping Liu, Wen Gao
    Comments: 11 pages, 9 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we propose an efficient super-resolution (SR) method based on
    deep convolutional neural network (CNN), namely gradual upsampling network
    (GUN). Recent CNN based SR methods either preliminarily magnify the low
    resolution (LR) input to high resolution (HR) and then reconstruct the HR
    input, or directly reconstruct the LR input and then recover the HR result at
    the last layer. The proposed GUN utilizes a gradual process instead of these
    two kinds of frameworks. The GUN consists of an input layer, multistep
    upsampling and convolutional layers, and an output layer. By means of the
    gradual process, the proposed network can simplify the difficult direct SR
    problem to multistep easier upsampling tasks with very small magnification
    factor in each step. Furthermore, a gradual training strategy is presented for
    the GUN. In the proposed training process, an initial network can be easily
    trained with edge-like samples, and then the weights are gradually tuned with
    more complex samples. The GUN can recover fine and vivid results, and is easy
    to be trained. The experimental results on several image sets demonstrate the
    effectiveness of the proposed network.

    Automatic Skin Lesion Analysis using Large-scale Dermoscopy Images and Deep Residual Networks

    Lei Bi, Jinman Kim, Euijoon Ahn, Dagan Feng
    Comments: Submission for ISIC17 Challenge
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Malignant melanoma has one of the most rapidly increasing incidences in the
    world and has a considerable mortality rate. Early diagnosis is particularly
    important since melanoma can be cured with prompt excision. Dermoscopy images
    play an important role in the non-invasive early detection of melanoma [1].
    However, melanoma detection using human vision alone can be subjective,
    inaccurate and poorly reproducible even among experienced dermatologists. This
    is attributed to the challenges in interpreting images with diverse
    characteristics including lesions of varying sizes and shapes, lesions that may
    have fuzzy boundaries, different skin colors and the presence of hair [2].
    Therefore, the automatic analysis of dermoscopy images is a valuable aid for
    clinical decision making and for image-based diagnosis to identify diseases
    such as melanoma [1-4]. Deep residual networks (ResNets) has achieved
    state-of-the-art results in image classification and detection related problems
    [5-8]. In this ISIC 2017 skin lesion analysis challenge [9], we propose to
    exploit the deep ResNets for robust visual features learning and
    representations.

    Hardware-Driven Nonlinear Activation for Stochastic Computing Based Deep Convolutional Neural Networks

    Ji Li, Zihao Yuan, Zhe Li, Caiwen Ding, Ao Ren, Qinru Qiu, Jeffrey Draper, Yanzhi Wang
    Comments: This paper is accepted by 2017 International Joint Conference on Neural Networks (IJCNN)
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Recently, Deep Convolutional Neural Networks (DCNNs) have made unprecedented
    progress, achieving the accuracy close to, or even better than human-level
    perception in various tasks. There is a timely need to map the latest software
    DCNNs to application-specific hardware, in order to achieve orders of magnitude
    improvement in performance, energy efficiency and compactness. Stochastic
    Computing (SC), as a low-cost alternative to the conventional binary computing
    paradigm, has the potential to enable massively parallel and highly scalable
    hardware implementation of DCNNs. One major challenge in SC based DCNNs is
    designing accurate nonlinear activation functions, which have a significant
    impact on the network-level accuracy but cannot be implemented accurately by
    existing SC computing blocks. In this paper, we design and optimize SC based
    neurons, and we propose highly accurate activation designs for the three most
    frequently used activation functions in software DCNNs, i.e, hyperbolic
    tangent, logistic, and rectified linear units. Experimental results on LeNet-5
    using MNIST dataset demonstrate that compared with a binary ASIC hardware DCNN,
    the DCNN with the proposed SC neurons can achieve up to 61X, 151X, and 2X
    improvement in terms of area, power, and energy, respectively, at the cost of
    small precision degradation.In addition, the SC approach achieves up to 21X and
    41X of the area, 41X and 72X of the power, and 198200X and 96443X of the
    energy, compared with CPU and GPU approaches, respectively, while the error is
    increased by less than 3.07%. ReLU activation is suggested for future SC based
    DCNNs considering its superior performance under a small bit stream length.

    Co-occurrence Filter

    Roy J Jevnisek, Shai Avidan
    Comments: accepted to CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Co-occurrence Filter (CoF) is a boundary preserving filter. It is based on
    the Bilateral Filter (BF) but instead of using a Gaussian on the range values
    to preserve edges it relies on a co-occurrence matrix. Pixel values that
    co-occur frequently in the image (i.e., inside textured regions) will have a
    high weight in the co-occurrence matrix. This, in turn, means that such pixel
    pairs will be averaged and hence smoothed, regardless of their intensity
    differences. On the other hand, pixel values that rarely co-occur (i.e., across
    texture boundaries) will have a low weight in the co-occurrence matrix. As a
    result, they will not be averaged and the boundary between them will be
    preserved. The CoF therefore extends the BF to deal with boundaries, not just
    edges. It learns co-occurrences directly from the image. We can achieve various
    filtering results by directing it to learn the co-occurrence matrix from a part
    of the image, or a different image. We give the definition of the filter,
    discuss how to use it with color images and show several use cases.

    Combining Residual Networks with LSTMs for Lipreading

    Themos Stafylakis, Georgios Tzimiropoulos
    Comments: Submitted to Interspeech 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We propose an end-to-end deep learning architecture for word-level visual
    speech recognition. The system is a combination of spatiotemporal
    convolutional, residual and bidirectional Long Short-Term Memory networks. We
    trained and evaluated it on the Lipreading In-The-Wild benchmark, a challenging
    database of 500-size vocabulary consisting of video excerpts from BBC TV
    broadcasts. The proposed network attains word accuracy equal to 83.0%, yielding
    6.8% absolute improvement over the current state-of-the-art.

    Detection of Human Rights Violations in Images: Can Convolutional Neural Networks help?

    Grigorios Kalliatakis (1), Shoaib Ehsan (1), Maria Fasli (1), Ales Leonardis (2), Juergen Gall (3), Klaus D. McDonald-Maier (1) ((1) School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK, (2) School of Computer Science, University of Birmingham, Birmingham, UK, (3) Institute of Computer Science, University of Bonn, Bonn, Germany)
    Comments: In Proceedings of the 12th International Conference on Computer Vision Theory and Applications (VISAPP 2017)
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    After setting the performance benchmarks for image, video, speech and audio
    processing, deep convolutional networks have been core to the greatest advances
    in image recognition tasks in recent times. This raises the question of whether
    there are any benefit in targeting these remarkable deep architectures with the
    unattempted task of recognising human rights violations through digital images.
    Under this perspective, we introduce a new, well-sampled human rights-centric
    dataset called Human Rights Understanding (HRUN). We conduct a rigorous
    evaluation on a common ground by combining this dataset with different
    state-of-the-art deep convolutional architectures in order to achieve
    recognition of human rights violations. Experimental results on the HRUN
    dataset have shown that the best performing CNN architectures can achieve up to
    88.10\% mean average precision. Additionally, our experiments demonstrate that
    increasing the size of the training samples is crucial for achieving an
    improvement on mean average precision principally when utilising very deep
    networks.

    Evaluating Deep Convolutional Neural Networks for Material Classification

    Grigorios Kalliatakis (1), Georgios Stamatiadis (1), Shoaib Ehsan (1), Ales Leonardis (2), Juergen Gall (3), Anca Sticlaru (1), Klaus D. McDonald-Maier (1) ((1) School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK, (2) School of Computer Science, University of Birmingham, Birmingham, UK, (3) Institute of Computer Science, University of Bonn, Bonn, Germany)
    Comments: In Proceedings of the 12th International Conference on Computer Vision Theory and Applications (VISAPP 2017)
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Determining the material category of a surface from an image is a demanding
    task in perception that is drawing increasing attention. Following the recent
    remarkable results achieved for image classification and object detection
    utilising Convolutional Neural Networks (CNNs), we empirically study material
    classification of everyday objects employing these techniques. More
    specifically, we conduct a rigorous evaluation of how state-of-the art CNN
    architectures compare on a common ground over widely used material databases.
    Experimental results on three challenging material databases show that the best
    performing CNN architectures can achieve up to 94.99\% mean average precision
    when classifying materials.

    Improving Interpretability of Deep Neural Networks with Semantic Information

    Yinpeng Dong, Hang Su, Jun Zhu, Bo Zhang
    Comments: To appear in CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Interpretability of deep neural networks (DNNs) is essential since it enables
    users to understand the overall strengths and weaknesses of the models, conveys
    an understanding of how the models will behave in the future, and how to
    diagnose and correct potential problems. However, it is challenging to reason
    about what a DNN actually does due to its opaque or black-box nature. To
    address this issue, we propose a novel technique to improve the
    interpretability of DNNs by leveraging the rich semantic information embedded
    in human descriptions. By concentrating on the video captioning task, we first
    extract a set of semantically meaningful topics from the human descriptions
    that cover a wide range of visual concepts, and integrate them into the model
    with an interpretive loss. We then propose a prediction difference maximization
    algorithm to interpret the learned features of each neuron. Experimental
    results demonstrate its effectiveness in video captioning using the
    interpretable features, which can also be transferred to video action
    recognition. By clearly understanding the learned features, users can easily
    revise false predictions via a human-in-the-loop procedure.

    Local Patch Classification Based Framework for Single Image Super-Resolution

    Yang Zhao, Ronggang Wang, Wei Jia, Jianchao Yang, Wenmin Wang, Wen Gao
    Comments: 11 pages, 8 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Recent learning-based super-resolution (SR) methods often focus on the
    dictionary learning or network training. In this paper, we detailedly discuss a
    new SR framework based on local classification instead of traditional
    dictionary learning. The proposed efficient and extendible SR framework is
    named as local patch classification (LPC) based framework. The LPC framework
    consists of a learning stage and a reconstructing stage. In the learning stage,
    image patches are classified into different classes by means of the proposed
    local patch encoding (LPE), and then a projection matrix is computed for each
    class by utilizing a simple constraint. In the reconstructing stage, an input
    LR patch can be simply reconstructed by computing its LPE code and then
    multiplying corresponding projection matrix. Furthermore, we establish the
    relationship between the proposed method and the anchored neighborhood
    regression based methods; and we also analyze the extendibility of the proposed
    framework. The experimental results on several image sets demonstrate the
    effectiveness of the proposed framework.

    SurfNet: Generating 3D shape surfaces using deep residual networks

    Ayan Sinha, Asim Unmesh, Qixing Huang, Karthik Ramani
    Comments: CVPR 2017 paper
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Computational Geometry (cs.CG)

    3D shape models are naturally parameterized using vertices and faces, ie,
    composed of polygons forming a surface. However, current 3D learning paradigms
    for predictive and generative tasks using convolutional neural networks focus
    on a voxelized representation of the object. Lifting convolution operators from
    the traditional 2D to 3D results in high computational overhead with little
    additional benefit as most of the geometry information is contained on the
    surface boundary. Here we study the problem of directly generating the 3D shape
    surface of rigid and non-rigid shapes using deep convolutional neural networks.
    We develop a procedure to create consistent `geometry images’ representing the
    shape surface of a category of 3D objects. We then use this consistent
    representation for category-specific shape surface generation from a parametric
    representation or an image by developing novel extensions of deep residual
    networks for the task of geometry image generation. Our experiments indicate
    that our network learns a meaningful representation of shape surfaces allowing
    it to interpolate between shape orientations and poses, invent new shape
    surfaces and reconstruct 3D shape surfaces from previously unseen images.

    Prostate Cancer Diagnosis using Deep Learning with 3D Multiparametric MRI

    Saifeng Liu, Huaixiu Zheng, Yesu Feng, Wei Li
    Comments: 4 pages, 4 figures, Proc. SPIE 10134, Medical Imaging 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

    A novel deep learning architecture (XmasNet) based on convolutional neural
    networks was developed for the classification of prostate cancer lesions, using
    the 3D multiparametric MRI data provided by the PROSTATEx challenge. End-to-end
    training was performed for XmasNet, with data augmentation done through 3D
    rotation and slicing, in order to incorporate the 3D information of the lesion.
    XmasNet outperformed traditional machine learning models based on engineered
    features, for both train and test data. For the test data, XmasNet outperformed
    69 methods from 33 participating groups and achieved the second highest AUC
    (0.84) in the PROSTATEx challenge. This study shows the great potential of deep
    learning for cancer imaging.

    A Compact DNN: Approaching GoogLeNet-Level Accuracy of Classification and Domain Adaptation

    Chunpeng Wu, Wei Wen, Tariq Afzal, Yongmei Zhang, Yiran Chen, Hai Li
    Comments: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17)
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    Recently, DNN model compression based on network architecture design, e.g.,
    SqueezeNet, attracted a lot attention. No accuracy drop on image classification
    is observed on these extremely compact networks, compared to well-known models.
    An emerging question, however, is whether these model compression techniques
    hurt DNN’s learning ability other than classifying images on a single dataset.
    Our preliminary experiment shows that these compression methods could degrade
    domain adaptation (DA) ability, though the classification performance is
    preserved. Therefore, we propose a new compact network architecture and
    unsupervised DA method in this paper. The DNN is built on a new basic module
    Conv-M which provides more diverse feature extractors without significantly
    increasing parameters. The unified framework of our DA method will
    simultaneously learn invariance across domains, reduce divergence of feature
    representations, and adapt label prediction. Our DNN has 4.1M parameters, which
    is only 6.7% of AlexNet or 59% of GoogLeNet. Experiments show that our DNN
    obtains GoogLeNet-level accuracy both on classification and DA, and our DA
    method slightly outperforms previous competitive ones. Put all together, our DA
    strategy based on our DNN achieves state-of-the-art on sixteen of total
    eighteen DA tasks on popular Office-31 and Office-Caltech datasets.

    Multi-Pose Face Recognition Using Hybrid Face Features Descriptor

    I Gede Pasek Suta Wijaya, Keiichi Uchimura, Gou Koutaki
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper presents a multi-pose face recognition approach using hybrid face
    features descriptors (HFFD). The HFFD is a face descriptor containing of rich
    discriminant information that is created by fusing some frequency-based
    features extracted using both wavelet and DCT analysis of several different
    poses of 2D face images. The main aim of this method is to represent the
    multi-pose face images using a dominant frequency component with still having
    reasonable achievement compared to the recent multi-pose face recognition
    methods. The HFFD based face recognition tends to achieve better performance
    than that of the recent 2D-based face recognition method. In addition, the
    HFFD-based face recognition also is sufficiently to handle large face
    variability due to face pose variations .

    Colorization as a Proxy Task for Visual Understanding

    Gustav Larsson, Michael Maire, Gregory Shakhnarovich
    Comments: To appear in CVPR 2017 (project page: this http URL)
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We investigate and improve self-supervision as a drop-in replacement for
    ImageNet pretraining, focusing on automatic colorization as the proxy task.
    Self-supervised training has been shown to be more promising for utilizing
    unlabeled data than other, traditional unsupervised learning methods. We show
    the ability of our self-supervised network in several contexts. On VOC
    segmentation and classification tasks, we present results that are
    state-of-the-art among methods not using ImageNet labels for pretraining
    representations.

    Moreover, we present the first in-depth analysis of self-supervision via
    colorization, concluding that formulation of the loss, training details and
    network architecture play important roles in its effectiveness. This
    investigation is further expanded by revisiting the ImageNet pretraining
    paradigm, asking questions such as: How much training data is needed? How many
    labels are needed? How much do features change when fine-tuned? We relate these
    questions back to self-supervision by showing that self-supervised colorization
    provides a similarly powerful supervision as various flavors of ImageNet
    pretraining.

    Negentropic Planar Symmetry Detector

    Agata Migalska, JP Lewis
    Comments: 25 pages, 18 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT)

    In this paper we observe that information theoretical concepts are valuable
    tools for extracting information from images and, in particular, information on
    image symmetries. It is shown that the problem of detecting reflectional and
    rotational symmetries in a two-dimensional image can be reduced to the problem
    of detecting point-symmetry and periodicity in one-dimensional negentropy
    functions. Based on these findings a detector of reflectional and rotational
    global symmetries in greyscale images is constructed. We discuss the importance
    of high precision in symmetry detection in applications arising from quality
    control and illustrate how the proposed method satisfies this requirement.
    Finally, a superior performance of our method to other existing methods,
    demonstrated by the results of a rigorous experimental verification, is an
    indication that our approach rooted in information theory is a promising
    direction in a development of a robust and widely applicable symmetry detector.

    Neural method for Explicit Mapping of Quasi-curvature Locally Linear Embedding in image retrieval

    Shenglan Liu, Jun Wu, Lin Feng, Feilong Wang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper proposed a new explicit nonlinear dimensionality reduction using
    neural networks for image retrieval tasks. We first proposed a Quasi-curvature
    Locally Linear Embedding (QLLE) for training set. QLLE guarantees the linear
    criterion in neighborhood of each sample. Then, a neural method (NM) is
    proposed for out-of-sample problem. Combining QLLE and NM, we provide a
    explicit nonlinear dimensionality reduction approach for efficient image
    retrieval. The experimental results in three benchmark datasets illustrate that
    our method can get better performance than other state-of-the-art out-of-sample
    methods.

    Web-based visualisation of head pose and facial expressions changes: monitoring human activity using depth data

    Grigorios Kalliatakis (1), Nikolaos Vidakis (2), Georgios Triantafyllidis (3) ((1) School of Computer Science and Electronic Engineering, University of Essex, UK, (2) Department of Informatics Engineering, Technological Educational Institute of Crete, Greece, (3) Mediology Section, AD: MT, Aalborg University, Copenhagen, Denmark)
    Comments: 8th Computer Science and Electronic Engineering, (CEEC 2016), University of Essex, UK
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Despite significant recent advances in the field of head pose estimation and
    facial expression recognition, raising the cognitive level when analysing human
    activity presents serious challenges to current concepts. Motivated by the need
    of generating comprehensible visual representations from different sets of
    data, we introduce a system capable of monitoring human activity through head
    pose and facial expression changes, utilising an affordable 3D sensing
    technology (Microsoft Kinect sensor). An approach build on discriminative
    random regression forests was selected in order to rapidly and accurately
    estimate head pose changes in unconstrained environment. In order to complete
    the secondary process of recognising four universal dominant facial expressions
    (happiness, anger, sadness and surprise), emotion recognition via facial
    expressions (ERFE) was adopted. After that, a lightweight data exchange format
    (JavaScript Object Notation-JSON) is employed, in order to manipulate the data
    extracted from the two aforementioned settings. Such mechanism can yield a
    platform for objective and effortless assessment of human activity within the
    context of serious gaming and human-computer interaction.

    Viraliency: Pooling Local Virality

    Xavier Alameda-Pineda, Andrea Pilzer, Dan Xu, Nicu Sebe, Elisa Ricci
    Comments: Accepted at IEEE CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In our overly-connected world, the automatic recognition of virality – the
    quality of an image or video to be rapidly and widely spread in social networks
    – is of crucial importance, and has recently awaken the interest of the
    computer vision community. Concurrently, recent progress in deep learning
    architectures showed that global pooling strategies allow the extraction of
    activation maps, which highlight the parts of the image most likely to contain
    instances of a certain class. We extend this concept by introducing a pooling
    layer that learns the size of the support area to be averaged: the learned
    top-N average (LENA) pooling. We hypothesize that the latent concepts (feature
    maps) describing virality may require such a rich pooling strategy. We assess
    the effectiveness of the LENA layer by appending it on top of a convolutional
    siamese architecture and evaluate its performance on the task of predicting and
    localizing virality. We report experiments on two publicly available datasets
    annotated for virality and show that our method outperforms state-of-the-art
    approaches.

    Gait Pattern Recognition Using Accelerometers

    Vahid Alizadeh
    Comments: 6 pages, project report
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Motion ability is one of the most important human properties, including gait
    as a basis of human transitional movement. Gait, as a biometric for recognizing
    human identities, can be non-intrusively captured signals using wearable or
    portable smart devices. In this study gait patterns is collected using a
    wireless platform of two sensors located at chest and right ankle of the
    subjects. Then the raw data has undergone some preprocessing methods and
    segmented into 5 seconds windows. Some time and frequency domain features is
    extracted and the performance evaluated by 5 different classifiers. Decision
    Tree (with all features) and K-Nearest Neighbors (with 10 selected features)
    classifiers reached 99.4% and 100% respectively.

    Segmentation of skin lesions based on fuzzy classification of pixels and histogram thresholding

    Jose Luis Garcia-Arroyo, Begonya Garcia-Zapirain
    Comments: 5 pages, 6 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    This paper proposes an innovative method for segmentation of skin lesions in
    dermoscopy images developed by the authors, based on fuzzy classification of
    pixels and histogram thresholding.

    Deep Image Matting

    Ning Xu, Brian Price, Scott Cohen, Thomas Huang
    Comments: Computer Vision and Pattern Recognition 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Image matting is a fundamental computer vision problem and has many
    applications. Previous algorithms have poor performance when an image has
    similar foreground and background colors or complicated textures. The main
    reasons are prior methods 1) only use low-level features and 2) lack high-level
    context. In this paper, we propose a novel deep learning based algorithm that
    can tackle both these problems. Our deep model has two parts. The first part is
    a deep convolutional encoder-decoder network that takes an image and the
    corresponding trimap as inputs and predict the alpha matte of the image. The
    second part is a small convolutional network that refines the alpha matte
    predictions of the first network to have more accurate alpha values and sharper
    edges. In addition, we also create a large-scale image matting dataset
    including 49300 training images and 1000 testing images. We evaluate our
    algorithm on the image matting benchmark, our testing set, and a wide variety
    of real images. Experimental results clearly demonstrate the superiority of our
    algorithm over previous methods.

    Depth from Monocular Images using a Semi-Parallel Deep Neural Network (SPDNN) Hybrid Architecture

    S. Bazrafkan (1), H. Javidnia (1), J. Lemley (1), P. Corcoran (1) ((1) Department of Electrical & Electronic Engineering, College of Engineering and Informatics, National University of Ireland, Galway)
    Comments: 15 pages, 24 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Computing pixel depth values provide a basis for understanding the 3D
    geometrical structure of an image. As it has been presented in recent research,
    using stereo images provides an accurate depth due to the advantage of having
    local correspondences; however, the processing time of these methods are still
    an open issue. To solve this problem, it has been suggested to use single
    images to compute the depth values but extracting depth from monocular images
    requires extracting a large number of cues from the global and local
    information in the image. This challenge has been studied for a decade and it
    is still an open problem. Recently the idea of using neural networks to solve
    this problem has attracted attention. In this paper, we tackle this challenge
    by employing a Deep Neural Network (DNN) equipped with semantic pixel-wise
    segmentation utilizing our recently published disparity post-processing method.
    Four models are trained in this study and they have been evaluated at 2 stages
    on KITTI dataset. The ground truth images in the first part of the experiment
    come from the benchmark and for the second part, the ground truth images are
    considered to be the disparity results from applying a state-of-art stereo
    matching method. The results of this evaluation demonstrate that using
    post-processing techniques to refine the target of the network increases the
    accuracy of depth estimation on individual mono images. The second evaluation
    shows that using segmentation data as the input can improve the depth
    estimation results to a point where performance is comparable with stereo depth
    matching.

    Development of An Android Application for Object Detection Based on Color, Shape, or Local Features

    Lamiaa A. Elrefaei, Mona Omar Al-musawa, Norah Abdullah Al-gohany
    Journal-ref: The International Journal of Multimedia & Its Applications (IJMA)
    Vol.9, No.1, February 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Object detection and recognition is an important task in many computer vision
    applications. In this paper an Android application was developed using Eclipse
    IDE and OpenCV3 Library. This application is able to detect objects in an image
    that is loaded from the mobile gallery, based on its color, shape, or local
    features. The image is processed in the HSV color domain for better color
    detection. Circular shapes are detected using Circular Hough Transform and
    other shapes are detected using Douglas-Peucker algorithm. BRISK (binary robust
    invariant scalable keypoints) local features were applied in the developed
    Android application for matching an object image in another scene image. The
    steps of the proposed detection algorithms are described, and the interfaces of
    the application are illustrated. The application is ported and tested on Galaxy
    S3, S6, and Note1 Smartphones. Based on the experimental results, the
    application is capable of detecting eleven different colors, detecting two
    dimensional geometrical shapes including circles, rectangles, triangles, and
    squares, and correctly match local features of object and scene images for
    different conditions. The application could be used as a standalone
    application, or as a part of another application such as Robot systems, traffic
    systems, e-learning applications, information retrieval and many others.

    Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs

    Michael Gygli, Mohammad Norouzi, Anelia Angelova
    Comments: Submitted to ICML 2017
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

    We approach structured output prediction by learning a deep value network
    (DVN) that evaluates different output structures for a given input. For
    example, when applied to image segmentation, the value network takes an image
    and a segmentation mask as inputs and predicts a scalar score evaluating the
    mask quality and its correspondence with the image. Once the value network is
    optimized, at inference, it finds output structures that maximize the score of
    the value net via gradient descent on continuous relaxations of structured
    outputs. Thus DVN takes advantage of the joint modeling of the inputs and
    outputs. Our framework applies to a wide range of structured output prediction
    problems. We conduct experiments on multi-label classification based on text
    data and on image segmentation problems. DVN outperforms several strong
    baselines and the state-of-the-art results on these benchmarks. In addition, on
    image segmentation, the proposed deep value network learns complex shape priors
    and effectively combines image information with the prior to obtain competitive
    segmentation results.

    A 3D Object Detection and Pose Estimation Pipeline Using RGB-D Images

    Ruotao He, Juan Rojas, Yisheng Guan
    Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)

    3D object detection and pose estimation has been studied extensively in
    recent decades for its potential applications in robotics. However, there still
    remains challenges when we aim at detecting multiple objects while retaining
    low false positive rate in cluttered environments. This paper proposes a robust
    3D object detection and pose estimation pipeline based on RGB-D images, which
    can detect multiple objects simultaneously while reducing false positives.
    Detection begins with template matching and yields a set of template matches. A
    clustering algorithm then groups templates of similar spatial location and
    produces multiple-object hypotheses. A scoring function evaluates the
    hypotheses using their associated templates and non-maximum suppression is
    adopted to remove duplicate results based on the scores. Finally, a combination
    of point cloud processing algorithms are used to compute objects’ 3D poses.
    Existing object hypotheses are verified by computing the overlap between model
    and scene points. Experiments demonstrate that our approach provides
    competitive results comparable to the state-of-the-arts and can be applied to
    robot random bin-picking.

    Convolutional Spike Timing Dependent Plasticity based Feature Learning in Spiking Neural Networks

    Priyadarshini Panda, Gopalakrishnan Srinivasan, Kaushik Roy
    Comments: 11 pages, 10 figures, Under Consideration in Scientific Reports
    Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

    Brain-inspired learning models attempt to mimic the cortical architecture and
    computations performed in the neurons and synapses constituting the human brain
    to achieve its efficiency in cognitive tasks. In this work, we present
    convolutional spike timing dependent plasticity based feature learning with
    biologically plausible leaky-integrate-and-fire neurons in Spiking Neural
    Networks (SNNs). We use shared weight kernels that are trained to encode
    representative features underlying the input patterns thereby improving the
    sparsity as well as the robustness of the learning model. We demonstrate that
    the proposed unsupervised learning methodology learns several visual categories
    for object recognition with fewer number of examples and outperforms
    traditional fully-connected SNN architectures while yielding competitive
    accuracy. Additionally, we observe that the learning model performs out-of-set
    generalization further making the proposed biologically plausible framework a
    viable and efficient architecture for future neuromorphic applications.


    Artificial Intelligence

    Cost-Based Intuitionist Probabilities on Spaces of Graphs, Hypergraphs and Theorems

    Ben Goertzel
    Subjects: Artificial Intelligence (cs.AI)

    A novel partial order is defined on the space of digraphs or hypergraphs,
    based on assessing the cost of producing a graph via a sequence of elementary
    transformations. Leveraging work by Knuth and Skilling on the foundations of
    inference, and the structure of Heyting algebras on graph space, this partial
    order is used to construct an intuitionistic probability measure that applies
    to either digraphs or hypergraphs. As logical inference steps can be
    represented as transformations on hypergraphs representing logical statements,
    this also yields an intuitionistic probability measure on spaces of theorems.
    The central result is also extended to yield intuitionistic probabilities based
    on more general weighted rule systems defined over bicartesian closed
    categories.

    Symbol Grounding via Chaining of Morphisms

    Ruiting Lian, Ben Goertzel, Linas Vepstas, David Hanson, Changle Zhou
    Subjects: Artificial Intelligence (cs.AI)

    A new model of symbol grounding is presented, in which the structures of
    natural language, logical semantics, perception and action are represented
    categorically, and symbol grounding is modeled via the composition of morphisms
    between the relevant categories. This model gives conceptual insight into the
    fundamentally systematic nature of symbol grounding, and also connects
    naturally to practical real-world AI systems in current research and commercial
    use. Specifically, it is argued that the structure of linguistic syntax can be
    modeled as a certain asymmetric monoidal category, as e.g. implicit in the link
    grammar formalism; the structure of spatiotemporal relationships and action
    plans can be modeled similarly using “image grammars” and “action grammars”;
    and common-sense logical semantic structure can be modeled using
    dependently-typed lambda calculus with uncertain truth values. Given these
    formalisms, the grounding of linguistic descriptions in spatiotemporal
    perceptions and coordinated actions consists of following morphisms from
    language to logic through to spacetime and body (for comprehension), and vice
    versa (for generation). The mapping is indicated between the spatial
    relationships in the Region Connection Calculus and Allen Interval Algebra and
    corresponding entries in the link grammar syntax parsing dictionary. Further,
    the abstractions introduced here are shown to naturally model the structures
    and systems currently being deployed in the context of using the OpenCog
    cognitive architecture to control Hanson Robotics humanoid robots.

    Toward a Formal Model of Cognitive Synergy

    Ben Goertzel
    Subjects: Artificial Intelligence (cs.AI)

    “Cognitive synergy” refers to a dynamic in which multiple cognitive
    processes, cooperating to control the same cognitive system, assist each other
    in overcoming bottlenecks encountered during their internal processing.
    Cognitive synergy has been posited as a key feature of real-world general
    intelligence, and has been used explicitly in the design of the OpenCog
    cognitive architecture. Here category theory and related concepts are used to
    give a formalization of the cognitive synergy concept.

    A series of formal models of intelligent agents is proposed, with increasing
    specificity and complexity: simple reinforcement learning agents; “cognit”
    agents with an abstract memory and processing model; hypergraph-based agents
    (in which “cognit” operations are carried out via hypergraphs); hypergraph
    agents with a rich language of nodes and hyperlinks (such as the OpenCog
    framework provides); “PGMC” agents whose rich hypergraphs are endowed with
    cognitive processes guided via Probabilistic Growth and Mining of Combinations;
    and finally variations of the PrimeAGI design, which is currently being built
    on top of OpenCog.

    A notion of cognitive synergy is developed for cognitive processes acting
    within PGMC agents, based on developing a formal notion of “stuckness,” and
    defining synergy as a relationship between cognitive processes in which they
    can help each other out when they get stuck. It is proposed that cognitive
    processes relating to each other synergetically, associate in a certain way
    with functors that map into each other via natural transformations. Cognitive
    synergy is proposed to correspond to a certain inequality regarding the
    relative costs of different paths through certain commutation diagrams.

    Applications of this notion of cognitive synergy to particular cognitive
    phenomena, and specific cognitive processes in the PrimeAGI design, are
    discussed.

    Numerical Integration and Dynamic Discretization in Heuristic Search Planning over Hybrid Domains

    Miquel Ramirez, Enrico Scala, Patrik Haslum, Sylvie Thiebaux
    Comments: 17 pages
    Subjects: Artificial Intelligence (cs.AI)

    In this paper we look into the problem of planning over hybrid domains, where
    change can be both discrete and instantaneous, or continuous over time. In
    addition, it is required that each state on the trajectory induced by the
    execution of plans complies with a given set of global constraints. We approach
    the computation of plans for such domains as the problem of searching over a
    deterministic state model. In this model, some of the successor states are
    obtained by solving numerically the so-called initial value problem over a set
    of ordinary differential equations (ODE) given by the current plan prefix.
    These equations hold over time intervals whose duration is determined
    dynamically, according to whether zero crossing events take place for a set of
    invariant conditions. The resulting planner, FS+, incorporates these features
    together with effective heuristic guidance. FS+ does not impose any of the
    syntactic restrictions on process effects often found on the existing
    literature on Hybrid Planning. A key concept of our approach is that a clear
    separation is struck between planning and simulation time steps. The former is
    the time allowed to observe the evolution of a given dynamical system before
    committing to a future course of action, whilst the later is part of the model
    of the environment. FS+ is shown to be a robust planner over a diverse set of
    hybrid domains, taken from the existing literature on hybrid planning and
    systems.

    Any-Angle Pathfinding for Multiple Agents Based on SIPP Algorithm

    Konstantin Yakovlev, Anton Andreychuk
    Comments: Final version as submitted to ICAPS-2017 (main track); 8 pages; 4 figures; 1 algorithm; 2 tables
    Subjects: Artificial Intelligence (cs.AI)

    The problem of finding conflict-free trajectories for multiple agents of
    identical circular shape, operating in shared 2D workspace, is addressed in the
    paper and decoupled, e.g., prioritized, approach is used to solve this problem.
    Agents’ workspace is tessellated into the square grid on which any-angle moves
    are allowed, e.g. each agent can move into an arbitrary direction as long as
    this move follows the straight line segment whose endpoints are tied to the
    distinct grid elements. A novel any-angle planner based on Safe Interval Path
    Planning (SIPP) algorithm is proposed to find trajectories for an agent moving
    amidst dynamic obstacles (other agents) on a grid. This algorithm is then used
    as part of a prioritized multi-agent planner AA-SIPP(m). On the theoretical,
    side we show that AA-SIPP(m) is complete under well-defined conditions. On the
    experimental side, in simulation tests with up to 200 agents involved, we show
    that our planner finds much better solutions in terms of cost (up to 20%)
    compared to the planners relying on cardinal moves only.

    BetaRun 2017 Team Description Paper: Variety, Complexity, and Learning

    Olivia Michael, Oliver Obst
    Comments: Submitted as Team Description Paper for RoboCup 2017
    Subjects: Artificial Intelligence (cs.AI)

    RoboCup offers a set of benchmark problems for Artificial Intelligence in
    form of official world championships since 1997. The most tactical advanced and
    richest in terms of behavioural complexity of these is the 2D Soccer Simulation
    League, a simulated robotic soccer competition. BetaRun is a new attempt
    combining both machine learning and manual programming approaches, with the
    ultimate goal to arrive at a team that is trained entirely from observing and
    playing games, and a successor of the World Champion team Gliders 2016.

    Micro-Objective Learning : Accelerating Deep Reinforcement Learning through the Discovery of Continuous Subgoals

    Sungtae Lee, Sang-Woo Lee, Jinyoung Choi, Dong-Hyun Kwak, Byoung-Tak Zhang
    Subjects: Artificial Intelligence (cs.AI)

    Recently, reinforcement learning has been successfully applied to the logical
    game of Go, various Atari games, and even a 3D game, Labyrinth, though it
    continues to have problems in sparse reward settings. It is difficult to
    explore, but also difficult to exploit, a small number of successes when
    learning policy. To solve this issue, the subgoal and option framework have
    been proposed. However, discovering subgoals online is too expensive to be used
    to learn options in large state spaces. We propose Micro-objective learning
    (MOL) to solve this problem. The main idea is to estimate how important a state
    is while training and to give an additional reward proportional to its
    importance. We evaluated our algorithm in two Atari games: Montezuma’s Revenge
    and Seaquest. With three experiments to each game, MOL significantly improved
    the baseline scores. Especially in Montezuma’s Revenge, MOL achieved two times
    better results than the previous state-of-the-art model.

    Axioms in Model-based Planners

    Shuwa Miura, Alex Fukunaga
    Subjects: Artificial Intelligence (cs.AI)

    Axioms can be used to model derived predicates in domain- independent
    planning models. Formulating models which use axioms can sometimes result in
    problems with much smaller search spaces and shorter plans than the original
    model. Previous work on axiom-aware planners focused solely on state- space
    search planners. We propose axiom-aware planners based on answer set
    programming and integer programming. We evaluate them on PDDL domains with
    axioms and show that they can exploit additional expressivity of axioms.

    Front-to-End Bidirectional Heuristic Search with Near-Optimal Node Expansions

    Jingwei Chen, Robert C. Holte, Sandra Zilles, Nathan R. Sturtevant
    Subjects: Artificial Intelligence (cs.AI)

    It is well-known that any admissible unidirectional heuristic search
    algorithm must expand all states whose (f)-value is smaller than the optimal
    solution cost when using a consistent heuristic. Such states are called “surely
    expanded” (s.e.). A recent study characterized s.e. pairs of states for
    bidirectional search with consistent heuristics: if a pair of states is s.e.
    then at least one of the two states must be expanded. In this paper, we derive
    a lower bound, VC, on the minimum possible number of expansions required to
    cover all s.e. pairs, and present a new admissible front-to-end bidirectional
    heuristic search algorithm, Near-Optimal Bidirectional Search (NBS), that is
    guaranteed to do no more than 2VC expansions. We further prove that no
    front-to-end algorithm has a worst case better than 2VC. Experimental results
    demonstrate that NBS competes with or outperforms existing bidirectional search
    algorithms, and often outperforms A* as well.

    Task-based End-to-end Model Learning

    Priya L. Donti, Brandon Amos, J. Zico Kolter
    Comments: Submitted to ICML 2017. Code available at this https URL
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

    As machine learning techniques have become more ubiquitous, it has become
    common to see machine learning prediction algorithms operating within some
    larger process. However, the criteria by which we train machine learning
    algorithms often differ from the ultimate criteria on which we evaluate them.
    This paper proposes an end-to-end approach for learning probabilistic machine
    learning models within the context of stochastic programming, in a manner that
    directly captures the ultimate task-based objective for which they will be
    used. We then present two experimental evaluations of the proposed approach,
    one as applied to a generic inventory stock problem and the second to a
    real-world electrical grid scheduling task. In both cases, we show that the
    proposed approach can outperform both a traditional modeling approach and a
    purely black-box policy optimization approach.

    High-Throughput and Language-Agnostic Entity Disambiguation and Linking on User Generated Data

    Preeti Bhargava, Nemanja Spasojevic, Guoning Hu
    Comments: 10 pages, 7 figures, 5 tables, WWW2017, Linked Data on the Web workshop 2017, LDOW’17
    Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

    The Entity Disambiguation and Linking (EDL) task matches entity mentions in
    text to a unique Knowledge Base (KB) identifier such as a Wikipedia or Freebase
    id. It plays a critical role in the construction of a high quality information
    network, and can be further leveraged for a variety of information retrieval
    and NLP tasks such as text categorization and document tagging. EDL is a
    complex and challenging problem due to ambiguity of the mentions and real world
    text being multi-lingual. Moreover, EDL systems need to have high throughput
    and should be lightweight in order to scale to large datasets and run on
    off-the-shelf machines. More importantly, these systems need to be able to
    extract and disambiguate dense annotations from the data in order to enable an
    Information Retrieval or Extraction task running on the data to be more
    efficient and accurate. In order to address all these challenges, we present
    the Lithium EDL system and algorithm – a high-throughput, lightweight,
    language-agnostic EDL system that extracts and correctly disambiguates 75% more
    entities than state-of-the-art EDL systems and is significantly faster than
    them.

    Reinforcement Learning for Transition-Based Mention Detection

    Georgiana Dinu, Wael Hamza, Radu Florian
    Comments: Deep Reinforcement Learning Workshop, NIPS 2016
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

    This paper describes an application of reinforcement learning to the mention
    detection task. We define a novel action-based formulation for the mention
    detection task, in which a model can flexibly revise past labeling decisions by
    grouping together tokens and assigning partial mention labels. We devise a
    method to create mention-level episodes and we train a model by rewarding
    correctly labeled complete mentions, irrespective of the inner structure
    created. The model yields results which are on par with a competitive
    supervised counterpart while being more flexible in terms of achieving targeted
    behavior through reward modeling and generating internal mention structure,
    especially on longer mentions.

    Bayesian Optimization with Gradients

    Jian Wu, Matthias Poloczek, Andrew Gordon Wilson, Peter I. Frazier
    Comments: 16 pages, 5 figures
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG); Optimization and Control (math.OC)

    In recent years, Bayesian optimization has proven successful for global
    optimization of expensive-to-evaluate multimodal objective functions. However,
    unlike most optimization methods, Bayesian optimization typically does not use
    derivative information. In this paper we show how Bayesian optimization can
    exploit derivative information to decrease the number of objective function
    evaluations required for good performance. In particular, we develop a novel
    Bayesian optimization algorithm, the derivative-enabled knowledge-gradient
    (dKG), for which we show one-step Bayes-optimality, asymptotic consistency, and
    greater one-step value of information than is possible in the derivative-free
    setting. Our procedure accommodates noisy and incomplete derivative
    information, and comes in both sequential and batch forms. We show dKG provides
    state-of-the-art performance compared to a wide range of optimization
    procedures with and without gradients, on benchmarks including logistic
    regression, kernel learning, and k-nearest neighbors.

    Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs

    Michael Gygli, Mohammad Norouzi, Anelia Angelova
    Comments: Submitted to ICML 2017
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

    We approach structured output prediction by learning a deep value network
    (DVN) that evaluates different output structures for a given input. For
    example, when applied to image segmentation, the value network takes an image
    and a segmentation mask as inputs and predicts a scalar score evaluating the
    mask quality and its correspondence with the image. Once the value network is
    optimized, at inference, it finds output structures that maximize the score of
    the value net via gradient descent on continuous relaxations of structured
    outputs. Thus DVN takes advantage of the joint modeling of the inputs and
    outputs. Our framework applies to a wide range of structured output prediction
    problems. We conduct experiments on multi-label classification based on text
    data and on image segmentation problems. DVN outperforms several strong
    baselines and the state-of-the-art results on these benchmarks. In addition, on
    image segmentation, the proposed deep value network learns complex shape priors
    and effectively combines image information with the prior to obtain competitive
    segmentation results.

    A Hierarchical Framework of Cloud Resource Allocation and Power Management Using Deep Reinforcement Learning

    Ning Liu, Zhe Li, Zhiyuan Xu, Jielong Xu, Sheng Lin, Qinru Qiu, Jian Tang, Yanzhi Wang
    Comments: accepted by 37th IEEE International Conference on Distributed Computing (ICDCS 2017)
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI)

    Automatic decision-making approaches, such as reinforcement learning (RL),
    have been applied to (partially) solve the resource allocation problem
    adaptively in the cloud computing system. However, a complete cloud resource
    allocation framework exhibits high dimensions in state and action spaces, which
    prohibit the usefulness of traditional RL techniques. In addition, high power
    consumption has become one of the critical concerns in design and control of
    cloud computing systems, which degrades system reliability and increases
    cooling cost. An effective dynamic power management (DPM) policy should
    minimize power consumption while maintaining performance degradation within an
    acceptable level. Thus, a joint virtual machine (VM) resource allocation and
    power management framework is critical to the overall cloud computing system.
    Moreover, novel solution framework is necessary to address the even higher
    dimensions in state and action spaces. In this paper, we propose a novel
    hierarchical framework for solving the overall resource allocation and power
    management problem in cloud computing systems. The proposed hierarchical
    framework comprises a global tier for VM resource allocation to the servers and
    a local tier for distributed power management of local servers. The emerging
    deep reinforcement learning (DRL) technique, which can deal with complicated
    control problems with large state space, is adopted to solve the global tier
    problem. Furthermore, an autoencoder and a novel weight sharing structure are
    adopted to handle the high-dimensional state space and accelerate the
    convergence speed. On the other hand, the local tier of distributed server
    power managements comprises an LSTM based workload predictor and a model-free
    RL based power manager, operating in a distributed manner.

    Real-Time Machine Learning: The Missing Pieces

    Robert Nishihara, Philipp Moritz, Stephanie Wang, Alexey Tumanov, William Paul, Johann Schleier-Smith, Richard Liaw, Michael I. Jordan, Ion Stoica
    Comments: 6 pages, 3 figures
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Machine learning applications are increasingly deployed not only to serve
    predictions using static models, but also as tightly-integrated components of
    feedback loops involving dynamic, real-time decision making. These applications
    pose a new set of requirements, none of which are difficult to achieve in
    isolation, but the combination of which creates a challenge for existing
    distributed execution frameworks: computation with millisecond latency at high
    throughput, adaptive construction of arbitrary task graphs, and execution of
    heterogeneous kernels over diverse sets of resources. We assert that a new
    distributed execution framework is needed for such ML applications and propose
    a candidate approach with a proof-of-concept architecture that achieves a 63x
    performance improvement over a state-of-the-art execution framework for a
    representative application.

    The Curse of Correlation in Security Games and Principle of Max-Entropy

    Haifeng Xu, Milind Tambe, Shaddin Dughmi, Venil Loyd Noronha
    Subjects: Computer Science and Game Theory (cs.GT); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR)

    In this paper, we identify and study a fundamental, yet underexplored,
    phenomenon in security games, which we term the Curse of Correlation (CoC).
    Specifically, we observe that there is inevitable correlation among the
    protection status of different targets. Such correlation is a crucial concern,
    especially in spatio-temporal domains like conservation area patrolling, where
    attackers can monitor patrollers at certain areas and then infer their
    patrolling routes using such correlation. To mitigate this issue, we introduce
    the principle of max-entropy to security games, and focus on designing
    entropy-maximizing defending strategies for the spatio-temporal security game
    — a major victim of CoC. We prove that the problem is #P-hard in general, but
    propose efficient algorithms in well-motivated special settings. Our
    experiments show significant advantages of the max-entropy algorithms against
    previous algorithms.

    Segmentation of skin lesions based on fuzzy classification of pixels and histogram thresholding

    Jose Luis Garcia-Arroyo, Begonya Garcia-Zapirain
    Comments: 5 pages, 6 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    This paper proposes an innovative method for segmentation of skin lesions in
    dermoscopy images developed by the authors, based on fuzzy classification of
    pixels and histogram thresholding.

    Evolution Strategies as a Scalable Alternative to Reinforcement Learning

    Tim Salimans, Jonathan Ho, Xi Chen, Ilya Sutskever
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    We explore the use of Evolution Strategies, a class of black box optimization
    algorithms, as an alternative to popular RL techniques such as Q-learning and
    Policy Gradients. Experiments on MuJoCo and Atari show that ES is a viable
    solution strategy that scales extremely well with the number of CPUs available:
    By using hundreds to thousands of parallel workers, ES can solve 3D humanoid
    walking in 10 minutes and obtain competitive results on most Atari games after
    one hour of training time. In addition, we highlight several advantages of ES
    as a black box optimization technique: it is invariant to action frequency and
    delayed rewards, tolerant of extremely long horizons, and does not need
    temporal discounting or value function approximation.

    Convolutional Spike Timing Dependent Plasticity based Feature Learning in Spiking Neural Networks

    Priyadarshini Panda, Gopalakrishnan Srinivasan, Kaushik Roy
    Comments: 11 pages, 10 figures, Under Consideration in Scientific Reports
    Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

    Brain-inspired learning models attempt to mimic the cortical architecture and
    computations performed in the neurons and synapses constituting the human brain
    to achieve its efficiency in cognitive tasks. In this work, we present
    convolutional spike timing dependent plasticity based feature learning with
    biologically plausible leaky-integrate-and-fire neurons in Spiking Neural
    Networks (SNNs). We use shared weight kernels that are trained to encode
    representative features underlying the input patterns thereby improving the
    sparsity as well as the robustness of the learning model. We demonstrate that
    the proposed unsupervised learning methodology learns several visual categories
    for object recognition with fewer number of examples and outperforms
    traditional fully-connected SNN architectures while yielding competitive
    accuracy. Additionally, we observe that the learning model performs out-of-set
    generalization further making the proposed biologically plausible framework a
    viable and efficient architecture for future neuromorphic applications.


    Information Retrieval

    High-Throughput and Language-Agnostic Entity Disambiguation and Linking on User Generated Data

    Preeti Bhargava, Nemanja Spasojevic, Guoning Hu
    Comments: 10 pages, 7 figures, 5 tables, WWW2017, Linked Data on the Web workshop 2017, LDOW’17
    Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

    The Entity Disambiguation and Linking (EDL) task matches entity mentions in
    text to a unique Knowledge Base (KB) identifier such as a Wikipedia or Freebase
    id. It plays a critical role in the construction of a high quality information
    network, and can be further leveraged for a variety of information retrieval
    and NLP tasks such as text categorization and document tagging. EDL is a
    complex and challenging problem due to ambiguity of the mentions and real world
    text being multi-lingual. Moreover, EDL systems need to have high throughput
    and should be lightweight in order to scale to large datasets and run on
    off-the-shelf machines. More importantly, these systems need to be able to
    extract and disambiguate dense annotations from the data in order to enable an
    Information Retrieval or Extraction task running on the data to be more
    efficient and accurate. In order to address all these challenges, we present
    the Lithium EDL system and algorithm – a high-throughput, lightweight,
    language-agnostic EDL system that extracts and correctly disambiguates 75% more
    entities than state-of-the-art EDL systems and is significantly faster than
    them.

    A Visual Representation of Wittgenstein's Tractatus Logico-Philosophicus

    Anca Bucur, Sergiu Nisioi
    Comments: Workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH)
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    In this paper we present a data visualization method together with its
    potential usefulness in digital humanities and philosophy of language. We
    compile a multilingual parallel corpus from different versions of
    Wittgenstein’s Tractatus Logico-Philosophicus, including the original in German
    and translations into English, Spanish, French, and Russian. Using this corpus,
    we compute a similarity measure between propositions and render a visual
    network of relations for different languages.

    DeepFM: A Factorization-Machine based Neural Network for CTR Prediction

    Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, Xiuqiang He
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    Learning sophisticated feature interactions behind user behaviors is critical
    in maximizing CTR for recommender systems. Despite great progress, existing
    methods seem to have a strong bias towards low- or high-order interactions, or
    require expertise feature engineering. In this paper, we show that it is
    possible to derive an end-to-end learning model that emphasizes both low- and
    high-order feature interactions. The proposed model, DeepFM, combines the power
    of factorization machines for recommendation and deep learning for feature
    learning in a new neural network architecture. Compared to the latest Wide &
    Deep model from Google, DeepFM has a shared input to its “wide” and “deep”
    parts, with no need of feature engineering besides raw features. Comprehensive
    experiments are conducted to demonstrate the effectiveness and efficiency of
    DeepFM over the existing models for CTR prediction, on both benchmark data and
    commercial data.

    Cognitive Inference of Demographic Data by User Ratings

    Jinliang Xu, Shangguang Wang, Fangchun Yang, Rong N. Chang
    Comments: 12 pages, 6 figures
    Subjects: Information Retrieval (cs.IR); Distributed, Parallel, and Cluster Computing (cs.DC)

    Cognitive inference of user demographics, such as gender and age, plays an
    important role in creating user profiles for adjusting marketing strategies and
    generating personalized recommendations because user demographic data is
    usually not available due to data privacy concerns. At present, users can
    readily express feedback regarding products or services that they have
    purchased. During this process, user demographics are concealed, but the data
    has never yet been successfully utilized to contribute to the cognitive
    inference of user demographics. In this paper, we investigate the inference
    power of user ratings data, and propose a simple yet general cognitive
    inference model, called rating to profile (R2P), to infer user demographics
    from user provided ratings. In particular, the proposed R2P model can achieve
    the following: 1. Correctly integrate user ratings into model training. 2.Infer
    multiple demographic attributes of users simultaneously, capturing the
    underlying relevance between different demographic attributes. 3. Train its two
    components, i.e. feature extractor and classifier, in an integrated manner
    under a supervised learning paradigm, which effectively helps to discover
    useful hidden patterns from highly sparse ratings data. We introduce how to
    incorporate user ratings data into the research field of cognitive inference of
    user demographic data, and detail the model development and optimization
    process for the proposed R2P. Extensive experiments are conducted on two
    real-world ratings datasets against various compared state-of-the-art methods,
    and the results from multiple aspects demonstrate that our proposed R2P model
    can significantly improve on the cognitive inference performance of user
    demographic data.

    Multiple User Context Inference by Fusing Data Sources

    Jinliang Xu, Shangguang Wang, Fangchun Yang, Jie Tang
    Comments: 13 pages, 8 figures
    Subjects: Information Retrieval (cs.IR); Distributed, Parallel, and Cluster Computing (cs.DC)

    Inference of user context information, including user’s gender, age, marital
    status, location and so on, has been proven to be valuable for building context
    aware recommender system. However, prevalent existing studies on user context
    inference have two shortcommings: 1. focusing on only a single data source
    (e.g. Internet browsing logs, or mobile call records), and 2. ignoring the
    interdependence of multiple user contexts (e.g. interdependence between age and
    marital status), which have led to poor inference performance. To solve this
    problem, in this paper, we first exploit tensor outer product to fuse multiple
    data sources in the feature space to obtain an extensional user feature
    representation. Following this, by taking this extensional user feature
    representation as input, we propose a multiple attribute probabilistic model
    called MulAProM to infer user contexts that can take advantage of the
    interdependence between them. Our study is based on large telecommunication
    datasets from the local mobile operator of Shanghai, China, and consists of two
    data sources, 4.6 million call detail records and 7.5 million data traffic
    records of (8,000) mobile users, collected in the course of six months. The
    experimental results show that our model can outperform other models in terms
    of emph{recall}, emph{precision}, and the emph{F1-measure}.

    A German Corpus for Text Similarity Detection Tasks

    Juan-Manuel Torres-Moreno, Gerardo Sierra, Peter Peinl
    Comments: 1 figure; 13 pages
    Journal-ref: Preprint of International Journal of Computational Linguistics and
    Applications, vol. 5, no. 2, 2014, pp. 9-24
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    Text similarity detection aims at measuring the degree of similarity between
    a pair of texts. Corpora available for text similarity detection are designed
    to evaluate the algorithms to assess the paraphrase level among documents. In
    this paper we present a textual German corpus for similarity detection. The
    purpose of this corpus is to automatically assess the similarity between a pair
    of texts and to evaluate different similarity measures, both for whole
    documents or for individual sentences. Therefore we have calculated several
    simple measures on our corpus based on a library of similarity functions.

    Building automated vandalism detection tools for Wikidata

    Amir Sarabadani, Aaron Halfaker, Dario Taraborelli
    Subjects: Information Retrieval (cs.IR); Computers and Society (cs.CY)

    Wikidata, like Wikipedia, is a knowledge base that anyone can edit. This open
    collaboration model is powerful in that it reduces barriers to participation
    and allows a large number of people to contribute. However, it exposes the
    knowledge base to the risk of vandalism and low-quality contributions. In this
    work, we build on past work detecting vandalism in Wikipedia to detect
    vandalism in Wikidata. This work is novel in that identifying damaging changes
    in a structured knowledge-base requires substantially different feature
    engineering work than in a text-based wiki like Wikipedia. We also discuss the
    utility of these classifiers for reducing the overall workload of vandalism
    patrollers in Wikidata. We describe a machine classification strategy that is
    able to catch 89% of vandalism while reducing patrollers’ workload by 98%, by
    drawing lightly from contextual features of an edit and heavily from the
    characteristics of the user making the edit.

    Locality-sensitive hashing of curves

    Anne Driemel, Francesco Silvestri
    Comments: Proc. of 33rd International Symposium on Computational Geometry (SoCG), 2017
    Subjects: Computational Geometry (cs.CG); Data Structures and Algorithms (cs.DS); Information Retrieval (cs.IR)

    We study data structures for storing a set of polygonal curves in ({
    m R}^d)
    such that, given a query curve, we can efficiently retrieve similar curves from
    the set, where similarity is measured using the discrete Fr’echet distance or
    the dynamic time warping distance. To this end we devise the first
    locality-sensitive hashing schemes for these distance measures. A major
    challenge is posed by the fact that these distance measures internally optimize
    the alignment between the curves. We give solutions for different types of
    alignments including constrained and unconstrained versions. For unconstrained
    alignments, we improve over a result by Indyk from 2002 for short curves. Let
    (n) be the number of input curves and let (m) be the maximum complexity of a
    curve in the input. In the particular case where (m leq frac{alpha}{4d} log
    n), for some fixed (alpha>0), our solutions imply an approximate near-neighbor
    data structure for the discrete Fr’echet distance that uses space in
    (O(n^{1+alpha}log n)) and achieves query time in (O(n^{alpha}log^2 n)) and
    constant approximation factor. Furthermore, our solutions provide a trade-off
    between approximation quality and computational performance: for any parameter
    (k in [m]), we can give a data structure that uses space in (O(2^{2k}m^{k-1} n
    log n + nm)), answers queries in (O( 2^{2k} m^{k}log n)) time and achieves
    approximation factor in (O(m/k)).


    Computation and Language

    Reinforcement Learning for Transition-Based Mention Detection

    Georgiana Dinu, Wael Hamza, Radu Florian
    Comments: Deep Reinforcement Learning Workshop, NIPS 2016
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

    This paper describes an application of reinforcement learning to the mention
    detection task. We define a novel action-based formulation for the mention
    detection task, in which a model can flexibly revise past labeling decisions by
    grouping together tokens and assigning partial mention labels. We devise a
    method to create mention-level episodes and we train a model by rewarding
    correctly labeled complete mentions, irrespective of the inner structure
    created. The model yields results which are on par with a competitive
    supervised counterpart while being more flexible in terms of achieving targeted
    behavior through reward modeling and generating internal mention structure,
    especially on longer mentions.

    Geometrical morphology

    John Goldsmith, Eric Rosen
    Comments: 42 pages
    Subjects: Computation and Language (cs.CL)

    We explore inflectional morphology as an example of the relationship of the
    discrete and the continuous in linguistics. The grammar requests a form of a
    lexeme by specifying a set of feature values, which corresponds to a corner M
    of a hypercube in feature value space. The morphology responds to that request
    by providing a morpheme, or a set of morphemes, whose vector sum is
    geometrically closest to the corner M. In short, the chosen morpheme (mu) is
    the morpheme (or set of morphemes) that maximizes the inner product of (mu)
    and M.

    DRAGNN: A Transition-based Framework for Dynamically Connected Neural Networks

    Lingpeng Kong, Chris Alberti, Daniel Andor, Ivan Bogatyy, David Weiss
    Comments: 10 pages; Submitted for review to ACL2017
    Subjects: Computation and Language (cs.CL)

    In this work, we present a compact, modular framework for constructing novel
    recurrent neural architectures. Our basic module is a new generic unit, the
    Transition Based Recurrent Unit (TBRU). In addition to hidden layer
    activations, TBRUs have discrete state dynamics that allow network connections
    to be built dynamically as a function of intermediate activations. By
    connecting multiple TBRUs, we can extend and combine commonly used
    architectures such as sequence-to-sequence, attention mechanisms, and
    re-cursive tree-structured models. A TBRU can also serve as both an encoder for
    downstream tasks and as a decoder for its own task simultaneously, resulting in
    more accurate multi-task learning. We call our approach Dynamic Recurrent
    Acyclic Graphical Neural Networks, or DRAGNN. We show that DRAGNN is
    significantly more accurate and efficient than seq2seq with attention for
    syntactic dependency parsing and yields more accurate multi-task learning for
    extractive summarization tasks.

    El Lenguaje Natural como Lenguaje Formal

    Franco M. Luque
    Comments: survey, 14 pages, 6 figures, in Spanish
    Subjects: Computation and Language (cs.CL); Formal Languages and Automata Theory (cs.FL)

    Formal languages theory is useful for the study of natural language. In
    particular, it is of interest to study the adequacy of the grammatical
    formalisms to express syntactic phenomena present in natural language. First,
    it helps to draw hypothesis about the nature and complexity of the
    speaker-hearer linguistic competence, a fundamental question in linguistics and
    other cognitive sciences. Moreover, from an engineering point of view, it
    allows the knowledge of practical limitations of applications based on those
    formalisms. In this article I introduce the adequacy problem of grammatical
    formalisms for natural language, also introducing some formal language theory
    concepts required for this discussion. Then, I review the formalisms that have
    been proposed in history, and the arguments that have been given to support or
    reject their adequacy.

    —–

    La teor’ia de lenguajes formales es ‘util para el estudio de los lenguajes
    naturales. En particular, resulta de inter’es estudiar la adecuaci’on de los
    formalismos gramaticales para expresar los fen’omenos sint’acticos presentes
    en el lenguaje natural. Primero, ayuda a trazar hip’otesis acerca de la
    naturaleza y complejidad de las competencias ling”u’isticas de los
    hablantes-oyentes del lenguaje, un interrogante fundamental de la
    ling”u’istica y otras ciencias cognitivas. Adem’as, desde el punto de vista
    de la ingenier’ia, permite conocer limitaciones pr’acticas de las
    aplicaciones basadas en dichos formalismos. En este art’iculo hago una
    introducci’on al problema de la adecuaci’on de los formalismos gramaticales
    para el lenguaje natural, introduciendo tambi’en algunos conceptos de la
    teor’ia de lenguajes formales necesarios para esta discusi’on. Luego, hago un
    repaso de los formalismos que han sido propuestos a lo largo de la historia, y
    de los argumentos que se han dado para sostener o refutar su adecuaci’on.

    Nematus: a Toolkit for Neural Machine Translation

    Rico Sennrich, Orhan Firat, Kyunghyun Cho, Alexandra Birch, Barry Haddow, Julian Hitschler, Marcin Junczys-Dowmunt, Samuel Läubli, Antonio Valerio Miceli Barone, Jozef Mokry, Maria Nădejde
    Comments: EACL 2017 demo track
    Subjects: Computation and Language (cs.CL)

    We present Nematus, a toolkit for Neural Machine Translation. The toolkit
    prioritizes high translation accuracy, usability, and extensibility. Nematus
    has been used to build top-performing submissions to shared translation tasks
    at WMT and IWSLT, and has been used to train systems for production
    environments.

    Story Cloze Ending Selection Baselines and Data Examination

    Todor Mihaylov, Anette Frank
    Comments: Submission for the LSDSem 2017 – Linking Models of Lexical, Sentential and Discourse-level Semantics – Shared Task
    Subjects: Computation and Language (cs.CL)

    This paper describes two supervised baseline systems for the Story Cloze Test
    Shared Task (Mostafazadeh et al., 2016a). We first build a classifier using
    features based on word embeddings and semantic similarity computation. We
    further implement a neural LSTM system with different encoding strategies that
    try to model the relation between the story and the provided endings. Our
    experiments show that a model using representation features based on average
    word embedding vectors over the given story words and the candidate ending
    sentences words, joint with similarity features between the story and candidate
    ending representations performed better than the neural models. Our best model
    achieves an accuracy of 72.42, ranking 3rd in the official evaluation.

    MetaPAD: Meta Pattern Discovery from Massive Text Corpora

    Meng Jiang, Jingbo Shang, Taylor Cassidy, Xiang Ren, Lance M. Kaplan, Timothy P. Hanratty, Jiawei Han
    Comments: 9 pages
    Subjects: Computation and Language (cs.CL)

    Mining textual patterns in news, tweets, papers, and many other kinds of text
    corpora has been an active theme in text mining and NLP research. Previous
    studies adopt a dependency parsing-based pattern discovery approach. However,
    the parsing results lose rich context around entities in the patterns, and the
    process is costly for a corpus of large scale. In this study, we propose a
    novel typed textual pattern structure, called meta pattern, which is extended
    to a frequent, informative, and precise subsequence pattern in certain context.
    We propose an efficient framework, called MetaPAD, which discovers meta
    patterns from massive corpora with three techniques: (1) it develops a
    context-aware segmentation method to carefully determine the boundaries of
    patterns with a learnt pattern quality assessment function, which avoids costly
    dependency parsing and generates high-quality patterns; (2) it identifies and
    groups synonymous meta patterns from multiple facets—their types, contexts,
    and extractions; and (3) it examines type distributions of entities in the
    instances extracted by each group of patterns, and looks for appropriate type
    levels to make discovered patterns precise. Experiments demonstrate that our
    proposed framework discovers high-quality typed textual patterns efficiently
    from different genres of massive corpora and facilitates information
    extraction.

    Why we have switched from building full-fledged taxonomies to simply detecting hypernymy relations

    Jose Camacho-Collados
    Comments: Discussion paper. 6 pages, 1 figure
    Subjects: Computation and Language (cs.CL)

    The study of taxonomies and hypernymy relations has been extensive on the
    Natural Language Processing (NLP) literature. However, the evaluation of
    taxonomy learning approaches has traditionally troublesome, as it mainly relies
    on ad-hoc experiments, which are hardly reproducible and manually expensive.
    Partly because of this, current research has been lately focusing on the
    hypernymy detection task. In this paper we reflect on this trend, analyzing
    issues related to current evaluation procedures. Finally, we propose three
    potential avenues for future work so that is-a relations and resources based on
    them play a more important role in downstream NLP applications.

    Automated Hate Speech Detection and the Problem of Offensive Language

    Thomas Davidson, Dana Warmsley, Michael Macy, Ingmar Weber
    Comments: To appear in the Proceedings of ICWSM 2017. Please cite that version
    Subjects: Computation and Language (cs.CL)

    A key challenge for automatic hate-speech detection on social media is the
    separation of hate speech from other instances of offensive language. Lexical
    detection methods tend to have low precision because they classify all messages
    containing particular terms as hate speech and previous work using supervised
    learning has failed to distinguish between the two categories. We used a
    crowd-sourced hate speech lexicon to collect tweets containing hate speech
    keywords. We use crowd-sourcing to label a sample of these tweets into three
    categories: those containing hate speech, only offensive language, and those
    with neither. We train a multi-class classifier to distinguish between these
    different categories. Close analysis of the predictions and the errors shows
    when we can reliably separate hate speech from other offensive language and
    when this differentiation is more difficult. We find that racist and homophobic
    tweets are more likely to be classified as hate speech but that sexist tweets
    are generally classified as offensive. Tweets without explicit hate keywords
    are also more difficult to classify.

    Language Use Matters: Analysis of the Linguistic Structure of Question Texts Can Characterize Answerability in Quora

    Suman Kalyan Maity, Aman Kharb, Animesh Mukherjee
    Comments: 1 figure, 3 tables, ICWSM 2017 as poster
    Subjects: Computation and Language (cs.CL); Social and Information Networks (cs.SI)

    Quora is one of the most popular community Q&A sites of recent times.
    However, many question posts on this Q&A site often do not get answered. In
    this paper, we quantify various linguistic activities that discriminates an
    answered question from an unanswered one. Our central finding is that the way
    users use language while writing the question text can be a very effective
    means to characterize answerability. This characterization helps us to predict
    early if a question remaining unanswered for a specific time period t will
    eventually be answered or not and achieve an accuracy of 76.26% (t = 1 month)
    and 68.33% (t = 3 months). Notably, features representing the language use
    patterns of the users are most discriminative and alone account for an accuracy
    of 74.18%. We also compare our method with some of the similar works (Dror et
    al., Yang et al.) achieving a maximum improvement of ~39% in terms of accuracy.

    Ask Me Even More: Dynamic Memory Tensor Networks (Extended Model)

    Govardana Sachithanandam Ramachandran, Ajay Sohmshetty
    Subjects: Computation and Language (cs.CL); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    We examine Memory Networks for the task of question answering (QA), under
    common real world scenario where training examples are scarce and under weakly
    supervised scenario, that is only extrinsic labels are available for training.
    We propose extensions for the Dynamic Memory Network (DMN), specifically within
    the attention mechanism, we call the resulting Neural Architecture as Dynamic
    Memory Tensor Network (DMTN). Ultimately, we see that our proposed extensions
    results in over 80% improvement in the number of task passed against the
    baselined standard DMN and 20% more task passed compared to state-of-the-art
    End-to-End Memory Network for Facebook’s single task weakly trained 1K bAbi
    dataset.

    Massive Exploration of Neural Machine Translation Architectures

    Denny Britz, Anna Goldie, Thang Luong, Quoc Le
    Comments: 9 pages, 2 figures, 8 tables, submitted to ACL 2017, opensource code at this https URL
    Subjects: Computation and Language (cs.CL)

    Neural Machine Translation (NMT) has shown remarkable progress over the past
    few years with production systems now being deployed to end-users. One major
    drawback of current architectures is that they are expensive to train,
    typically requiring days to weeks of GPU time to converge. This makes
    exhaustive hyperparameter search, as is commonly done with other neural network
    architectures, prohibitively expensive. In this work, we present the first
    large-scale analysis of NMT architecture hyperparameters. We report empirical
    results and variance numbers for several hundred experimental runs,
    corresponding to over 250,000 GPU hours on the standard WMT English to German
    translation task. Our experiments lead to novel insights and practical advice
    for building and extending NMT architectures. As part of this contribution, we
    release an open-source NMT framework that enables researchers to easily
    experiment with novel techniques and reproduce state of the art results.

    Effects of Limiting Memory Capacity on the Behaviour of Exemplar Dynamics

    B. Goodman, P. F. Tupper
    Comments: 14 pages
    Subjects: Computation and Language (cs.CL)

    Exemplar models are a popular class of models used to describe language
    change. Here we study how limiting the memory capacity of an individual in
    these models affects the system’s behaviour. In particular we demonstrate the
    effect this change has on the extinction of categories. Previous work in
    exemplar dynamics has not addressed this question. In order to investigate
    this, we will inspect a simplified exemplar model. We will prove for the
    simplified model that all the sound categories but one will always become
    extinct, whether memory storage is limited or not. However, computer
    simulations show that changing the number of stored memories alters how fast
    categories become extinct.

    Normalisation de la langue et de lecriture arabe : enjeux culturels regionaux et mondiaux

    Henri Hudrisier, Ben Henda Mokhtar
    Comments: 17 p, in French
    Subjects: Computers and Society (cs.CY); Computation and Language (cs.CL)

    Arabic language and writing are now facing a resurgence of international
    normative solutions that challenge most of their local or network based
    operating principles. Even if the multilingual digital coding solutions,
    especially those proposed by Unicode, have solved many difficulties of Arabic
    writing, the linguistic aspect is still in search of more adapted solutions.
    Terminology is one of the sectors in which the Arabic language requires a deep
    modernization of its classical productivity models. The normative approach, in
    particular that of the ISO TC37, is proposed as one of the solutions that would
    allow it to combine with international standards to better integrate the
    knowledge society under construction.

    La langue et lecriture arabe sont aujourdhui confrontees a une recrudescence
    de solutions normatives internationales qui remettent en cause la plupart de
    leurs principes de fonctionnement en site ou sur les reseaux. Meme si les
    solutions du codage numerique multilingue, notamment celles proposees par
    Unicode, ont resolu beaucoup de difficultes de lecriture arabe, le volet
    linguistique est encore en quete de solutions plus adaptees. La terminologie
    est lun des secteurs dans lequel la langue arabe necessite une modernisation
    profonde de ses modeles classiques de production. La voie normative, notamment
    celle du TC37 de ISO, est proposee comme une des solutions qui lui permettrait
    de se mettre en synergie avec les referentiels internationaux pour mieux
    integrer la societe du savoir en voie de construction.

    High-Throughput and Language-Agnostic Entity Disambiguation and Linking on User Generated Data

    Preeti Bhargava, Nemanja Spasojevic, Guoning Hu
    Comments: 10 pages, 7 figures, 5 tables, WWW2017, Linked Data on the Web workshop 2017, LDOW’17
    Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

    The Entity Disambiguation and Linking (EDL) task matches entity mentions in
    text to a unique Knowledge Base (KB) identifier such as a Wikipedia or Freebase
    id. It plays a critical role in the construction of a high quality information
    network, and can be further leveraged for a variety of information retrieval
    and NLP tasks such as text categorization and document tagging. EDL is a
    complex and challenging problem due to ambiguity of the mentions and real world
    text being multi-lingual. Moreover, EDL systems need to have high throughput
    and should be lightweight in order to scale to large datasets and run on
    off-the-shelf machines. More importantly, these systems need to be able to
    extract and disambiguate dense annotations from the data in order to enable an
    Information Retrieval or Extraction task running on the data to be more
    efficient and accurate. In order to address all these challenges, we present
    the Lithium EDL system and algorithm – a high-throughput, lightweight,
    language-agnostic EDL system that extracts and correctly disambiguates 75% more
    entities than state-of-the-art EDL systems and is significantly faster than
    them.

    DeepFM: A Factorization-Machine based Neural Network for CTR Prediction

    Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, Xiuqiang He
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    Learning sophisticated feature interactions behind user behaviors is critical
    in maximizing CTR for recommender systems. Despite great progress, existing
    methods seem to have a strong bias towards low- or high-order interactions, or
    require expertise feature engineering. In this paper, we show that it is
    possible to derive an end-to-end learning model that emphasizes both low- and
    high-order feature interactions. The proposed model, DeepFM, combines the power
    of factorization machines for recommendation and deep learning for feature
    learning in a new neural network architecture. Compared to the latest Wide &
    Deep model from Google, DeepFM has a shared input to its “wide” and “deep”
    parts, with no need of feature engineering besides raw features. Comprehensive
    experiments are conducted to demonstrate the effectiveness and efficiency of
    DeepFM over the existing models for CTR prediction, on both benchmark data and
    commercial data.

    Feature overwriting as a finite mixture process: Evidence from comprehension data

    Shravan Vasishth, Lena A. Jaeger, Bruno Nicenboim
    Comments: 6 pages, 2 figures, 1 table, submitted to MathPsych/ICCM 2017, Warwick, UK
    Subjects: Machine Learning (stat.ML); Computation and Language (cs.CL); Applications (stat.AP)

    The ungrammatical sentence “The key to the cabinets are on the table” is
    known to lead to an illusion of grammaticality. As discussed in the
    meta-analysis by Jaeger et al., 2017, faster reading times are observed at the
    verb are in the agreement-attraction sentence above compared to the equally
    ungrammatical sentence “The key to the cabinet are on the table”. One
    explanation for this facilitation effect is the feature percolation account:
    the plural feature on cabinets percolates up to the head noun key, leading to
    the illusion. An alternative account is in terms of cue-based retrieval (Lewis
    & Vasishth, 2005), which assumes that the non-subject noun cabinets is
    misretrieved due to a partial feature-match when a dependency completion
    process at the auxiliary initiates a memory access for a subject with plural
    marking. We present evidence for yet another explanation for the observed
    facilitation. Because the second sentence has two nouns with identical number,
    it is possible that these are, in some proportion of trials, more difficult to
    keep distinct, leading to slower reading times at the verb in the first
    sentence above; this is the feature overwriting account of Nairne, 1990. We
    show that the feature overwriting proposal can be implemented as a finite
    mixture process. We reanalysed ten published data-sets, fitting hierarchical
    Bayesian mixture models to these data assuming a two-mixture distribution. We
    show that in nine out of the ten studies, a mixture distribution corresponding
    to feature overwriting furnishes a superior fit over both the feature
    percolation and the cue-based retrieval accounts.

    A German Corpus for Text Similarity Detection Tasks

    Juan-Manuel Torres-Moreno, Gerardo Sierra, Peter Peinl
    Comments: 1 figure; 13 pages
    Journal-ref: Preprint of International Journal of Computational Linguistics and
    Applications, vol. 5, no. 2, 2014, pp. 9-24
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    Text similarity detection aims at measuring the degree of similarity between
    a pair of texts. Corpora available for text similarity detection are designed
    to evaluate the algorithms to assess the paraphrase level among documents. In
    this paper we present a textual German corpus for similarity detection. The
    purpose of this corpus is to automatically assess the similarity between a pair
    of texts and to evaluate different similarity measures, both for whole
    documents or for individual sentences. Therefore we have calculated several
    simple measures on our corpus based on a library of similarity functions.


    Distributed, Parallel, and Cluster Computing

    A Hierarchical Framework of Cloud Resource Allocation and Power Management Using Deep Reinforcement Learning

    Ning Liu, Zhe Li, Zhiyuan Xu, Jielong Xu, Sheng Lin, Qinru Qiu, Jian Tang, Yanzhi Wang
    Comments: accepted by 37th IEEE International Conference on Distributed Computing (ICDCS 2017)
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI)

    Automatic decision-making approaches, such as reinforcement learning (RL),
    have been applied to (partially) solve the resource allocation problem
    adaptively in the cloud computing system. However, a complete cloud resource
    allocation framework exhibits high dimensions in state and action spaces, which
    prohibit the usefulness of traditional RL techniques. In addition, high power
    consumption has become one of the critical concerns in design and control of
    cloud computing systems, which degrades system reliability and increases
    cooling cost. An effective dynamic power management (DPM) policy should
    minimize power consumption while maintaining performance degradation within an
    acceptable level. Thus, a joint virtual machine (VM) resource allocation and
    power management framework is critical to the overall cloud computing system.
    Moreover, novel solution framework is necessary to address the even higher
    dimensions in state and action spaces. In this paper, we propose a novel
    hierarchical framework for solving the overall resource allocation and power
    management problem in cloud computing systems. The proposed hierarchical
    framework comprises a global tier for VM resource allocation to the servers and
    a local tier for distributed power management of local servers. The emerging
    deep reinforcement learning (DRL) technique, which can deal with complicated
    control problems with large state space, is adopted to solve the global tier
    problem. Furthermore, an autoencoder and a novel weight sharing structure are
    adopted to handle the high-dimensional state space and accelerate the
    convergence speed. On the other hand, the local tier of distributed server
    power managements comprises an LSTM based workload predictor and a model-free
    RL based power manager, operating in a distributed manner.

    Big Data in HEP: A comprehensive use case study

    Oliver Gutsche (1), Matteo Cremonesi (1), Peter Elmer (2), Bo Jayatilaka (1), Jim Kowalkowski (1), Jim Pivarski (2), Saba Sehrish (1), Cristina Mantilla Surez (3), Alexey Svyatkovskiy (2), Nhan Tran (1) ((1) Fermi National Accelerator Laboratory, (2) Princeton University, (3) Fermi National Accelerator Laboratory now Johns Hopkins University)
    Comments: Proceedings for 22nd International Conference on Computing in High Energy and Nuclear Physics (CHEP 2016)
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Experimental Particle Physics has been at the forefront of analyzing the
    worlds largest datasets for decades. The HEP community was the first to develop
    suitable software and computing tools for this task. In recent times, new
    toolkits and systems collectively called Big Data technologies have emerged to
    support the analysis of Petabyte and Exabyte datasets in industry. While the
    principles of data analysis in HEP have not changed (filtering and transforming
    experiment-specific data formats), these new technologies use different
    approaches and promise a fresh look at analysis of very large datasets and
    could potentially reduce the time-to-physics with increased interactivity. In
    this talk, we present an active LHC Run 2 analysis, searching for dark matter
    with the CMS detector, as a testbed for Big Data technologies. We directly
    compare the traditional NTuple-based analysis with an equivalent analysis using
    Apache Spark on the Hadoop ecosystem and beyond. In both cases, we start the
    analysis with the official experiment data formats and produce publication
    physics plots. We will discuss advantages and disadvantages of each approach
    and give an outlook on further studies needed.

    Real-Time Machine Learning: The Missing Pieces

    Robert Nishihara, Philipp Moritz, Stephanie Wang, Alexey Tumanov, William Paul, Johann Schleier-Smith, Richard Liaw, Michael I. Jordan, Ion Stoica
    Comments: 6 pages, 3 figures
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Machine learning applications are increasingly deployed not only to serve
    predictions using static models, but also as tightly-integrated components of
    feedback loops involving dynamic, real-time decision making. These applications
    pose a new set of requirements, none of which are difficult to achieve in
    isolation, but the combination of which creates a challenge for existing
    distributed execution frameworks: computation with millisecond latency at high
    throughput, adaptive construction of arbitrary task graphs, and execution of
    heterogeneous kernels over diverse sets of resources. We assert that a new
    distributed execution framework is needed for such ML applications and propose
    a candidate approach with a proof-of-concept architecture that achieves a 63x
    performance improvement over a state-of-the-art execution framework for a
    representative application.

    DotDFS: A Grid-based high-throughput file transfer system

    Alireza Poshtkohia, M.B. Ghaznavi-Ghoushchi
    Comments: 28 pages, 21 figures
    Journal-ref: Parallel Computing, 37 (2011) 114-136
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    DotGrid platform is a Grid infrastructure integrated with a set of open and
    standard protocols recently implemented on the top of Microsoft .NET in Windows
    and MONO .NET in UNIX/Linux. DotGrid infrastructure along with its proposed
    protocols provides a right and solid approach to targeting other platforms,
    e.g., the native C/C++ runtime. In this paper, we propose a new file transfer
    protocol called DotDFS as a high-throughput distributed file transfer component
    for DotGrid. DotDFS introduces some open binary protocols for efficient file
    transfers on current Grid infrastructures. DotDFS protocol also provides
    mechanisms for multiple file streams to gain high-throughput file transfer
    similar to GridFTP protocol, but by proposing and implementing a new parallel
    TCP connection-oriented paradigm. In our LAN tests, we have achieved better
    results than Globus GridFTP implementation particularly in multiple TCP streams
    and directory tree transfers. Our LAN experiences in memory-to-memory tests
    show that DotDFS accesses to the 94% bottleneck bandwidth while GridFTP is
    accessing 91%. In LAN disk-to-disk tests, comparing DotDFS protocol with
    GridFTP protocol unveils a set of interesting and technical problems in GridFTP
    for both the nature of the protocol and its implementation by Globus. In the
    WAN experimental studies, we propose a new idea for analytical modeling of file
    transfer protocols like DotDFS inspired by sampling, experimentation and
    mathematical interpolation approaches. The cross-platform and open
    standard-based features of DotDFS provide a substantial framework for unifying
    data access and resource sharing in real heterogeneous Grid environments.

    DotGrid: a .NET-based cross-platform software for desktop grids

    Alireza Poshtkohi, Ali Haj Abutalebi, Shaahin Hessabi
    Comments: 20 pages, 14 figures
    Journal-ref: Int. J. Web and Grid Services, Vol. 3, No. 3, pp.313-332, 2007
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Grid infrastructures that have provided wide integrated use of resources are
    becoming the de-facto computing platform for solving large-scale problems in
    science, engineering and commerce. In this evolution, desktop grid technologies
    allow the grid communities to exploit the idle cycles of pervasive desktop PC
    systems to increase the available computing power. In this paper we present
    DotGrid, a cross-platform grid software. DotGrid is the first comprehensive
    desktop grid software utilising Microsoft’s .NET Framework in Windows-based
    environments and MONO .NET in Unix-class operating systems to operate. Using
    DotGrid services and APIs, grid desktop middleware and applications can be
    implemented conveniently. We evaluated our DotGrid’s performance by
    implementing a set of grid-based applications.

    Decentralized, Robust and Efficient Services for an Autonomous and Real-time Urban Crisis Management

    Frédéric Le Mouël (CITI), Carlos Barrios Hernández (UIS), Oscar Carrillo (CITI), Gabriel Pedraza (UIS)
    Comments: in French. Colloque international interdisciplinaire Colombie — France ” La Ville-R’egion durable comme projet: d’efis actuels. Regards crois’es et perspectives “, Mar 2017, Bogota, Colombie
    Subjects: Computers and Society (cs.CY); Distributed, Parallel, and Cluster Computing (cs.DC); Social and Information Networks (cs.SI)

    The globalization of trade and the organization of work are currently causing
    a large migratory flow towards the cities. This growth of cities requires new
    urban planning where digital tools take a preponderant place to capture data
    and understand and decide in face of changes. These tools however hardly resist
    to natural disasters, terrorism, accidents, etc. Based on the expertise of the
    CITI laboratory of INSA Lyon and SC3 of the Industrial University of Santander,
    we propose to create the ALERT project – Autonomous Liable Emergency service in
    Real Time – with decentralized, reliable and efficient services, physically
    close to the citizens, taking decisions locally, in a relevant manner without
    risk of disconnection with a central authority. These information gathering and
    decision-making will involve the population with participatory and social
    approaches.

    On the Transformation Capability of Feasible Mechanisms for Programmable Matter

    Othon Michail, George Skretas, Paul G. Spirakis
    Comments: 48 pages, 32 figures
    Subjects: Data Structures and Algorithms (cs.DS); Distributed, Parallel, and Cluster Computing (cs.DC); Robotics (cs.RO)

    In this work, we study theoretical models of emph{programmable matter}
    systems. The systems under consideration consist of spherical modules, kept
    together by magnetic forces and able to perform two minimal mechanical
    operations (or movements): emph{rotate} around a neighbor and emph{slide}
    over a line. In terms of modeling, there are (n) nodes arranged in a
    2-dimensional grid and forming some initial emph{shape}. The goal is for the
    initial shape (A) to emph{transform} to some target shape (B) by a sequence of
    movements. Most of the paper focuses on emph{transformability} questions,
    meaning whether it is in principle feasible to transform a given shape to
    another. We first consider the case in which only rotation is available to the
    nodes. Our main result is that deciding whether two given shapes (A) and (B)
    can be transformed to each other, is in (mathbf{P}). We then insist on
    rotation only and impose the restriction that the nodes must maintain global
    connectivity throughout the transformation. We prove that the corresponding
    transformability question is in (mathbf{PSPACE}) and study the problem of
    determining the minimum emph{seeds} that can make feasible, otherwise
    infeasible transformations. Next we allow both rotations and slidings and prove
    universality: any two connected shapes (A,B) of the same order, can be
    transformed to each other without breaking connectivity. The worst-case number
    of movements of the generic strategy is (Omega(n^2)). We improve this to
    (O(n)) parallel time, by a pipelining strategy, and prove optimality of both by
    matching lower bounds. In the last part of the paper, we turn our attention to
    distributed transformations. The nodes are now distributed processes able to
    perform communicate-compute-move rounds. We provide distributed algorithms for
    a general type of transformations.

    Towards Efficient Verification of Population Protocols

    Michael Blondin, Stefan Jaax, Javier Esparza, Philipp J. Meyer
    Comments: 27 pages, 1 figure
    Subjects: Logic in Computer Science (cs.LO); Distributed, Parallel, and Cluster Computing (cs.DC)

    Population protocols are a well established model of computation by
    anonymous, identical finite state agents. A protocol is well-specified if from
    every initial configuration, all fair executions reach a common consensus. The
    central verification question for population protocols is the
    well-specification problem: deciding if a given protocol is well-specified.
    Esparza et al. have recently shown that this problem is decidable, but with
    very high complexity: it is at least as hard as the Petri net reachability
    problem, which is EXPSPACE-hard, and for which only algorithms of non-primitive
    recursive complexity are currently known.

    In this paper we introduce the class WS3 of well-specified strongly-silent
    protocols and we prove that it is suitable for automatic verification. More
    precisely, we show that WS3 has the same computational power as general
    well-specified protocols, and captures standard protocols from the literature.
    Moreover, we show that the membership problem for WS3 reduces to solving
    boolean combinations of linear constraints over N. This allowed us to develop
    the first software able to automatically prove well-specification for all of
    the infinitely many possible inputs.

    Cognitive Inference of Demographic Data by User Ratings

    Jinliang Xu, Shangguang Wang, Fangchun Yang, Rong N. Chang
    Comments: 12 pages, 6 figures
    Subjects: Information Retrieval (cs.IR); Distributed, Parallel, and Cluster Computing (cs.DC)

    Cognitive inference of user demographics, such as gender and age, plays an
    important role in creating user profiles for adjusting marketing strategies and
    generating personalized recommendations because user demographic data is
    usually not available due to data privacy concerns. At present, users can
    readily express feedback regarding products or services that they have
    purchased. During this process, user demographics are concealed, but the data
    has never yet been successfully utilized to contribute to the cognitive
    inference of user demographics. In this paper, we investigate the inference
    power of user ratings data, and propose a simple yet general cognitive
    inference model, called rating to profile (R2P), to infer user demographics
    from user provided ratings. In particular, the proposed R2P model can achieve
    the following: 1. Correctly integrate user ratings into model training. 2.Infer
    multiple demographic attributes of users simultaneously, capturing the
    underlying relevance between different demographic attributes. 3. Train its two
    components, i.e. feature extractor and classifier, in an integrated manner
    under a supervised learning paradigm, which effectively helps to discover
    useful hidden patterns from highly sparse ratings data. We introduce how to
    incorporate user ratings data into the research field of cognitive inference of
    user demographic data, and detail the model development and optimization
    process for the proposed R2P. Extensive experiments are conducted on two
    real-world ratings datasets against various compared state-of-the-art methods,
    and the results from multiple aspects demonstrate that our proposed R2P model
    can significantly improve on the cognitive inference performance of user
    demographic data.

    Multiple User Context Inference by Fusing Data Sources

    Jinliang Xu, Shangguang Wang, Fangchun Yang, Jie Tang
    Comments: 13 pages, 8 figures
    Subjects: Information Retrieval (cs.IR); Distributed, Parallel, and Cluster Computing (cs.DC)

    Inference of user context information, including user’s gender, age, marital
    status, location and so on, has been proven to be valuable for building context
    aware recommender system. However, prevalent existing studies on user context
    inference have two shortcommings: 1. focusing on only a single data source
    (e.g. Internet browsing logs, or mobile call records), and 2. ignoring the
    interdependence of multiple user contexts (e.g. interdependence between age and
    marital status), which have led to poor inference performance. To solve this
    problem, in this paper, we first exploit tensor outer product to fuse multiple
    data sources in the feature space to obtain an extensional user feature
    representation. Following this, by taking this extensional user feature
    representation as input, we propose a multiple attribute probabilistic model
    called MulAProM to infer user contexts that can take advantage of the
    interdependence between them. Our study is based on large telecommunication
    datasets from the local mobile operator of Shanghai, China, and consists of two
    data sources, 4.6 million call detail records and 7.5 million data traffic
    records of (8,000) mobile users, collected in the course of six months. The
    experimental results show that our model can outperform other models in terms
    of emph{recall}, emph{precision}, and the emph{F1-measure}.

    BLOCKBENCH: A Framework for Analyzing Private Blockchains

    Tien Tuan Anh Dinh, Ji Wang, Gang Chen, Rui Liu, Beng Chin Ooi, Kian-Lee Tan
    Comments: 16 pages
    Subjects: Databases (cs.DB); Cryptography and Security (cs.CR); Distributed, Parallel, and Cluster Computing (cs.DC)

    Blockchain technologies are taking the world by storm. Public blockchains,
    such as Bitcoin and Ethereum, enable secure peer-to-peer applications like
    crypto-currency or smart contracts. Their security and performance are well
    studied. This paper concerns recent private blockchain systems designed with
    stronger security (trust) assumption and performance requirement. These systems
    target and aim to disrupt applications which have so far been implemented on
    top of database systems, for example banking, finance applications. Multiple
    platforms for private blockchains are being actively developed and fine tuned.
    However, there is a clear lack of a systematic framework with which different
    systems can be analyzed and compared against each other. Such a framework can
    be used to assess blockchains’ viability as another distributed data processing
    platform, while helping developers to identify bottlenecks and accordingly
    improve their platforms.

    In this paper, we first describe BlockBench, the first evaluation framework
    for analyzing private blockchains. It serves as a fair means of comparison for
    different platforms and enables deeper understanding of different system design
    choices. Any private blockchain can be integrated to BlockBench via simple APIs
    and benchmarked against workloads that are based on real and synthetic smart
    contracts. BlockBench measures overall and component-wise performance in terms
    of throughput, latency, scalability and fault-tolerance. Next, we use
    BlockBench to conduct comprehensive evaluation of three major private
    blockchains: Ethereum, Parity and Hyperledger Fabric. The results demonstrate
    that these systems are still far from displacing current database systems in
    traditional data processing workloads. Furthermore, there are gaps in
    performance among the three systems which are attributed to the design choices
    at different layers of the software stack.


    Learning

    Task-based End-to-end Model Learning

    Priya L. Donti, Brandon Amos, J. Zico Kolter
    Comments: Submitted to ICML 2017. Code available at this https URL
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

    As machine learning techniques have become more ubiquitous, it has become
    common to see machine learning prediction algorithms operating within some
    larger process. However, the criteria by which we train machine learning
    algorithms often differ from the ultimate criteria on which we evaluate them.
    This paper proposes an end-to-end approach for learning probabilistic machine
    learning models within the context of stochastic programming, in a manner that
    directly captures the ultimate task-based objective for which they will be
    used. We then present two experimental evaluations of the proposed approach,
    one as applied to a generic inventory stock problem and the second to a
    real-world electrical grid scheduling task. In both cases, we show that the
    proposed approach can outperform both a traditional modeling approach and a
    purely black-box policy optimization approach.

    Langevin Dynamics with Continuous Tempering for High-dimensional Non-convex Optimization

    Nanyang Ye, Zhanxing Zhu, Rafal K. Mantiuk
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    Minimizing non-convex and high-dimensional objective functions are
    challenging, especially when training modern deep neural networks. In this
    paper, a novel approach is proposed which divides the training process into two
    consecutive phases to obtain better generalization performance: Bayesian
    sampling and stochastic optimization. The first phase is to explore the energy
    landscape and to capture the “fat” modes; and the second one is to fine-tune
    the parameter learned from the first phase. In the Bayesian learning phase, we
    apply continuous tempering and stochastic approximation into the Langevin
    dynamics to create an efficient and effective sampler, in which the temperature
    is adjusted automatically according to the designed “temperature dynamics”.
    These strategies can overcome the challenge of early trapping into bad local
    minima and have achieved remarkable improvements in various types of neural
    networks as shown in our theoretical analysis and empirical experiments.

    Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs

    Michael Gygli, Mohammad Norouzi, Anelia Angelova
    Comments: Submitted to ICML 2017
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

    We approach structured output prediction by learning a deep value network
    (DVN) that evaluates different output structures for a given input. For
    example, when applied to image segmentation, the value network takes an image
    and a segmentation mask as inputs and predicts a scalar score evaluating the
    mask quality and its correspondence with the image. Once the value network is
    optimized, at inference, it finds output structures that maximize the score of
    the value net via gradient descent on continuous relaxations of structured
    outputs. Thus DVN takes advantage of the joint modeling of the inputs and
    outputs. Our framework applies to a wide range of structured output prediction
    problems. We conduct experiments on multi-label classification based on text
    data and on image segmentation problems. DVN outperforms several strong
    baselines and the state-of-the-art results on these benchmarks. In addition, on
    image segmentation, the proposed deep value network learns complex shape priors
    and effectively combines image information with the prior to obtain competitive
    segmentation results.

    Blocking Transferability of Adversarial Examples in Black-Box Learning Systems

    Hossein Hosseini, Yize Chen, Sreeram Kannan, Baosen Zhang, Radha Poovendran
    Subjects: Learning (cs.LG)

    Advances in Machine Learning (ML) have led to its adoption as an integral
    component in many applications, including banking, medical diagnosis, and
    driverless cars. To further broaden the use of ML models, cloud-based services
    offered by Microsoft, Amazon, Google, and others have developed ML-as-a-service
    tools as black-box systems. However, ML classifiers are vulnerable to
    adversarial examples: inputs that are maliciously modified can cause the
    classifier to provide adversary-desired outputs. Moreover, it is known that
    adversarial examples generated on one classifier are likely to cause another
    classifier to make the same mistake, even if the classifiers have different
    architectures or are trained on disjoint datasets. This property, which is
    known as transferability, opens up the possibility of attacking black-box
    systems by generating adversarial examples on a substitute classifier and
    transferring the examples to the target classifier. Therefore, the key to
    protect black-box learning systems against the adversarial examples is to block
    their transferability. To this end, we propose a training method that, as the
    input is more perturbed, the classifier smoothly outputs lower confidence on
    the original label and instead predicts that the input is “invalid”. In
    essence, we augment the output class set with a NULL label and train the
    classifier to reject the adversarial examples by classifying them as NULL. In
    experiments, we apply a wide range of attacks based on adversarial examples on
    the black-box systems. We show that a classifier trained with the proposed
    method effectively resists against the adversarial examples, while maintaining
    the accuracy on clean data.

    Online Learning with Local Permutations and Delayed Feedback

    Ohad Shamir, Liran Szlak
    Subjects: Learning (cs.LG)

    We propose an Online Learning with Local Permutations (OLLP) setting, in
    which the learner is allowed to slightly permute the emph{order} of the loss
    functions generated by an adversary. On one hand, this models natural
    situations where the exact order of the learner’s responses is not crucial, and
    on the other hand, might allow better learning and regret performance, by
    mitigating highly adversarial loss sequences. Also, with random permutations,
    this can be seen as a setting interpolating between adversarial and stochastic
    losses. In this paper, we consider the applicability of this setting to convex
    online learning with delayed feedback, in which the feedback on the prediction
    made in round (t) arrives with some delay ( au). With such delayed feedback,
    the best possible regret bound is well-known to be (O(sqrt{ au T})). We prove
    that by being able to permute losses by a distance of at most (M) (for (Mgeq
    au)), the regret can be improved to (O(sqrt{T}(1+sqrt{ au^2/M}))), using a
    Mirror-Descent based algorithm which can be applied for both Euclidean and
    non-Euclidean geometries. We also prove a lower bound, showing that for
    (M< au/3), it is impossible to improve the standard (O(sqrt{ au T})) regret
    bound by more than constant factors. Finally, we provide some experiments
    validating the performance of our algorithm.

    Conjugate-Computation Variational Inference : Converting Variational Inference in Non-Conjugate Models to Inferences in Conjugate Models

    Mohammad Emtiyaz Khan, Wu Lin
    Comments: Published in AI-Stats 2017. This version contains a short paragraph in the conclusions section which we could not add in the conference version due to space constraints. The last line in Section 5 has also been modified accordingly
    Subjects: Learning (cs.LG)

    Variational inference is computationally challenging in models that contain
    both conjugate and non-conjugate terms. Methods specifically designed for
    conjugate models, even though computationally efficient, find it difficult to
    deal with non-conjugate terms. On the other hand, stochastic-gradient methods
    can handle the non-conjugate terms but they usually ignore the conjugate
    structure of the model which might result in slow convergence. In this paper,
    we propose a new algorithm called Conjugate-computation Variational Inference
    (CVI) which brings the best of the two worlds together — it uses conjugate
    computations for the conjugate terms and employs stochastic gradients for the
    rest. We derive this algorithm by using a stochastic mirror-descent method in
    the mean-parameter space, and then expressing each gradient step as a
    variational inference in a conjugate model. We demonstrate our algorithm’s
    applicability to a large class of models and establish its convergence. Our
    experimental results show that our method converges much faster than the
    methods that ignore the conjugate structure of the model.

    SPARTan: Scalable PARAFAC2 for Large & Sparse Data

    Ioakeim Perros, Evangelos E. Papalexakis, Fei Wang, Richard Vuduc, Elizabeth Searles, Michael Thompson, Jimeng Sun
    Subjects: Learning (cs.LG); Numerical Analysis (cs.NA)

    In exploratory tensor mining, a common problem is how to analyze a set of
    variables across a set of subjects whose observations do not align naturally.
    For example, when modeling medical features across a set of patients, the
    number and duration of treatments may vary widely in time, meaning there is no
    meaningful way to align their clinical records across time points for analysis
    purposes. To handle such data, the state-of-the-art tensor model is the
    so-called PARAFAC2, which yields interpretable and robust output and can
    naturally handle sparse data. However, its main limitation up to now has been
    the lack of efficient algorithms that can handle large-scale datasets.

    In this work, we fill this gap by developing a scalable method to compute the
    PARAFAC2 decomposition of large and sparse datasets, called SPARTan. Our method
    exploits special structure within PARAFAC2, leading to a novel algorithmic
    reformulation that is both fast (in absolute time) and more memory-efficient
    than prior work. We evaluate SPARTan on both synthetic and real datasets,
    showing 22X performance gains over the best previous implementation and also
    handling larger problem instances for which the baseline fails. Furthermore, we
    are able to apply SPARTan to the mining of temporally-evolving phenotypes on
    data taken from real and medically complex pediatric patients. The clinical
    meaningfulness of the phenotypes identified in this process, as well as their
    temporal evolution over time for several patients, have been endorsed by
    clinical experts.

    Improved multitask learning through synaptic intelligence

    Friedemann Zenke, Ben Poole, Surya Ganguli
    Subjects: Learning (cs.LG); Neurons and Cognition (q-bio.NC); Machine Learning (stat.ML)

    Deep learning has led to remarkable advances when applied to problems where
    the data distribution does not change over the course of learning. In stark
    contrast, biological neural networks continually adapt to changing domains, and
    solve a diversity of tasks simultaneously. Furthermore, synapses in biological
    neurons are not simply real-valued scalars, but possess complex molecular
    machinery enabling non-trivial learning dynamics. In this study, we take a
    first step toward bringing this biological complexity into artificial neural
    networks. We introduce a model of intelligent synapses that accumulate task
    relevant information over time, and exploit this information to efficiently
    consolidate memories of old tasks to protect them from being overwritten as new
    tasks are learned. We apply our framework to learning sequences of related
    classification problems, and show that it dramatically reduces catastrophic
    forgetting while maintaining computational efficiency.

    Leak Event Identification in Water Systems Using High Order CRF

    Qing Han, Wentao Zhu, Yang Shi
    Subjects: Learning (cs.LG)

    Today, detection of anomalous events in civil infrastructures (e.g. water
    pipe breaks and leaks) is time consuming and often takes hours or days. Pipe
    breakage as one of the most frequent types of failure of water networks often
    causes community disruptions ranging from temporary interruptions in services
    to extended loss of business and relocation of residents. In this project, we
    design and implement a two-phase approach for leak event identification, which
    leverages dynamic data from multiple information sources including IoT sensing
    data (pressure values and/or flow rates), geophysical data (water systems), and
    human inputs (tweets posted on Twitter). In the approach, a high order
    Conditional Random Field (CRF) is constructed that enforces predictions based
    on IoT observations consistent with human inputs to improve the performance of
    event identifications.

    Considering the physical water network as a graph, a CRF model is built and
    learned by the Structured Support Vector Machine (SSVM) using node features
    such as water pressure and flow rate. After that, we built the high order CRF
    system by enforcing twitter leakage detection information. An optimal inference
    algorithm is proposed for the adapted high order CRF model. Experimental
    results show the effectiveness of our system.

    Multiscale Hierarchical Convolutional Networks

    Jörn-Henrik Jacobsen, Edouard Oyallon, Stéphane Mallat, Arnold W.M. Smeulders
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    Deep neural network algorithms are difficult to analyze because they lack
    structure allowing to understand the properties of underlying transforms and
    invariants. Multiscale hierarchical convolutional networks are structured deep
    convolutional networks where layers are indexed by progressively higher
    dimensional attributes, which are learned from training data. Each new layer is
    computed with multidimensional convolutions along spatial and attribute
    variables. We introduce an efficient implementation of such networks where the
    dimensionality is progressively reduced by averaging intermediate layers along
    attribute indices. Hierarchical networks are tested on CIFAR image data bases
    where they obtain comparable precisions to state of the art networks, with much
    fewer parameters. We study some properties of the attributes learned from these
    databases.

    Autoregressive Convolutional Neural Networks for Asynchronous Time Series

    Mikolaj Binkowski, Gautier Marti, Philippe Donnat
    Comments: Under review by the International Conference on Machine Learning (ICML)
    Subjects: Learning (cs.LG)

    We propose ‘Significance-Offset Convolutional Neural Network’, a deep
    convolutional network architecture for multivariate time series regression. The
    model is inspired by standard autoregressive (AR) models and gating mechanisms
    used in recurrent neural networks. It involves an AR-like weighting system,
    where the final predictor is obtained as a weighted sum of sub-predictors while
    the weights are data-dependent functions learnt through a convolutional
    network.The architecture was designed for applications on asynchronous time
    series with low signal-to-noise ratio and hence is evaluated on such datasets:
    a hedge fund proprietary dataset of over2 million quotes for a credit
    derivative index andan artificially generated noisy autoregressive series. The
    proposed architecture achieves promising results compared to convolutional and
    recur-rent neural networks. The code for the numerical experiments and the
    architecture implementation will be shared online to make the research
    reproducible.

    Sequential Local Learning for Latent Graphical Models

    Sejun Park, Eunho Yang, Jinwoo Shin
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    Learning parameters of latent graphical models (GM) is inherently much harder
    than that of no-latent ones since the latent variables make the corresponding
    log-likelihood non-concave. Nevertheless, expectation-maximization schemes are
    popularly used in practice, but they are typically stuck in local optima. In
    the recent years, the method of moments have provided a refreshing angle for
    resolving the non-convex issue, but it is applicable to a quite limited class
    of latent GMs. In this paper, we aim for enhancing its power via enlarging such
    a class of latent GMs. To this end, we introduce two novel concepts, coined
    marginalization and conditioning, which can reduce the problem of learning a
    larger GM to that of a smaller one. More importantly, they lead to a sequential
    learning framework that repeatedly increases the learning portion of given
    latent GM, and thus covers a significantly broader and more complicated class
    of loopy latent GMs which include convolutional and random regular models.

    Prediction and Control with Temporal Segment Models

    Nikhil Mishra, Pieter Abbeel, Igor Mordatch
    Subjects: Learning (cs.LG); Robotics (cs.RO)

    We introduce a method for learning the dynamics of complex nonlinear systems
    based on deep generative models over temporal segments of states and actions.
    Unlike dynamics models that operate over individual discrete timesteps, we
    learn the distribution over future state trajectories conditioned on past
    state, past action, and planned future action trajectories, as well as a latent
    prior over action trajectories. Our approach is based on convolutional
    autoregressive models and variational autoencoders. It makes stable and
    accurate predictions over long horizons for complex, stochastic systems,
    effectively expressing uncertainty and modeling the effects of collisions,
    sensory noise, and action delays. The learned dynamics model and action prior
    can be used for end-to-end, fully differentiable trajectory optimization and
    model-based policy optimization, which we use to evaluate the performance and
    sample-efficiency of our method.

    Recruiting from the network: discovering Twitter users who can help combat Zika epidemics

    Paolo Missier, Callum McClean, Jonathan Carlton, Diego Cedrim, Leonardo Silva, Alessandro Garcia, Alexandre Plastino, Alexander Romanovsky
    Subjects: Learning (cs.LG); Social and Information Networks (cs.SI)

    Tropical diseases like extit{Chikungunya} and extit{Zika} have come to
    prominence in recent years as the cause of serious, long-lasting,
    population-wide health problems. In large countries like Brasil, traditional
    disease prevention programs led by health authorities have not been
    particularly effective. We explore the hypothesis that monitoring and analysis
    of social media content streams may effectively complement such efforts.
    Specifically, we aim to identify selected members of the public who are likely
    to be sensitive to virus combat initiatives that are organised in local
    communities. Focusing on Twitter and on the topic of Zika, our approach
    involves (i) training a classifier to select topic-relevant tweets from the
    Twitter feed, and (ii) discovering the top users who are actively posting
    relevant content about the topic. We may then recommend these users as the
    prime candidates for direct engagement within their community. In this short
    paper we describe our analytical approach and prototype architecture, discuss
    the challenges of dealing with noisy and sparse signal, and present encouraging
    preliminary results.

    Deep Learning in Customer Churn Prediction: Unsupervised Feature Learning on Abstract Company Independent Feature Vectors

    Philip Spanoudes, Thomson Nguyen
    Comments: 23 pages, 14 figures
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    As companies increase their efforts in retaining customers, being able to
    predict accurately ahead of time, whether a customer will churn in the
    foreseeable future is an extremely powerful tool for any marketing team. The
    paper describes in depth the application of Deep Learning in the problem of
    churn prediction. Using abstract feature vectors, that can generated on any
    subscription based company’s user event logs, the paper proves that through the
    use of the intrinsic property of Deep Neural Networks (learning secondary
    features in an unsupervised manner), the complete pipeline can be applied to
    any subscription based company with extremely good churn predictive
    performance. Furthermore the research documented in the paper was performed for
    Framed Data (a company that sells churn prediction as a service for other
    companies) in conjunction with the Data Science Institute at Lancaster
    University, UK. This paper is the intellectual property of Framed Data.

    Social Fingerprinting: detection of spambot groups through DNA-inspired behavioral modeling

    Stefano Cresci, Roberto Di Pietro, Marinella Petrocchi, Angelo Spognardi, Maurizio Tesconi
    Subjects: Social and Information Networks (cs.SI); Cryptography and Security (cs.CR); Learning (cs.LG)

    Spambot detection in online social networks is a long-lasting challenge
    involving the study and design of detection techniques capable of efficiently
    identifying ever-evolving spammers. Recently, a new wave of social spambots has
    emerged, with advanced human-like characteristics that allow them to go
    undetected even by current state-of-the-art algorithms. In this paper, we show
    that efficient spambots detection can be achieved via an in-depth analysis of
    their collective behaviors exploiting the digital DNA technique for modeling
    the behaviors of social network users. Inspired by its biological counterpart,
    in the digital DNA representation the behavioral lifetime of a digital account
    is encoded in a sequence of characters. Then, we define a similarity measure
    for such digital DNA sequences. We build upon digital DNA and the similarity
    between groups of users to characterize both genuine accounts and spambots.
    Leveraging such characterization, we design the Social Fingerprinting
    technique, which is able to discriminate among spambots and genuine accounts in
    both a supervised and an unsupervised fashion. We finally evaluate the
    effectiveness of Social Fingerprinting and we compare it with three
    state-of-the-art detection algorithms. Among the peculiarities of our approach
    is the possibility to apply off-the-shelf DNA analysis techniques to study
    online users behaviors and to efficiently rely on a limited number of
    lightweight account characteristics.

    Bayesian Optimization with Gradients

    Jian Wu, Matthias Poloczek, Andrew Gordon Wilson, Peter I. Frazier
    Comments: 16 pages, 5 figures
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG); Optimization and Control (math.OC)

    In recent years, Bayesian optimization has proven successful for global
    optimization of expensive-to-evaluate multimodal objective functions. However,
    unlike most optimization methods, Bayesian optimization typically does not use
    derivative information. In this paper we show how Bayesian optimization can
    exploit derivative information to decrease the number of objective function
    evaluations required for good performance. In particular, we develop a novel
    Bayesian optimization algorithm, the derivative-enabled knowledge-gradient
    (dKG), for which we show one-step Bayes-optimality, asymptotic consistency, and
    greater one-step value of information than is possible in the derivative-free
    setting. Our procedure accommodates noisy and incomplete derivative
    information, and comes in both sequential and batch forms. We show dKG provides
    state-of-the-art performance compared to a wide range of optimization
    procedures with and without gradients, on benchmarks including logistic
    regression, kernel learning, and k-nearest neighbors.

    Learning Large-Scale Bayesian Networks with the sparsebn Package

    Bryon Aragam, Jiaying Gu, Qing Zhou
    Comments: 33 pages, 7 figures
    Subjects: Machine Learning (stat.ML); Learning (cs.LG); Computation (stat.CO); Methodology (stat.ME)

    Learning graphical models from data is an important problem with wide
    applications, ranging from genomics to the social sciences. Nowadays datasets
    typically have upwards of thousands—sometimes tens or hundreds of
    thousands—of variables and far fewer samples. To meet this challenge, we
    develop a new R package called sparsebn for learning the structure of large,
    sparse graphical models with a focus on Bayesian networks. While there are many
    existing packages for this task within the R ecosystem, this package focuses on
    the unique setting of learning large networks from high-dimensional data,
    possibly with interventions. As such, the methods provided place a premium on
    scalability and consistency in a high-dimensional setting. Furthermore, in the
    presence of interventions, the methods implemented here achieve the goal of
    learning a causal network from data. The sparsebn package is open-source and
    available on CRAN.

    Ask Me Even More: Dynamic Memory Tensor Networks (Extended Model)

    Govardana Sachithanandam Ramachandran, Ajay Sohmshetty
    Subjects: Computation and Language (cs.CL); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    We examine Memory Networks for the task of question answering (QA), under
    common real world scenario where training examples are scarce and under weakly
    supervised scenario, that is only extrinsic labels are available for training.
    We propose extensions for the Dynamic Memory Network (DMN), specifically within
    the attention mechanism, we call the resulting Neural Architecture as Dynamic
    Memory Tensor Network (DMTN). Ultimately, we see that our proposed extensions
    results in over 80% improvement in the number of task passed against the
    baselined standard DMN and 20% more task passed compared to state-of-the-art
    End-to-End Memory Network for Facebook’s single task weakly trained 1K bAbi
    dataset.

    Real-Time Machine Learning: The Missing Pieces

    Robert Nishihara, Philipp Moritz, Stephanie Wang, Alexey Tumanov, William Paul, Johann Schleier-Smith, Richard Liaw, Michael I. Jordan, Ion Stoica
    Comments: 6 pages, 3 figures
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Machine learning applications are increasingly deployed not only to serve
    predictions using static models, but also as tightly-integrated components of
    feedback loops involving dynamic, real-time decision making. These applications
    pose a new set of requirements, none of which are difficult to achieve in
    isolation, but the combination of which creates a challenge for existing
    distributed execution frameworks: computation with millisecond latency at high
    throughput, adaptive construction of arbitrary task graphs, and execution of
    heterogeneous kernels over diverse sets of resources. We assert that a new
    distributed execution framework is needed for such ML applications and propose
    a candidate approach with a proof-of-concept architecture that achieves a 63x
    performance improvement over a state-of-the-art execution framework for a
    representative application.

    Evolution Strategies as a Scalable Alternative to Reinforcement Learning

    Tim Salimans, Jonathan Ho, Xi Chen, Ilya Sutskever
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    We explore the use of Evolution Strategies, a class of black box optimization
    algorithms, as an alternative to popular RL techniques such as Q-learning and
    Policy Gradients. Experiments on MuJoCo and Atari show that ES is a viable
    solution strategy that scales extremely well with the number of CPUs available:
    By using hundreds to thousands of parallel workers, ES can solve 3D humanoid
    walking in 10 minutes and obtain competitive results on most Atari games after
    one hour of training time. In addition, we highlight several advantages of ES
    as a black box optimization technique: it is invariant to action frequency and
    delayed rewards, tolerant of extremely long horizons, and does not need
    temporal discounting or value function approximation.

    Joint Embedding of Graphs

    Shangsi Wang, Joshua T. Vogelstein, Carey E. Priebe
    Comments: 14 pages
    Subjects: Applications (stat.AP); Learning (cs.LG); Machine Learning (stat.ML)

    Feature extraction and dimension reduction for networks is critical in a wide
    variety of domains. Efficiently and accurately learning features for multiple
    graphs has important applications in statistical inference on graphs. We
    propose a method to jointly embed multiple undirected graphs. Given a set of
    graphs, the joint embedding method identifies a linear subspace spanned by rank
    one symmetric matrices and projects adjacency matrices of graphs into this
    subspace. The projection coefficients can be treated as features of the graphs.
    We also propose a random graph model which generalizes classical random graph
    model and can be used to model multiple graphs. We show through theory and
    numerical experiments that under the model, the joint embedding method produces
    estimates of parameters with small errors. Via simulation experiments, we
    demonstrate that the joint embedding method produces features which lead to
    state of the art performance in classifying graphs. Applying the joint
    embedding method to human brain graphs, we find it extract interpretable
    features that can be used to predict individual composite creativity index.

    Markov Chain Lifting and Distributed ADMM

    Guilherme França, José Bento
    Comments: This work was also selected for a talk at NIPS 2016, Optimization for Machine Learning Workshop (OPT 2016)
    Journal-ref: IEEE Signal Processing Letters (Volume: 24, Issue: 3, March 2017)
    Subjects: Machine Learning (stat.ML); Data Structures and Algorithms (cs.DS); Information Theory (cs.IT); Learning (cs.LG); Optimization and Control (math.OC)

    The time to converge to the steady state of a finite Markov chain can be
    greatly reduced by a lifting operation, which creates a new Markov chain on an
    expanded state space. For a class of quadratic objectives, we show an analogous
    behavior where a distributed ADMM algorithm can be seen as a lifting of
    Gradient Descent algorithm. This provides a deep insight for its faster
    convergence rate under optimal parameter tuning. We conjecture that this gain
    is always present, as opposed to the lifting of a Markov chain which sometimes
    only provides a marginal speedup.


    Information Theory

    Information geometry, simulation and complexity in Gaussian random fields

    Alexandre L. M. Levada
    Journal-ref: Monte Carlo Methods and Applications, Volume 22, Issue 2, Pages
    81-107, 2016
    Subjects: Information Theory (cs.IT)

    Random fields are useful mathematical objects in the characterization of
    non-deterministic complex systems. A fundamental issue in the evolution of
    dynamical systems is how intrinsic properties of such structures change in
    time. In this paper, we propose to quantify how changes in the spatial
    dependence structure affect the Riemannian metric tensor that equips the
    model’s parametric space. Defining Fisher curves, we measure the variations in
    each component of the metric tensor when visiting different entropic states of
    the system. Simulations show that the geometric deformations induced by the
    metric tensor in case of a decrease in the inverse temperature are not
    reversible for an increase of the same amount, provided there is significant
    variation in the system entropy: the process of taking a system from a lower
    entropy state A to a higher entropy state B and then bringing it back to A,
    induces a natural intrinsic one-way direction of evolution. In this context,
    Fisher curves resemble mathematical models of hysteresis in which the natural
    orientation is pointed by an arrow of time.

    Dynamic-SCFlip Decoding of Polar Codes

    Ludovic Chandesris, Valentin Savin, David Declercq
    Comments: submitted to IEEE Transactions on Communications
    Subjects: Information Theory (cs.IT)

    This paper proposes a generalization of the recently introduced Successive
    Cancellation Flip (SCFlip) decoding of polar codes, characterized by a number
    of extra decoding attempts, where one or several positions are flipped from the
    standard Successive Cancellation (SC) decoding. To make such an approach
    effective, we first introduce the concept of higher-order bit-flips, and
    propose a new metric to determine the bit-flips that are more likely to correct
    the trajectory of the SC decoding. We then propose a generalized SCFlip
    decoding algorithm, referred to as Dynamic-SCFlip (D-SCFlip), which dynamically
    builds a list of candidate bit-flips, while guaranteeing that extra decoding
    attempts are performed by decreasing probability of success. Simulation results
    show that D-SCFlip is an effective alternative to SC-List decoding of polar
    codes, by providing very good error correcting performance, with an average
    computation complexity close to the one of the SC decoder.

    Frequency Synchronization for Uplink Massive MIMO Systems

    Weile Zhang, Feifei Gao, Shi Jin, Hai Lin
    Subjects: Information Theory (cs.IT)

    In this paper, we propose a frequency synchronization scheme for multiuser
    orthogonal frequency division multiplexing (OFDM) uplink with a large-scale
    uniform linear array (ULA) at base station (BS) by exploiting the angle
    information of users. Considering that the incident signal at BS from each user
    can be restricted within a certain angular spread, the proposed scheme could
    perform carrier frequency offset (CFO) estimation for each user individually
    through a extit{joint spatial-frequency alignment} procedure and can be
    completed efficiently with the aided of fast Fourier transform (FFT). A
    multi-branch receive beamforming is further designed to yield an equivalent
    single user transmission model for which the conventional single-user channel
    estimation and data detection can be carried out. To make the study complete,
    the theoretical performance analysis of the CFO estimation is also conducted.
    We further develop a user grouping scheme to deal with the unexpected scenarios
    that some users may not be separated well from the spatial domain. Finally,
    various numerical results are provided to verify the proposed studies.

    Analytical Model of Wireless Cell with Superposition Coding

    Jean-Marc Kelif
    Comments: 5 pages, 1 figure
    Subjects: Information Theory (cs.IT)

    Focusing on the downlink, we consider a base station (BS) and the cell it
    covers. The Superposition Coding (SC), also referred to as Non Orthogonal
    Multiple Access (NOMA), is implemented. We propose an analytical model of the
    wireless cell covered by the BS. Based on this model, we establish a closed
    form expression of the minimum transmit power of the base station, needed to
    achieve a given SINR (signal to interference plus noise ratio), whatever the
    users locations, on the area covered by the base station. The closed form
    expression of the BS transmit power allows to establish quality of service
    (QoS) and coverage values, in a simple and quick way.

    Interference Networks with Caches at Both Ends

    Joan S. Pujol Roig, Filippo Tosato, Deniz Gündüz
    Comments: 6 pages, 4 figures, conference
    Subjects: Information Theory (cs.IT)

    A (K_T imes K_R) cache-aided wireless interference network, in which both
    the transmitters and the receivers are equipped with cache memories is studied.
    Each user requests one file from a library of (N) popular files. The goal is to
    design the cache contents without the knowledge of the particular user demands,
    such that all possible demand combinations can be satisfied reliably over the
    interference channel. The achievable sum degrees-of-freedom ((mathrm{sDoF}))
    and the normalized delivery time (NDT) are studied for centralized and
    decentralized network architectures, respectively. First, using a combination
    of interference alignment (IA), zero-forcing (ZF) and interference cancellation
    (IC) techniques, a novel caching and transmission scheme for centralized
    networks is introduced, and it is shown to improve the (mathrm{sDoF}) upon the
    state-of-the-art. Then, the NDT is studied when the content placement at the
    receiver caches is carried out in a decentralized manner. Our results indicate
    that, for this particular network architecture, caches located at the receiver
    side are more effective than those at the transmitter side in order to reduce
    the NDT.

    Linear codes over Fq which are equivalent to LCD codes

    Claude Carlet, Sihem Mesnager, Chunming Tang, Yanfeng Qi
    Subjects: Information Theory (cs.IT)

    Linear codes with complementary duals (abbreviated LCD) are linear codes
    whose intersection with their dual are trivial. When they are binary, they play
    an important role in armoring implementations against side-channel attacks and
    fault injection attacks. Non-binary LCD codes in characteristic 2 can be
    transformed into binary LCD codes by expansion. In this paper, we introduce a
    general construction of LCD codes from any linear codes. Further, we show that
    any linear code over (mathbb F_{q} (q>3)) is equivalent to an Euclidean LCD
    code and any linear code over (mathbb F_{q^2} (q>2)) is equivalent to a
    Hermitian LCD code. Consequently an ([n,k,d]) Euclidean LCD code over (mathbb
    F_q) with (q>3) exists if there is an ([n,k,d])-linear code over (mathbb F_q)
    and an ([n,k,d]) Hermitian LCD code over (mathbb F_{q^2}) with (q>2) exists if
    there is an ([n,k,d])-linear code over (mathbb F_{q^2}). Hence, when (q>3)
    (resp.(q>2)) (q)-ary Euclidean (resp. (q^2)-ary Hermitian) LCD codes are as
    good as (q)-ary linear codes (resp. (q^2)-ary linear codes).

    Virtual Reality over Wireless Networks: Quality-of-Service Model and Learning-Based Resource Management

    Mingzhe Chen, Walid Saad, Changchuan Yin
    Subjects: Information Theory (cs.IT)

    In this paper, the problem of resource management is studied for a network of
    wireless virtual reality (VR) users communicating over heterogeneous small cell
    networks (SCNs). In order to capture the VR users’ quality-of-service (QoS) in
    SCNs, a novel VR model, based on multi-attribute utility theory, is proposed.
    This model jointly accounts for VR metrics such as tracking accuracy,
    processing delay, and transmission delay. In this model, the small base
    stations (SBSs) act as the VR control centers that collect the tracking
    information from VR users over the cellular uplink. Once this information is
    collected, the SBSs will then send the three dimensional images and
    accompanying surround stereo audio to the VR users over the downlink.
    Therefore, the resource allocation problem in VR wireless networks must jointly
    consider both the uplink and downlink. This problem is then formulated as a
    noncooperative game and a distributed algorithm based on the machine learning
    framework of echo state networks (ESNs) is proposed to find the solution of
    this game. The use of the proposed ESN algorithm enables the SBSs to predict
    the VR QoS of each SBS and guarantees the convergence to a mixed-strategy Nash
    equilibrium. The analytical result shows that each user’s VR QoS jointly
    depends on both VR tracking accuracy and wireless resource allocation.
    Simulation results show that the proposed algorithm yields significant gains,
    in terms of total utility value of VR QoS, that reach up to 22% and 38.5%,
    respectively, compared to Q-learning and a baseline proportional fair
    algorithm. The results also show that the proposed algorithm has a faster
    convergence time than Q-learning and can guarantee low delays for VR services.

    Maximizing the Mutual Information of Multi-Antenna Links Under an Interfered Receiver Power Constraint

    George K. Papageorgiou, Konstantinos Ntougias, Constantinos B. Papadias
    Comments: 12 pages, 5 figures, conference paper
    Subjects: Information Theory (cs.IT)

    Single-user multiple-input / multiple-output (SU-MIMO) communication systems
    have been successfully used over the years and have provided a significant
    increase on a wireless link’s capacity by enabling the transmission of multiple
    data streams. Assuming channel knowledge at the transmitter, the maximization
    of the mutual information of a MIMO link is achieved by finding the optimal
    power allocation under a given sum-power constraint, which is in turn obtained
    by the water-filling (WF) algorithm. However, in spectrum sharing setups, such
    as Licensed Shared Access (LSA), where a primary link (PL) and a secondary link
    (SL) coexist, the power transmitted by the SL transmitter may induce harmful
    interference to the PL receiver. While such co-existing links have been
    considered extensively in various spectrum sharing setups, the mutual
    information of the SL under a constraint on the interference it may cause to
    the PL receiver has, quite astonishingly, not been evaluated so far. In this
    paper, we solve this problem, find its unique optimal solution and provide the
    power allocation policy and corresponding precoding solution that achieves the
    optimal capacity under the imposed constraint. The performance of the optimal
    solution and the penalty due to the interference constraint are evaluated over
    some indicative Rayleigh fading channel conditions and interference thresholds.
    We believe that the obtained results are of general nature and that they may
    apply, beyond spectrum sharing, to a variety of applications that admit a
    similar setup.

    Wireless Bidirectional Relaying using Physical Layer Network Coding with Heterogeneous PSK Modulation

    Chinmayananda A., Saket D. Buch, B. Sundar Rajan
    Comments: 10 pages, 10 figures and 3 tables
    Subjects: Information Theory (cs.IT)

    In bidirectional relaying using Physical Layer Network Coding (PLNC), it is
    generally assumed that users employ same modulation schemes in the Multiple
    Access phase. However, as observed by Zhang et al., it may not be desirable for
    the users to always use the same modulation schemes, particularly when
    user-relay channels are not equally strong. Such a scheme is called
    Heterogeneous PLNC. However, the approach in [1] uses the computationally
    intensive Closest Neighbour Clustering (CNC) algorithm to find the network
    coding maps to be applied at the relay. Also, the treatment is specific to
    certain cases of heterogeneous modulations. In this paper, we show that, when
    users employ heterogeneous but symmetric PSK modulations, the network coding
    maps and the mapping regions in the fade state plane can be obtained
    analytically. Performance results are provided in terms of Relay Error Rate
    (RER) and Bit Error Rate (BER).

    Performance Analysis of Physical Layer Network Coding for Two-way Relaying over Non-regenerative Communication Satellites

    Saket D. Buch, B. Sundar Rajan
    Comments: 9 pages and 13 figures
    Subjects: Information Theory (cs.IT)

    Two-way relaying is one of the major applications of broadband communication
    satellites, for which an efficient technique is Physical Layer Network Coding
    (PLNC). Earlier studies have considered satellites employing PLNC with onboard
    processing. This paper investigates the performance of PLNC over
    non-regenerative satellites, as a majority of the operational and planned
    satellites have no onboard processing. Assuming that the channel magnitudes of
    the two users are equal, two operating conditions are considered with
    uncoded-QPSK relaying. In the first condition, both users are completely
    synchronized in phase and transmit power, and in the second condition, phase is
    not synchronized. The peak power constraint imposed by the satellite amplifier
    is considered and the error performance bounds are derived for both the
    conditions. The simulation results for end-to-end Bit Error Rate (BER) and
    throughput are provided. These results shall enable communication system
    designers to decide system parameters like power and linearity, and perform
    tradeoff analysis between different relaying schemes.

    Weight Spectrum of Quasi-Perfect Binary Codes with Distance 4

    Valentine B. Afanassiev, Alexander A. Davydov
    Comments: 5 pages, 11 references, 2 tables
    Subjects: Information Theory (cs.IT)

    We consider the weight spectrum of a class of quasi-perfect binary linear
    codes with code distance 4. For example, extended Hamming code and Panchenko
    code are the known members of this class and it is known that Panchenko code
    has the minimal number of weight 4 codewords. We give exact recursive formulas
    for the weight spectrum of quasi-perfect codes and their dual codes. As an
    example of application of the weight spectrum we derive a lower estimate for
    the conditional probability of correction of erasure patterns of high weights
    (equal to or greater than code distance).

    Resource Allocation for a Full-Duplex Base Station Aided OFDMA System

    Yang You, Chong Qin, Yi Gong
    Subjects: Information Theory (cs.IT)

    Exploiting full-duplex (FD) technology on base stations (BSs) is a promising
    solution to enhancing the system performance. Motivated by this, we revisit a
    full-duplex base station (FD-BS) aided OFDMA system, which consists of one BS,
    several uplink/downlink users and multiple subcarriers. A joint 3-dimensional
    (3D) mapping scheme among subcarriers, down-link users (DUEs), uplink users
    (UUEs) is considered as well as an associated power allocation optimization. In
    detail, we first decompose the complex 3D mapping problem into three
    2-dimensional sub ones and solve them by using the iterative Hungarian method,
    respectively. Then based on the Lagrange dual method, we sequentially solve the
    power allocation and 3- dimensional mapping problem by fixing a dual point.
    Finally, the optimal solution can be obtained by utilizing the sub-gradient
    method. Unlike existing work that only solves either 3D mapping or power
    allocation problem but with a high computation complexity, we tackle both of
    them and have successfully reduced computation complexity from exponential to
    polynomial order. Numerical simulations are conducted to verify the proposed
    scheme.

    An Improved Receiving Scheme for Layered ACO-FOFDM in IM/DD Systems

    Mengqi Guo, Ji Zhou, Xizi Tang, Fan Hu, Jia Qi, Yaojun Qiao, Aiying Yang, Yueming Lu
    Comments: 7 pages, 6 figures
    Subjects: Information Theory (cs.IT)

    In this paper, an improved receiver is proposed to improve the bit error rate
    (BER) performance in layered asymmetrically clipped optical fast orthogonal
    frequency division multiplexing (ACO-FOFDM) for intensity-modulated and
    direct-detected (IM/DD) optical transmission systems. Layered ACO-FOFDM can
    compensate the weakness of traditional ACO-FOFDM in low spectral efficiency,
    the utilization of discrete cosine transform (DCT) in FOFDM system instead of
    fast Fourier transform (FFT) in OFDM system can reduce the computational
    complexity without any influence on BER performance. The BER performances of
    layered ACO-FOFDM system with improved receiver and DC-offset FOFDM (DCO-FOFDM)
    system with optimal DC-bias are compared at the same spectral efficiency.
    Simulation results show that under different optical bit energy to noise power
    ratios, layered ACO-FOFDM system with improved receiver has 2.86dB, 5.26dB and
    5.72dB BER performance advantages at forward error correction (FEC) limit over
    DCO-FOFDM system when the spectral efficiencies are 1 bit/s/Hz, 2 bits/s/Hz and
    3 bits/s/Hz, respectively. Therefore, the proposed scheme has low-cost property
    through the use of DCT, and is suitable for application in the adaptive IM/DD
    systems with zero DC-bias.

    Multi-user Precoding and Channel Estimation for Hybrid Millimeter Wave Systems

    Lou Zhao, Derrick Wing Kwan Ng, Jinhong Yuan
    Comments: 15 pages, accepted for publication, JSAC 2017
    Subjects: Information Theory (cs.IT)

    In this paper, we develop a low-complexity channel estimation for hybrid
    millimeter wave (mmWave) systems, where the number of radio frequency (RF)
    chains is much less than the number of antennas equipped at each transceiver.
    The proposed mmWave channel estimation algorithm first exploits multiple
    frequency tones to estimate the strongest angle-of-arrivals (AoAs) at both base
    station (BS) and user sides for the design of analog beamforming matrices. Then
    all the users transmit orthogonal pilot symbols to the BS along the directions
    of the estimated strongest AoAs in order to estimate the channel. The estimated
    channel will be adopted to design the digital zero-forcing (ZF) precoder at the
    BS for the multi-user downlink transmission. The proposed channel estimation
    algorithm is applicable to both nonsparse and sparse mmWave channel
    environments. Furthermore, we derive a tight achievable rate upper bound of the
    digital ZF precoding with the proposed channel estimation algorithm scheme. Our
    analytical and simulation results show that the proposed scheme obtains a
    considerable achievable rate of fully digital systems, where the number of RF
    chains equipped at each transceiver is equal to the number of antennas.
    Besides, by taking into account the effect of various types of errors, i.e.,
    random phase errors, transceiver analog beamforming errors, and equivalent
    channel estimation errors, we derive a closed-form approximation for the
    achievable rate of the considered scheme. We illustrate the robustness of the
    proposed channel estimation and multi-user downlink precoding scheme against
    the system imperfection.

    Waveform Optimization for Radio-Frequency Wireless Power Transfer

    Mohammad R. Vedady Moghadam, Yong Zeng, Rui Zhang
    Comments: Submitted for publication
    Subjects: Information Theory (cs.IT)

    In this paper, we study the waveform design problem for a single-input
    single-output (SISO) radio-frequency (RF) wireless power transfer (WPT) system
    in frequency-selective channels. First, based on the actual non-linear
    current-voltage model of the diode at the energy receiver, we derive a
    semi-closed-form expression for the deliverable DC voltage in terms of the
    incident RF signal and hence obtain the average harvested power. Next, by
    adopting a multisine waveform structure for the transmit signal of the energy
    transmitter, we jointly design the multisine signal amplitudes and phases
    overall frequency tones according to the channel state information (CSI) to
    maximize the deliverable DC voltage or harvested power. Although our formulated
    problem is non-convex and difficult to solve, we propose two suboptimal
    solutions to it, based on the frequency-domain maximal ratio transmission (MRT)
    principle and the sequential convex optimization (SCP) technique, respectively.
    Using various simulations, the performance gain of our solutions over the
    existing waveform designs is shown.

    Capacity Enhancement with Meta-Multiplexing

    Chunlin Ji, Ruopeng Liu, Daoben Li
    Subjects: Information Theory (cs.IT)

    Multiplexing services as a key communication technique to effectively combine
    multiple signals into one signal and transmit over a shared medium.
    Multiplexing can increase the channel capacity by requiring more resources on
    the transmission medium. For instance, the space-division multiplexing
    accomplished through the multiple-input multiple-output (MIMO) scheme achieves
    significant capacity increase by the realized parallel channel, but it requires
    expensive hardware resources. Here, we present a novel multiplexing
    methodology, named meta-multiplexing, which allows ordinary modulated signals
    overlap together to form a set of “artificial” parallel channels, meanwhile, it
    only requires similar resources as ordinary modulation schemes. We prove the
    capacity law for the meta-multiplexing system and disclose that under broad
    conditions, the capacity of a single channel increases linearly with the signal
    to noise ratio (SNR), which breaks the conventional logarithmic growth of the
    capacity over SNR. Numerous simulation studies verify the capacity law and
    demonstrate the high efficiency of meta-multiplexing. Through proof-of-concept
    hardware experiments, we tested the proposed method in communication practices
    and achieved a spectral efficiency of 81.7 bits/s/Hz over a single channel,
    which is significantly higher than the efficiency of any existing communication
    system.

    Generalized Rao Test for Decentralized Detection of an Uncooperative Target

    D. Ciuonzo, P. Salvo Rossi, P. Willett
    Comments: extended version of IEEE Signal Processing Letters
    Subjects: Information Theory (cs.IT)

    We tackle distributed detection of a non-cooperative target with a Wireless
    Sensor Network (WSN). When the target is present, sensors observe an (unknown)
    deterministic signal with attenuation depending on the distance between the
    sensor and the (unknown) target positions, embedded in symmetricand unimodal
    noise. The Fusion Center (FC) receives quantized sensor observations through
    error-prone Binary Symmetric Channels (BSCs) and is in charge of performing a
    more-accurate global decision. The resulting problem is a two-sided parameter
    testing with nuisance parameters (i.e. the target position) present only under
    the alternative hypothesis. After introducing the Generalized Likelihood Ratio
    Test (GLRT) for the problem, we develop a novel fusion rule corresponding to a
    Generalized Rao (G-Rao) test, based on Davies’ framework, to reduce the
    computational complexity. Also, a rationale for threshold-optimization is
    proposed and confirmed by simulations. Finally, the aforementioned rules are
    compared in terms of performance and computational complexity.

    GRAAD: Group Anonymous and Accountable D2D Communication in Mobile Networks

    Ruei-Hau Hsu, Jemin Lee, Tony Q.S. Quek, Jyh-Cheng Chen
    Comments: under submission to possible journal publications
    Subjects: Cryptography and Security (cs.CR); Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

    Device-to-Device (D2D) communication is mainly launched by the transmission
    requirements between devices for specific applications such as Proximity
    Services in Long-Term Evolution Advanced (LTE-A) networks, and each application
    will form a group of devices for the network-covered and network-absent D2D
    communications. Although there are many privacy concerns in D2D communication,
    they have not been well-addressed in current communication standards. This work
    introduces network-covered and network-absent authenticated key exchange
    protocols for D2D communications to guarantee accountable group anonymity,
    end-to-end security to network operators, as well as traceability and
    revocability for accounting and management requirements. We formally prove the
    security of those protocols, and also develop an analytic model to evaluate the
    quality of authentication protocols by authentication success rate in D2D
    communications. Besides, we implement the proposed protocols on android mobile
    devices to evaluate the computation costs of the protocols. We also evaluate
    the authentication success rate by the proposed analytic model and prove the
    correctness of the analytic model via simulation. Those evaluations show that
    the proposed protocols are feasible to the performance requirements of D2D
    communications.

    Negentropic Planar Symmetry Detector

    Agata Migalska, JP Lewis
    Comments: 25 pages, 18 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT)

    In this paper we observe that information theoretical concepts are valuable
    tools for extracting information from images and, in particular, information on
    image symmetries. It is shown that the problem of detecting reflectional and
    rotational symmetries in a two-dimensional image can be reduced to the problem
    of detecting point-symmetry and periodicity in one-dimensional negentropy
    functions. Based on these findings a detector of reflectional and rotational
    global symmetries in greyscale images is constructed. We discuss the importance
    of high precision in symmetry detection in applications arising from quality
    control and illustrate how the proposed method satisfies this requirement.
    Finally, a superior performance of our method to other existing methods,
    demonstrated by the results of a rigorous experimental verification, is an
    indication that our approach rooted in information theory is a promising
    direction in a development of a robust and widely applicable symmetry detector.

    Markov Chain Lifting and Distributed ADMM

    Guilherme França, José Bento
    Comments: This work was also selected for a talk at NIPS 2016, Optimization for Machine Learning Workshop (OPT 2016)
    Journal-ref: IEEE Signal Processing Letters (Volume: 24, Issue: 3, March 2017)
    Subjects: Machine Learning (stat.ML); Data Structures and Algorithms (cs.DS); Information Theory (cs.IT); Learning (cs.LG); Optimization and Control (math.OC)

    The time to converge to the steady state of a finite Markov chain can be
    greatly reduced by a lifting operation, which creates a new Markov chain on an
    expanded state space. For a class of quadratic objectives, we show an analogous
    behavior where a distributed ADMM algorithm can be seen as a lifting of
    Gradient Descent algorithm. This provides a deep insight for its faster
    convergence rate under optimal parameter tuning. We conjecture that this gain
    is always present, as opposed to the lifting of a Markov chain which sometimes
    only provides a marginal speedup.

    Maximum entropy sampling in complex networks

    Filippo Radicchi, Claudio Castellano
    Comments: 6 pages, 2 figures + Appendix
    Subjects: Physics and Society (physics.soc-ph); Information Theory (cs.IT)

    Many real-world systems are characterized by stochastic dynamical rules where
    a complex network of dependencies among individual elements probabilistically
    determines their state. Even with full knowledge of the network structure and
    of the stochastic rules of the dynamical process, the ability to predict system
    configurations is generally characterized by large uncertainty. Sampling a
    fraction of the nodes and deterministically observing their state may help to
    reduce the uncertainty about the unobserved nodes. However, choosing these
    points of observation with the goal of maximizing predictive power is a highly
    nontrivial task, depending on the nature of the stochastic process and on the
    structure of the underlying network. Here, we introduce a computationally
    efficient algorithm to determine quasi-optimal solutions for arbitrary
    stochastic processes defined on generic sparse topologies. We show that the
    method is effective for various processes on different substrates. We further
    show how the method can be fruitfully used to identify the best nodes to label
    in semi-supervised probabilistic classification algorithms.




沪ICP备19023445号-2号
友情链接