IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Wed, 14 Dec 2016

    我爱机器学习(52ml.net)发表于 2016-12-14 00:00:00
    love 0

    Neural and Evolutionary Computing

    Understanding the Impact of Precision Quantization on the Accuracy and Energy of Neural Networks

    Soheil Hashemi, Nicholas Anthony, Hokchhay Tann, R. Iris Bahar, Sherief Reda
    Comments: Accepted for conference proceedings in DATE17
    Subjects: Neural and Evolutionary Computing (cs.NE)

    Deep neural networks are gaining in popularity as they are used to generate
    state-of-the-art results for a variety of computer vision and machine learning
    applications. At the same time, these networks have grown in depth and
    complexity in order to solve harder problems. Given the limitations in power
    budgets dedicated to these networks, the importance of low-power, low-memory
    solutions has been stressed in recent years. While a large number of dedicated
    hardware using different precisions has recently been proposed, there exists no
    comprehensive study of different bit precisions and arithmetic in both inputs
    and network parameters. In this work, we address this issue and perform a study
    of different bit-precisions in neural networks (from floating-point to
    fixed-point, powers of two, and binary). In our evaluation, we consider and
    analyze the effect of precision scaling on both network accuracy and hardware
    metrics including memory footprint, power and energy consumption, and design
    area. We also investigate training-time methodologies to compensate for the
    reduction in accuracy due to limited bit precision and demonstrate that in most
    cases, precision scaling can deliver significant benefits in design metrics at
    the cost of very modest decreases in network accuracy. In addition, we propose
    that a small portion of the benefits achieved when using lower precisions can
    be forfeited to increase the network size and therefore the accuracy. We
    evaluate our experiments, using three well-recognized networks and datasets to
    show its generality. We investigate the trade-offs and highlight the benefits
    of using lower precisions in terms of energy and memory footprint.

    Stacked Generative Adversarial Networks

    Xun Huang, Yixuan Li, Omid Poursaeed, John Hopcroft, Serge Belongie
    Comments: Under review
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    In this paper we aim to leverage the powerful bottom-up discriminative
    representations to guide a top-down generative model. We propose a novel
    generative model named Stacked Generative Adversarial Networks (SGAN), which is
    trained to invert the hierarchical representations of a discriminative
    bottom-up deep network. Our model consists of a top-down stack of GANs, each
    trained to generate “plausible” lower-level representations, conditioned on
    higher-level representations. A representation discriminator is introduced at
    each feature hierarchy to encourage the representation manifold of the
    generator to align with that of the bottom-up discriminative network, providing
    intermediate supervision. In addition, we introduce a conditional loss that
    encourages the use of conditional information from the layer above, and a novel
    entropy loss that maximizes a variational lower bound on the conditional
    entropy of generator outputs. To the best of our knowledge, the entropy loss is
    the first attempt to tackle the conditional model collapse problem that is
    common in conditional GANs. We first train each GAN of the stack independently,
    and then we train the stack end-to-end. Unlike the original GAN that uses a
    single noise vector to represent all the variations, our SGAN decomposes
    variations into multiple levels and gradually resolves uncertainties in the
    top-down generative process. Experiments demonstrate that SGAN is able to
    generate diverse and high-quality images, as well as being more interpretable
    than a vanilla GAN.

    Memcomputing Numerical Inversion with Self-Organizing Logic Gates

    Haik Manukian, Fabio L. Traversa, Massimiliano Di Ventra
    Subjects: Emerging Technologies (cs.ET); Neural and Evolutionary Computing (cs.NE)

    We propose to use Digital Memcomputing Machines (DMMs), implemented with
    self-organizing logic gates (SOLGs), to solve the problem of numerical
    inversion. Starting from fixed-point scalar inversion we describe the
    generalization to solving linear systems and matrix inversion. This method,
    when realized in hardware, will output the result in only one computational
    step. As an example, we perform simulations of the scalar case using a 5-bit
    logic circuit made of SOLGs, and show that the circuit successfully performs
    the inversion. Since this type of numerical inversion can be implemented by DMM
    units in hardware, it is scalable, and thus of great benefit to any real-time
    computing application.

    An Artificial Neural Networks based Temperature Prediction Framework for Network-on-Chip based Multicore Platform

    Sandeep Aswath Narayana
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Hardware Architecture (cs.AR); Neural and Evolutionary Computing (cs.NE)

    Continuous improvement in silicon process technologies has made possible the
    integration of hundreds of cores on a single chip. However, power and heat have
    become dominant constraints in designing these massive multicore chips causing
    issues with reliability, timing variations and reduced lifetime of the chips.
    Dynamic Thermal Management (DTM) is a solution to avoid high temperatures on
    the die. Typical DTM schemes only address core level thermal issues. However,
    the Network-on-chip (NoC) paradigm, which has emerged as an enabling
    methodology for integrating hundreds to thousands of cores on the same die can
    contribute significantly to the thermal issues. Moreover, the typical DTM is
    triggered reactively based on temperature measurements from on-chip thermal
    sensor requiring long reaction times whereas predictive DTM method estimates
    future temperature in advance, eliminating the chance of temperature overshoot.
    Artificial Neural Networks (ANNs) have been used in various domains for
    modeling and prediction with high accuracy due to its ability to learn and
    adapt. This thesis concentrates on designing an ANN prediction engine to
    predict the thermal profile of the cores and Network-on-Chip elements of the
    chip. This thermal profile of the chip is then used by the predictive DTM that
    combines both core level and network level DTM techniques. On-chip wireless
    interconnect which is recently envisioned to enable energy-efficient data
    exchange between cores in a multicore environment, will be used to provide a
    broadcast-capable medium to efficiently distribute thermal control messages to
    trigger and manage the DTM schemes.

    Theory and Tools for the Conversion of Analog to Spiking Convolutional Neural Networks

    Bodo Rueckauer, Iulia-Alexandra Lungu, Yuhuang Hu, Michael Pfeiffer
    Comments: 9 pages, 2 figures, presented at the workshop “Computing with Spikes” at NIPS 2016, Barcelona, Spain
    Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Deep convolutional neural networks (CNNs) have shown great potential for
    numerous real-world machine learning applications, but performing inference in
    large CNNs in real-time remains a challenge. We have previously demonstrated
    that traditional CNNs can be converted into deep spiking neural networks
    (SNNs), which exhibit similar accuracy while reducing both latency and
    computational load as a consequence of their data-driven, event-based style of
    computing. Here we provide a novel theory that explains why this conversion is
    successful, and derive from it several new tools to convert a larger and more
    powerful class of deep networks into SNNs. We identify the main sources of
    approximation errors in previous conversion methods, and propose simple
    mechanisms to fix these issues. Furthermore, we develop spiking implementations
    of common CNN operations such as max-pooling, softmax, and batch-normalization,
    which allow almost loss-less conversion of arbitrary CNN architectures into the
    spiking domain. Empirical evaluation of different network architectures on the
    MNIST and CIFAR10 benchmarks leads to the best SNN results reported to date.

    Online Sequence-to-Sequence Reinforcement Learning for Open-Domain Conversational Agents

    Nabiha Asghar, Pascal Poupart, Jiang Xin, Hang Li
    Comments: 8 pages
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    We propose an online, end-to-end, deep reinforcement learning technique to
    develop generative conversational agents for open-domain dialogue. We use a
    unique combination of offline two-phase supervised learning and online
    reinforcement learning with human users to train our agent. While most existing
    research proposes hand-crafted and develop-defined reward functions for
    reinforcement, we devise a novel reward mechanism based on a variant of Beam
    Search and one-character user-feedback at each step. Experiments show that our
    model, when trained on a small and shallow Seq2Seq network, successfully
    promotes the generation of meaningful, diverse and interesting responses, and
    can be used to train agents with customized personas and conversational styles.


    Computer Vision and Pattern Recognition

    Stacked Generative Adversarial Networks

    Xun Huang, Yixuan Li, Omid Poursaeed, John Hopcroft, Serge Belongie
    Comments: Under review
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    In this paper we aim to leverage the powerful bottom-up discriminative
    representations to guide a top-down generative model. We propose a novel
    generative model named Stacked Generative Adversarial Networks (SGAN), which is
    trained to invert the hierarchical representations of a discriminative
    bottom-up deep network. Our model consists of a top-down stack of GANs, each
    trained to generate “plausible” lower-level representations, conditioned on
    higher-level representations. A representation discriminator is introduced at
    each feature hierarchy to encourage the representation manifold of the
    generator to align with that of the bottom-up discriminative network, providing
    intermediate supervision. In addition, we introduce a conditional loss that
    encourages the use of conditional information from the layer above, and a novel
    entropy loss that maximizes a variational lower bound on the conditional
    entropy of generator outputs. To the best of our knowledge, the entropy loss is
    the first attempt to tackle the conditional model collapse problem that is
    common in conditional GANs. We first train each GAN of the stack independently,
    and then we train the stack end-to-end. Unlike the original GAN that uses a
    single noise vector to represent all the variations, our SGAN decomposes
    variations into multiple levels and gradually resolves uncertainties in the
    top-down generative process. Experiments demonstrate that SGAN is able to
    generate diverse and high-quality images, as well as being more interpretable
    than a vanilla GAN.

    Fast Patch-based Style Transfer of Arbitrary Style

    Tian Qi Chen, Mark Schmidt
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Learning (cs.LG)

    Artistic style transfer is an image synthesis problem where the content of an
    image is reproduced with the style of another. Recent works show that a
    visually appealing style transfer can be achieved by using the hidden
    activations of a pretrained convolutional neural network. However, existing
    methods either apply (i) an optimization procedure that works for any style
    image but is very expensive, or (ii) an efficient feedforward network that only
    allows a limited number of trained styles. In this work we propose a simpler
    optimization objective based on local matching that combines the content
    structure and style textures in a single layer of the pretrained network. We
    show that our objective has desirable properties such as a simpler optimization
    landscape, intuitive parameter tuning, and consistent frame-by-frame
    performance on video. Furthermore, we use 80,000 natural images and 80,000
    paintings to train an inverse network that approximates the result of the
    optimization. This results in a procedure for artistic style transfer that is
    efficient but also allows arbitrary content and style images.

    Saliency in VR: How do people explore virtual environments?

    Vincent Sitzmann, Ana Serrano, Amy Pavel, Maneesh Agrawala, Diego Gutierrez, Gordon Wetzstein
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Understanding how humans explore virtual environments is crucial for many
    applications, such as developing compression algorithms or designing effective
    cinematic virtual reality (VR) content, as well as to develop predictive
    computational models. We have recorded 780 head and gaze trajectories from 86
    users exploring omni-directional stereo panoramas using VR head-mounted
    displays. By analyzing the interplay between visual stimuli, head orientation,
    and gaze direction, we demonstrate patterns and biases of how people explore
    these panoramas and we present first steps toward predicting time-dependent
    saliency. To compare how visual attention and saliency in VR are different from
    conventional viewing conditions, we have also recorded users observing the same
    scenes in a desktop setup. Based on this data, we show how to adapt existing
    saliency predictors to VR, so that insights and tools developed for predicting
    saliency in desktop scenarios may directly transfer to these immersive
    applications.

    Compressive Image Recovery Using Recurrent Generative Model

    Akshat Dave, Anil Kumar Vadathya, Kaushik Mitra
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Generative models are considered as the swiss knives for data modelling. In
    this paper we leverage the recently proposed recurrent generative model, RIDE,
    for applications like image inpainting and compressive image reconstruction.
    Recurrent networks can model long range dependencies in images and hence are
    suitable to handle global multiplexing in reconstruction from compressive
    imaging. We perform MAP inference with RIDE as prior using backpropagation to
    the inputs and projected gradient method. We propose a entropy thresholding
    based approach for preserving texture well. Our approach shows comparable
    results for image inpainting task. It shows superior results in compressive
    image reconstruction compared to traditional methods D-AMP and TVAL3 which uses
    global prior of minimizing TV norm.

    Automated Inference on Sociopsychological Impressions of Attractive Female Faces

    Xiaolin Wu, Xi Zhang, Chang Liu
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This article is a sequel to our earlier paper [24]. Our main objective is to
    explore the potential of supervised machine learning in face-induced social
    computing and cognition, riding on the momentum of much heralded successes of
    face processing, analysis and recognition on the tasks of biometric-based
    identification. We present a case study of automated statistical inference on
    sociopsychological perceptions of female faces controlled for race,
    attractiveness, age and nationality. Like in [24], our empirical evidences
    point to the possibility of teaching computer vision and machine learning
    algorithms, using example face images, to predict personality traits and
    behavioral predisposition.

    Spatial Pyramid Convolutional Neural Network for Social Event Detection in Static Image

    Reza Fuad Rachmadi, Keiichi Uchimura, Gou Koutaki
    Comments: in Proceeding of 11th International Student Conference on Advanced Science and Technology (ICAST) 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Social event detection in a static image is a very challenging problem and
    it’s very useful for internet of things applications including automatic photo
    organization, ads recommender system, or image captioning. Several publications
    show that variety of objects, scene, and people can be very ambiguous for the
    system to decide the event that occurs in the image. We proposed the spatial
    pyramid configuration of convolutional neural network (CNN) classifier for
    social event detection in a static image. By applying the spatial pyramid
    configuration to the CNN classifier, the detail that occurs in the image can
    observe more accurately by the classifier. USED dataset provided by Ahmad et
    al. is used to evaluate our proposed method, which consists of two different
    image sets, EiMM, and SED dataset. As a result, the average accuracy of our
    system outperforms the baseline method by 15% and 2% respectively.

    Learning to Hash-tag Videos with Tag2Vec

    Aditya Singh, Saurabh Saini, Rajvi Shah, PJ Narayanan
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)

    User-given tags or labels are valuable resources for semantic understanding
    of visual media such as images and videos. Recently, a new type of labeling
    mechanism known as hash-tags have become increasingly popular on social media
    sites. In this paper, we study the problem of generating relevant and useful
    hash-tags for short video clips. Traditional data-driven approaches for tag
    enrichment and recommendation use direct visual similarity for label transfer
    and propagation. We attempt to learn a direct low-cost mapping from video to
    hash-tags using a two step training process. We first employ a natural language
    processing (NLP) technique, skip-gram models with neural network training to
    learn a low-dimensional vector representation of hash-tags (Tag2Vec) using a
    corpus of 10 million hash-tags. We then train an embedding function to map
    video features to the low-dimensional Tag2vec space. We learn this embedding
    for 29 categories of short video clips with hash-tags. A query video without
    any tag-information can then be directly mapped to the vector space of tags
    using the learned embedding and relevant tags can be found by performing a
    simple nearest-neighbor retrieval in the Tag2Vec space. We validate the
    relevance of the tags suggested by our system qualitatively and quantitatively
    with a user study.

    A Video-Based Method for Objectively Rating Ataxia

    Ronnachai Jaroensri, Amy Zhao, Guha Balakrishnan, Derek Lo, Jeremy Schmahmann, John Guttag, Fredo Durand
    Comments: 8 pages, 7 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    For many movement disorders, such as Parkinson’s and ataxia, disease
    progression is usually assessed visually by a clinician according to a
    numerical rating scale, or using questionnaires. These tests are subjective,
    time-consuming, and must be administered by a professional. We present an
    automated method for quantifying the severity of motion impairment in patients
    with ataxia, using only video recordings. We focus on videos of the
    finger-to-nose test, a common movement task used to assess ataxia progression
    during the course of routine clinical checkups.

    Our method uses pose estimation and optical flow techniques to track the
    motion of the patient’s hand in a video recording. We extract features that
    describe qualities of the motion such as speed and variation in performance.
    Using labels provided by an expert clinician, we build a supervised learning
    model that predicts severity according to the Brief Ataxia Rating Scale (BARS).
    Our model achieves a mean absolute error of 0.363 on a 0-4 scale and a
    prediction-label correlation of 0.835 in a leave-one-patient-out experiment.
    The accuracy of our system is comparable to the reported inter-rater
    correlation among clinicians assessing the finger-to-nose exam using a similar
    ataxia rating scale. This work demonstrates the feasibility of using videos to
    produce more objective and clinically useful measures of motor impairment.

    A Fast Keypoint Based Hybrid Method for Copy Move Forgery Detection

    Sunil Kumar, J. V. Desa, Shaktidev Mukherjee
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Copy move forgery detection in digital images has become a very popular
    research topic in the area of image forensics. Due to the availability of
    sophisticated image editing tools and ever increasing hardware capabilities, it
    has become an easy task to manipulate the digital images. Passive forgery
    detection techniques are more relevant as they can be applied without the prior
    information about the image in question. Block based techniques are used to
    detect copy move forgery, but have limitations of large time complexity and
    sensitivity against affine operations like rotation and scaling. Keypoint based
    approaches are used to detect forgery in large images where the possibility of
    significant post processing operations like rotation and scaling is more. A
    hybrid approach is proposed using different methods for keypoint detection and
    description. Speeded Up Robust Features (SURF) are used to detect the keypoints
    in the image and Binary Robust Invariant Scalable Keypoints (BRISK) features
    are used to describe features at these keypoints. The proposed method has
    performed better than the existing forgery detection method using SURF
    significantly in terms of detection speed and is invariant to post processing
    operations like rotation and scaling. The proposed method is also invariant to
    other commonly applied post processing operations like adding Gaussian noise
    and JPEG compression

    Deep Convolutional Poses for Human Interaction Recognition in Monocular Videos

    Marcel Sheeny de Moraes, Sankha Mukherjee, Neil M Robertson
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Human interaction recognition is a challenging problem in computer vision and
    has been researched over the years due to its important applications. With the
    development of deep models for the human pose estimation problem, this work
    aims to verify the effectiveness of using the human pose in order to recognize
    the human interaction in monocular videos. This paper developed a method based
    on 5 steps: detect each person in the scene, track them, retrieve the human
    pose, extract features based on the pose and finally recognize the interaction
    using a classifier. The Two-Person interaction dataset was used for the
    development of this methodology. Using a whole sequence evaluation approach it
    achieved 87.56% of average accuracy of all interaction. Yun, et at achieved
    91.10% using the same dataset, however their methodology used the depth sensor
    to recognize the interaction. The methodology developed in this paper shows
    that an RGB camera can be as effective as depth cameras to recognize the
    interaction between two persons using the recent development of deep models to
    estimate the human pose.

    Neural Networks with Manifold Learning for Diabetic Retinopathy Detection

    Arjun Raj Rajanna, Kamelia Aryafar, Rajeev Ramchandran, Christye Sisson, Ali Shokoufandeh, Raymond Ptucha
    Comments: Published in Proceedings of “IEEE Western NY Image & Signal Processing Workshop”
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Widespread outreach programs using remote retinal imaging have proven to
    decrease the risk from diabetic retinopathy, the leading cause of blindness in
    the US. However, this process still requires manual verification of image
    quality and grading of images for level of disease by a trained human grader
    and will continue to be limited by the lack of such scarce resources.
    Computer-aided diagnosis of retinal images have recently gained increasing
    attention in the machine learning community. In this paper, we introduce a set
    of neural networks for diabetic retinopathy classification of fundus retinal
    images. We evaluate the efficiency of the proposed classifiers in combination
    with preprocessing and augmentation steps on a sample dataset. Our experimental
    results show that neural networks in combination with preprocessing on the
    images can boost the classification accuracy on this dataset. Moreover the
    proposed models are scalable and can be used in large scale datasets for
    diabetic retinopathy detection. The models introduced in this paper can be used
    to facilitate the diagnosis and speed up the detection process.

    Autoencoder-based holographic image restoration

    Tomoyoshi Shimobaba, Yutaka Endo, Ryuji Hirayama, Yuki Nagahama, Takayuki Takahashi, Takashi Nishitsuji, Takashi Kakue, Atsushi Shiraki, Naoki Takada, Nobuyuki Masuda, Tomoyoshi Ito
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Optics (physics.optics)

    We propose a holographic image restoration method using an autoencoder, which
    is an artificial neural network. Because holographic reconstructed images are
    often contaminated by direct light, conjugate light, and speckle noise, the
    discrimination of reconstructed images may be difficult. In this paper, we
    demonstrate the restoration of reconstructed images from holograms that record
    page data in holographic memory and QR codes by using the proposed method.

    Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer

    Sergey Zagoruyko, Nikos Komodakis
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Attention plays a critical role in human visual experience. Furthermore, it
    has recently been demonstrated that attention can also play an important role
    in the context of applying artificial neural networks to a variety of tasks
    from fields such as computer vision and NLP. In this work we show that, by
    properly defining attention for convolutional neural networks, we can actually
    use this type of information in order to significantly improve the performance
    of a student CNN network by forcing it to mimic the attention maps of a
    powerful teacher network. To that end, we propose several novel methods of
    transferring attention, showing consistent improvement across a variety of
    datasets and convolutional neural network architectures.

    3D fully convolutional networks for subcortical segmentation in MRI: A large-scale study

    J. Dolz, C. Desrosiers, I. Ben Ayed
    Comments: Submitted to the special issue of Neuroimage: “Brain Segmentation and Parcellation”
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This study investigates a 3D and fully convolutional neural network (CNN) for
    subcortical brain structure segmentation in MRI. 3D CNN architectures have been
    generally avoided due to their computational and memory requirements during
    inference. We address the problem via small kernels, allowing deeper
    architectures. We further model both local and global context by embedding
    intermediate-layer outputs in the final prediction, which encourages
    consistency between features extracted at different scales and embeds
    fine-grained information directly in the segmentation process. Our model is
    efficiently trained end-to-end on a graphics processing unit (GPU), in a single
    stage, exploiting the dense inference capabilities of fully CNNs.

    We performed comprehensive experiments over two publicly available data sets.
    First, we demonstrate a state-of-the-art performance on the ISBR dataset. Then,
    we report a {em large-scale} multi-site evaluation over 1112 unregistered
    subject data sets acquired from 17 different sites (ABIDE data set), with ages
    ranging from 7 to 64 years, showing that our method is robust to various
    acquisition protocols, demographics and clinical factors. Our method yielded
    segmentations that are highly consistent with a standard atlas-based approach,
    while running in a fraction of the time needed by atlas-based methods and
    avoiding registration/normalization steps. This makes it convenient for massive
    multi-site neuroanatomical imaging studies. To the best of our knowledge, our
    work is the first to study subcortical structure segmentation on such
    large-scale and heterogeneous data.

    Observation of dynamics inside an unlabeled live cell using bright-field photon microscopy: Evaluation of organelles' trajectories

    Renata Rychtarikova, Dalibor Stys
    Comments: 12 pages, 5 figures, supplementary data
    Subjects: Quantitative Methods (q-bio.QM); Computer Vision and Pattern Recognition (cs.CV); Cell Behavior (q-bio.CB); Subcellular Processes (q-bio.SC)

    This article presents an algorithm for the evaluation of organelles’
    movements inside of an unmodified live cell. We used a time-lapse image series
    obtained using wide-field bright-field photon transmission microscopy as an
    algorithm input. The benefit of the algorithm is the application of the R’enyi
    information entropy, namely a variable called a point information gain, which
    enables to highlight the borders of the intracellular organelles and to
    localize the organelles’ centers of mass with the precision of one pixel.

    Theory and Tools for the Conversion of Analog to Spiking Convolutional Neural Networks

    Bodo Rueckauer, Iulia-Alexandra Lungu, Yuhuang Hu, Michael Pfeiffer
    Comments: 9 pages, 2 figures, presented at the workshop “Computing with Spikes” at NIPS 2016, Barcelona, Spain
    Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Deep convolutional neural networks (CNNs) have shown great potential for
    numerous real-world machine learning applications, but performing inference in
    large CNNs in real-time remains a challenge. We have previously demonstrated
    that traditional CNNs can be converted into deep spiking neural networks
    (SNNs), which exhibit similar accuracy while reducing both latency and
    computational load as a consequence of their data-driven, event-based style of
    computing. Here we provide a novel theory that explains why this conversion is
    successful, and derive from it several new tools to convert a larger and more
    powerful class of deep networks into SNNs. We identify the main sources of
    approximation errors in previous conversion methods, and propose simple
    mechanisms to fix these issues. Furthermore, we develop spiking implementations
    of common CNN operations such as max-pooling, softmax, and batch-normalization,
    which allow almost loss-less conversion of arbitrary CNN architectures into the
    spiking domain. Empirical evaluation of different network architectures on the
    MNIST and CIFAR10 benchmarks leads to the best SNN results reported to date.


    Artificial Intelligence

    Towards Adaptive Training of Agent-based Sparring Partners for Fighter Pilots

    Brett W. Israelsen, Nisar Ahmed, Kenneth Center, Roderick Green, Winston Bennett Jr
    Comments: submitted copy
    Journal-ref: SciTech 2017, paper 2545524
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG); Robotics (cs.RO)

    A key requirement for the current generation of artificial decision-makers is
    that they should adapt well to changes in unexpected situations. This paper
    addresses the situation in which an AI for aerial dog fighting, with tunable
    parameters that govern its behavior, must optimize behavior with respect to an
    objective function that is evaluated and learned through simulations. Bayesian
    optimization with a Gaussian Process surrogate is used as the method for
    investigating the objective function. One key benefit is that during
    optimization, the Gaussian Process learns a global estimate of the true
    objective function, with predicted outcomes and a statistical measure of
    confidence in areas that haven’t been investigated yet. Having a model of the
    objective function is important for being able to understand possible outcomes
    in the decision space; for example this is crucial for training and providing
    feedback to human pilots. However, standard Bayesian optimization does not
    perform consistently or provide an accurate Gaussian Process surrogate function
    for highly volatile objective functions. We treat these problems by introducing
    a novel sampling technique called Hybrid Repeat/Multi-point Sampling. This
    technique gives the AI ability to learn optimum behaviors in a highly uncertain
    environment. More importantly, it not only improves the reliability of the
    optimization, but also creates a better model of the entire objective surface.
    With this improved model the agent is equipped to more accurately/efficiently
    predict performance in unexplored scenarios.

    Algorithms for Graph-Constrained Coalition Formation in the Real World

    Filippo Bistaffa, Alessandro Farinelli, Jesús Cerquides, Juan A. Rodríguez-Aguilar, Sarvapali D. Ramchurn
    Comments: Accepted for publication, cite as “in press”
    Journal-ref: ACM Transactions on Intelligent Systems and Technology, 2017,
    Volume 8, Issue 4
    Subjects: Multiagent Systems (cs.MA); Artificial Intelligence (cs.AI)

    Coalition formation typically involves the coming together of multiple,
    heterogeneous, agents to achieve both their individual and collective goals. In
    this paper, we focus on a special case of coalition formation known as
    Graph-Constrained Coalition Formation (GCCF) whereby a network connecting the
    agents constrains the formation of coalitions. We focus on this type of problem
    given that in many real-world applications, agents may be connected by a
    communication network or only trust certain peers in their social network. We
    propose a novel representation of this problem based on the concept of edge
    contraction, which allows us to model the search space induced by the GCCF
    problem as a rooted tree. Then, we propose an anytime solution algorithm
    (CFSS), which is particularly efficient when applied to a general class of
    characteristic functions called (m+a) functions. Moreover, we show how CFSS can
    be efficiently parallelised to solve GCCF using a non-redundant partition of
    the search space. We benchmark CFSS on both synthetic and realistic scenarios,
    using a real-world dataset consisting of the energy consumption of a large
    number of households in the UK. Our results show that, in the best case, the
    serial version of CFSS is 4 orders of magnitude faster than the state of the
    art, while the parallel version is 9.44 times faster than the serial version on
    a 12-core machine. Moreover, CFSS is the first approach to provide anytime
    approximate solutions with quality guarantees for very large systems of agents
    (i.e., with more than 2700 agents).

    Application of Advanced Record Linkage Techniques for Complex Population Reconstruction

    Peter Christen
    Comments: 12 pages
    Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI)

    Record linkage is the process of identifying records that refer to the same
    entities from several databases. This process is challenging because commonly
    no unique entity identifiers are available. Linkage therefore has to rely on
    partially identifying attributes, such as names and addresses of people. Recent
    years have seen the development of novel techniques for linking data from
    diverse application areas, where a major focus has been on linking complex data
    that contain records about different types of entities. Advanced approaches
    that exploit both the similarities between record attributes as well as the
    relationships between entities to identify clusters of matching records have
    been developed.

    In this application paper we study the novel problem where rather than
    different types of entities we have databases where the same entity can have
    different roles, and where these roles change over time. We specifically
    develop novel techniques for linking historical birth, death, marriage and
    census records with the aim to reconstruct the population covered by these
    records over a period of several decades. Our experimental evaluation on real
    Scottish data shows that even with advanced linkage techniques that consider
    group, relationship, and temporal aspects it is challenging to achieve high
    quality linkage from such complex data.

    Proceedings of the The First Workshop on Verification and Validation of Cyber-Physical Systems

    Mehdi Kargahi (University of Tehran), Ashutosh Trivedi (University of Colorado Boulder)
    Journal-ref: EPTCS 232, 2016
    Subjects: Systems and Control (cs.SY); Artificial Intelligence (cs.AI); Robotics (cs.RO)

    The first International Workshop on Verification and Validation of
    Cyber-Physical Systems (V2CPS-16) was held in conjunction with the 12th
    International Conference on integration of Formal Methods (iFM 2016) in
    Reykjavik, Iceland. The purpose of V2CPS-16 was to bring together researchers
    and experts of the fields of formal verification and cyber-physical systems
    (CPS) to cover the theme of this workshop, namely a wide spectrum of
    verification and validation methods including (but not limited to) control,
    simulation, formal methods, etc.

    A CPS is an integration of networked computational and physical processes
    with meaningful inter-effects; the former monitors, controls, and affects the
    latter, while the latter also impacts the former. CPSs have applications in a
    wide-range of systems spanning robotics, transportation, communication,
    infrastructure, energy, and manufacturing. Many safety-critical systems such as
    chemical processes, medical devices, aircraft flight control, and automotive
    systems, are indeed CPS. The advanced capabilities of CPS require complex
    software and synthesis algorithms, which are hard to verify. In fact, many
    problems in this area are undecidable. Thus, a major step is to find particular
    abstractions of such systems which might be algorithmically verifiable
    regarding specific properties of such systems, describing the partial/overall
    behaviors of CPSs.

    Hybrid Repeat/Multi-point Sampling for Highly Volatile Objective Functions

    Brett Israelsen, Nisar Ahmed
    Journal-ref: BayesOpt Workshop, NIPS 2016
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG); Robotics (cs.RO)

    A key drawback of the current generation of artificial decision-makers is
    that they do not adapt well to changes in unexpected situations. This paper
    addresses the situation in which an AI for aerial dog fighting, with tunable
    parameters that govern its behavior, will optimize behavior with respect to an
    objective function that must be evaluated and learned through simulations. Once
    this objective function has been modeled, the agent can then choose its desired
    behavior in different situations. Bayesian optimization with a Gaussian Process
    surrogate is used as the method for investigating the objective function. One
    key benefit is that during optimization the Gaussian Process learns a global
    estimate of the true objective function, with predicted outcomes and a
    statistical measure of confidence in areas that haven’t been investigated yet.
    However, standard Bayesian optimization does not perform consistently or
    provide an accurate Gaussian Process surrogate function for highly volatile
    objective functions. We treat these problems by introducing a novel sampling
    technique called Hybrid Repeat/Multi-point Sampling. This technique gives the
    AI ability to learn optimum behaviors in a highly uncertain environment. More
    importantly, it not only improves the reliability of the optimization, but also
    creates a better model of the entire objective surface. With this improved
    model the agent is equipped to better adapt behaviors.

    Online Sequence-to-Sequence Reinforcement Learning for Open-Domain Conversational Agents

    Nabiha Asghar, Pascal Poupart, Jiang Xin, Hang Li
    Comments: 8 pages
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    We propose an online, end-to-end, deep reinforcement learning technique to
    develop generative conversational agents for open-domain dialogue. We use a
    unique combination of offline two-phase supervised learning and online
    reinforcement learning with human users to train our agent. While most existing
    research proposes hand-crafted and develop-defined reward functions for
    reinforcement, we devise a novel reward mechanism based on a variant of Beam
    Search and one-character user-feedback at each step. Experiments show that our
    model, when trained on a small and shallow Seq2Seq network, successfully
    promotes the generation of meaningful, diverse and interesting responses, and
    can be used to train agents with customized personas and conversational styles.

    Context-aware Sentiment Word Identification: sentiword2vec

    Yushi Yao, Guangjian Li
    Comments: 15 pages
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

    Traditional sentiment analysis often uses sentiment dictionary to extract
    sentiment information in text and classify documents. However, emerging
    informal words and phrases in user generated content call for analysis aware to
    the context. Usually, they have special meanings in a particular context.
    Because of its great performance in representing inter-word relation, we use
    sentiment word vectors to identify the special words. Based on the distributed
    language model word2vec, in this paper we represent a novel method about
    sentiment representation of word under particular context, to be detailed, to
    identify the words with abnormal sentiment polarity in long answers. Result
    shows the improved model shows better performance in representing the words
    with special meaning, while keep doing well in representing special idiomatic
    pattern. Finally, we will discuss the meaning of vectors representing in the
    field of sentiment, which may be different from general object-based
    conditions.


    Information Retrieval

    Information Extraction with Character-level Neural Networks and Noisy Supervision

    Philipp Meerkamp (Bloomberg LP), Zhengyi Zhou (AT&T Labs Research)
    Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR); Learning (cs.LG)

    We present an architecture for information extraction from text that augments
    an existing parser with a character-level neural network. To train the neural
    network, we compute a measure of consistency of extracted data with existing
    databases, and use it as a form of noisy supervision. Our architecture combines
    the ability of constraint-based information extraction system to easily
    incorporate domain knowledge and constraints with the ability of deep neural
    networks to leverage large amounts of data to learn complex features. The
    system led to large improvements over a mature and highly tuned
    constraint-based information extraction system used at Bloomberg for financial
    language text. At the same time, the new system massively reduces the
    development effort, allowing rule-writers to write high-recall constraints
    while relying on the deep neural network to remove false positives and boost
    precision.


    Computation and Language

    Building Large Machine Reading-Comprehension Datasets using Paragraph Vectors

    Radu Soricut, Nan Ding
    Comments: 10 pages
    Subjects: Computation and Language (cs.CL)

    We present a dual contribution to the task of machine reading-comprehension:
    a technique for creating large-sized machine-comprehension (MC) datasets using
    paragraph-vector models; and a novel, hybrid neural-network architecture that
    combines the representation power of recurrent neural networks with the
    discriminative power of fully-connected multi-layered networks. We use the
    MC-dataset generation technique to build a dataset of around 2 million
    examples, for which we empirically determine the high-ceiling of human
    performance (around 91% accuracy), as well as the performance of a variety of
    computer models. Among all the models we have experimented with, our hybrid
    neural-network architecture achieves the highest performance (83.2% accuracy).
    The remaining gap to the human-performance ceiling provides enough room for
    future model improvements.

    Multi-Perspective Context Matching for Machine Comprehension

    Zhiguo Wang, Haitao Mi, Wael Hamza, Radu Florian
    Comments: 8
    Subjects: Computation and Language (cs.CL)

    Previous machine comprehension (MC) datasets are either too small to train
    end-to-end deep learning models, or not difficult enough to evaluate the
    ability of current MC techniques. The newly released SQuAD dataset alleviates
    these limitations, and gives us a chance to develop more realistic MC models.
    Based on this dataset, we propose a Multi-Perspective Context Matching (MPCM)
    model, which is an end-to-end system that directly predicts the answer
    beginning and ending points in a passage. Our model first adjusts each
    word-embedding vector in the passage by multiplying a relevancy weight computed
    against the question. Then, we encode the question and weighted passage by
    using bi-directional LSTMs. For each point in the passage, our model matches
    the context of this point against the encoded question from multiple
    perspectives and produces a matching vector. Given those matched vectors, we
    employ another bi-directional LSTM to aggregate all the information and predict
    the beginning and ending points. Experimental result on the test set of SQuAD
    shows that our model achieves a competitive result on the leaderboard.

    Models of retrieval in sentence comprehension: A computational evaluation using Bayesian hierarchical modeling

    Bruno Nicenboim, Shravan Vasishth
    Subjects: Computation and Language (cs.CL); Applications (stat.AP); Machine Learning (stat.ML)

    Research on interference has provided evidence that the formation of
    dependencies between non-adjacent words relies on a cue-based retrieval
    mechanism. Two different models can account for one of the main predictions of
    interference, i.e., a slowdown at a retrieval site, when several items share a
    feature associated with a retrieval cue: Lewis and Vasishth’s (2005)
    activation-based model and McElree’s (2000) direct access model. Even though
    these two models have been used almost interchangeably, they are based on
    different assumptions and predict differences in the relationship between
    reading times and response accuracy. The activation-based model follows the
    assumptions of ACT-R, and its retrieval process behaves as a lognormal race
    between accumulators of evidence with a single variance. Under this model,
    accuracy of the retrieval is determined by the winner of the race and retrieval
    time by its rate of accumulation. In contrast, the direct access model assumes
    a model of memory where only the probability of retrieval varies between items;
    in this model, differences in latencies are a by-product of the possibility and
    repairing incorrect retrievals. We implemented both models in a Bayesian
    hierarchical framework in order to evaluate them and compare them. We show that
    some aspects of the data are better fit under the direct access model than
    under the activation-based model. We suggest that this finding does not rule
    out the possibility that retrieval may be behaving as a race model with
    assumptions that follow less closely the ones from the ACT-R framework. We show
    that by introducing a modification of the activation model, i.e, by assuming
    that the accumulation of evidence for retrieval of incorrect items is not only
    slower but noisier (i.e., different variances for the correct and incorrect
    items), the model can provide a fit as good as the one of the direct access
    model.

    Information Extraction with Character-level Neural Networks and Noisy Supervision

    Philipp Meerkamp (Bloomberg LP), Zhengyi Zhou (AT&T Labs Research)
    Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR); Learning (cs.LG)

    We present an architecture for information extraction from text that augments
    an existing parser with a character-level neural network. To train the neural
    network, we compute a measure of consistency of extracted data with existing
    databases, and use it as a form of noisy supervision. Our architecture combines
    the ability of constraint-based information extraction system to easily
    incorporate domain knowledge and constraints with the ability of deep neural
    networks to leverage large amounts of data to learn complex features. The
    system led to large improvements over a mature and highly tuned
    constraint-based information extraction system used at Bloomberg for financial
    language text. At the same time, the new system massively reduces the
    development effort, allowing rule-writers to write high-recall constraints
    while relying on the deep neural network to remove false positives and boost
    precision.

    Vicinity-Driven Paragraph and Sentence Alignment for Comparable Corpora

    Gustavo Henrique Paetzold, Lucia Specia
    Subjects: Computation and Language (cs.CL)

    Parallel corpora have driven great progress in the field of Text
    Simplification. However, most sentence alignment algorithms either offer a
    limited range of alignment types supported, or simply ignore valuable clues
    present in comparable documents. We address this problem by introducing a new
    set of flexible vicinity-driven paragraph and sentence alignment algorithms
    that 1-N, N-1, N-N and long distance null alignments without the need for
    hard-to-replicate supervised models.

    Performance Improvements of Probabilistic Transcript-adapted ASR with Recurrent Neural Network and Language-specific Constraints

    Xiang Kong, Preethi Jyothi, Mark Hasegawa-Johnson
    Journal-ref: ICASSP 2017
    Subjects: Computation and Language (cs.CL)

    Mismatched transcriptions have been proposed as a mean to acquire
    probabilistic transcriptions from non-native speakers of a language.Prior work
    has demonstrated the value of these transcriptions by successfully adapting
    cross-lingual ASR systems for different tar-get languages. In this work, we
    describe two techniques to refine these probabilistic transcriptions: a
    noisy-channel model of non-native phone misperception is trained using a
    recurrent neural net-work, and decoded using minimally-resourced
    language-dependent pronunciation constraints. Both innovations improve quality
    of the transcript, and both innovations reduce phone error rate of a
    trainedASR, by 7% and 9% respectively

    Evaluating Automatic Speech Recognition Systems in Comparison With Human Perception Results Using Distinctive Feature Measures

    Xiang Kong, Jeung-Yoon Choi, Stefanie Shattuck-Hufnagel
    Comments: ICASSP 2017
    Journal-ref: ICASSP 2017
    Subjects: Computation and Language (cs.CL)

    This paper describes methods for evaluating automatic speech recognition
    (ASR) systems in comparison with human perception results, using measures
    derived from linguistic distinctive features. Error patterns in terms of
    manner, place and voicing are presented, along with an examination of confusion
    matrices via a distinctive-feature-distance metric. These evaluation methods
    contrast with conventional performance criteria that focus on the phone or word
    level, and are intended to provide a more detailed profile of ASR system
    performance,as well as a means for direct comparison with human perception
    results at the sub-phonemic level.

    ConceptNet 5.5: An Open Multilingual Graph of General Knowledge

    Robert Speer, Joshua Chin, Catherine Havasi
    Subjects: Computation and Language (cs.CL)

    Machine learning about language can be improved by supplying it with specific
    knowledge and sources of external information. We present here a new version of
    the linked open data resource ConceptNet that is particularly well suited to be
    used with modern NLP techniques such as word embeddings.

    ConceptNet is a knowledge graph that connects words and phrases of natural
    language with labeled edges. Its knowledge is collected from many sources that
    include expert-created resources, crowd-sourcing, and games with a purpose. It
    is designed to represent the general knowledge involved in understanding
    language, improving natural language applications by allowing the application
    to better understand the meanings behind the words people use.

    When ConceptNet is combined with word embeddings acquired from distributional
    semantics (such as word2vec), it provides applications with understanding that
    they would not acquire from distributional semantics alone, nor from narrower
    resources such as WordNet or DBPedia. We demonstrate this with state-of-the-art
    results on intrinsic evaluations of word relatedness that translate into
    improvements on applications of word vectors, including solving SAT-style
    analogies.

    Tracking the World State with Recurrent Entity Networks

    Mikael Henaff, Jason Weston, Arthur Szlam, Antoine Bordes, Yann LeCun
    Subjects: Computation and Language (cs.CL)

    We introduce a new model, the Recurrent Entity Network (EntNet). It is
    equipped with a dynamic long-term memory which allows it to maintain and update
    a representation of the state of the world as it receives new data. For
    language understanding tasks, it can reason on-the-fly as it reads text, not
    just when it is required to answer a question or respond as is the case for a
    Memory Network (Sukhbaatar et al., 2015). Like a Neural Turing Machine or
    Differentiable Neural Computer (Graves et al., 2014; 2016) it maintains a fixed
    size memory and can learn to perform location and content-based read and write
    operations. However, unlike those models it has a simple parallel architecture
    in which several memory locations can be updated simultaneously. The EntNet
    sets a new state-of-the-art on the bAbI tasks, and is the first method to solve
    all the tasks in the 10k training examples setting. We also demonstrate that it
    can solve a reasoning task which requires a large number of supporting facts,
    which other methods are not able to solve, and can generalize past its training
    horizon. It can also be practically used on large scale datasets such as
    Children’s Book Test, where it obtains competitive performance, reading the
    story in a single pass.

    Online Sequence-to-Sequence Reinforcement Learning for Open-Domain Conversational Agents

    Nabiha Asghar, Pascal Poupart, Jiang Xin, Hang Li
    Comments: 8 pages
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    We propose an online, end-to-end, deep reinforcement learning technique to
    develop generative conversational agents for open-domain dialogue. We use a
    unique combination of offline two-phase supervised learning and online
    reinforcement learning with human users to train our agent. While most existing
    research proposes hand-crafted and develop-defined reward functions for
    reinforcement, we devise a novel reward mechanism based on a variant of Beam
    Search and one-character user-feedback at each step. Experiments show that our
    model, when trained on a small and shallow Seq2Seq network, successfully
    promotes the generation of meaningful, diverse and interesting responses, and
    can be used to train agents with customized personas and conversational styles.

    Learning to Hash-tag Videos with Tag2Vec

    Aditya Singh, Saurabh Saini, Rajvi Shah, PJ Narayanan
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)

    User-given tags or labels are valuable resources for semantic understanding
    of visual media such as images and videos. Recently, a new type of labeling
    mechanism known as hash-tags have become increasingly popular on social media
    sites. In this paper, we study the problem of generating relevant and useful
    hash-tags for short video clips. Traditional data-driven approaches for tag
    enrichment and recommendation use direct visual similarity for label transfer
    and propagation. We attempt to learn a direct low-cost mapping from video to
    hash-tags using a two step training process. We first employ a natural language
    processing (NLP) technique, skip-gram models with neural network training to
    learn a low-dimensional vector representation of hash-tags (Tag2Vec) using a
    corpus of 10 million hash-tags. We then train an embedding function to map
    video features to the low-dimensional Tag2vec space. We learn this embedding
    for 29 categories of short video clips with hash-tags. A query video without
    any tag-information can then be directly mapped to the vector space of tags
    using the learned embedding and relevant tags can be found by performing a
    simple nearest-neighbor retrieval in the Tag2Vec space. We validate the
    relevance of the tags suggested by our system qualitatively and quantitatively
    with a user study.

    Joint Bayesian Gaussian discriminant analysis for speaker verification

    Yiyan Wang, Haotian Xu, Zhijian Ou
    Comments: accepted by ICASSP2017
    Subjects: Sound (cs.SD); Computation and Language (cs.CL); Learning (cs.LG)

    State-of-the-art i-vector based speaker verification relies on variants of
    Probabilistic Linear Discriminant Analysis (PLDA) for discriminant analysis. We
    are mainly motivated by the recent work of the joint bayesian (JB) method,
    which is originally proposed for discriminant analysis in face verification. We
    apply JB to speaker verification and make three contributions beyond of the
    original JB. 1) In contrast to the EM iterations with approximated statistics
    in the original JB, the EM iterations with exact statistics is employed and
    gives better performance. 2) We propose to do simultaneously diagonalization
    (SD) of the within-class and between-class covariance matrices to achieve
    efficient testing, which has broader application scope than the SVD-based
    efficient testing method in the original JB. 3) We scrutinize similarities and
    differences between various Gaussian PLDAs and JB, complementing the previous
    analysis of comparing JB only with Prince-Elder PLDA. Extensive experiments are
    conducted on NIST SRE10 core condition 5, empirically validating the
    superiority of JB with faster convergence rate and 9 – 13% EER reduction
    compared with state-of-the-art PLDA.


    Distributed, Parallel, and Cluster Computing

    TF.Learn: TensorFlow's High-level Module for Distributed Machine Learning

    Yuan Tang
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Learning (cs.LG)

    TF.Learn is a high-level Python module for distributed machine learning
    inside TensorFlow. It provides an easy-to-use Scikit-learn style interface to
    simplify the process of creating, configuring, training, evaluating, and
    experimenting a machine learning model. TF.Learn integrates a wide range of
    state-of-art machine learning algorithms built on top of TensorFlow’s low level
    APIs for small to large-scale supervised and unsupervised problems. This module
    focuses on bringing machine learning to non-specialists using a general-purpose
    high-level language as well as researchers who want to implement, benchmark,
    and compare their new methods in a structured environment. Emphasis is put on
    ease of use, performance, documentation, and API consistency.

    An Artificial Neural Networks based Temperature Prediction Framework for Network-on-Chip based Multicore Platform

    Sandeep Aswath Narayana
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Hardware Architecture (cs.AR); Neural and Evolutionary Computing (cs.NE)

    Continuous improvement in silicon process technologies has made possible the
    integration of hundreds of cores on a single chip. However, power and heat have
    become dominant constraints in designing these massive multicore chips causing
    issues with reliability, timing variations and reduced lifetime of the chips.
    Dynamic Thermal Management (DTM) is a solution to avoid high temperatures on
    the die. Typical DTM schemes only address core level thermal issues. However,
    the Network-on-chip (NoC) paradigm, which has emerged as an enabling
    methodology for integrating hundreds to thousands of cores on the same die can
    contribute significantly to the thermal issues. Moreover, the typical DTM is
    triggered reactively based on temperature measurements from on-chip thermal
    sensor requiring long reaction times whereas predictive DTM method estimates
    future temperature in advance, eliminating the chance of temperature overshoot.
    Artificial Neural Networks (ANNs) have been used in various domains for
    modeling and prediction with high accuracy due to its ability to learn and
    adapt. This thesis concentrates on designing an ANN prediction engine to
    predict the thermal profile of the cores and Network-on-Chip elements of the
    chip. This thermal profile of the chip is then used by the predictive DTM that
    combines both core level and network level DTM techniques. On-chip wireless
    interconnect which is recently envisioned to enable energy-efficient data
    exchange between cores in a multicore environment, will be used to provide a
    broadcast-capable medium to efficiently distribute thermal control messages to
    trigger and manage the DTM schemes.

    Avoiding communication in primal and dual block coordinate descent methods

    Aditya Devarakonda, Kimon Fountoulakis, James Demmel, Michael W. Mahoney
    Comments: 30 pages
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Primal and dual block coordinate descent methods are iterative methods for
    solving regularized and unregularized optimization problems. Distributed-memory
    parallel implementations of these methods have become popular in analyzing
    large machine learning datasets. However, existing implementations communicate
    at every iteration which, on modern data center and supercomputing
    architectures, often dominates the cost of floating-point computation. Recent
    results on communication-avoiding Krylov subspace methods suggest that large
    speedups are possible by re-organizing iterative algorithms to avoid
    communication. We show how applying similar algorithmic transformations can
    lead to primal and dual block coordinate descent methods that only communicate
    every s iterations–where s is a tuning parameter–instead of every iteration
    for the regularized least-squares problem. We derive communication-avoiding
    variants of the primal and dual block coordinate descent methods which reduce
    the number of synchronizations by a factor of s on distributed-memory parallel
    machines without altering the convergence rate. Our communication-avoiding
    algorithms attain modeled strong scaling speedups of 14x and 165x on a modern
    supercomputer using MPI and Apache Spark, respectively. Our algorithms attain
    modeled weak scaling speedups of 12x and 396x on the same machine using MPI and
    Apache Spark, respectively.

    FaaS: Federation-as-a-Service

    Francesco Paolo Schiavo, Vladimiro Sassone, Luca Nicoletti, Andrea Margheri
    Comments: Technical Report Edited by Francesco Paolo Schiavo, Vladimiro Sassone, Luca Nicoletti and Andrea Margheri
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    This document is the main high-level architecture specification of the
    SUNFISH cloud federation solution. Its main objective is to introduce the
    concept of Federation-as-a-Service (FaaS) and the SUNFISH platform. FaaS is the
    new and innovative cloud federation service proposed by the SUNFISH project.
    The document defines the functionalities of FaaS, its governance and precise
    objectives. With respect to these objectives, the document proposes the
    high-level architecture of the SUNFISH platform: the software architecture that
    permits realising a FaaS federation. More specifically, the document describes
    all the components forming the platform, the offered functionalities and their
    high-level interactions underlying the main FaaS functionalities. The document
    concludes by outlining the main implementation strategies towards the actual
    implementation of the proposed cloud federation solution.


    Learning

    DizzyRNN: Reparameterizing Recurrent Neural Networks for Norm-Preserving Backpropagation

    Victor Dorobantu, Per Andre Stromhaug, Jess Renteria
    Subjects: Learning (cs.LG)

    The vanishing and exploding gradient problems are well-studied obstacles that
    make it difficult for recurrent neural networks to learn long-term time
    dependencies. We propose a reparameterization of standard recurrent neural
    networks to update linear transformations in a provably norm-preserving way
    through Givens rotations. Additionally, we use the absolute value function as
    an element-wise non-linearity to preserve the norm of backpropagated signals
    over the entire network. We show that this reparameterization reduces the
    number of parameters and maintains the same algorithmic complexity as a
    standard recurrent neural network, while outperforming standard recurrent
    neural networks with orthogonal initializations and Long Short-Term Memory
    networks on the copy problem.

    Distributed Multi-task Relationship Learning

    Sulin Liu, Sinno Jialin Pan, Qirong Ho
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    In this paper, we propose a distributed multi-task learning framework that
    simultaneously learns predictive models for each task as well as task
    relationships between tasks alternatingly in the parameter server paradigm. In
    our framework, we first offer a general dual form for a family of regularized
    multi-task relationship learning methods. Subsequently, we propose a
    communication-efficient primal-dual distributed optimization algorithm to solve
    the dual problem by carefully designing local subproblems to make the dual
    problem decomposable. Moreover, we provide a theoretical convergence analysis
    for the proposed algorithm, which is specific for distributed multi-task
    relationship learning. We conduct extensive experiments on both synthetic and
    real-world datasets to evaluate our proposed framework in terms of scalability,
    effectiveness, and convergence.

    Generative Adversarial Parallelization

    Daniel Jiwoong Im, He Ma, Chris Dongjoo Kim, Graham Taylor
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    Generative Adversarial Networks have become one of the most studied
    frameworks for unsupervised learning due to their intuitive formulation. They
    have also been shown to be capable of generating convincing examples in limited
    domains, such as low-resolution images. However, they still prove difficult to
    train in practice and tend to ignore modes of the data generating distribution.
    Quantitatively capturing effects such as mode coverage and more generally the
    quality of the generative model still remain elusive. We propose Generative
    Adversarial Parallelization, a framework in which many GANs or their variants
    are trained simultaneously, exchanging their discriminators. This eliminates
    the tight coupling between a generator and discriminator, leading to improved
    convergence and improved coverage of modes. We also propose an improved variant
    of the recently proposed Generative Adversarial Metric and show how it can
    score individual GANs or their collections under the GAP model.

    An Empirical Analysis of Deep Network Loss Surfaces

    Daniel Jiwoong Im, Michael Tao, Kristin Branson
    Subjects: Learning (cs.LG)

    The training of deep neural networks is a high-dimension optimization problem
    with respect to the loss function of a model. Unfortunately, these functions
    are of high dimension and non-convex and hence difficult to characterize. In
    this paper, we empirically investigate the geometry of the loss functions for
    state-of-the-art networks with multiple stochastic optimization methods. We do
    this through several experiments that are visualized on polygons to understand
    how and when these stochastic optimization methods find minima.

    Sources identification using shifted non-negative matrix factorization combined with semi-supervised clustering

    Filip L. Iliev, Valentin G. Stanev, Velimir V. Vesselinov, Boian S. Alexandrov
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    Non-negative matrix factorization (NMF) is a well-known unsupervised learning
    method that has been successfully used for blind source separation of
    non-negative additive signals.NMF method requires the number of the original
    sources to be known a priori. Recently, we reported a method, we called NMFk,
    which by coupling the original NMF multiplicative algorithm with a custom
    semi-supervised clustering allows us to estimate the number of the sources
    based on the robustness of the reconstructed solutions. Here, an extension of
    NMFk is developed, called ShiftNMFk, which by combining NMFk with previously
    formulated ShiftNMF algorithm, Akaike Information Criterion (AIC), and a custom
    procedure for estimating the source locations is capable of identifying: (a)
    the number of the unknown sources, (b) the eventual delays in the signal
    propagation, (c) the locations of the sources, and (d) the speed of propagation
    of each of the signals in the medium. Our new method is a natural extension of
    NMFk that can be used for sources identification based only on observational
    data. We demonstrate how our novel method identifies the components of
    synthetic data sets, discuss its limitations, and present a Julia language
    implementation of ShiftNMFk algorithm.

    Machine learning approach for identification of release sources in advection-diffusion systems

    Valentin G. Stanev, Filip L. Iliev, Velimir V. Vesselinov, Boian S. Alexandrov
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    These records are then used to estimate properties of the contaminant
    sources, e.g., locations, release strengths and model parameters representing
    contaminant migration (e.g., velocity, dispersivity, etc.). These estimates are
    essential for a reliable assessment of the contamination hazards and risks. If
    there are more than one contaminant sources (with different locations and
    strengths), the observed records represent contaminant mixtures; typically, the
    number of sources is unknown. The mixing ratios of the different contaminant
    sources at the detectors are also unknown; this further hinders the reliability
    and complexity of the inverse-model analyses. To circumvent some of these
    challenges, we have developed a novel hybrid source identification method
    coupling machine learning and inverse-analysis methods, and called Green-NMFk.
    It performs decomposition of the observed mixtures based on Non-negative Matrix
    Factorization method for Blind Source Separation, coupled with custom
    semi-supervised clustering algorithm, and uses Green’s functions of
    advection-diffusion equation. Our method is capable of identifying the unknown
    number, locations, and properties of a set of contaminant sources from measured
    contaminant-source mixtures with unknown mixing ratios, without any additional
    information. It also estimates the contaminant transport properties, such as
    velocity and dispersivity. Green-NMFk is not limited to contaminant transport
    but can be applied directly to any problem controlled by partial-differential
    parabolic equation where mixtures of an unknown number of physical sources are
    monitored at multiple locations. Green-NMFk can be also applied with different
    Green’s functions; for example, representing anomalous (non-Fickian) dispersion
    or wave propagation in dispersive media.

    Stacked Generative Adversarial Networks

    Xun Huang, Yixuan Li, Omid Poursaeed, John Hopcroft, Serge Belongie
    Comments: Under review
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    In this paper we aim to leverage the powerful bottom-up discriminative
    representations to guide a top-down generative model. We propose a novel
    generative model named Stacked Generative Adversarial Networks (SGAN), which is
    trained to invert the hierarchical representations of a discriminative
    bottom-up deep network. Our model consists of a top-down stack of GANs, each
    trained to generate “plausible” lower-level representations, conditioned on
    higher-level representations. A representation discriminator is introduced at
    each feature hierarchy to encourage the representation manifold of the
    generator to align with that of the bottom-up discriminative network, providing
    intermediate supervision. In addition, we introduce a conditional loss that
    encourages the use of conditional information from the layer above, and a novel
    entropy loss that maximizes a variational lower bound on the conditional
    entropy of generator outputs. To the best of our knowledge, the entropy loss is
    the first attempt to tackle the conditional model collapse problem that is
    common in conditional GANs. We first train each GAN of the stack independently,
    and then we train the stack end-to-end. Unlike the original GAN that uses a
    single noise vector to represent all the variations, our SGAN decomposes
    variations into multiple levels and gradually resolves uncertainties in the
    top-down generative process. Experiments demonstrate that SGAN is able to
    generate diverse and high-quality images, as well as being more interpretable
    than a vanilla GAN.

    End-to-End Deep Reinforcement Learning for Lane Keeping Assist

    Ahmad El Sallab, Mohammed Abdou, Etienne Perot, Senthil Yogamani
    Comments: Presented at the Machine Learning for Intelligent Transportation Systems Workshop, NIPS 2016
    Subjects: Machine Learning (stat.ML); Learning (cs.LG); Robotics (cs.RO)

    Reinforcement learning is considered to be a strong AI paradigm which can be
    used to teach machines through interaction with the environment and learning
    from their mistakes, but it has not yet been successfully used for automotive
    applications. There has recently been a revival of interest in the topic,
    however, driven by the ability of deep learning algorithms to learn good
    representations of the environment. Motivated by Google DeepMind’s successful
    demonstrations of learning for games from Breakout to Go, we will propose
    different methods for autonomous driving using deep reinforcement learning.
    This is of particular interest as it is difficult to pose autonomous driving as
    a supervised learning problem as it has a strong interaction with the
    environment including other vehicles, pedestrians and roadworks. As this is a
    relatively new area of research for autonomous driving, we will formulate two
    main categories of algorithms: 1) Discrete actions category, and 2) Continuous
    actions category. For the discrete actions category, we will deal with Deep
    Q-Network Algorithm (DQN) while for the continuous actions category, we will
    deal with Deep Deterministic Actor Critic Algorithm (DDAC). In addition to
    that, We will also discover the performance of these two categories on an open
    source car simulator for Racing called (TORCS) which stands for The Open Racing
    car Simulator. Our simulation results demonstrate learning of autonomous
    maneuvering in a scenario of complex road curvatures and simple interaction
    with other vehicles. Finally, we explain the effect of some restricted
    conditions, put on the car during the learning phase, on the convergence time
    for finishing its learning phase.

    Fast Patch-based Style Transfer of Arbitrary Style

    Tian Qi Chen, Mark Schmidt
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Learning (cs.LG)

    Artistic style transfer is an image synthesis problem where the content of an
    image is reproduced with the style of another. Recent works show that a
    visually appealing style transfer can be achieved by using the hidden
    activations of a pretrained convolutional neural network. However, existing
    methods either apply (i) an optimization procedure that works for any style
    image but is very expensive, or (ii) an efficient feedforward network that only
    allows a limited number of trained styles. In this work we propose a simpler
    optimization objective based on local matching that combines the content
    structure and style textures in a single layer of the pretrained network. We
    show that our objective has desirable properties such as a simpler optimization
    landscape, intuitive parameter tuning, and consistent frame-by-frame
    performance on video. Furthermore, we use 80,000 natural images and 80,000
    paintings to train an inverse network that approximates the result of the
    optimization. This results in a procedure for artistic style transfer that is
    efficient but also allows arbitrary content and style images.

    Towards Adaptive Training of Agent-based Sparring Partners for Fighter Pilots

    Brett W. Israelsen, Nisar Ahmed, Kenneth Center, Roderick Green, Winston Bennett Jr
    Comments: submitted copy
    Journal-ref: SciTech 2017, paper 2545524
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG); Robotics (cs.RO)

    A key requirement for the current generation of artificial decision-makers is
    that they should adapt well to changes in unexpected situations. This paper
    addresses the situation in which an AI for aerial dog fighting, with tunable
    parameters that govern its behavior, must optimize behavior with respect to an
    objective function that is evaluated and learned through simulations. Bayesian
    optimization with a Gaussian Process surrogate is used as the method for
    investigating the objective function. One key benefit is that during
    optimization, the Gaussian Process learns a global estimate of the true
    objective function, with predicted outcomes and a statistical measure of
    confidence in areas that haven’t been investigated yet. Having a model of the
    objective function is important for being able to understand possible outcomes
    in the decision space; for example this is crucial for training and providing
    feedback to human pilots. However, standard Bayesian optimization does not
    perform consistently or provide an accurate Gaussian Process surrogate function
    for highly volatile objective functions. We treat these problems by introducing
    a novel sampling technique called Hybrid Repeat/Multi-point Sampling. This
    technique gives the AI ability to learn optimum behaviors in a highly uncertain
    environment. More importantly, it not only improves the reliability of the
    optimization, but also creates a better model of the entire objective surface.
    With this improved model the agent is equipped to more accurately/efficiently
    predict performance in unexplored scenarios.

    An EoS-meter of QCD transition from deep learning

    Long-Gang Pang, Kai Zhou, Nan Su, Hannah Petersen, Horst Stöcker, Xin-Nian Wang
    Subjects: High Energy Physics – Phenomenology (hep-ph); Learning (cs.LG); Nuclear Theory (nucl-th); Machine Learning (stat.ML)

    Supervised learning with a deep convolutional neural network is used to
    identify the QCD equation of state (EoS) employed in relativistic hydrodynamic
    simulations of heavy-ion collisions. The final-state particle spectra
    (
    ho(p_T,Phi)) provide directly accessible information from experiments.
    High-level correlations of (
    ho(p_T,Phi)) learned by the neural network act
    as an “EoS-meter”, effective in detecting the nature of the QCD transition. The
    EoS-meter is model independent and insensitive to other simulation input,
    especially the initial conditions. Thus it provides a formidable
    direct-connection of heavy-ion collision observable with the bulk properties of
    QCD.

    Neuro-symbolic representation learning on biological knowledge graphs

    Mona Alshahrani, Mohammed Asif Khan, Omar Maddouri, Akira R Kinjo, Núria Queralt-Rosinach, Robert Hoehndorf
    Subjects: Quantitative Methods (q-bio.QM); Learning (cs.LG); Molecular Networks (q-bio.MN)

    Motivation: Biological data and knowledge bases increasingly rely on Semantic
    Web technologies and the use of knowledge graphs for data integration,
    retrieval and federated queries. In the past years, feature learning methods
    that are applicable to graph-structured data are becoming available, but have
    not yet widely been applied and evaluated on structured biological knowledge.
    Results: We develop a novel method for feature learning on biological knowledge
    graphs. Our method combines symbolic methods, in particular knowledge
    representation using symbolic logic and automated reasoning, with neural
    networks to generate embeddings of nodes that encode for related information
    within knowledge graphs. Through the use of symbolic logic, these embeddings
    contain both explicit and implicit information. We apply these embeddings to
    the prediction of edges in the knowledge graph representing problems of
    function prediction, finding candidate genes of diseases, protein-protein
    interactions, or drug target relations, and demonstrate performance that
    matches and sometimes outperforms traditional approaches based on manually
    crafted features. Our method can be applied to any biological knowledge graph,
    and will thereby open up the increasing amount of Semantic Web based knowledge
    bases in biology to use in machine learning and data analytics. Availability
    and Implementation:
    this https URL Contact:
    robert.hoehndorf@kaust.edu.sa

    TF.Learn: TensorFlow's High-level Module for Distributed Machine Learning

    Yuan Tang
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Learning (cs.LG)

    TF.Learn is a high-level Python module for distributed machine learning
    inside TensorFlow. It provides an easy-to-use Scikit-learn style interface to
    simplify the process of creating, configuring, training, evaluating, and
    experimenting a machine learning model. TF.Learn integrates a wide range of
    state-of-art machine learning algorithms built on top of TensorFlow’s low level
    APIs for small to large-scale supervised and unsupervised problems. This module
    focuses on bringing machine learning to non-specialists using a general-purpose
    high-level language as well as researchers who want to implement, benchmark,
    and compare their new methods in a structured environment. Emphasis is put on
    ease of use, performance, documentation, and API consistency.

    Information Extraction with Character-level Neural Networks and Noisy Supervision

    Philipp Meerkamp (Bloomberg LP), Zhengyi Zhou (AT&T Labs Research)
    Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR); Learning (cs.LG)

    We present an architecture for information extraction from text that augments
    an existing parser with a character-level neural network. To train the neural
    network, we compute a measure of consistency of extracted data with existing
    databases, and use it as a form of noisy supervision. Our architecture combines
    the ability of constraint-based information extraction system to easily
    incorporate domain knowledge and constraints with the ability of deep neural
    networks to leverage large amounts of data to learn complex features. The
    system led to large improvements over a mature and highly tuned
    constraint-based information extraction system used at Bloomberg for financial
    language text. At the same time, the new system massively reduces the
    development effort, allowing rule-writers to write high-recall constraints
    while relying on the deep neural network to remove false positives and boost
    precision.

    Parsimonious Online Learning with Kernels via Sparse Projections in Function Space

    Alec Koppel, Garrett Warnell, Ethan Stump, Alejandro Ribeiro
    Comments: Submitted to JMLR on 11/24/2016
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Despite their attractiveness, popular perception is that techniques for
    nonparametric function approximation do not scale to streaming data due to an
    intractable growth in the amount of storage they require. To solve this problem
    in a memory-affordable way, we propose an online technique based on functional
    stochastic gradient descent in tandem with supervised sparsification based on
    greedy function subspace projections. The method, called parsimonious online
    learning with kernels (POLK), provides a controllable tradeoff? between its
    solution accuracy and the amount of memory it requires. We derive conditions
    under which the generated function sequence converges almost surely to the
    optimal function, and we establish that the memory requirement remains finite.
    We evaluate POLK for kernel multi-class logistic regression and kernel
    hinge-loss classification on three canonical data sets: a synthetic Gaussian
    mixture model, the MNIST hand-written digits, and the Brodatz texture database.
    On all three tasks, we observe a favorable tradeoff of objective function
    evaluation, classification performance, and complexity of the nonparametric
    regressor extracted the proposed method.

    Corporate Disruption in the Science of Machine Learning

    Sam Work
    Comments: MSc dissertation, qualitative analysis, machine learning researchers
    Subjects: Computers and Society (cs.CY); Learning (cs.LG)

    This MSc dissertation considers the effects of the current corporate interest
    on researchers in the field of machine learning. Situated within the field’s
    cyclical history of academic, public and corporate interest, this dissertation
    investigates how current researchers view recent developments and negotiate
    their own research practices within an environment of increased commercial
    interest and funding. The original research consists of in-depth interviews
    with 12 machine learning researchers working in both academia and industry.
    Building on theory from science, technology and society studies, this
    dissertation problematizes the traditional narratives of the neoliberalization
    of academic research by allowing the researchers themselves to discuss how
    their career choices, working environments and interactions with others in the
    field have been affected by the reinvigorated corporate interest of recent
    years.

    Joint Bayesian Gaussian discriminant analysis for speaker verification

    Yiyan Wang, Haotian Xu, Zhijian Ou
    Comments: accepted by ICASSP2017
    Subjects: Sound (cs.SD); Computation and Language (cs.CL); Learning (cs.LG)

    State-of-the-art i-vector based speaker verification relies on variants of
    Probabilistic Linear Discriminant Analysis (PLDA) for discriminant analysis. We
    are mainly motivated by the recent work of the joint bayesian (JB) method,
    which is originally proposed for discriminant analysis in face verification. We
    apply JB to speaker verification and make three contributions beyond of the
    original JB. 1) In contrast to the EM iterations with approximated statistics
    in the original JB, the EM iterations with exact statistics is employed and
    gives better performance. 2) We propose to do simultaneously diagonalization
    (SD) of the within-class and between-class covariance matrices to achieve
    efficient testing, which has broader application scope than the SVD-based
    efficient testing method in the original JB. 3) We scrutinize similarities and
    differences between various Gaussian PLDAs and JB, complementing the previous
    analysis of comparing JB only with Prince-Elder PLDA. Extensive experiments are
    conducted on NIST SRE10 core condition 5, empirically validating the
    superiority of JB with faster convergence rate and 9 – 13% EER reduction
    compared with state-of-the-art PLDA.

    Theory and Tools for the Conversion of Analog to Spiking Convolutional Neural Networks

    Bodo Rueckauer, Iulia-Alexandra Lungu, Yuhuang Hu, Michael Pfeiffer
    Comments: 9 pages, 2 figures, presented at the workshop “Computing with Spikes” at NIPS 2016, Barcelona, Spain
    Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Deep convolutional neural networks (CNNs) have shown great potential for
    numerous real-world machine learning applications, but performing inference in
    large CNNs in real-time remains a challenge. We have previously demonstrated
    that traditional CNNs can be converted into deep spiking neural networks
    (SNNs), which exhibit similar accuracy while reducing both latency and
    computational load as a consequence of their data-driven, event-based style of
    computing. Here we provide a novel theory that explains why this conversion is
    successful, and derive from it several new tools to convert a larger and more
    powerful class of deep networks into SNNs. We identify the main sources of
    approximation errors in previous conversion methods, and propose simple
    mechanisms to fix these issues. Furthermore, we develop spiking implementations
    of common CNN operations such as max-pooling, softmax, and batch-normalization,
    which allow almost loss-less conversion of arbitrary CNN architectures into the
    spiking domain. Empirical evaluation of different network architectures on the
    MNIST and CIFAR10 benchmarks leads to the best SNN results reported to date.

    Hybrid Repeat/Multi-point Sampling for Highly Volatile Objective Functions

    Brett Israelsen, Nisar Ahmed
    Journal-ref: BayesOpt Workshop, NIPS 2016
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG); Robotics (cs.RO)

    A key drawback of the current generation of artificial decision-makers is
    that they do not adapt well to changes in unexpected situations. This paper
    addresses the situation in which an AI for aerial dog fighting, with tunable
    parameters that govern its behavior, will optimize behavior with respect to an
    objective function that must be evaluated and learned through simulations. Once
    this objective function has been modeled, the agent can then choose its desired
    behavior in different situations. Bayesian optimization with a Gaussian Process
    surrogate is used as the method for investigating the objective function. One
    key benefit is that during optimization the Gaussian Process learns a global
    estimate of the true objective function, with predicted outcomes and a
    statistical measure of confidence in areas that haven’t been investigated yet.
    However, standard Bayesian optimization does not perform consistently or
    provide an accurate Gaussian Process surrogate function for highly volatile
    objective functions. We treat these problems by introducing a novel sampling
    technique called Hybrid Repeat/Multi-point Sampling. This technique gives the
    AI ability to learn optimum behaviors in a highly uncertain environment. More
    importantly, it not only improves the reliability of the optimization, but also
    creates a better model of the entire objective surface. With this improved
    model the agent is equipped to better adapt behaviors.


    Information Theory

    On dually almost MRD codes

    Javier de la Cruz
    Subjects: Information Theory (cs.IT)

    In this paper we define and study a family of codes which come close to be
    MRD codes, so we call them AMRD codes (almost MRD). An AMRD code is a code with
    rank defect equal to 1. AMRD codes whose duals are AMRD are called dually AMRD.
    Dually AMRD codes are the closest to the MRD codes given that both they and
    their dual codes are almost optimal. Necessary and sufficient conditions for
    the codes to be dually AMRD are given. Furthermore we show that dually AMRD
    codes and codes of rank defect one and maximum 2-generalized weight coincide
    when the size of the matrix divides the dimension.

    CFD results calibration from sparse sensor observations with a case study for indoor thermal map

    Chaoyang Jiang, Yeng Chai Soh, Hua Li, Mustafa K. Masood, Zhe Wei, Xiaoli Zhou, Deqing Zhai
    Comments: 17 pages
    Subjects: Information Theory (cs.IT); Graphics (cs.GR)

    Current CFD calibration work has mainly focused on the CFD model calibration.
    However no known work has considered the calibration of the CFD results. In
    this paper, we take inspiration from the image editing problem to develop a
    methodology to calibrate CFD simulation results based on sparse sensor
    observations. We formulate the calibration of CFD results as an optimization
    problem. The cost function consists of two terms. One term guarantees a good
    local adjustment of the simulation results based on the sparse sensor
    observations. The other term transmits the adjustment from local regions around
    sensing locations to the global domain. The proposed method can enhance the CFD
    simulation results while preserving the overall original profile. An experiment
    in an air-conditioned room was implemented to verify the effectiveness of the
    proposed method. In the experiment, four sensor observations were used to
    calibrate a simulated thermal map with 167×365 data points. The experimental
    results show that the proposed method is effective and practical.

    Jointly Broadcasting Data and Power with Quality of Service Guarantees

    P K Deekshith, Trupthi Chougule, Shreya Turmari, Ramya Raju, Rakshitha Ram, Vinod Sharma
    Subjects: Information Theory (cs.IT)

    In this work, we consider a scenario wherein an energy harvesting wireless
    radio equipment sends information to multiple receivers alongside powering
    them. In addition to harvesting the incoming radio frequency (RF) energy, the
    receivers also harvest energy from {its environment (e.g., solar energy)}. This
    communication framework is captured by a fading Gaussian Broadcast Channel
    (GBC) with energy harvesting transmitter and receivers. In order to ensure
    {some quality of service (QoS)} in data reception among the receivers, we
    impose a extit{minimum-rate} requirement on data transmission. For the
    setting in place, we characterize the fundamental limits in jointly
    transmitting information and power subject to a QoS guarantee, for three
    cardinal receiver structures namely, extit{ideal}, extit{time-switching}
    and extit{power-splitting}. We show that a time-switching receiver can
    {switch} between {information reception mode} and {energy harvesting mode},
    extit{without} the transmitter’s knowledge of the same and extit{without}
    any extra extit{rate loss}. We also prove that, for the same amount of power
    transferred, on average, a power-splitting receiver supports higher data rates
    compared to a time-switching receiver.

    Information and Power Transfer by Energy Harvesting Transmitters over a Fading Multiple Access Channel with Minimum Rate Constraints

    P K Deekshith, Trupthi Chougule, Shreya Turmari, Ramya Raju, Rakshitha Ram, Vinod Sharma
    Subjects: Information Theory (cs.IT)

    We consider the problem of Simultaneous Wireless Information and Power
    Transfer (SWIPT) over a fading multiple access channel with additive Gaussian
    noise. The transmitters as well as the receiver harvest energy from ambient
    sources. We assume that the transmitters have two classes of data to send, viz.
    delay sensitive and delay tolerant data. Each transmitter sends the delay
    sensitive data at a certain minimum rate irrespective of the channel conditions
    (fading states). In addition, if the channel conditions are good, the delay
    tolerant data is sent. {Along with data, the transmitters also transfer power
    to aid the receiver in meeting its energy requirements.} In this setting, we
    characterize the extit{minimum-rate capacity region} which provides the
    fundamental limit of transferring information and power simultaneously with
    minimum rate guarantees. Owing to the limitations of current technology, these
    limits might not be achievable in practice. Among the practical receiver
    structures proposed for SWIPT in literature, two popular architectures are the
    extit{time switching} and extit{power splitting} receivers. For each of
    these architectures, we derive the minimum-rate capacity regions. We show that
    power splitting receivers although more complex, provide a larger capacity
    region.

    Cramer-Rao Lower Bound for DoA Estimation with RF Lens-Embedded Antenna Array

    Jae-Nam Shim, Hongseok Park, GeeYong Suk, Chan-Byoung Chae, Dong Ku Kim
    Subjects: Information Theory (cs.IT)

    In this paper, we consider the Cramer-Rao lower bound (CRLB) for estimation
    of a lens-embedded antenna array with deterministic parameters. Unlike CRLB of
    uniform linear array (ULA), it is noted that CRLB for direction of arrival
    (DoA) of lens-embedded antenna array is dominated by not only angle but
    characteristics of lens. Derivation is based on the approximation that
    amplitude of received signal with lens is approximated to Gaussian function. We
    confirmed that parameters needed to design a lens can be derived by standard
    deviation of Gaussian, which represents characteristic of received signal, by
    simulation of beam propagation method. Well-designed lens antenna shows better
    performance than ULA in terms of estimating DoA. This is a useful derivation
    because, result can be the guideline for designing parameters of lens to
    satisfy certain purpose.

    Massive MIMO with Imperfect Channel Covariance Information

    Emil Björnson, Luca Sanguinetti, Merouane Debbah
    Comments: 5 pages, 3 figures, 1 table
    Journal-ref: presented at Asilomar Conference on Signals, Systems, and
    Computers, Pacific Grove, USA, Nov. 2016
    Subjects: Information Theory (cs.IT)

    This work investigates the impact of imperfect statistical information in the
    uplink of massive MIMO systems. In particular, we first show why covariance
    information is needed and then propose two schemes for covariance matrix
    estimation. A lower bound on the spectral efficiency (SE) of any combining
    scheme is derived, under imperfect covariance knowledge, and a closed-form
    expression is computed for maximum-ratio combin- ing. We show that having
    covariance information is not critical, but that it is relatively easy to
    acquire it and to achieve SE close to the ideal case of having perfect
    statistical information.

    Green OFDMA Resource Allocation in Cache-Enabled CRAN

    Reuben George Stephen, Rui Zhang
    Comments: Presented in IEEE Online Conference on Green Communications (Online GreenComm), Nov. 2016 (Invited Paper)
    Subjects: Information Theory (cs.IT)

    Cloud radio access network (CRAN), in which remote radio heads (RRHs) are
    deployed to serve users in a target area, and connected to a central processor
    (CP) via limited-capacity links termed the fronthaul, is a promising candidate
    for the next-generation wireless communication systems. Due to the
    content-centric nature of future wireless communications, it is desirable to
    cache popular contents beforehand at the RRHs, to reduce the burden on the
    fronthaul and achieve energy saving through cooperative transmission. This
    motivates our study in this paper on the energy efficient transmission in an
    orthogonal frequency division multiple access (OFDMA)-based CRAN with multiple
    RRHs and users, where the RRHs can prefetch popular contents. We consider a
    joint optimization of the user-SC assignment, RRH selection and transmit power
    allocation over all the SCs to minimize the total transmit power of the RRHs,
    subject to the RRHs’ individual fronthaul capacity constraints and the users’
    minimum rate constraints, while taking into account the caching status at the
    RRHs. Although the problem is non-convex, we propose a Lagrange duality based
    solution, which can be efficiently computed with good accuracy. We compare the
    minimum transmit power required by the proposed algorithm with different
    caching strategies against the case without caching by simulations, which show
    the significant energy saving with caching.

    Signal Detection under Short-Interval Sampling of Continuous Waveforms for Optical Wireless Scattering Communication

    Difan Zou, Chen Gong, Zhengyuan Xu
    Subjects: Information Theory (cs.IT)

    In optical wireless scattering communication, received signal in each symbol
    interval is captured by a photomultiplier tube (PMT) and then sampled through
    very short but finite interval sampling. The resulting samples form a signal
    vector for symbol detection. The upper and lower bounds on transmission rate of
    such a processing system are studied. It is shown that the gap between two
    bounds approaches zero as the thermal noise and shot noise variances approach
    zero. The maximum a posteriori (MAP) signal detection is performed and a low
    computational complexity receiver is derived under piecewise polynomial
    approximation. Meanwhile, the threshold based signal detection is also studied,
    where two threshold selection rules are proposed based on the detection error
    probability and the Kullback-Leibler (KL) distance. For the latter, it is shown
    that the KL distance is not sensitive to the threshold selection for small shot
    and thermal noise variances, and thus the threshold can be selected among a
    wide range without significant loss from the optimal KL distance. The
    performances of the transmission rate bounds, the signal detection, and the
    threshold selection approaches are evaluated by the numerical results.

    Construction of Full-Diversity LDPC Lattices for Block-Fading Channels

    Hassan Khodaiemehr, Mohammad-Reza Sadeghi, Daniel Panario
    Comments: 44 pages, 6 figures. Part of this work has been presented at ISIT 2016, Spain
    Subjects: Information Theory (cs.IT)

    LDPC lattices were the first family of lattices which have an efficient
    decoding algorithm in high dimensions over an AWGN channel. Considering
    Construction D’ of lattices with one binary LDPC code as underlying code gives
    the well known Construction A LDPC lattices or 1-level LDPC lattices.
    Block-fading channel (BF) is a useful model for various wireless communication
    channels in both indoor and outdoor environments. Frequency-hopping schemes and
    orthogonal frequency division multiplexing (OFDM) can conveniently be modelled
    as block-fading channels. Applying lattices in this type of channel entails
    dividing a lattice point into multiple blocks such that fading is constant
    within a block but changes, independently, across blocks. The design of
    lattices for BF channels offers a challenging problem, which differs greatly
    from its counterparts like AWGN channels. Recently, the original binary
    Construction A for lattices, due to Forney, have been generalized to a lattice
    construction from totally real and complex multiplication fields. This
    generalized Construction A of lattices provides signal space diversity
    intrinsically, which is the main requirement for the signal sets designed for
    fading channels. In this paper we construct full diversity LDPC lattices for
    block-fading channels using Construction A over totally real number fields. We
    propose a new iterative decoding method for these family of lattices which has
    complexity that grows linearly in the dimension of the lattice. In order to
    implement our decoding algorithm, we propose the definition of a parity check
    matrix and Tanner graph for full diversity Construction A lattices. We also
    prove that the constructed LDPC lattices together with the proposed decoding
    method admit diversity order n-1 over an n-block-fading channel.

    Optimization and Analysis of Probabilistic Caching in (N)-tier Heterogeneous Networks

    Kuikui Li, Chenchen Yang, Zhiyong Chen, Meixia Tao
    Comments: submitted to IEEE Trans. Wireless Communications
    Subjects: Information Theory (cs.IT)

    In this paper, we study the probabilistic caching for an N-tier wireless
    heterogeneous network (HetNet) using stochastic geometry. A general and
    tractable expression of the successful delivery probability (SDP) is first
    derived. We then optimize the caching probabilities for maximizing the SDP in
    the high signal-to-noise ratio (SNR) region. The problem is proved to be convex
    and solved efficiently. We next establish an interesting connection between
    N-tier HetNets and single-tier networks. Unlike the single-tier network where
    the optimal performance only depends on the cache size, the optimal performance
    of N-tier HetNets depends also on the BS densities. The performance upper bound
    is, however, determined by an equivalent single-tier network. We further show
    that with even caching probabilities regardless of content popularities, to
    achieve a target SDP, the BS density of a tier can be reduced by increasing the
    cache size of the tier when the cache size is larger than a threshold;
    otherwise the BS density and BS cache size can be increased simultaneously. It
    is also found analytically that the BS density of a tier is inverse to the BS
    cache size of the same tier and is linear to BS cache sizes of other tiers.

    Precoding under Instantaneous Per-Antenna Peak Power Constraint

    Hela Jedda, Amine Mezghani, A. Lee Swindlehurst, Josef A. Nossek
    Comments: 5 pages, submitted to ICC 2017
    Subjects: Information Theory (cs.IT)

    We consider a multi-user (MU) multiple-input-single-output (MISO) downlink
    system with M single-antenna users and N transmit antennas with a nonlinear
    power amplifier (PA) at each antenna. Instead of emitting constant envelope
    (CE) signals from the antennas to have highly power efficient PAs, we relax the
    CE constraint and allow the transmit signals to have instantaneous power less
    than or equal to the available power at each PA. The PA power efficiency
    decreases but simulation results show that the same performance in terms of
    bit-error-ratio (BER) can be achieved with less transmitted power and less PA
    power consumption. We propose a linear and a nonlinear precoder design to
    mitigate the multi-user interference (MUI) under the constraint of a maximal
    instantaneous per-antenna peak power.

    Rate-Achieving Policy in Finite-Horizon Throughput Region for Multi-User Interference Channels

    Yirui Cong, Xiangyun Zhou, Rodney A. Kennedy
    Comments: This paper is accepted for publication in Globecom’16
    Subjects: Information Theory (cs.IT)

    This paper studies a wireless network consisting of multiple
    transmitter-receiver pairs sharing the same spectrum where interference is
    regarded as noise. Previously, the throughput region of such a network was
    characterized for either one time slot or an infinite time horizon. This work
    aims to close the gap by investigating the throughput region for transmissions
    over a finite time horizon. We derive an efficient algorithm to examine the
    achievability of any given rate in the finite-horizon throughput region and
    provide the rate-achieving policy. The computational efficiency of our
    algorithm comes from the use of A* search with a carefully chosen heuristic
    function and a tree pruning strategy. We also show that the celebrated
    max-weight algorithm which finds all achievable rates in the infinite-horizon
    throughput region fails to work for the finite-horizon throughput region.

    Design of 5G Full Dimension Massive MIMO Systems

    Qurrat-Ul-Ain Nadeem, Abla Kammoun, Mérouane Debbah, and Mohamed-Slim Alouini
    Subjects: Information Theory (cs.IT)

    Massive multiple-input-multiple-output (MIMO) transmission is a promising
    technology to improve the capacity and reliability of wireless systems.
    However, the number of antennas that can be equipped at a base station (BS) is
    limited by the BS form factor, posing a challenge to the deployment of massive
    linear arrays. To cope with this limitation, this work discusses Full Dimension
    MIMO (FD-MIMO), which is currently an active area of research and
    standardization in the 3rd Generation Partnership Project (3GPP) for evolution
    towards fifth generation (5G) cellular systems. FD-MIMO utilizes an active
    antenna system (AAS) with a 2D planar array structure, which provides the
    ability of adaptive electronic beamforming in the 3D space. This paper presents
    the design of the AAS and the ongoing efforts in the 3GPP to develop the
    corresponding 3D channel model. Compact structure of large-scale antenna arrays
    drastically increases the spatial correlation in FD-MIMO systems. In order to
    account for its effects, the generalized spatial correlation functions for
    channels constituted by individual antenna elements and overall antenna ports
    in the AAS are derived. Exploiting the quasi-static channel covariance matrices
    of the users, the problem of determining the optimal downtilt weight vector for
    antenna ports, which maximizes the minimum signal-to-interference ratio of a
    multi-user multiple-input-single-output system, is formulated as a fractional
    optimization problem. A quasi-optimal solution is obtained through the
    application of semi-definite relaxation and Dinkelbach’s method. Finally, the
    user-group specific elevation beamforming scenario is devised, which offers
    significant performance gains as confirmed through simulations. These results
    have direct application in the analysis of 5G FD-MIMO systems.

    On the Identification of SM and Alamouti Coded SC-FDMA Signals: A Statistical-Based Approach

    Yahia A. Eldemerdash, Octavia A. Dobre
    Subjects: Information Theory (cs.IT)

    Signal identification represents the task of a receiver to identify the
    signal type and its parameters, with applications to both military and
    commercial communications. In this paper, we investigate the identification of
    spatial multiplexing (SM) and Alamouti (AL) space-time block code (STBC) with
    single carrier frequency division multiple access (SC-FDMA) signals, when the
    receiver is equipped with a single antenna. We develop a discriminating feature
    based on a fourth-order statistic of the received signal, as well as a constant
    false alarm rate decision criterion which relies on the statistical properties
    of the feature estimate. Furthermore, we present the theoretical performance
    analysis of the proposed identification algorithm. The algorithm does not
    require channel or noise power estimation, modulation classification, and block
    synchronization. Simulation results show the validity of the proposed
    algorithm, as well as a very good agreement with the theoretical analysis.

    Spatial multi-LRU: Distributed Caching for Wireless Networks with Coverage Overlaps

    Anastasios Giovanidis, Apostolos Avranas
    Comments: 14 pages, double column, 5 figures, 15 sub-figures in total. arXiv admin note: substantial text overlap with arXiv:1602.07623
    Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT); Multimedia (cs.MM); Performance (cs.PF)

    This article introduces a novel family of decentralised caching policies,
    applicable to wireless networks with finite storage at the edge-nodes
    (stations). These policies, that are based on the Least-Recently-Used
    replacement principle, are here referred to as spatial multi-LRU. They update
    cache inventories in a way that provides content diversity to users who are
    covered by, and thus have access to, more than one station. Two variations are
    proposed, the multi-LRU-One and -All, which differ in the number of replicas
    inserted in the involved caches. We analyse their performance under two types
    of traffic demand, the Independent Reference Model (IRM) and a model that
    exhibits temporal locality. For IRM, we propose a Che-like approximation to
    predict the hit probability, which gives very accurate results. Numerical
    evaluations show that the performance of multi-LRU increases the more the
    multi-coverage areas increase, and it is close to the performance of
    centralised policies, when multi-coverage is sufficient. For IRM traffic,
    multi-LRU-One is preferable to multi-LRU-All, whereas when the traffic exhibits
    temporal locality the -All variation can perform better. Both variations
    outperform the simple LRU. When popularity knowledge is not accurate, the new
    policies can perform better than centralised ones.

    A High-Capacity Separable Reversible Method for Hiding Multiple Messages in Encrypted Images

    M. Hassan Najafi, David J. Lilja
    Subjects: Cryptography and Security (cs.CR); Information Theory (cs.IT)

    This work proposes a high-capacity scheme for separable reversible data
    hiding in encrypted images. At the sender side, the original uncompressed image
    is encrypted using an encryption key. One or several data hiders use the MSB of
    some image pixels to hide additional data. Given the encrypted image containing
    this additional data, with only one of those data hiding keys, the receiver can
    extract the corresponding embedded data, although the image content will remain
    inaccessible. With all of the embedding keys, the receiver can extract all of
    the embedded data. Finally, with the encryption key, the receiver can decrypt
    the received data and reconstruct the original image perfectlyignore{ without
    the data embedding key(s) }by exploiting the spatial correlation of natural
    images. Based on the proposed method a receiver could recover the original
    image perfectly even when it does not have the data embedding key(s) and the
    embedding rate is high.

    Millimeter Wave V2V Communications: Distributed Association and Beam Alignment

    Cristina Perfecto, Javier Del Ser, Mehdi Bennis
    Comments: 14 pages, 6 figures
    Subjects: Networking and Internet Architecture (cs.NI); Computer Science and Game Theory (cs.GT); Information Theory (cs.IT)

    Recently millimeter-wave bands have been postulated as a means to accommodate
    the foreseen extreme bandwidth demands in vehicular communications, which
    result from the dissemination of sensory data to nearby vehicles for enhanced
    environmental awareness and improved safety level. However, the literature is
    particularly scarce in regards to principled resource allocation schemes that
    deal with the challenging radio conditions posed by the high mobility of
    vehicular scenarios. In this work we propose a novel framework that blends
    together Matching Theory and Swarm Intelligence to dynamically and efficiently
    pair vehicles and optimize both transmission and reception beamwidths. This is
    done by jointly considering Channel (CSI) and Queue (QSI) State Information
    when establishing vehicle-to-vehicle (V2V) links. To validate the proposed
    framework simulation results are presented and discussed where the throughput
    performance as well as the latency/reliability trade-offs of the proposed
    approach are assessed and compared to several baseline approaches recently
    proposed in the literature. The results obtained in our study -with performance
    gains in terms of reliability and delay of up to 25% for ultra-dense vehicular
    scenarios and on average 50% more paired vehicles that some of the baselines-
    shed light on the operational limits and practical feasibility of mmWave bands
    as a viable radio access solution for future high-rate V2V communications.

    Parameter Estimation under Gaussian Model Uncertainties by Iterative Covariance Approximation

    Oliver Lang, Michael Lunglmayr, Mario Huemer
    Subjects: Statistics Theory (math.ST); Information Theory (cs.IT)

    We propose a novel iterative algorithm for estimating a deterministic but
    unknown parameter vector in the presence of Gaussian model uncertainties. This
    iterative algorithm is based on a system model where an overall noise term
    describes both, the measurement noise and the noise resulting from the model
    uncertainties. This overall noise term is a function of the true parameter
    vector. The proposed iterative algorithm can be applied on structured as well
    as unstructured models and it outperforms prior art algorithms for a broad
    range of applications.

    Gradient Coding

    Rashish Tandon, Qi Lei, Alexandros G. Dimakis, Nikos Karampatziakis
    Comments: 13 pages, Presented at the Machine Learning Systems Workshop at NIPS 2016
    Subjects: Machine Learning (stat.ML); Distributed, Parallel, and Cluster Computing (cs.DC); Information Theory (cs.IT); Computation (stat.CO)

    We propose a novel coding theoretic framework for mitigating stragglers in
    distributed learning. We show how carefully replicating data blocks and coding
    across gradients can provide tolerance to failures and stragglers for
    synchronous Gradient Descent. We implement our scheme in MPI and show how we
    compare against baseline architectures in running time and generalization
    error.




沪ICP备19023445号-2号
友情链接