IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Mon, 8 May 2017

    我爱机器学习(52ml.net)发表于 2017-05-08 00:00:00
    love 0

    Neural and Evolutionary Computing

    Discrete Modeling of Multi-Transmitter Neural Networks with Neuron Competition

    Nikolay Bazenkov, Dmitry Vorontsov, Varvara Dyakonova, Liudmila Zhilyakova, Oleg Kuznetsov, Dmitri Sakharov
    Comments: 13 pages, 4 figures
    Subjects: Neural and Evolutionary Computing (cs.NE); Neurons and Cognition (q-bio.NC)

    We propose a novel discrete model of central pattern generators (CPG),
    neuronal ensembles generating rhythmic activity. The model emphasizes the role
    of nonsynaptic interactions and the diversity of electrical properties in
    nervous systems. Neurons in the model release different neurotransmitters into
    the shared extracellular space (ECS) so each neuron with the appropriate set of
    receptors can receive signals from other neurons. We consider neurons,
    differing in their electrical activity, represented as finite-state machines
    functioning in discrete time steps. Discrete modeling is aimed to provide a
    computationally tractable and compact explanation of rhythmic pattern
    generation in nervous systems. The important feature of the model is the
    introduced mechanism of neuronal competition which is shown to be responsible
    for the generation of proper rhythms. The model is illustrated with two
    examples: a half-center oscillator considered to be a basic mechanism of
    emerging rhythmic activity and the well-studied feeding network of a pond
    snail. Future research will focus on the neuromodulatory effects ubiquitous in
    CPG networks and the whole nervous systems.

    Exponential scaling of neural algorithms – a future beyond Moore's Law?

    James B. Aimone
    Comments: Draft version of a perspective article to be submitted
    Subjects: Neural and Evolutionary Computing (cs.NE); Neurons and Cognition (q-bio.NC)

    Although the brain has long been considered a potential inspiration for
    future computing, Moore’s Law – the scaling property that has seen revolutions
    in technologies ranging from supercomputers to smart phones – has largely been
    driven by advances in materials science. As the ability to miniaturize
    transistors is coming to an end, there is increasing attention on new
    approaches to computation, including renewed enthusiasm around the potential of
    neural computation. Recent advances in neurotechnologies, many of which have
    been aided by computing’s rapid progression over recent decades, are now
    reigniting this opportunity to bring neural computation insights into broader
    computing applications. As we understand more about the brain, our ability to
    motivate new computing paradigms with continue to progress. These new
    approaches to computing, which we are already seeing in techniques such as deep
    learning, will themselves improve our ability to learn about the brain and
    accordingly can be projected to give rise to even further insights. Such a
    positive feedback has the potential to change the complexion of how computing
    sciences and neurosciences interact, and suggests that the next form of
    exponential scaling in computing may emerge from our progressive understanding
    of the brain.

    Analysis and Design of Convolutional Networks via Hierarchical Tensor Decompositions

    Nadav Cohen, Or Sharir, Ronen Tamari, David Yakira, Yoav Levine, Amnon Shashua
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    The driving force behind convolutional networks – the most successful deep
    learning architecture to date, is their expressive power. Despite its wide
    acceptance and vast empirical evidence, formal analyses supporting this belief
    are scarce. The primary notions for formally reasoning about expressiveness are
    efficiency and inductive bias. Efficiency refers to the ability of a network
    architecture to realize functions that require an alternative architecture to
    be much larger. Inductive bias refers to the prioritization of some functions
    over others given prior knowledge regarding a task at hand. In this paper we
    provide a high-level overview of a series of works written by the authors, that
    through an equivalence to hierarchical tensor decompositions, analyze the
    expressive efficiency and inductive bias of various architectural features in
    convolutional networks (depth, width, convolution strides and more). The
    results presented shed light on the demonstrated effectiveness of convolutional
    networks, and in addition, provide new tools for network design.


    Computer Vision and Pattern Recognition

    ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases

    Xiaosong Wang, Yifan Peng, Le Lu, Zhiyong Lu, Mohammadhadi Bagheri, Ronald M. Summers
    Comments: CVPR 2017 spotlight
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)

    The chest X-ray is one of the most commonly accessible radiological
    examinations for screening and diagnosis of many lung diseases. A tremendous
    number of X-ray imaging studies accompanied by radiological reports are
    accumulated and stored in many modern hospitals’ Picture Archiving and
    Communication Systems (PACS). On the other side, it is still an open question
    how this type of hospital-size knowledge database containing invaluable imaging
    informatics (i.e., loosely labeled) can be used to facilitate the data-hungry
    deep learning paradigms in building truly large-scale high precision
    computer-aided diagnosis (CAD) systems.

    In this paper, we present a new chest X-ray database, namely “ChestX-ray8”,
    which comprises 108,948 frontal-view X-ray images of 32,717 unique patients
    with the text-mined eight disease image labels (where each image can have
    multi-labels), from the associated radiological reports using natural language
    processing. Importantly, we demonstrate that these commonly occurring thoracic
    diseases can be detected and even spatially-located via a unified
    weakly-supervised multi-label image classification and disease localization
    framework, which is validated using our proposed dataset. Although the initial
    quantitative results are promising as reported, deep convolutional neural
    network based “reading chest X-rays” (i.e., recognizing and locating the common
    disease patterns trained with only image-level labels) remains a strenuous task
    for fully-automated high precision CAD systems.

    S-OHEM: Stratified Online Hard Example Mining for Object Detection

    Minne Li, Zhaoning Zhang, Hao Yu, Xinyuan Chen, Dongsheng Li
    Comments: 9 pages, 4 figures, submitted to BMVC 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    One of the major challenges in object detection is to propose detectors with
    highly accurate localization of objects. The online sampling of high-loss
    region proposals (hard examples) has made training of region-based deep
    convolutional network detectors effective and efficient. In this paper, we
    present the Stratified Online Hard Example Mining (S-OHEM) algorithm for
    training higher efficiency and accuracy detectors. S-OHEM exploits OHEM with
    stratified sampling, a widely adopted sampling technique. OHEM uses the
    multitask loss with equal weight settings across all loss types (e.g.,
    classification and localization, rigid and non-rigid categories) and ignores
    the influence of different loss distribution throughout the training process,
    which we found essential to the training efficacy. By maintaining a sampling
    distribution according to this influence during hard example mining, we can
    enhance the performance of object detectors. Based on this, S-OHEM samples the
    training examples according to this distribution before feeding them to the
    backpropagation process. We show through systematic experiments that S-OHEM
    yields an average precision (AP) improvement of 0.5% on rigid categories of
    PASCAL VOC 2007 for both the IoU threshold of 0.6 and 0.7. For KITTI 2012, both
    results of the same metric are 1.6%. Regarding the mean average precision
    (mAP), a relative increase of 0.3% and 0.5% (1% and 0.5%) is observed for VOC07
    (KITTI12) using the same set of IoU threshold. Also, S-OHEM is easy to
    integrate with existing region-based detectors and is capable of acting with
    post-recognition level regressors.

    Unsupervised learning of object landmarks by factorized spatial embeddings

    James Thewlis, Hakan Bilen, Andrea Vedaldi
    Comments: 17 pages, 14 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

    Automatically learning the structure of object categories remains an
    important open problem in computer vision. We propose a novel unsupervised
    approach that can discover and learn to detect landmarks in object categories,
    thus characterizing their structure. Our approach is based on factorizing image
    deformations, as induced by a viewpoint change or an object articulation, by
    learning a deep neural network that detects landmarks compatible with such
    visual effects. We show that, by requiring the same neural network to be
    applicable to different object instances, our method naturally induces
    meaningful correspondences between different object instances in a category. We
    assess the method qualitatively on a variety of object types, natural an
    man-made. We also show that our unsupervised landmarks are highly predictive of
    manually-annotated landmarks in faces benchmark datasets, and can be used to
    regress those with a high degree of accuracy.

    Unified Embedding and Metric Learning for Zero-Exemplar Event Detection

    Noureldien Hussein, Efstratios Gavves, Arnold W.M. Smeulders
    Comments: IEEE CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Event detection in unconstrained videos is conceived as a content-based video
    retrieval with two modalities: textual and visual. Given a text describing a
    novel event, the goal is to rank related videos accordingly. This task is
    zero-exemplar, no video examples are given to the novel event.

    Related works train a bank of concept detectors on external data sources.
    These detectors predict confidence scores for test videos, which are ranked and
    retrieved accordingly. In contrast, we learn a joint space in which the visual
    and textual representations are embedded. The space casts a novel event as a
    probability of pre-defined events. Also, it learns to measure the distance
    between an event and its related videos.

    Our model is trained end-to-end on publicly available EventNet. When applied
    to TRECVID Multimedia Event Detection dataset, it outperforms the
    state-of-the-art by a considerable margin.

    Part-based Deep Hashing for Large-scale Person Re-identification

    Fuqing Zhu, Xiangwei Kong, Liang Zheng, Haiyan Fu, Qi Tian
    Comments: 12 pages, 4 figures. IEEE Transactions on Image Processing, 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Large-scale is a trend in person re-identification (re-id). It is important
    that real-time search be performed in a large gallery. While previous methods
    mostly focus on discriminative learning, this paper makes the attempt in
    integrating deep learning and hashing into one framework to evaluate the
    efficiency and accuracy for large-scale person re-id. We integrate spatial
    information for discriminative visual representation by partitioning the
    pedestrian image into horizontal parts. Specifically, Part-based Deep Hashing
    (PDH) is proposed, in which batches of triplet samples are employed as the
    input of the deep hashing architecture. Each triplet sample contains two
    pedestrian images (or parts) with the same identity and one pedestrian image
    (or part) of the different identity. A triplet loss function is employed with a
    constraint that the Hamming distance of pedestrian images (or parts) with the
    same identity is smaller than ones with the different identity. In the
    experiment, we show that the proposed Part-based Deep Hashing method yields
    very competitive re-id accuracy on the large-scale Market-1501 and
    Market-1501+500K datasets.

    Bridging between Computer and Robot Vision through Data Augmentation: a Case Study on Object Recognition

    Antonio D'Innocente, Fabio Maria Carlucci, Mirco Colosi, Barbara Caputo
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Despite the impressive progress brought by deep network in visual object
    recognition, robot vision is still far from being a solved problem. The most
    successful convolutional architectures are developed starting from ImageNet, a
    large scale collection of images of object categories downloaded from the Web.
    This kind of images is very different from the situated and embodied visual
    experience of robots deployed in unconstrained settings. To reduce the gap
    between these two visual experiences, this paper proposes a simple yet
    effective data augmentation layer that zooms on the object of interest and
    simulates the object detection outcome of a robot vision system. The layer,
    that can be used with any convolutional deep architecture, brings to an
    increase in object recognition performance of up to 7\%, in experiments
    performed over three different benchmark databases. Upon acceptance of the
    paper, our robot data augmentation layer will be made publicly available.

    TALL: Temporal Activity Localization via Language Query

    Jiyang Gao, Chen Sun, Zhenheng Yang, Ram Nevatia
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper focuses on temporal localization of actions in untrimmed videos.
    Existing methods typically train classifiers for a pre-defined list of actions
    and apply them in a sliding window fashion. However, activities in the wild
    consist of a wide combination of actors, actions and objects; it is difficult
    to design a proper activity list that meets users’ needs. We propose to
    localize activities by natural language queries. Temporal Activity Localization
    via Language (TALL) is challenging as it requires: (1) suitable design of text
    and video representations to allow cross-modal matching of actions and language
    queries; (2) ability to locate actions accurately given features from sliding
    windows of limited granularity. We propose a novel Cross-modal Temporal
    Regression Localizer (CTRL) to jointly model text query and video clips, output
    alignment scores and action boundary regression results for candidate clips.
    For evaluation, we adopt TaCoS dataset, and build a new dataset for this task
    on top of Charades by adding sentence temporal annotations, called
    Charades-STA. We also build complex sentence queries in Charades-STA for test.
    Experimental results show that CTRL outperforms previous methods significantly
    on both datasets.

    Characterizing and Improving Stability in Neural Style Transfer

    Agrim Gupta, Justin Johnson, Alexandre Alahi, Li Fei-Fei
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Recent progress in style transfer on images has focused on improving the
    quality of stylized images and speed of methods. However, real-time methods are
    highly unstable resulting in visible flickering when applied to videos. In this
    work we characterize the instability of these methods by examining the solution
    set of the style transfer objective. We show that the trace of the Gram matrix
    representing style is inversely related to the stability of the method. Then,
    we present a recurrent convolutional network for real-time video style transfer
    which incorporates a temporal consistency loss and overcomes the instability of
    prior methods. Our networks can be applied at any resolution, do not re- quire
    optical flow at test time, and produce high quality, temporally consistent
    stylized videos in real-time.

    Motion Prediction Under Multimodality with Conditional Stochastic Networks

    Katerina Fragkiadaki, Jonathan Huang, Alex Alemi, Sudheendra Vijayanarasimhan, Susanna Ricco, Rahul Sukthankar
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Given a visual history, multiple future outcomes for a video scene are
    equally probable, in other words, the distribution of future outcomes has
    multiple modes. Multimodality is notoriously hard to handle by standard
    regressors or classifiers: the former regress to the mean and the latter
    discretize a continuous high dimensional output space. In this work, we present
    stochastic neural network architectures that handle such multimodality through
    stochasticity: future trajectories of objects, body joints or frames are
    represented as deep, non-linear transformations of random (as opposed to
    deterministic) variables. Such random variables are sampled from simple
    Gaussian distributions whose means and variances are parametrized by the output
    of convolutional encoders over the visual history. We introduce novel
    convolutional architectures for predicting future body joint trajectories that
    outperform fully connected alternatives cite{DBLP:journals/corr/WalkerDGH16}.
    We introduce stochastic spatial transformers through optical flow warping for
    predicting future frames, which outperform their deterministic equivalents
    cite{DBLP:journals/corr/PatrauceanHC15}. Training stochastic networks involves
    an intractable marginalization over stochastic variables. We compare various
    training schemes that handle such marginalization through a) straightforward
    sampling from the prior, b) conditional variational autoencoders
    cite{NIPS2015_5775,DBLP:journals/corr/WalkerDGH16}, and, c) a proposed
    K-best-sample loss that penalizes the best prediction under a fixed “prediction
    budget”. We show experimental results on object trajectory prediction, human
    body joint trajectory prediction and video prediction under varying future
    uncertainty, validating quantitatively and qualitatively our architectural
    choices and training schemes.

    Streaming Algorithm for Euler Characteristic Curves of Multidimensional Images

    Teresa Heiss, Hubert Wagner
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Algebraic Topology (math.AT)

    We present an efficient algorithm to compute Euler characteristic curves of
    gray scale images of arbitrary dimension. In various applications the Euler
    characteristic curve is used as a descriptor of an image.

    Our algorithm is the first streaming algorithm for Euler characteristic
    curves. The usage of streaming removes the necessity to store the entire image
    in RAM. Experiments show that our implementation handles terabyte scale images
    on commodity hardware. Due to lock-free parallelism, it scales well with the
    number of processor cores.

    Additionally, we put the concept of the Euler characteristic curve in the
    wider context of computational topology. In particular, we explain the
    connection with persistence diagrams.

    Detecting Adversarial Samples Using Density Ratio Estimates

    Lovedeep Gondara
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

    Machine learning models, especially based on deep learning are used in
    everyday applications ranging from self driving cars to medical diagnostics.
    However, it is easy to trick such models using adversarial samples,
    indistinguishable from real samples to human eye, such samples can bias models
    towards incorrect classification. Impact of adversarial samples is far-reaching
    and efficient detection of adversarial samples remains an open problem. In this
    paper we propose to use direct density ratio estimation to detect adversarial
    samples, we empirically show that adversarial samples have different underlying
    probability densities compared to real samples. Our proposed method works well
    with colored and grayscale images, and with different adversarial sample
    generation methods.

    Phase Congruency Parameter Optimization for Enhanced Detection of Image Features for both Natural and Medical Applications

    Seyed Mohammad Mahdi Alavi, Yunyan Zhang
    Subjects: Optimization and Control (math.OC); Computer Vision and Pattern Recognition (cs.CV)

    Following the presentation and proof of the hypothesis that image features
    are particularly perceived at points where the Fourier components are maximally
    in phase, the concept of phase congruency (PC) is introduced. Subsequently, a
    two-dimensional multi-scale phase congruency (2D-MSPC) is developed, which has
    been an important tool for detecting and evaluation of image features. However,
    the 2D-MSPC requires many parameters to be appropriately tuned for optimal
    image features detection. In this paper, we defined a criterion for parameter
    optimization of the 2D-MSPC, which is a function of its maximum and minimum
    moments. We formulated the problem in various optimal and suboptimal
    frameworks, and discussed the conditions and features of the suboptimal
    solutions. The effectiveness of the proposed method was verified through
    several examples, ranging from natural objects to medical images from patients
    with a neurological disease, multiple sclerosis.

    GRASS: Generative Recursive Autoencoders for Shape Structures

    Jun Li, Kai Xu, Siddhartha Chaudhuri, Ersin Yumer, Hao Zhang, Leonidas Guibas
    Comments: Corresponding author: Kai Xu (kevin.kai.xu@gmail.com)
    Journal-ref: ACM Transactions on Graphics (SIGGRAPH 2017) 36, 4, Article 52
    Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)

    We introduce a novel neural network architecture for encoding and synthesis
    of 3D shapes, particularly their structures. Our key insight is that 3D shapes
    are effectively characterized by their hierarchical organization of parts,
    which reflects fundamental intra-shape relationships such as adjacency and
    symmetry. We develop a recursive neural net (RvNN) based autoencoder to map a
    flat, unlabeled, arbitrary part layout to a compact code. The code effectively
    captures hierarchical structures of man-made 3D objects of varying structural
    complexities despite being fixed-dimensional: an associated decoder maps a code
    back to a full hierarchy. The learned bidirectional mapping is further tuned
    using an adversarial setup to yield a generative model of plausible structures,
    from which novel structures can be sampled. Finally, our structure synthesis
    framework is augmented by a second trained module that produces fine-grained
    part geometry, conditioned on global and local structural context, leading to a
    full generative pipeline for 3D shapes. We demonstrate that without
    supervision, our network learns meaningful structural hierarchies adhering to
    perceptual grouping principles, produces compact codes which enable
    applications such as shape classification and partial matching, and supports
    shape synthesis and interpolation with significant variations in topology and
    geometry.


    Artificial Intelligence

    SLDR-DL: A Framework for SLD-Resolution with Deep Learning

    Cheng-Hao Cai
    Comments: 12 pages, 5 figures
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Logic in Computer Science (cs.LO)

    This paper introduces an SLD-resolution technique based on deep learning.
    This technique enables neural networks to learn from old and successful
    resolution processes and to use learnt experiences to guide new resolution
    processes. An implementation of this technique is named SLDR-DL. It includes a
    Prolog library of deep feedforward neural networks and some essential functions
    of resolution. In the SLDR-DL framework, users can define logical rules in the
    form of definite clauses and teach neural networks to use the rules in
    reasoning processes.

    Distributed Online Learning of Event Definitions

    Nikos Katzouris, Alexander Artikis, Georgios Paliouras
    Subjects: Artificial Intelligence (cs.AI)

    Logic-based event recognition systems infer occurrences of events in time
    using a set of event definitions in the form of first-order rules. The Event
    Calculus is a temporal logic that has been used as a basis in event recognition
    applications, providing among others, direct connections to machine learning,
    via Inductive Logic Programming (ILP). OLED is a recently proposed ILP system
    that learns event definitions in the form of Event Calculus theories, in a
    single pass over a data stream. In this work we present a version of OLED that
    allows for distributed, online learning. We evaluate our approach on a
    benchmark activity recognition dataset and show that we can significantly
    reduce training times, exchanging minimal information between processing nodes.

    Data Readiness Levels

    Neil D. Lawrence
    Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Learning (cs.LG)

    Application of models to data is fraught. Data-generating collaborators often
    only have a very basic understanding of the complications of collating,
    processing and curating data. Challenges include: poor data collection
    practices, missing values, inconvenient storage mechanisms, intellectual
    property, security and privacy. All these aspects obstruct the sharing and
    interconnection of data, and the eventual interpretation of data through
    machine learning or other approaches. In project reporting, a major challenge
    is in encapsulating these problems and enabling goals to be built around the
    processing of data. Project overruns can occur due to failure to account for
    the amount of time required to curate and collate. But to understand these
    failures we need to have a common language for assessing the readiness of a
    particular data set. This position paper proposes the use of data readiness
    levels: it gives a rough outline of three stages of data preparedness and
    speculates on how formalisation of these levels into a common language for data
    readiness could facilitate project management.

    Group invariance principles for causal generative models

    Michel Besserve, Naji Shajarisales, Bernhard Schölkopf, Dominik Janzing
    Comments: 16 pages, 6 figures
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG); Statistics Theory (math.ST)

    The postulate of independence of cause and mechanism (ICM) has recently led
    to several new causal discovery algorithms. The interpretation of independence
    and the way it is utilized, however, varies across these methods. Our aim in
    this paper is to propose a group theoretic framework for ICM to unify and
    generalize these approaches. In our setting, the cause-mechanism relationship
    is assessed by comparing it against a null hypothesis through the application
    of random generic group transformations. We show that the group theoretic view
    provides a very general tool to study the structure of data generating
    mechanisms with direct applications to machine learning.

    A Workflow for Visual Diagnostics of Binary Classifiers using Instance-Level Explanations

    Josua Krause, Aritra Dasgupta, Jordan Swartz, Yindalon Aphinyanaphongs, Enrico Bertini
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI)

    Human-in-the-loop data analysis applications necessitate greater transparency
    in machine learning models for experts to understand and trust their decisions.
    To this end, we propose a visual analytics workflow to help data scientists and
    domain experts explore, diagnose, and understand the decisions made by a binary
    classifier. The approach leverages “instance-level explanations”, measures of
    local feature relevance that explain single instances, and uses them to build a
    set of visual representations that guide the users in their investigation. The
    workflow is based on three main visual representations and steps: one based on
    aggregate statistics to see how data distributes across correct / incorrect
    decisions; one based on explanations to understand which features are used to
    make these decisions; and one based on raw data, to derive insights on
    potential root causes for the observed patterns. The work is validated through
    a long-term collaboration with a group of machine learning and healthcare
    professionals who used our method to make sense of machine learning models they
    developed. The case study from this collaboration demonstrates that the
    proposed method helps experts derive useful knowledge about the model and the
    phenomena it describes and also generate useful hypotheses on how a model can
    be improved.


    Information Retrieval

    Social Media Advertisement Outreach: Learning the Role of Aesthetics

    Avikalp Srivastava, Madhav Datt, Jaikrishna Chaparala, Shubham Mangla, Priyadarshi Patnaik
    Comments: Accepted to SIGIR 2017
    Subjects: Information Retrieval (cs.IR)

    Corporations spend millions of dollars on developing creative image-based
    promotional content to advertise to their user-base on platforms like Twitter.
    Our paper is an initial study, where we propose a novel method to evaluate and
    improve outreach of promotional images from corporations on Twitter, based
    purely on their describable aesthetic attributes. Existing works in aesthetic
    based image analysis exclusively focus on the attributes of digital
    photographs, and are not applicable to advertisements due to the influences of
    inherent content and context based biases on outreach.

    Our paper identifies broad categories of biases affecting such images,
    describes a method for normalization to eliminate effects of those biases and
    score images based on their outreach, and examines the effects of certain
    handcrafted describable aesthetic features on image outreach. Optimizing on the
    describable aesthetic features resulting from this research is a simple method
    for corporations to complement their existing marketing strategy to gain
    significant improvement in user engagement on social media for promotional
    images.

    A Probabilistic Model for Collaborative Filtering with Implicit and Explicit Feedback Data

    ThaiBinh Nguyen, Kenro Aihara, Atsuhiro Takasu
    Subjects: Information Retrieval (cs.IR)

    Collaborative filtering (CF) is one of the most efficient ways for
    recommender systems. Typically, CF-based algorithms analyze users’ preferences
    and items’ attributes using one of two types of feedback: extit{explicit
    feedback} (e.g., ratings given to item by users, like/dislike) or
    extit{implicit feedback} (e.g., clicks, views, purchases). Explicit feedback
    is reliable but is extremely sparse; whereas implicit feedback is abundant but
    is not reliable. To leverage the sparsity of explicit feedback, in this paper,
    we propose a model that efficiently combines explicit and implicit feedback in
    a unified model for rating prediction. This model is a combination of
    extit{matrix factorization} and extit{item embedding}, a similar concept
    with word-embedding in natural language processing. The experiments on three
    real-datasets ( extit{Movilens 1M}, extit{Movielens 20M}, and
    extit{Bookcrossing}) demonstrate that our method can efficiently predict
    ratings for items even if the ratings data is not available for them. The
    experimental results also show that our method outperforms competing methods on
    rating prediction task in general as well as for users and items which have few
    ratings.

    On Identifying Disaster-Related Tweets: Matching-based or Learning-based?

    Hien To, Sumeet Agrawal, Seon Ho Kim, Cyrus Shahabi
    Subjects: Information Retrieval (cs.IR); Learning (cs.LG)

    Social media such as tweets are emerging as platforms contributing to
    situational awareness during disasters. Information shared on Twitter by both
    affected population (e.g., requesting assistance, warning) and those outside
    the impact zone (e.g., providing assistance) would help first responders,
    decision makers, and the public to understand the situation first-hand.
    Effective use of such information requires timely selection and analysis of
    tweets that are relevant to a particular disaster. Even though abundant tweets
    are promising as a data source, it is challenging to automatically identify
    relevant messages since tweet are short and unstructured, resulting to
    unsatisfactory classification performance of conventional learning-based
    approaches. Thus, we propose a simple yet effective algorithm to identify
    relevant messages based on matching keywords and hashtags, and provide a
    comparison between matching-based and learning-based approaches. To evaluate
    the two approaches, we put them into a framework specifically proposed for
    analyzing disaster-related tweets. Analysis results on eleven datasets with
    various disaster types show that our technique provides relevant tweets of
    higher quality and more interpretable results of sentiment analysis tasks when
    compared to learning approach.

    Analysis of Computational Science Papers from ICCS 2001-2016 using Topic Modeling and Graph Theory

    Tesfamariam M. Abuhay, Sergey V. Kovalchuk, Klavdiya O. Bochenina, George Kampis, Valeria V. Krzhizhanovskaya, Michael H. Lees
    Comments: Accepted by International Conference on Computational Science (ICCS) 2017 which will be held in Zurich, Switzerland from June 11-June 14
    Subjects: Digital Libraries (cs.DL); Computation and Language (cs.CL); Information Retrieval (cs.IR); Social and Information Networks (cs.SI)

    This paper presents results of topic modeling and network models of topics
    using the International Conference on Computational Science corpus, which
    contains domain-specific (computational science) papers over sixteen years (a
    total of 5695 papers). We discuss topical structures of International
    Conference on Computational Science, how these topics evolve over time in
    response to the topicality of various problems, technologies and methods, and
    how all these topics relate to one another. This analysis illustrates
    multidisciplinary research and collaborations among scientific communities, by
    constructing static and dynamic networks from the topic modeling results and
    the keywords of authors. The results of this study give insights about the past
    and future trends of core discussion topics in computational science. We used
    the Non-negative Matrix Factorization topic modeling algorithm to discover
    topics and labeled and grouped results hierarchically.


    Computation and Language

    Building Morphological Chains for Agglutinative Languages

    Serkan Ozen, Burcu Can
    Comments: 10 pages, accepted and presented at the CICLing 2017 (18th International Conference on Intelligent Text Processing and Computational Linguistics)
    Subjects: Computation and Language (cs.CL)

    In this paper, we build morphological chains for agglutinative languages by
    using a log-linear model for the morphological segmentation task. The model is
    based on the unsupervised morphological segmentation system called
    MorphoChains. We extend MorphoChains log linear model by expanding the
    candidate space recursively to cover more split points for agglutinative
    languages such as Turkish, whereas in the original model candidates are
    generated by considering only binary segmentation of each word. The results
    show that we improve the state-of-art Turkish scores by 12% having a F-measure
    of 72% and we improve the English scores by 3% having a F-measure of 74%.
    Eventually, the system outperforms both MorphoChains and other well-known
    unsupervised morphological segmentation systems. The results indicate that
    candidate generation plays an important role in such an unsupervised log-linear
    model that is learned using contrastive estimation with negative samples.

    Deep Speaker: an End-to-End Neural Speaker Embedding System

    Chao Li, Xiaokong Ma, Bing Jiang, Xiangang Li, Xuewei Zhang, Xiao Liu, Ying Cao, Ajay Kannan, Zhenyao Zhu
    Subjects: Computation and Language (cs.CL)

    We present Deep Speaker, a neural speaker embedding system that maps
    utterances to a hypersphere where speaker similarity is measured by cosine
    similarity. The embeddings generated by Deep Speaker can be used for many
    tasks, including speaker identification, verification, and clustering. We
    experiment with ResCNN and GRU architectures to extract the acoustic features,
    then mean pool to produce utterance-level speaker embeddings, and train using
    triplet loss based on cosine similarity. Experiments on three distinct datasets
    suggest that Deep Speaker outperforms a DNN-based i-vector baseline. For
    example, Deep Speaker reduces the verification equal error rate by 50%
    (relatively) and improves the identification accuracy by 60% (relatively) on a
    text-independent dataset. We also present results that suggest adapting from a
    model trained with Mandarin can improve accuracy for English speaker
    recognition.

    Sequential Attention

    Sebastian Brarda, Philip Yeres, Samuel R. Bowman
    Comments: 4 pages
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    In this paper we propose a neural network model with a novel Sequential
    Attention layer that extends soft attention by assigning weights to words in an
    input sequence in a way that takes into account not just how well that word
    matches a query, but how well surrounding words match. We evaluate this
    approach on the task of reading comprehension (Who did What and CNN datasets)
    and show that it dramatically improves a strong baseline like the Stanford
    Reader. The resulting model is competitive with the state of the art.

    Joint RNN Model for Argument Component Boundary Detection

    Minglan Li, Yang Gao, Hui Wen, Yang Du, Haijing Liu, Hao Wang
    Comments: 6 pages, 3 figures, submitted to IEEE SMC 2017
    Subjects: Computation and Language (cs.CL)

    Argument Component Boundary Detection (ACBD) is an important sub-task in
    argumentation mining; it aims at identifying the word sequences that constitute
    argument components, and is usually considered as the first sub-task in the
    argumentation mining pipeline. Existing ACBD methods heavily depend on
    task-specific knowledge, and require considerable human efforts on
    feature-engineering. To tackle these problems, in this work, we formulate ACBD
    as a sequence labeling problem and propose a variety of Recurrent Neural
    Network (RNN) based methods, which do not use domain specific or handcrafted
    features beyond the relative position of the sentence in the document. In
    particular, we propose a novel joint RNN model that can predict whether
    sentences are argumentative or not, and use the predicted results to more
    precisely detect the argument component boundaries. We evaluate our techniques
    on two corpora from two different genres; results suggest that our joint RNN
    model obtain the state-of-the-art performance on both datasets.

    Crowdsourcing Argumentation Structures in Chinese Hotel Reviews

    Mengxue Li, Shiqiang Geng, Yang Gao, Haijing Liu, Hao Wang
    Comments: 6 pages,3 figures,This article has been submitted to “The 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC2017)”
    Subjects: Computation and Language (cs.CL)

    Argumentation mining aims at automatically extracting the premises-claim
    discourse structures in natural language texts. There is a great demand for
    argumentation corpora for customer reviews. However, due to the controversial
    nature of the argumentation annotation task, there exist very few large-scale
    argumentation corpora for customer reviews. In this work, we novelly use the
    crowdsourcing technique to collect argumentation annotations in Chinese hotel
    reviews. As the first Chinese argumentation dataset, our corpus includes 4814
    argument component annotations and 411 argument relation annotations, and its
    annotations qualities are comparable to some widely used argumentation corpora
    in other languages.

    Cross-lingual Distillation for Text Classification

    Ruochen Xu, Yiming Yang
    Comments: Accepted at ACL 2017
    Subjects: Computation and Language (cs.CL)

    Cross-lingual text classification(CLTC) is the task of classifying documents
    written in different languages into the same taxonomy of categories. This paper
    presents a novel approach to CLTC that builds on model distillation, which
    adapts and extends a framework originally proposed for model compression. Using
    soft probabilistic predictions for the documents in a label-rich language as
    the (induced) supervisory labels in a parallel corpus of documents, we train
    classifiers successfully for new languages in which labeled training data are
    not available. An adversarial feature adaptation technique is also applied
    during the model training to reduce distribution mismatch. We conducted
    experiments on two benchmark CLTC datasets, treating English as the source
    language and German, French, Japan and Chinese as the unlabeled target
    languages. The proposed approach had the advantageous or comparable performance
    of the other state-of-art methods.

    Senti17 at SemEval-2017 Task 4: Ten Convolutional Neural Network Voters for Tweet Polarity Classification

    Hussam Hamdan
    Subjects: Computation and Language (cs.CL)

    This paper presents Senti17 system which uses ten convolutional neural
    networks (ConvNet) to assign a sentiment label to a tweet. The network consists
    of a convolutional layer followed by a fully-connected layer and a Softmax on
    top. Ten instances of this network are initialized with the same word
    embeddings as inputs but with different initializations for the network
    weights. We combine the results of all instances by selecting the sentiment
    label given by the majority of the ten voters. This system is ranked fourth in
    SemEval-2017 Task4 over 38 systems with 67.4%

    Machine Comprehension by Text-to-Text Neural Question Generation

    Xingdi Yuan, Tong Wang, Caglar Gulcehre, Alessandro Sordoni, Philip Bachman, Sandeep Subramanian, Saizheng Zhang, Adam Trischler
    Subjects: Computation and Language (cs.CL)

    We propose a recurrent neural model that generates natural-language questions
    from documents, conditioned on answers. We show how to train the model using a
    combination of supervised and reinforcement learning. After teacher forcing for
    standard maximum likelihood training, we fine-tune the model using policy
    gradient techniques to maximize several rewards that measure question quality.
    Most notably, one of these rewards is the performance of a question-answering
    system. We motivate question generation as a means to improve the performance
    of question answering systems. Our model is trained and evaluated on the recent
    question-answering dataset SQuAD.

    Sharp Models on Dull Hardware: Fast and Accurate Neural Machine Translation Decoding on the CPU

    Jacob Devlin
    Subjects: Computation and Language (cs.CL)

    Attentional sequence-to-sequence models have become the new standard for
    machine translation, but one challenge of such models is a significant increase
    in training and decoding cost compared to phrase-based systems. Here, we focus
    on efficient decoding, with a goal of achieving accuracy close the
    state-of-the-art in neural machine translation (NMT), while achieving CPU
    decoding speed/throughput close to that of a phrasal decoder.

    We approach this problem from two angles: First, we describe several
    techniques for speeding up an NMT beam search decoder, which obtain a 4.4x
    speedup over a very efficient baseline decoder without changing the decoder
    output. Second, we propose a simple but powerful network architecture which
    uses an RNN (GRU/LSTM) layer at bottom, followed by a series of stacked
    fully-connected layers applied at every timestep. This architecture achieves
    similar accuracy to a deep recurrent model, at a small fraction of the training
    and decoding cost. By combining these techniques, our best system achieves a
    very competitive accuracy of 38.3 BLEU on WMT English-French NewsTest2014,
    while decoding at 100 words/sec on single-threaded CPU. We believe this is the
    best published accuracy/speed trade-off of an NMT system.

    ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases

    Xiaosong Wang, Yifan Peng, Le Lu, Zhiyong Lu, Mohammadhadi Bagheri, Ronald M. Summers
    Comments: CVPR 2017 spotlight
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)

    The chest X-ray is one of the most commonly accessible radiological
    examinations for screening and diagnosis of many lung diseases. A tremendous
    number of X-ray imaging studies accompanied by radiological reports are
    accumulated and stored in many modern hospitals’ Picture Archiving and
    Communication Systems (PACS). On the other side, it is still an open question
    how this type of hospital-size knowledge database containing invaluable imaging
    informatics (i.e., loosely labeled) can be used to facilitate the data-hungry
    deep learning paradigms in building truly large-scale high precision
    computer-aided diagnosis (CAD) systems.

    In this paper, we present a new chest X-ray database, namely “ChestX-ray8”,
    which comprises 108,948 frontal-view X-ray images of 32,717 unique patients
    with the text-mined eight disease image labels (where each image can have
    multi-labels), from the associated radiological reports using natural language
    processing. Importantly, we demonstrate that these commonly occurring thoracic
    diseases can be detected and even spatially-located via a unified
    weakly-supervised multi-label image classification and disease localization
    framework, which is validated using our proposed dataset. Although the initial
    quantitative results are promising as reported, deep convolutional neural
    network based “reading chest X-rays” (i.e., recognizing and locating the common
    disease patterns trained with only image-level labels) remains a strenuous task
    for fully-automated high precision CAD systems.

    Analysis of Computational Science Papers from ICCS 2001-2016 using Topic Modeling and Graph Theory

    Tesfamariam M. Abuhay, Sergey V. Kovalchuk, Klavdiya O. Bochenina, George Kampis, Valeria V. Krzhizhanovskaya, Michael H. Lees
    Comments: Accepted by International Conference on Computational Science (ICCS) 2017 which will be held in Zurich, Switzerland from June 11-June 14
    Subjects: Digital Libraries (cs.DL); Computation and Language (cs.CL); Information Retrieval (cs.IR); Social and Information Networks (cs.SI)

    This paper presents results of topic modeling and network models of topics
    using the International Conference on Computational Science corpus, which
    contains domain-specific (computational science) papers over sixteen years (a
    total of 5695 papers). We discuss topical structures of International
    Conference on Computational Science, how these topics evolve over time in
    response to the topicality of various problems, technologies and methods, and
    how all these topics relate to one another. This analysis illustrates
    multidisciplinary research and collaborations among scientific communities, by
    constructing static and dynamic networks from the topic modeling results and
    the keywords of authors. The results of this study give insights about the past
    and future trends of core discussion topics in computational science. We used
    the Non-negative Matrix Factorization topic modeling algorithm to discover
    topics and labeled and grouped results hierarchically.


    Distributed, Parallel, and Cluster Computing

    In-place Parallel Super Scalar Samplesort (IPSSSSo)

    Michael Axtmann, Sascha Witt, Daniel Ferizovic, Peter Sanders
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    We present a sorting algorithm that works in-place, executes in parallel, is
    cache-optimal, avoids branch-mispredictions, and performs work O(n log n) for
    arbitrary inputs with high probability. We ran extensive experiments and show
    that our algorithm scales linearly in the number of cores on various
    multi-socket machines with 32 cores. On large inputs, we outperform our closest
    in-place competitor by a factor of 2.25 to 2.53 and our closest non-in-place
    competitor by a factor of 1.26 to 1.89. Even sequentially executed, we
    outperform our closest sequential competitor, BlockQuicksort, by a factor of
    1.19 to 1.23 for large inputs.

    A Note on Hardness of Diameter Approximation

    Karl Bringmann, Sebastian Krinninger
    Subjects: Data Structures and Algorithms (cs.DS); Distributed, Parallel, and Cluster Computing (cs.DC)

    We revisit the hardness of approximating the diameter of a network. In the
    CONGEST model, ( ilde Omega (n) ) rounds are necessary to compute the
    diameter [Frischknecht et al. SODA’12]. Abboud et al. DISC 2016 extended this
    result to sparse graphs and, at a more fine-grained level, showed that, for any
    integer ( 1 leq ell leq operatorname{polylog} (n) ), distinguishing between
    networks of diameter ( 4 ell + 2 ) and ( 6 ell + 1 ) requires ( ilde Omega
    (n) ) rounds. We slightly tighten this result by showing that even
    distinguishing between diameter ( 2 ell + 1 ) and ( 3 ell + 1 ) requires (
    ilde Omega (n) ) rounds. The reduction of Abboud et al. is inspired by
    recent conditional lower bounds in the RAM model, where the orthogonal vectors
    problem plays a pivotal role. In our new lower bound, we make the connection to
    orthogonal vectors explicit, leading to a conceptually more streamlined
    exposition. This is suited for teaching both the lower bound in the CONGEST
    model and the conditional lower bound in the RAM model.


    Learning

    A Time-Vertex Signal Processing Framework

    Francesco Grassi, Andreas Loukas, Nathanaël Perraudin, Benjamin Ricaud
    Subjects: Learning (cs.LG)

    An emerging way to deal with high-dimensional non-euclidean data is to assume
    that the underlying structure can be captured by a graph. Recently, ideas have
    begun to emerge related to the analysis of time-varying graph signals. This
    work aims to elevate the notion of joint harmonic analysis to a full-fledged
    framework denoted as Time-Vertex Signal Processing, that links together the
    time-domain signal processing techniques with the new tools of graph signal
    processing. This entails three main contributions: (a) We provide a formal
    motivation for harmonic time-vertex analysis as an analysis tool for the state
    evolution of simple Partial Differential Equations on graphs. (b) We improve
    the accuracy of joint filtering operators by up-to two orders of magnitude. (c)
    Using our joint filters, we construct time-vertex dictionaries analyzing the
    different scales and the local time-frequency content of a signal. The utility
    of our tools is illustrated in numerous applications and datasets, such as
    dynamic mesh denoising and classification, still-video inpainting, and source
    localization in seismic events. Our results suggest that joint analysis of
    time-vertex signals can bring benefits to regression and learning.

    Analysis and Design of Convolutional Networks via Hierarchical Tensor Decompositions

    Nadav Cohen, Or Sharir, Ronen Tamari, David Yakira, Yoav Levine, Amnon Shashua
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    The driving force behind convolutional networks – the most successful deep
    learning architecture to date, is their expressive power. Despite its wide
    acceptance and vast empirical evidence, formal analyses supporting this belief
    are scarce. The primary notions for formally reasoning about expressiveness are
    efficiency and inductive bias. Efficiency refers to the ability of a network
    architecture to realize functions that require an alternative architecture to
    be much larger. Inductive bias refers to the prioritization of some functions
    over others given prior knowledge regarding a task at hand. In this paper we
    provide a high-level overview of a series of works written by the authors, that
    through an equivalence to hierarchical tensor decompositions, analyze the
    expressive efficiency and inductive bias of various architectural features in
    convolutional networks (depth, width, convolution strides and more). The
    results presented shed light on the demonstrated effectiveness of convolutional
    networks, and in addition, provide new tools for network design.

    Spherical Wards clustering and generalized Voronoi diagrams

    Marek Śmieja, Jacek Tabor
    Subjects: Learning (cs.LG)

    Gaussian mixture model is very useful in many practical problems.
    Nevertheless, it cannot be directly generalized to non Euclidean spaces. To
    overcome this problem we present a spherical Gaussian-based clustering approach
    for partitioning data sets with respect to arbitrary dissimilarity measure. The
    proposed method is a combination of spherical Cross-Entropy Clustering with a
    generalized Wards approach. The algorithm finds the optimal number of clusters
    by automatically removing groups which carry no information. Moreover, it is
    scale invariant and allows for forming of spherically-shaped clusters of
    arbitrary sizes. In order to graphically represent and interpret the results
    the notion of Voronoi diagram was generalized to non Euclidean spaces and
    applied for introduced clustering method.

    Detecting Adversarial Samples Using Density Ratio Estimates

    Lovedeep Gondara
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

    Machine learning models, especially based on deep learning are used in
    everyday applications ranging from self driving cars to medical diagnostics.
    However, it is easy to trick such models using adversarial samples,
    indistinguishable from real samples to human eye, such samples can bias models
    towards incorrect classification. Impact of adversarial samples is far-reaching
    and efficient detection of adversarial samples remains an open problem. In this
    paper we propose to use direct density ratio estimation to detect adversarial
    samples, we empirically show that adversarial samples have different underlying
    probability densities compared to real samples. Our proposed method works well
    with colored and grayscale images, and with different adversarial sample
    generation methods.

    Sequential Attention

    Sebastian Brarda, Philip Yeres, Samuel R. Bowman
    Comments: 4 pages
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    In this paper we propose a neural network model with a novel Sequential
    Attention layer that extends soft attention by assigning weights to words in an
    input sequence in a way that takes into account not just how well that word
    matches a query, but how well surrounding words match. We evaluate this
    approach on the task of reading comprehension (Who did What and CNN datasets)
    and show that it dramatically improves a strong baseline like the Stanford
    Reader. The resulting model is competitive with the state of the art.

    Data Readiness Levels

    Neil D. Lawrence
    Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Learning (cs.LG)

    Application of models to data is fraught. Data-generating collaborators often
    only have a very basic understanding of the complications of collating,
    processing and curating data. Challenges include: poor data collection
    practices, missing values, inconvenient storage mechanisms, intellectual
    property, security and privacy. All these aspects obstruct the sharing and
    interconnection of data, and the eventual interpretation of data through
    machine learning or other approaches. In project reporting, a major challenge
    is in encapsulating these problems and enabling goals to be built around the
    processing of data. Project overruns can occur due to failure to account for
    the amount of time required to curate and collate. But to understand these
    failures we need to have a common language for assessing the readiness of a
    particular data set. This position paper proposes the use of data readiness
    levels: it gives a rough outline of three stages of data preparedness and
    speculates on how formalisation of these levels into a common language for data
    readiness could facilitate project management.

    Group invariance principles for causal generative models

    Michel Besserve, Naji Shajarisales, Bernhard Schölkopf, Dominik Janzing
    Comments: 16 pages, 6 figures
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG); Statistics Theory (math.ST)

    The postulate of independence of cause and mechanism (ICM) has recently led
    to several new causal discovery algorithms. The interpretation of independence
    and the way it is utilized, however, varies across these methods. Our aim in
    this paper is to propose a group theoretic framework for ICM to unify and
    generalize these approaches. In our setting, the cause-mechanism relationship
    is assessed by comparing it against a null hypothesis through the application
    of random generic group transformations. We show that the group theoretic view
    provides a very general tool to study the structure of data generating
    mechanisms with direct applications to machine learning.

    SLDR-DL: A Framework for SLD-Resolution with Deep Learning

    Cheng-Hao Cai
    Comments: 12 pages, 5 figures
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Logic in Computer Science (cs.LO)

    This paper introduces an SLD-resolution technique based on deep learning.
    This technique enables neural networks to learn from old and successful
    resolution processes and to use learnt experiences to guide new resolution
    processes. An implementation of this technique is named SLDR-DL. It includes a
    Prolog library of deep feedforward neural networks and some essential functions
    of resolution. In the SLDR-DL framework, users can define logical rules in the
    form of definite clauses and teach neural networks to use the rules in
    reasoning processes.

    Matrix Factorization with Side and Higher Order Information

    Vatsal Shah, Nikhil Rao, Weicong Ding
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    The problem of predicting unobserved entries of a partially observed matrix
    has found wide applicability in several areas, such as recommender systems,
    computational biology, and computer vision. Many scalable methods with rigorous
    theoretical guarantees have been developed for algorithms where the matrix is
    factored into low-rank components, and embeddings are learned for the row and
    column variables. While there has been recent research on incorporating
    explicit side information in the low-rank matrix factorization setting, often
    implicit information can be gleaned from the data, via higher order
    interactions among variables. In this paper, we design a method to make use of
    this implicit information, via random walks on graphs. We show that the problem
    we intend to solve can be cast as factoring a nonlinear transform of the
    (partially) observed matrix and develop an efficient coordinate descent based
    algorithm for the same. Experiments on several datasets show that the method we
    propose outperforms vanilla matrix factorization, and also those methods that
    use explicitly available side information.

    KATE: K-Competitive Autoencoder for Text

    Yu Chen, Mohammed J. Zaki
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Autoencoders have been successful in learning meaningful representations from
    image datasets. However, their performance on text datasets has not been widely
    studied. Traditional autoencoders tend to learn possibly trivial
    representations of text documents due to their confounding properties such as
    high-dimensionality, sparsity and power-law word distributions. In this paper,
    we propose a novel k-competitive autoencoder, called KATE, for text documents.
    Due to the competition between the neurons in the hidden layer, each neuron
    becomes specialized in recognizing specific data patterns, and overall the
    model can learn meaningful representations of textual data. A comprehensive set
    of experiments show that KATE can learn better representations than traditional
    autoencoders including denoising, contractive, variational, and k-sparse
    autoencoders. Our model also outperforms deep generative models, probabilistic
    topic models, and even word representation models (e.g., Word2Vec) in terms of
    several downstream tasks such as document classification, regression, and
    retrieval.

    On Identifying Disaster-Related Tweets: Matching-based or Learning-based?

    Hien To, Sumeet Agrawal, Seon Ho Kim, Cyrus Shahabi
    Subjects: Information Retrieval (cs.IR); Learning (cs.LG)

    Social media such as tweets are emerging as platforms contributing to
    situational awareness during disasters. Information shared on Twitter by both
    affected population (e.g., requesting assistance, warning) and those outside
    the impact zone (e.g., providing assistance) would help first responders,
    decision makers, and the public to understand the situation first-hand.
    Effective use of such information requires timely selection and analysis of
    tweets that are relevant to a particular disaster. Even though abundant tweets
    are promising as a data source, it is challenging to automatically identify
    relevant messages since tweet are short and unstructured, resulting to
    unsatisfactory classification performance of conventional learning-based
    approaches. Thus, we propose a simple yet effective algorithm to identify
    relevant messages based on matching keywords and hashtags, and provide a
    comparison between matching-based and learning-based approaches. To evaluate
    the two approaches, we put them into a framework specifically proposed for
    analyzing disaster-related tweets. Analysis results on eleven datasets with
    various disaster types show that our technique provides relevant tweets of
    higher quality and more interpretable results of sentiment analysis tasks when
    compared to learning approach.


    Information Theory

    Fundamental Limits of Covert Communication over MIMO AWGN Channel

    Amr Abdelaziz, C. Emre Koksal
    Comments: Submitted to IEEE CNS 2017
    Subjects: Information Theory (cs.IT)

    Fundamental limits of covert communication have been studied in literature
    for different models of scalar channels. It was shown that, over (n)
    independent channel uses, (mathcal{O}(sqrt{n})) bits can transmitted reliably
    over a public channel while achieving an arbitrarily low probability of
    detection (LPD) by other stations. This result is well known as square-root law
    and even to achieve this diminishing rate of covert communication, some form of
    shared secret is needed between the transmitter and the receiver. In this
    paper, we establish the limits of LPD communication over the MIMO AWGN channel.
    We define the notion of (epsilon)-probability of detection ((epsilon)-PD) and
    provide a formulation to evaluate the maximum achievable rate under the
    (epsilon)-PD constraint. Then, we show that the optimal input distribution is
    zero mean Gaussian distribution. Assuming channel state information (CSI) about
    the main channel only at the transmitter, we derive the optimal input
    covariance matrix, hence, establishing the (epsilon)-PD capacity. We evaluate
    (epsilon)-PD rates in the limiting regimes for the number of channel uses
    (asymptotic block length) and the number of antennas (massive MIMO). We show
    that, while the SRL still holds for the MIMO AWGN, the number of bits that can
    be transmitted covertly scales exponentially with the number of transmitting
    antennas. We establish the condition on the number of transmitting antennas
    required to achieve a non-diminishing rate with LPD. The practical implication
    of our result is that, MIMO has the potential to provide a substantial increase
    in the file sizes that can be covertly communicated subject to a reasonably low
    delay.

    Consistent Sensor, Relay, and Link Selection in Wireless Sensor Networks

    Rocio Arroyo-Valles, Andrea Simonetto, Geert Leus
    Comments: 27 pages, 17 figures
    Subjects: Information Theory (cs.IT)

    In wireless sensor networks, where energy is scarce, it is inefficient to
    have all nodes active because they consume a non-negligible amount of battery.
    In this paper we consider the problem of jointly selecting sensors, relays and
    links in a wireless sensor network where the active sensors need to communicate
    their measurements to one or multiple access points. Information messages are
    routed stochastically in order to capture the inherent reliability of the
    broadcast links via multiple hops, where the nodes may be acting as sensors or
    as relays. We aim at finding optimal sparse solutions where both, the
    consistency between the selected subset of sensors, relays and links, and the
    graph connectivity in the selected subnetwork are guaranteed. Furthermore,
    active nodes should ensure a network performance in a parameter estimation
    scenario. Two problems are studied: sensor and link selection; and sensor,
    relay and link selection. To solve such problems, we present tractable
    optimization formulations and propose two algorithms that satisfy the previous
    network requirements. We also explore an extension scenario: only link
    selection. Simulation results show the performance of the algorithms and
    illustrate how they provide a sparse solution, which not only saves energy but
    also guarantees the network requirements.

    Probabilistically-Shaped Coded Modulation with Hard Decision Decoding for Coherent Optical Systems

    Alireza Sheikh, Alexandre Graell i Amat, Gianluigi Liva
    Subjects: Information Theory (cs.IT)

    We consider probabilistic shaping to maximize the achievable information rate
    of coded modulation (CM) with hard decision decoding. The proposed scheme using
    binary staircase codes outperforms its uniform CM counterpart by more than 1.3
    dB for 64-QAM and 5 bits/symbol.

    Distributed Task Encoding

    Annina Bracher, Amos Lapidoth, Christoph Pfister
    Comments: 5 pages; accepted at ISIT 2017
    Subjects: Information Theory (cs.IT)

    The rate region of the task-encoding problem for two correlated sources is
    characterized using a novel parametric family of dependence measures. The
    converse uses a new expression for the (
    ho)-th moment of the list size, which
    is derived using the relative (alpha)-entropy.

    Power Allocation and Cooperative Diversity in Two-Way Non-Regenerative Cognitive Radio Networks

    Saeed Vahidian, Maryam Najafi, Marzieh Najafi, Fawaz S. Al-Qahtani
    Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

    In this paper, we investigate the performance of a dual-hop block fading
    cognitive radio network with underlay spectrum sharing over independent but not
    necessarily identically distributed (i.n.i.d.) Nakagami-(m) fading channels.
    The primary network consists of a source and a destination. Depending on
    whether the secondary network which consists of two source nodes have a single
    relay for cooperation or multiple relays thereby employs opportunistic relay
    selection for cooperation and whether the two source nodes suffer from the
    primary users’ (PU) interference, two cases are considered in this paper, which
    are referred to as Scenario (a) and Scenario (b), respectively. For the
    considered underlay spectrum sharing, the transmit power constraint of the
    proposed system is adjusted by interference limit on the primary network and
    the interference imposed by primary user (PU). The developed new analysis
    obtains new analytical results for the outage capacity (OC) and average symbol
    error probability (ASEP). In particular, for Scenario (a), tight lower bounds
    on the OC and ASEP of the secondary network are derived in closed-form. In
    addition, a closed from expression for the end-to-end OC of Scenario (a) is
    achieved. With regards to Scenario (b), a tight lower bound on the OC of the
    secondary network is derived in closed-form. All analytical results are
    corroborated using Monte Carlo simulation method.

    Resource Allocation for Secure Full-Duplex OFDMA Radio Systems

    Yan Sun, Derrick Wing Kwan Ng, Robert Schober
    Comments: Invited presentation at SPAWC 2017, Japan
    Subjects: Information Theory (cs.IT)

    In this paper, we study the resource allocation for an orthogonal frequency
    division multiple access (OFDMA) radio system employing a full-duplex base
    station for serving multiple half-duplex downlink and uplink users
    simultaneously. The resource allocation design objective is the maximization of
    the weighted system throughput while limiting the information leakage to
    guarantee secure simultaneous downlink and uplink transmission in the presence
    of potential eavesdroppers. The algorithm design leads to a mixed combinatorial
    non-convex optimization problem and obtaining the globally optimal solution
    entails a prohibitively high computational complexity. Therefore, an efficient
    successive convex approximation based suboptimal iterative algorithm is
    proposed. Our simulation results confirm that the proposed suboptimal algorithm
    achieves a significant performance gain compared to two baseline schemes.

    D2D User Selection For Simultaneous Spectrum Sharing And Energy Harvesting

    Mansi Peer, Vivek Ashok Bohara, Neha Jain
    Subjects: Information Theory (cs.IT)

    This paper presents a device-to-device (D2D) user selection protocol wherein
    multiple D2D pairs coexist with a cellular network. In the developed framework,
    certain D2D users harvest energy and share the spectrum of the cellular users
    by adopting a hybrid time switching and power splitting protocol. The D2D user
    which harvests the maximum energy and achieves the desired target rate for the
    cellular communication is selected to serve as a decode-and-forward (DF) relay
    for the cellular user. The proposed work analyzes the impact of increase in the
    number of D2D users on the performance of cellular user as well as derives an
    upper bound on the time duration of energy harvesting within which best
    possible rate for cellular user can be obtained. The performance of the
    proposed protocol has been quantified by obtaining the closed form expressions
    of outage probability.

    Blind Detection of Polar Codes

    Pascal Giard, Alexios Balatsoukas-Stimming, Andreas Burg
    Comments: 6 pages, 8 figures, submitted to IEEE Int. Workshop on Signal Process. Syst. (SiPS) 2017
    Subjects: Information Theory (cs.IT)

    Polar codes were recently chosen to protect the control channel information
    in the next-generation mobile communication standard (5G) defined by the 3GPP.
    As a result, receivers will have to implement blind decoding of polar coded
    frames in order to keep complexity, latency, and power consumption tractable.
    As a newly proposed class of block codes, to our knowledge, the blind detection
    of polar codes has not been addressed yet. In this work, we propose a
    low-complexity blind-detection algorithm for polar-encoded frames. We base this
    algorithm on a novel detection metric with update rules that leverage the a
    priori knowledge of the frozen-bit locations, exploiting the inherent
    structures that these locations impose on a polar-encoded block of data. We
    show that the proposed detection metric allows to clearly distinguish
    polar-encoded frames from other types of data by investigating the cumulative
    distribution functions of the detection metric, and the receiver operating
    characteristic. The results presented are tailored to the 5G standardization
    effort discussions, i.e., for a short low-rate polar code concatenated with a
    CRC.

    Optimizing the Finite Length Performance of Sparse Superposition Codes

    Adam Greig, Ramji Venkataramanan
    Comments: 21 pages, 14 figures
    Subjects: Information Theory (cs.IT)

    Sparse superposition codes are a recent class of codes introduced by Barron
    and Joseph for efficient communication over the AWGN channel. With an
    appropriate power allocation, these codes have been shown to be asymptotically
    capacity-achieving with computationally feasible decoding. However, a direct
    implementation of the capacity-achieving construction does not give good finite
    length error performance, and has unnecessarily high computational complexity.
    In this paper, we consider sparse superposition codes with approximate message
    passing (AMP) decoding, and describe a variety of techniques to improve their
    finite length performance and decrease computational complexity. These include
    using implicit Walsh-Hadamard design matrices, optimizing SPARC power
    allocations and codebook parameters, and estimating a critical decoding
    parameter online instead of pre-computation. Finally we compare the error
    performance of AMP-decoded sparse superposition codes with coded modulation
    using LDPC codes from the WiMAX standard.

    Optimal Power Control and Scheduling under Hard Deadline Constraints for Continuous Fading Channels

    Ahmed Ewaisha, Cihan Tepedelenlioglu
    Comments: Submitted to Asilomar 2017. arXiv admin note: text overlap with arXiv:1612.08326
    Subjects: Information Theory (cs.IT)

    We consider a joint scheduling-and-power-allocation problem of a downlink
    cellular system. The system consists of two groups of users: real-time (RT) and
    non-real-time (NRT) users. Given an average power constraint on the base
    station, the problem is to find an algorithm that satisfies the RT hard
    deadline constraint and NRT queue stability constraint. We propose a
    sum-rate-maximizing algorithm that satisfies these constraints. We also show,
    through simulations, that the proposed algorithm has an average complexity that
    is close-to-linear in the number of RT users. The power allocation policy in
    the proposed algorithm has a closed-form expression for the two groups of
    users. However, interestingly, the power policy of the RT users differ in
    structure from that of the NRT users. We also show the superiority of the
    proposed algorithms over existing approaches using extensive simulations.

    Efficiently decodable codes for the binary deletion channel

    Venkatesan Guruswami, Ray Li
    Comments: arXiv admin note: substantial text overlap with arXiv:1612.06335
    Subjects: Information Theory (cs.IT); Data Structures and Algorithms (cs.DS)

    In the random deletion channel, each bit is deleted independently with
    probability (p). For the random deletion channel, the existence of codes of
    rate ((1-p)/9), and thus bounded away from (0) for any (p < 1), has been known.
    We give an explicit construction with polynomial time encoding and deletion
    correction algorithms with rate (c_0 (1-p)) for an absolute constant (c_0 > 0).

    Structured sampling and fast reconstruction of smooth graph signals

    Gilles Puy, Patrick Pérez
    Subjects: Social and Information Networks (cs.SI); Information Theory (cs.IT)

    This work concerns sampling of smooth signals on arbitrary graphs. We first
    study a structured sampling strategy for such smooth graph signals that
    consists of a random selection of few pre-defined groups of nodes. The number
    of groups to sample to stably embed the set of (k)-bandlimited signals is
    driven by a quantity called the emph{group} graph cumulative coherence. For
    some optimised sampling distributions, we show that sampling (O(klog(k)))
    groups is always sufficient to stably embed the set of (k)-bandlimited signals
    but that this number can be smaller — down to (O(log(k))) — depending on the
    structure of the groups of nodes. Fast methods to approximate these sampling
    distributions are detailed. Second, we consider (k)-bandlimited signals that
    are nearly piecewise constant over pre-defined groups of nodes. We show that it
    is possible to speed up the reconstruction of such signals by reducing
    drastically the dimension of the vectors to reconstruct. When combined with the
    proposed structured sampling procedure, we prove that the method provides
    stable and accurate reconstruction of the original signal. Finally, we present
    numerical experiments that illustrate our theoretical results and, as an
    example, show how to combine these methods for interactive object segmentation
    in an image using superpixels.

    Optimal Approximation with Sparsely Connected Deep Neural Networks

    Helmut Bölcskei, Philipp Grohs, Gitta Kutyniok, Philipp Petersen
    Subjects: Learning (cs.LG); Information Theory (cs.IT); Functional Analysis (math.FA)

    We derive fundamental lower bounds on the connectivity and the memory
    requirements of deep neural networks guaranteeing uniform approximation rates
    for arbitrary function classes in (L^2(mathbb{R}^d)). In other words, we
    establish a connection between the complexity of a function class and the
    complexity of deep neural networks approximating functions from this class to
    within a prescribed accuracy.

    Additionally, we prove that our lower bounds are achievable for a broad
    family of function classes. Specifically, all function classes that are
    optimally approximated by a general class of representation systems—so-called
    emph{affine systems}—can be approximated by deep neural networks with
    minimal connectivity and memory requirements. Affine systems encompass a wealth
    of representation systems from applied harmonic analysis such as wavelets,
    ridgelets, curvelets, shearlets, (alpha)-shearlets, and more generally
    (alpha)-molecules. This result elucidates a remarkable universality property
    of neural networks and shows that they achieve the optimum approximation
    properties of all affine systems combined. As a specific example, we consider
    the class of (1/alpha)-cartoon-like functions, which is approximated optimally
    by (alpha)-shearlets.

    We also explain how our results can be extended to the case of functions on
    low-dimensional immersed manifolds.

    Finally, we present numerical experiments demonstrating that the standard
    stochastic gradient descent algorithm generates deep neural networks providing
    close-to-optimal approximation rates at minimal connectivity. Moreover, these
    results show that stochastic gradient descent actually learns approximations
    that are sparse in the representation systems optimally sparsifying the
    function class the network is trained on.




沪ICP备19023445号-2号
友情链接