IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Mon, 20 Mar 2017

    我爱机器学习(52ml.net)发表于 2017-03-20 00:00:00
    love 0

    Neural and Evolutionary Computing

    Implicit Gradient Neural Networks with a Positive-Definite Mass Matrix for Online Linear Equations Solving

    Ke Chen
    Comments: Submitted to Information Processing Letters
    Subjects: Neural and Evolutionary Computing (cs.NE); Systems and Control (cs.SY)

    Motivated by the advantages achieved by implicit analogue net for solving
    online linear equations, a novel implicit neural model is designed based on
    conventional explicit gradient neural networks in this letter by introducing a
    positive-definite mass matrix. In addition to taking the advantages of the
    implicit neural dynamics, the proposed implicit gradient neural networks can
    still achieve globally exponential convergence to the unique theoretical
    solution of linear equations and also global stability even under no-solution
    and multi-solution situations. Simulative results verify theoretical
    convergence analysis on the proposed neural dynamics.

    Reservoir Computing and Extreme Learning Machines using Pairs of Cellular Automata Rules

    Nathan McDonald
    Comments: accepted to International Joint Conference on Neural Networks (IJCNN 2017)
    Subjects: Neural and Evolutionary Computing (cs.NE)

    A framework for implementing reservoir computing (RC) and extreme learning
    machines (ELMs), two types of artificial neural networks, based on 1D
    elementary Cellular Automata (CA) is presented, in which two separate CA rules
    explicitly implement the minimum computational requirements of the reservoir
    layer: hyperdimensional projection and short-term memory. CAs are cell-based
    state machines, which evolve in time in accordance with local rules based on a
    cells current state and those of its neighbors. Notably, simple single cell
    shift rules as the memory rule in a fixed edge CA afforded reasonable success
    in conjunction with a variety of projection rules, potentially significantly
    reducing the optimal solution search space. Optimal iteration counts for the CA
    rule pairs can be estimated for some tasks based upon the category of the
    projection rule. Initial results support future hardware realization, where CAs
    potentially afford orders of magnitude reduction in size, weight, and power
    (SWaP) requirements compared with floating point RC implementations.

    Pattern representation and recognition with accelerated analog neuromorphic systems

    Mihai A. Petrovici, Sebastian Schmitt, Johann Klähn, David Stöckel, Anna Schroeder, Guillaume Bellec, Johannes Bill, Oliver Breitwieser, Ilja Bytschok, Andreas Grübl, Maurice Güttler, Andreas Hartel, Stephan Hartmann, Dan Husmann, Kai Husmann, Sebastian Jeltsch, Vitali Karasenko, Mitja Kleider, Christoph Koke, Alexander Kononov, Christian Mauch, Paul Müller, Johannes Partzsch, Thomas Pfeil, Stefan Schiefer, Stefan Scholze, Anand Subramoney, Vasilis Thanasoulis, Bernhard Vogginger, Robert Legenstein, Wolfgang Maass, René Schüffny, Christian Mayr, Johannes Schemmel, Karlheinz Meier
    Comments: accepted at ISCAS 2017
    Subjects: Neurons and Cognition (q-bio.NC); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    Despite being originally inspired by the central nervous system, artificial
    neural networks have diverged from their biological archetypes as they have
    been remodeled to fit particular tasks. In this paper, we review several
    possibilites to reverse map these architectures to biologically more realistic
    spiking networks with the aim of emulating them on fast, low-power neuromorphic
    hardware. Since many of these devices employ analog components, which cannot be
    perfectly controlled, finding ways to compensate for the resulting effects
    represents a key challenge. Here, we discuss three different strategies to
    address this problem: the addition of auxiliary network components for
    stabilizing activity, the utilization of inherently robust architectures and a
    training method for hardware-emulated networks that functions without perfect
    knowledge of the system’s dynamics and parameters. For all three scenarios, we
    corroborate our theoretical considerations with experimental results on
    accelerated analog neuromorphic platforms.


    Computer Vision and Pattern Recognition

    PSF field learning based on Optimal Transport Distances

    F. M. Ngolè Mboula, J.-L. Starck
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Instrumentation and Methods for Astrophysics (astro-ph.IM)

    Context: in astronomy, observing large fractions of the sky within a
    reasonable amount of time implies using large field-of-view (fov) optical
    instruments that typically have a spatially varying Point Spread Function
    (PSF). Depending on the scientific goals, galaxies images need to be corrected
    for the PSF whereas no direct measurement of the PSF is available. Aims: given
    a set of PSFs observed at random locations, we want to estimate the PSFs at
    galaxies locations for shapes measurements correction. Contributions: we
    propose an interpolation framework based on Sliced Optimal Transport. A
    non-linear dimension reduction is first performed based on local pairwise
    approximated Wasserstein distances. A low dimensional representation of the
    unknown PSFs is then estimated, which in turn is used to derive representations
    of those PSFs in the Wasserstein metric. Finally, the interpolated PSFs are
    calculated as approximated Wasserstein barycenters. Results: the proposed
    method was tested on simulated monochromatic PSFs of the Euclid space mission
    telescope (to be launched in 2020). It achieves a remarkable accuracy in terms
    of pixels values and shape compared to standard methods such as Inverse
    Distance Weighting or Radial Basis Function based interpolation methods.

    Towards Diverse and Natural Image Descriptions via a Conditional GAN

    Bo Dai, Dahua Lin, Raquel Urtasun, Sanja Fidler
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Despite the substantial progress in recent years, the image captioning
    techniques are still far from being perfect.Sentences produced by existing
    methods, e.g. those based on RNNs, are often overly rigid and lacking in
    variability. This issue is related to a learning principle widely used in
    practice, that is, to maximize the likelihood of training samples. This
    principle encourages high resemblance to the “ground-truth” captions while
    suppressing other reasonable descriptions. Conventional evaluation metrics,
    e.g. BLEU and METEOR, also favor such restrictive methods. In this paper, we
    explore an alternative approach, with the aim to improve the naturalness and
    diversity — two essential properties of human expression. Specifically, we
    propose a new framework based on Conditional Generative Adversarial Networks
    (CGAN), which jointly learns a generator to produce descriptions conditioned on
    images and an evaluator to assess how well a description fits the visual
    content. It is noteworthy that training a sequence generator is nontrivial. We
    overcome the difficulty by Policy Gradient, a strategy stemming from
    Reinforcement Learning, which allows the generator to receive early feedback
    along the way. We tested our method on two large datasets, where it performed
    competitively against real people in our user study and outperformed other
    methods on various tasks.

    Color Orchestra: Ordering Color Palettes for Interpolation and Prediction

    Huy Q. Phan, Hongbo Fu, Antoni B. Chan
    Comments: IEEE Transactions on Visualization and Computer Graphics
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)

    Color theme or color palette can deeply influence the quality and the feeling
    of a photograph or a graphical design. Although color palettes may come from
    different sources such as online crowd-sourcing, photographs and graphical
    designs, in this paper, we consider color palettes extracted from fine art
    collections, which we believe to be an abundant source of stylistic and unique
    color themes. We aim to capture color styles embedded in these collections by
    means of statistical models and to build practical applications upon these
    models. As artists often use their personal color themes in their paintings,
    making these palettes appear frequently in the dataset, we employed density
    estimation to capture the characteristics of palette data. Via density
    estimation, we carried out various predictions and interpolations on palettes,
    which led to promising applications such as photo-style exploration, real-time
    color suggestion, and enriched photo recolorization. It was, however,
    challenging to apply density estimation to palette data as palettes often come
    as unordered sets of colors, which make it difficult to use conventional
    metrics on them. To this end, we developed a divide-and-conquer sorting
    algorithm to rearrange the colors in the palettes in a coherent order, which
    allows meaningful interpolation between color palettes. To confirm the
    performance of our model, we also conducted quantitative experiments on
    datasets of digitized paintings collected from the Internet and received
    favorable results.

    Auxiliary Manifold Embedding for Fully Convolutional Networks

    Christoph Baur, Shadi Albarqouni, Nassir Navab
    Comments: 9 pages, 6 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Deep learning usually requires large amounts of labeled training data, but
    annotating data is costly and tedious. The framework of semi-supervised
    learning provides the means to use both labeled data and arbitrary amounts of
    unlabeled data for training. Recently, semi-supervised deep learning has been
    intensively studied for standard CNN architectures. However, Fully
    Convolutional Networks (FCNs) set the state-of-the-art for many image
    segmentation tasks. To the best of our knowledge, there is no existing
    semi-supervised learning method for such FCNs yet. We lift the concept of
    auxiliary manifold embedding for semi-supervised learning to FCNs with the help
    of Random Feature Embedding. In our experiments on the challenging task of MS
    Lesion Segmentation, we leverage the proposed framework for the purpose of
    domain adaptation and report substantial improvements over the baseline model.

    Comparison of Different Methods for Tissue Segmentation in Histopathological Whole-Slide Images

    Péter Bándi, Rob van de Loo, Milad Intezar, Daan Geijs, Francesco Ciompi, Bram van Ginneken, Jeroen van der Laak, Geert Litjens
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Tissue segmentation is an important pre-requisite for efficient and accurate
    diagnostics in digital pathology. However, it is well known that whole-slide
    scanners can fail in detecting all tissue regions, for example due to the
    tissue type, or due to weak staining because their tissue detection algorithms
    are not robust enough. In this paper, we introduce two different convolutional
    neural network architectures for whole slide image segmentation to accurately
    identify the tissue sections. We also compare the algorithms to a published
    traditional method. We collected 54 whole slide images with differing stains
    and tissue types from three laboratories to validate our algorithms. We show
    that while the two methods do not differ significantly they outperform their
    traditional counterpart (Jaccard index of 0.937 and 0.929 vs. 0.870, p < 0.01).

    Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery

    Thomas Schlegl, Philipp Seeböck, Sebastian M. Waldstein, Ursula Schmidt-Erfurth, Georg Langs
    Comments: To be published in the proceedings of the international conference on Information Processing in Medical Imaging (IPMI), 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Obtaining models that capture imaging markers relevant for disease
    progression and treatment monitoring is challenging. Models are typically based
    on large amounts of data with annotated examples of known markers aiming at
    automating detection. High annotation effort and the limitation to a vocabulary
    of known markers limit the power of such approaches. Here, we perform
    unsupervised learning to identify anomalies in imaging data as candidates for
    markers. We propose AnoGAN, a deep convolutional generative adversarial network
    to learn a manifold of normal anatomical variability, accompanying a novel
    anomaly scoring scheme based on the mapping from image space to a latent space.
    Applied to new data, the model labels anomalies, and scores image patches
    indicating their fit into the learned distribution. Results on optical
    coherence tomography images of the retina demonstrate that the approach
    correctly identifies anomalous images, such as images containing retinal fluid
    or hyperreflective foci.

    Computer Aided Detection of Anemia-like Pallor

    Sohini Roychowdhury, Donny Sun, Matthew Bihis, Johnny Ren, Paul Hage, Humairat H. Rahman
    Comments: 4 pages,2 figures, 2 tables
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Paleness or pallor is a manifestation of blood loss or low hemoglobin
    concentrations in the human blood that can be caused by pathologies such as
    anemia. This work presents the first automated screening system that utilizes
    pallor site images, segments, and extracts color and intensity-based features
    for multi-class classification of patients with high pallor due to anemia-like
    pathologies, normal patients and patients with other abnormalities. This work
    analyzes the pallor sites of conjunctiva and tongue for anemia screening
    purposes. First, for the eye pallor site images, the sclera and conjunctiva
    regions are automatically segmented for regions of interest. Similarly, for the
    tongue pallor site images, the inner and outer tongue regions are segmented.
    Then, color-plane based feature extraction is performed followed by machine
    learning algorithms for feature reduction and image level classification for
    anemia. In this work, a suite of classification algorithms image-level
    classifications for normal (class 0), pallor (class 1) and other abnormalities
    (class 2). The proposed method achieves 86% accuracy, 85% precision and 67%
    recall in eye pallor site images and 98.2% accuracy and precision with 100%
    recall in tongue pallor site images for classification of images with pallor.
    The proposed pallor screening system can be further fine-tuned to detect the
    severity of anemia-like pathologies using controlled set of local images that
    can then be used for future benchmarking purposes.

    Learning Robust Visual-Semantic Embeddings

    Yao-Hung Hubert Tsai, Liang-Kang Huang, Ruslan Salakhutdinov
    Comments: 12 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Learning (cs.LG)

    Many of the existing methods for learning joint embedding of images and text
    use only supervised information from paired images and its textual attributes.
    Taking advantage of the recent success of unsupervised learning in deep neural
    networks, we propose an end-to-end learning framework that is able to extract
    more robust multi-modal representations across domains. The proposed method
    combines representation learning models (i.e., auto-encoders) together with
    cross-domain learning criteria (i.e., Maximum Mean Discrepancy loss) to learn
    joint embeddings for semantic and visual features. A novel technique of
    unsupervised-data adaptation inference is introduced to construct more
    comprehensive embeddings for both labeled and unlabeled data. We evaluate our
    method on Animals with Attributes and Caltech-UCSD Birds 200-2011 dataset with
    a wide range of applications, including zero and few-shot image recognition and
    retrieval, from inductive to transductive settings. Empirically, we show that
    our framework improves over the current state of the art on many of the
    considered tasks.

    Need for Speed: A Benchmark for Higher Frame Rate Object Tracking

    Hamed Kiani Galoogahi, Ashton Fagg, Chen Huang, Deva Ramanan, Simon Lucey
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we propose the first higher frame rate video dataset (called
    Need for Speed – NfS) and benchmark for visual object tracking. The dataset
    consists of 100 videos (380K frames) captured with now commonly available
    higher frame rate (240 FPS) cameras from real world scenarios. All frames are
    annotated with axis aligned bounding boxes and all sequences are manually
    labelled with nine visual attributes – such as occlusion, fast motion,
    background clutter, etc. Our benchmark provides an extensive evaluation of many
    recent and state-of-the-art trackers on higher frame rate sequences. We ranked
    each of these trackers according to their tracking accuracy and real-time
    performance. One of our surprising conclusions is that at higher frame rates,
    simple trackers such as correlation filters outperform complex methods based on
    deep networks. This suggests that for practical applications (such as in
    robotics or embedded vision), one needs to carefully tradeoff bandwidth
    constraints associated with higher frame rate acquisition, computational costs
    of real-time analysis, and the required application accuracy. Our dataset and
    benchmark allows for the first time (to our knowledge) systematic exploration
    of such issues, and will be made available to allow for further research in
    this space.

    DropRegion Training of Inception Font Network for High-Performance Chinese Font Recognition

    Shuangping Huangm Zhuoyao Zhong, Lianwen Jin, Shuye Zhang, Haobin Wang
    Comments: 15 pages, 7 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Chinese font recognition (CFR) has gained significant attention in recent
    years. However, due to the sparsity of labeled font samples and the structural
    complexity of Chinese characters, CFR is still a challenging task. In this
    paper, a DropRegion method is proposed to generate a large number of stochastic
    variant font samples whose local regions are selectively disrupted and an
    inception font network (IFN) with two additional convolutional neural network
    (CNN) structure elements, i.e., a cascaded cross-channel parametric pooling
    (CCCP) and global average pooling, is designed. Because the distribution of
    strokes in a font image is non-stationary, an elastic meshing technique that
    adaptively constructs a set of local regions with equalized information is
    developed. Thus, DropRegion is seamlessly embedded in the IFN, which enables
    end-to-end training; the proposed DropRegion-IFN can be used for high
    performance CFR. Experimental results have confirmed the effectiveness of our
    new approach for CFR.

    Understanding Traffic Density from Large-Scale Web Camera Data

    Shanghang Zhang, Guanhang Wu, Joao P. Costeira, Jose M. F. Moura
    Comments: Accepted by CVPR 2017. Preprint version was uploaded on this http URL
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we estimate traffic density from low quality videos captured
    by city web cameras (webcams). Webcam videos have low resolution, low frame
    rate, high occlusion and large perspective, making most existing methods lose
    their efficacy. To deeply understand traffic density, we explore both deep
    learning based and optimization based methods. To avoid individual vehicle
    detection and tracking, both methods map the image into vehicle density map,
    one based on rank constrained regression and the other one based on fully
    convolution networks (FCN). The regression based method learns different
    weights for different blocks in the image to increase freedom degrees of
    weights and embed perspective information. The FCN based method jointly
    estimates vehicle density map and vehicle count with a residual learning
    framework to perform end-to-end dense prediction, allowing arbitrary image
    resolution, and adapting to different vehicle scales and perspectives. We
    analyze and compare both methods, and get insights from optimization based
    method to improve deep model. Since existing datasets do not cover all the
    challenges in our work, we collected and labelled a large-scale traffic video
    dataset, containing 60 million frames from 212 webcams. Both methods are
    extensively evaluated and compared on different counting tasks and three
    datasets, with experimental results demonstrating their effectiveness and
    robustness. In particular, FCN based method significantly reduces the mean
    absolute value from 10.99 to 5.31 on the public dataset TRANCOS compared with
    the state-of-the-art baseline.

    Towards Closing the Energy Gap Between HOG and CNN Features for Embedded Vision

    Amr Suleiman, Yu-Hsin Chen, Joel Emer, Vivienne Sze
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Computer vision enables a wide range of applications in robotics/drones,
    self-driving cars, smart Internet of Things, and portable/wearable electronics.
    For many of these applications, local embedded processing is preferred due to
    privacy and/or latency concerns. Accordingly, energy-efficient embedded vision
    hardware delivering real-time and robust performance is crucial. While deep
    learning is gaining popularity in several computer vision algorithms, a
    significant energy consumption difference exists compared to traditional
    hand-crafted approaches. In this paper, we provide an in-depth analysis of the
    computation, energy and accuracy trade-offs between learned features such as
    deep Convolutional Neural Networks (CNN) and hand-crafted features such as
    Histogram of Oriented Gradients (HOG). This analysis is supported by
    measurements from two chips that implement these algorithms. Our goal is to
    understand the source of the energy discrepancy between the two approaches and
    to provide insight about the potential areas where CNNs can be improved and
    eventually approach the energy-efficiency of HOG while maintaining its
    outstanding performance accuracy.

    Automatically identifying wild animals in camera trap images with deep learning

    Mohammed Sadegh Norouzzadeh, Anh Nguyen, Margaret Kosmala, Ali Swanson, Craig Packer, Jeff Clune
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Having accurate, detailed, and up-to-date information about wildlife location
    and behavior across broad geographic areas would revolutionize our ability to
    study, conserve, and manage species and ecosystems. Currently such data are
    mostly gathered manually at great expense, and thus are sparsely and
    infrequently collected. Here we investigate the ability to automatically,
    accurately, and inexpensively collect such data from motion sensor cameras.
    These camera traps enable pictures of wildlife to be collected inexpensively,
    unobtrusively, and at high-volume. However, identifying the animals, animal
    attributes, and behaviors in these pictures remains an expensive,
    time-consuming, manual task often performed by researchers, hired technicians,
    or crowdsourced teams of human volunteers. In this paper, we demonstrate that
    such data can be automatically extracted by deep neural networks (aka deep
    learning), which is a cutting-edge type of artificial intelligence. In
    particular, we use the existing human-labeled images from the Snapshot
    Serengeti dataset to train deep convolutional neural networks for identifying
    48 species in 3.2 million images taken from Tanzania’s Serengeti National Park.
    We train neural networks that automatically identify animals with over 92%
    accuracy. More importantly, we can choose to have our system classify only the
    images it is highly confident about, allowing valuable human time to be focused
    only on challenging images. In this case, our automatic animal identification
    system saves approximately ~8.3 years (at 40 hours per week) of human labeling
    effort (i.e. over 17,000 hours) while operating on a 3.2-million-image dataset
    at the same 96% accuracy level of crowdsourced teams of human volunteers. Those
    efficiency gains immediately highlight the importance of using deep neural
    networks to automate data extraction from camera trap images.

    Low-rank and Sparse NMF for Joint Endmembers' Number Estimation and Blind Unmixing of Hyperspectral Images

    Paris V. Giampouras, Athanasios A. Rontogiannis, Konstantinos D. Koutroumbas
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

    Estimation of the number of endmembers existing in a scene constitutes a
    critical task in the hyperspectral unmixing process. The accuracy of this
    estimate plays a crucial role in subsequent unsupervised unmixing steps i.e.,
    the derivation of the spectral signatures of the endmembers (endmembers’
    extraction) and the estimation of the abundance fractions of the pixels. A
    common practice amply followed in literature is to treat endmembers’ number
    estimation and unmixing, independently as two separate tasks, providing the
    outcome of the former as input to the latter. In this paper, we go beyond this
    computationally demanding strategy. More precisely, we set forth a multiple
    constrained optimization framework, which encapsulates endmembers’ number
    estimation and unsupervised unmixing in a single task. This is attained by
    suitably formulating the problem via a low-rank and sparse nonnegative matrix
    factorization rationale, where low-rankness is promoted with the use of a
    sophisticated (ell_2/ell_1) norm penalty term. An alternating proximal
    algorithm is then proposed for minimizing the emerging cost function. The
    results obtained by simulated and real data experiments verify the
    effectiveness of the proposed approach.


    Artificial Intelligence

    Approximation Complexity of Maximum A Posteriori Inference in Sum-Product Networks

    Denis Deratani Mauá, Cassio P. de Campos
    Comments: 14 pages
    Subjects: Artificial Intelligence (cs.AI)

    We discuss the computational complexity of approximating maximum a posteriori
    inference in sum-product networks. We first show NP-hardness in three-level
    trees by a reduction from maximum independent set; this implies
    non-approximability within a sublinear factor. We show that this is a tight
    bound, as we can find an approximation within a linear factor in three-level
    networks. We then show that in four-level trees it is NP-hard to approximate
    the problem within a factor (2^{f(n)}) for any sublinear function (f) of the
    size of the input (n). Again, this is bound is tight, as we prove that the
    usual max-product algorithm finds (in any network) approximations within factor
    (2^{c n}) from some constant (c < 1). Last, we present a simple algorithm, and
    show that it provably produces solutions at least as good as, and potentially
    much better than, the max-product algorithm.

    A Visual Web Tool to Perform What-If Analysis of Optimization Approaches

    Sascha Van Cauwelaert, Michele Lombardi, Pierre Schaus
    Subjects: Artificial Intelligence (cs.AI); Performance (cs.PF)

    In Operation Research, practical evaluation is essential to validate the
    efficacy of optimization approaches. This paper promotes the usage of
    performance profiles as a standard practice to visualize and analyze
    experimental results. It introduces a Web tool to construct and export
    performance profiles as SVG or HTML files. In addition, the application relies
    on a methodology to estimate the benefit of hypothetical solver improvements.
    Therefore, the tool allows one to employ what-if analysis to screen possible
    research directions, and identify those having the best potential. The approach
    is showcased on two Operation Research technologies: Constraint Programming and
    Mixed Integer Linear Programming.

    Generalised Reichenbachian Common Cause Systems

    Claudio Mazzola
    Subjects: Other Statistics (stat.OT); Artificial Intelligence (cs.AI)

    The principle of the common cause claims that if an improbable coincidence
    has occurred, there must exist a common cause. This is generally taken to mean
    that positive correlations between non-causally related events should disappear
    when conditioning on the action of some underlying common cause. The extended
    interpretation of the principle, by contrast, urges that common causes should
    be called for in order to explain positive deviations between the estimated
    correlation of two events and the expected value of their correlation. The aim
    of this paper is to provide the extended reading of the principle with a
    general probabilistic model, capturing the simultaneous action of a system of
    multiple common causes. To this end, two distinct models are elaborated, and
    the necessary and sufficient conditions for their existence are determined.

    Modeling Relational Data with Graph Convolutional Networks

    Michael Schlichtkrull, Thomas N. Kipf, Peter Bloem, Rianne van den Berg, Ivan Titov, Max Welling
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Databases (cs.DB); Learning (cs.LG)

    Knowledge bases play a crucial role in many applications, for example
    question answering and information retrieval. Despite the great effort invested
    in creating and maintaining them, even the largest representatives (e.g., Yago,
    DBPedia or Wikidata) are highly incomplete. We introduce relational graph
    convolutional networks (R-GCNs) and apply them to two standard knowledge base
    completion tasks: link prediction (recovery of missing facts, i.e.
    subject-predicate-object triples) and entity classification (recovery of
    missing attributes of entities). R-GCNs are a generalization of graph
    convolutional networks, a recent class of neural networks operating on graphs,
    and are developed specifically to deal with highly multi-relational data,
    characteristic of realistic knowledge bases. Our methods achieve competitive
    results on standard benchmarks for both tasks.

    Particle Value Functions

    Chris J. Maddison, Dieterich Lawson, George Tucker, Nicolas Heess, Arnaud Doucet, Andriy Mnih, Yee Whye Teh
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

    The policy gradients of the expected return objective can react slowly to
    rare rewards. Yet, in some cases agents may wish to emphasize the low or high
    returns regardless of their probability. Borrowing from the economics and
    control literature, we review the risk-sensitive value function that arises
    from an exponential utility and illustrate its effects on an example. This
    risk-sensitive value function is not always applicable to reinforcement
    learning problems, so we introduce the particle value function defined by a
    particle filter over the distributions of an agent’s experience, which bounds
    the risk-sensitive one. We illustrate the benefit of the policy gradients of
    this objective in Cliffworld.


    Information Retrieval

    Global Entity Ranking Across Multiple Languages

    Prantik Bhattacharyya, Nemanja Spasojevic
    Comments: 2 Pages, 1 Figure, 2 Tables, WWW2017 Companion, WWW 2017 Companion
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Social and Information Networks (cs.SI)

    We present work on building a global long-tailed ranking of entities across
    multiple languages using Wikipedia and Freebase knowledge bases. We identify
    multiple features and build a model to rank entities using a ground-truth
    dataset of more than 10 thousand labels. The final system ranks 27 million
    entities with 75% precision and 48% F1 score. We provide performance evaluation
    and empirical evidence of the quality of ranking across languages, and open the
    final ranked lists for future research.

    Temporal Information Extraction for Question Answering Using Syntactic Dependencies in an LSTM-based Architecture

    Yuanliang Meng, Anna Rumshisky, Alexey Romanov
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    In this paper, we propose to use a set of simple, uniform in architecture
    LSTM-based models to recover different kinds of temporal relations from text.
    Using the shortest dependency path between entities as input, the same
    architecture is used to extract intra-sentence, cross-sentence, and document
    creation time relations. A “double-checking” technique reverses entity pairs in
    classification, boosting the recall of positive cases and reducing
    misclassifications between opposite classes. An efficient pruning algorithm
    resolves conflicts globally. Evaluated on QA-TempEval (SemEval2015 Task 5), our
    proposed technique outperforms state-of-the-art methods by a large margin.

    Search Engine Drives the Evolution of Social Networks

    Cai Fu, Chenchen Peng, Xiao-Yang Liu
    Comments: 9 pages, 11 figures
    Subjects: Social and Information Networks (cs.SI); Information Retrieval (cs.IR); Physics and Society (physics.soc-ph)

    The search engine is tightly coupled with social networks and is primarily
    designed for users to acquire interested information. Specifically, the search
    engine assists the information dissemination for social networks, i.e.,
    enabling users to access interested contents with keywords-searching and
    promoting the process of contents-transferring from the source users directly
    to potential interested users. Accompanying such processes, the social network
    evolves as new links emerge between users with common interests. However, there
    is no clear understanding of such a “chicken-and-egg” problem, namely, new
    links encourage more social interactions, and vice versa. In this paper, we aim
    to quantitatively characterize the social network evolution phenomenon driven
    by a search engine. First, we propose a search network model for social network
    evolution. Second, we adopt two performance metrics, namely, degree
    distribution and network diameter. Theoretically, we prove that the degree
    distribution follows an intensified power-law, and the network diameter
    shrinks. Third, we quantitatively show that the search engine accelerates the
    rumor propagation in social networks. Finally, based on four real-world data
    sets (i.e., CDBLP, Facebook, Weibo Tweets, P2P), we verify our theoretical
    findings. Furthermore, we find that the search engine dramatically increases
    the speed of rumor propagation.


    Computation and Language

    Construction of a Japanese Word Similarity Dataset

    Yuya Sakaizawa, Mamoru Komachi
    Comments: 5 pages
    Subjects: Computation and Language (cs.CL)

    An evaluation of distributed word representation is generally conducted using
    a word similarity task and/or a word analogy task. There are many datasets
    readily available for these tasks in English. However, evaluating distributed
    representation in languages that do not have such resources (e.g., Japanese) is
    difficult. Therefore, as a first step toward evaluating distributed
    representations in Japanese, we constructed a Japanese word similarity dataset.
    To the best of our knowledge, our dataset is the first resource that can be
    used to evaluate distributed representations in Japanese. Moreover, our dataset
    contains various parts of speech and includes rare words in addition to common
    words.

    Empirical Evaluation of Parallel Training Algorithms on Acoustic Modeling

    Wenpeng Li, BinBin Zhang, Lei Xie, Dong Yu
    Subjects: Computation and Language (cs.CL); Learning (cs.LG); Sound (cs.SD)

    Deep learning models (DLMs) are state-of-the-art techniques in speech
    recognition. However, training good DLMs can be time consuming especially for
    production-size models and corpora. Although several parallel training
    algorithms have been proposed to improve training efficiency, there is no clear
    guidance on which one to choose for the task in hand due to lack of systematic
    and fair comparison among them. In this paper we aim at filling this gap by
    comparing four popular parallel training algorithms in speech recognition,
    namely asynchronous stochastic gradient descent (ASGD), blockwise model-update
    filtering (BMUF), bulk synchronous parallel (BSP) and elastic averaging
    stochastic gradient descent (EASGD), on 1000-hour LibriSpeech corpora using
    feed-forward deep neural networks (DNNs) and convolutional, long short-term
    memory, DNNs (CLDNNs). Based on our experiments, we recommend using BMUF as the
    top choice to train acoustic models since it is most stable, scales well with
    number of GPUs, can achieve reproducible results, and in many cases even
    outperforms single-GPU SGD. ASGD can be used as a substitute in some cases.

    Global Entity Ranking Across Multiple Languages

    Prantik Bhattacharyya, Nemanja Spasojevic
    Comments: 2 Pages, 1 Figure, 2 Tables, WWW2017 Companion, WWW 2017 Companion
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Social and Information Networks (cs.SI)

    We present work on building a global long-tailed ranking of entities across
    multiple languages using Wikipedia and Freebase knowledge bases. We identify
    multiple features and build a model to rank entities using a ground-truth
    dataset of more than 10 thousand labels. The final system ranks 27 million
    entities with 75% precision and 48% F1 score. We provide performance evaluation
    and empirical evidence of the quality of ranking across languages, and open the
    final ranked lists for future research.

    Learning Robust Visual-Semantic Embeddings

    Yao-Hung Hubert Tsai, Liang-Kang Huang, Ruslan Salakhutdinov
    Comments: 12 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Learning (cs.LG)

    Many of the existing methods for learning joint embedding of images and text
    use only supervised information from paired images and its textual attributes.
    Taking advantage of the recent success of unsupervised learning in deep neural
    networks, we propose an end-to-end learning framework that is able to extract
    more robust multi-modal representations across domains. The proposed method
    combines representation learning models (i.e., auto-encoders) together with
    cross-domain learning criteria (i.e., Maximum Mean Discrepancy loss) to learn
    joint embeddings for semantic and visual features. A novel technique of
    unsupervised-data adaptation inference is introduced to construct more
    comprehensive embeddings for both labeled and unlabeled data. We evaluate our
    method on Animals with Attributes and Caltech-UCSD Birds 200-2011 dataset with
    a wide range of applications, including zero and few-shot image recognition and
    retrieval, from inductive to transductive settings. Empirically, we show that
    our framework improves over the current state of the art on many of the
    considered tasks.

    Temporal Information Extraction for Question Answering Using Syntactic Dependencies in an LSTM-based Architecture

    Yuanliang Meng, Anna Rumshisky, Alexey Romanov
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    In this paper, we propose to use a set of simple, uniform in architecture
    LSTM-based models to recover different kinds of temporal relations from text.
    Using the shortest dependency path between entities as input, the same
    architecture is used to extract intra-sentence, cross-sentence, and document
    creation time relations. A “double-checking” technique reverses entity pairs in
    classification, boosting the recall of positive cases and reducing
    misclassifications between opposite classes. An efficient pruning algorithm
    resolves conflicts globally. Evaluated on QA-TempEval (SemEval2015 Task 5), our
    proposed technique outperforms state-of-the-art methods by a large margin.


    Distributed, Parallel, and Cluster Computing

    Communication Primitives in Cognitive Radio Networks

    Seth Gilbert, Fabian Kuhn, Chaodong Zheng
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Cognitive radio networks are a new type of multi-channel wireless network in
    which different nodes can have access to different sets of channels. By
    providing multiple channels, they improve the efficiency and reliability of
    wireless communication. However, the heterogeneous nature of cognitive radio
    networks also brings new challenges to the design and analysis of distributed
    algorithms.

    In this paper, we focus on two fundamental problems in cognitive radio
    networks: neighbor discovery, and global broadcast. We consider a network
    containing (n) nodes, each of which has access to (c) channels. We assume the
    network has diameter (D), and each pair of neighbors have at least (kgeq 1),
    and at most (k_{max}leq c), shared channels. We also assume each node has at
    most (Delta) neighbors. For the neighbor discovery problem, we design a
    randomized algorithm CSeek which has time complexity
    ( ilde{O}((c^2/k)+(k_{max}/k)cdotDelta)). CSeek is flexible and robust,
    which allows us to use it as a generic “filter” to find “well-connected”
    neighbors with an even shorter running time. We then move on to the global
    broadcast problem, and propose CGCast, a randomized algorithm which takes
    ( ilde{O}((c^2/k)+(k_{max}/k)cdotDelta+DcdotDelta)) time. CGCast uses
    CSeek to achieve communication among neighbors, and uses edge coloring to
    establish an efficient schedule for fast message dissemination.

    Towards the end of the paper, we give lower bounds for solving the two
    problems. These lower bounds demonstrate that in many situations, CSeek and
    CGCast are near optimal.

    Block CUR : Decomposing Large Distributed Matrices

    Urvashi Oswal, Swayambhoo Jain, Kevin S. Xu, Brian Eriksson
    Subjects: Machine Learning (stat.ML); Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS); Learning (cs.LG)

    A common problem in large-scale data analysis is to approximate a matrix
    using a combination of specifically sampled rows and columns, known as CUR
    decomposition. Unfortunately, in many real-world environments, the ability to
    sample specific individual rows or columns of the matrix is limited by either
    system constraints or cost. In this paper, we consider matrix approximation by
    sampling predefined blocks of columns (or rows) from the matrix. This regime is
    commonly found when data is distributed across multiple nodes in a compute
    cluster, where such blocks correspond to columns (or rows) of the matrix stored
    on the same node, which can be retrieved with much less overhead than
    retrieving individual columns stored across different nodes. We propose a novel
    algorithm for sampling useful column blocks and provide guarantees for the
    quality of the approximation. We demonstrate the practical utility of this
    algorithm for computing the block CUR decomposition of large matrices in a
    distributed setting using Apache Spark. Using our proposed block CUR
    algorithms, we can achieve a significant speed-up compared to a regular CUR
    decomposition with the same quality of approximation.

    Computation Peer Offloading for Energy-Constrained Mobile Edge Computing in Small-Cell Networks

    Lixing Chen, Sheng Zhou, Jie Xu
    Subjects: Computer Science and Game Theory (cs.GT); Distributed, Parallel, and Cluster Computing (cs.DC)

    The (ultra-)dense deployment of small-cell base stations (SBSs) endowed with
    cloud-like computing functionalities paves the way for pervasive mobile edge
    computing (MEC), enabling ultra-low latency and location-awareness for a
    variety of emerging mobile applications and the Internet of Things. To handle
    spatially uneven computation workloads in the network, cooperation among SBSs
    via workload peer offloading is essential to avoid large computation latency at
    overloaded SBSs and provide high quality of service to end users. However,
    performing effective peer offloading faces many unique challenges in small cell
    networks due to limited energy resources committed by self-interested SBS
    owners, uncertainties in the system dynamics and co-provisioning of radio
    access and computing services. This paper develops a novel online SBS peer
    offloading framework, called OPEN, by leveraging the Lyapunov technique, in
    order to maximize the long-term system performance while keeping the energy
    consumption of SBSs below individual long-term constraints. OPEN works online
    without requiring information about future system dynamics, yet provides
    provably near-optimal performance compared to the oracle solution that has the
    complete future information. In addition, this paper formulates a novel peer
    offloading game among SBSs, analyzes its equilibrium and efficiency loss in
    terms of the price of anarchy in order to thoroughly understand SBSs’ strategic
    behaviors, thereby enabling decentralized and autonomous peer offloading
    decision making. Extensive simulations are carried out and show that peer
    offloading among SBSs dramatically improves the edge computing performance.


    Learning

    Deep Sets

    Manzil Zaheer, Satwik Kottur, Siamak Ravanbhakhsh, Barnabas Poczos, Ruslan Ssalakhutdinov, Alexander Smola
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    In this paper, we study the problem of designing objective functions for
    machine learning problems defined on finite emph{sets}. In contrast to
    traditional objective functions defined for machine learning problems operating
    on finite dimensional vectors, the new objective functions we propose are
    operating on finite sets and are invariant to permutations. Such problems are
    widespread, ranging from estimation of population statistics
    citep{poczos13aistats}, via anomaly detection in piezometer data of embankment
    dams citep{Jung15Exploration}, to cosmology
    citep{Ntampaka16Dynamical,Ravanbakhsh16ICML1}. Our main theorem characterizes
    the permutation invariant objective functions and provides a family of
    functions to which any permutation invariant objective function must belong.
    This family of functions has a special structure which enables us to design a
    deep network architecture that can operate on sets and which can be deployed on
    a variety of scenarios including both unsupervised and supervised learning
    tasks. We demonstrate the applicability of our method on population statistic
    estimation, point cloud classification, set expansion, and image tagging.

    Online Learning for Offloading and Autoscaling in Energy Harvesting Mobile Edge Computing

    Jie Xu, Lixing Chen, Shaolei Ren
    Comments: arXiv admin note: text overlap with arXiv:1701.01090 by other authors
    Subjects: Learning (cs.LG); Networking and Internet Architecture (cs.NI)

    Mobile edge computing (a.k.a. fog computing) has recently emerged to enable
    in-situ processing of delay-sensitive applications at the edge of mobile
    networks. Providing grid power supply in support of mobile edge computing,
    however, is costly and even infeasible (in certain rugged or under-developed
    areas), thus mandating on-site renewable energy as a major or even sole power
    supply in increasingly many scenarios. Nonetheless, the high intermittency and
    unpredictability of renewable energy make it very challenging to deliver a high
    quality of service to users in energy harvesting mobile edge computing systems.
    In this paper, we address the challenge of incorporating renewables into mobile
    edge computing and propose an efficient reinforcement learning-based resource
    management algorithm, which learns on-the-fly the optimal policy of dynamic
    workload offloading (to the centralized cloud) and edge server provisioning to
    minimize the long-term system cost (including both service delay and
    operational cost). Our online learning algorithm uses a decomposition of the
    (offline) value iteration and (online) reinforcement learning, thus achieving a
    significant improvement of learning rate and run-time performance when compared
    to standard reinforcement learning algorithms such as Q-learning. We prove the
    convergence of the proposed algorithm and analytically show that the learned
    policy has a simple monotone structure amenable to practical implementation.
    Our simulation results validate the efficacy of our algorithm, which
    significantly improves the edge computing performance compared to fixed or
    myopic optimization schemes and conventional reinforcement learning algorithms.

    Conditional Accelerated Lazy Stochastic Gradient Descent

    Guanghui Lan, Sebastian Pokutta, Yi Zhou, Daniel Zink
    Comments: 33 pages, 9 figures
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    In this work we introduce a conditional accelerated lazy stochastic gradient
    descent algorithm with optimal number of calls to a stochastic first-order
    oracle and convergence rate (Oleft(frac{1}{varepsilon^2}
    ight)) improving
    over the projection-free, Online Frank-Wolfe based stochastic gradient descent
    of Hazan and Kale [2012] with convergence rate
    (Oleft(frac{1}{varepsilon^4}
    ight)).

    Particle Value Functions

    Chris J. Maddison, Dieterich Lawson, George Tucker, Nicolas Heess, Arnaud Doucet, Andriy Mnih, Yee Whye Teh
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

    The policy gradients of the expected return objective can react slowly to
    rare rewards. Yet, in some cases agents may wish to emphasize the low or high
    returns regardless of their probability. Borrowing from the economics and
    control literature, we review the risk-sensitive value function that arises
    from an exponential utility and illustrate its effects on an example. This
    risk-sensitive value function is not always applicable to reinforcement
    learning problems, so we introduce the particle value function defined by a
    particle filter over the distributions of an agent’s experience, which bounds
    the risk-sensitive one. We illustrate the benefit of the policy gradients of
    this objective in Cliffworld.

    Nonconvex One-bit Single-label Multi-label Learning

    Shuang Qiu, Tingjin Luo, Jieping Ye, Ming Lin
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    We study an extreme scenario in multi-label learning where each training
    instance is endowed with a single one-bit label out of multiple labels. We
    formulate this problem as a non-trivial special case of one-bit rank-one matrix
    sensing and develop an efficient non-convex algorithm based on alternating
    power iteration. The proposed algorithm is able to recover the underlying
    low-rank matrix model with linear convergence. For a rank-(k) model with (d_1)
    features and (d_2) classes, the proposed algorithm achieves (O(epsilon))
    recovery error after retrieving (O(k^{1.5}d_1 d_2/epsilon)) one-bit labels
    within (O(kd)) memory. Our bound is nearly optimal in the order of
    (O(1/epsilon)). This significantly improves the state-of-the-art sampling
    complexity of one-bit multi-label learning. We perform experiments to verify
    our theory and evaluate the performance of the proposed algorithm.

    Modeling Relational Data with Graph Convolutional Networks

    Michael Schlichtkrull, Thomas N. Kipf, Peter Bloem, Rianne van den Berg, Ivan Titov, Max Welling
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Databases (cs.DB); Learning (cs.LG)

    Knowledge bases play a crucial role in many applications, for example
    question answering and information retrieval. Despite the great effort invested
    in creating and maintaining them, even the largest representatives (e.g., Yago,
    DBPedia or Wikidata) are highly incomplete. We introduce relational graph
    convolutional networks (R-GCNs) and apply them to two standard knowledge base
    completion tasks: link prediction (recovery of missing facts, i.e.
    subject-predicate-object triples) and entity classification (recovery of
    missing attributes of entities). R-GCNs are a generalization of graph
    convolutional networks, a recent class of neural networks operating on graphs,
    and are developed specifically to deal with highly multi-relational data,
    characteristic of realistic knowledge bases. Our methods achieve competitive
    results on standard benchmarks for both tasks.

    Machine learning approach for early detection of autism by combining questionnaire and home video screening

    Halim Abbas, Ford Garberson, Eric Glover, Dennis P Wall
    Subjects: Computers and Society (cs.CY); Learning (cs.LG)

    Existing screening tools for early detection of autism are expensive,
    cumbersome, time-intensive, and sometimes fall short in predictive value. In
    this work, we apply Machine Learning (ML) to gold standard clinical data
    obtained across thousands of children at risk for autism spectrum disorders to
    create a low-cost, quick, and easy to apply autism screening tool that performs
    as well or better than most widely used standardized instruments. This new tool
    combines two screening methods into a single assessment, one based on short,
    structured parent-report questionnaires and the other on tagging key behaviors
    from short, semi-structured home videos of children. To overcome the scarcity,
    sparsity, and imbalance of training data, we apply creative feature selection,
    feature engineering, and novel feature encoding techniques. We allow for
    inconclusive determination where appropriate in order to boost screening
    accuracy when conclusive. We demonstrate a significant accuracy improvement
    over standard screening tools in a clinical study sample of 162 children.

    Block CUR : Decomposing Large Distributed Matrices

    Urvashi Oswal, Swayambhoo Jain, Kevin S. Xu, Brian Eriksson
    Subjects: Machine Learning (stat.ML); Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS); Learning (cs.LG)

    A common problem in large-scale data analysis is to approximate a matrix
    using a combination of specifically sampled rows and columns, known as CUR
    decomposition. Unfortunately, in many real-world environments, the ability to
    sample specific individual rows or columns of the matrix is limited by either
    system constraints or cost. In this paper, we consider matrix approximation by
    sampling predefined blocks of columns (or rows) from the matrix. This regime is
    commonly found when data is distributed across multiple nodes in a compute
    cluster, where such blocks correspond to columns (or rows) of the matrix stored
    on the same node, which can be retrieved with much less overhead than
    retrieving individual columns stored across different nodes. We propose a novel
    algorithm for sampling useful column blocks and provide guarantees for the
    quality of the approximation. We demonstrate the practical utility of this
    algorithm for computing the block CUR decomposition of large matrices in a
    distributed setting using Apache Spark. Using our proposed block CUR
    algorithms, we can achieve a significant speed-up compared to a regular CUR
    decomposition with the same quality of approximation.

    Comparison of Different Methods for Tissue Segmentation in Histopathological Whole-Slide Images

    Péter Bándi, Rob van de Loo, Milad Intezar, Daan Geijs, Francesco Ciompi, Bram van Ginneken, Jeroen van der Laak, Geert Litjens
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Tissue segmentation is an important pre-requisite for efficient and accurate
    diagnostics in digital pathology. However, it is well known that whole-slide
    scanners can fail in detecting all tissue regions, for example due to the
    tissue type, or due to weak staining because their tissue detection algorithms
    are not robust enough. In this paper, we introduce two different convolutional
    neural network architectures for whole slide image segmentation to accurately
    identify the tissue sections. We also compare the algorithms to a published
    traditional method. We collected 54 whole slide images with differing stains
    and tissue types from three laboratories to validate our algorithms. We show
    that while the two methods do not differ significantly they outperform their
    traditional counterpart (Jaccard index of 0.937 and 0.929 vs. 0.870, p < 0.01).

    Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery

    Thomas Schlegl, Philipp Seeböck, Sebastian M. Waldstein, Ursula Schmidt-Erfurth, Georg Langs
    Comments: To be published in the proceedings of the international conference on Information Processing in Medical Imaging (IPMI), 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Obtaining models that capture imaging markers relevant for disease
    progression and treatment monitoring is challenging. Models are typically based
    on large amounts of data with annotated examples of known markers aiming at
    automating detection. High annotation effort and the limitation to a vocabulary
    of known markers limit the power of such approaches. Here, we perform
    unsupervised learning to identify anomalies in imaging data as candidates for
    markers. We propose AnoGAN, a deep convolutional generative adversarial network
    to learn a manifold of normal anatomical variability, accompanying a novel
    anomaly scoring scheme based on the mapping from image space to a latent space.
    Applied to new data, the model labels anomalies, and scores image patches
    indicating their fit into the learned distribution. Results on optical
    coherence tomography images of the retina demonstrate that the approach
    correctly identifies anomalous images, such as images containing retinal fluid
    or hyperreflective foci.

    Learning Robust Visual-Semantic Embeddings

    Yao-Hung Hubert Tsai, Liang-Kang Huang, Ruslan Salakhutdinov
    Comments: 12 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Learning (cs.LG)

    Many of the existing methods for learning joint embedding of images and text
    use only supervised information from paired images and its textual attributes.
    Taking advantage of the recent success of unsupervised learning in deep neural
    networks, we propose an end-to-end learning framework that is able to extract
    more robust multi-modal representations across domains. The proposed method
    combines representation learning models (i.e., auto-encoders) together with
    cross-domain learning criteria (i.e., Maximum Mean Discrepancy loss) to learn
    joint embeddings for semantic and visual features. A novel technique of
    unsupervised-data adaptation inference is introduced to construct more
    comprehensive embeddings for both labeled and unlabeled data. We evaluate our
    method on Animals with Attributes and Caltech-UCSD Birds 200-2011 dataset with
    a wide range of applications, including zero and few-shot image recognition and
    retrieval, from inductive to transductive settings. Empirically, we show that
    our framework improves over the current state of the art on many of the
    considered tasks.

    Empirical Evaluation of Parallel Training Algorithms on Acoustic Modeling

    Wenpeng Li, BinBin Zhang, Lei Xie, Dong Yu
    Subjects: Computation and Language (cs.CL); Learning (cs.LG); Sound (cs.SD)

    Deep learning models (DLMs) are state-of-the-art techniques in speech
    recognition. However, training good DLMs can be time consuming especially for
    production-size models and corpora. Although several parallel training
    algorithms have been proposed to improve training efficiency, there is no clear
    guidance on which one to choose for the task in hand due to lack of systematic
    and fair comparison among them. In this paper we aim at filling this gap by
    comparing four popular parallel training algorithms in speech recognition,
    namely asynchronous stochastic gradient descent (ASGD), blockwise model-update
    filtering (BMUF), bulk synchronous parallel (BSP) and elastic averaging
    stochastic gradient descent (EASGD), on 1000-hour LibriSpeech corpora using
    feed-forward deep neural networks (DNNs) and convolutional, long short-term
    memory, DNNs (CLDNNs). Based on our experiments, we recommend using BMUF as the
    top choice to train acoustic models since it is most stable, scales well with
    number of GPUs, can achieve reproducible results, and in many cases even
    outperforms single-GPU SGD. ASGD can be used as a substitute in some cases.

    Automatically identifying wild animals in camera trap images with deep learning

    Mohammed Sadegh Norouzzadeh, Anh Nguyen, Margaret Kosmala, Ali Swanson, Craig Packer, Jeff Clune
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Having accurate, detailed, and up-to-date information about wildlife location
    and behavior across broad geographic areas would revolutionize our ability to
    study, conserve, and manage species and ecosystems. Currently such data are
    mostly gathered manually at great expense, and thus are sparsely and
    infrequently collected. Here we investigate the ability to automatically,
    accurately, and inexpensively collect such data from motion sensor cameras.
    These camera traps enable pictures of wildlife to be collected inexpensively,
    unobtrusively, and at high-volume. However, identifying the animals, animal
    attributes, and behaviors in these pictures remains an expensive,
    time-consuming, manual task often performed by researchers, hired technicians,
    or crowdsourced teams of human volunteers. In this paper, we demonstrate that
    such data can be automatically extracted by deep neural networks (aka deep
    learning), which is a cutting-edge type of artificial intelligence. In
    particular, we use the existing human-labeled images from the Snapshot
    Serengeti dataset to train deep convolutional neural networks for identifying
    48 species in 3.2 million images taken from Tanzania’s Serengeti National Park.
    We train neural networks that automatically identify animals with over 92%
    accuracy. More importantly, we can choose to have our system classify only the
    images it is highly confident about, allowing valuable human time to be focused
    only on challenging images. In this case, our automatic animal identification
    system saves approximately ~8.3 years (at 40 hours per week) of human labeling
    effort (i.e. over 17,000 hours) while operating on a 3.2-million-image dataset
    at the same 96% accuracy level of crowdsourced teams of human volunteers. Those
    efficiency gains immediately highlight the importance of using deep neural
    networks to automate data extraction from camera trap images.


    Information Theory

    On the Minimization of Convex Functionals of Probability Distributions Under Band Constraints

    Michael Fauss, Abdelhak M. Zoubir
    Comments: 11 pages, 5 figures, 1 table, to be submitted to the IEEE Transactions on Signal Processing
    Subjects: Information Theory (cs.IT)

    The problem of minimizing convex functionals of probability distributions is
    solved under the assumption that the density of every distribution is bounded
    from above and below. First, a system of sufficient and necessary first order
    optimality conditions, which characterize global minima as solutions of a
    fixed-point equation, is derived. Based on these conditions, two algorithms are
    proposed that iteratively solve the fixed-point equation via a block coordinate
    descent strategy. While the first algorithm is conceptually simpler and more
    efficient, it is not guaranteed to converge for objective functions that are
    not strictly convex. This shortcoming is overcome in the second algorithm,
    which uses an additional outer proximal iteration, and, which is proven to
    converge under very mild assumptions. Two examples are given to demonstrate the
    theoretical usefulness of the optimality conditions as well as the high
    efficiency and accuracy of the proposed numerical algorithms.

    Performance Analysis of Ultra-Dense Networks with Elevated Base Stations

    Italo Atzeni, Jesús Arnau, Marios Kountouris
    Comments: 6 pages, 4 figures. To be presented at SpaSWiN’17 (WiOpt workshops), May 2017
    Subjects: Information Theory (cs.IT)

    This paper analyzes the downlink performance of ultra-dense networks with
    elevated base stations (BSs). We consider a general dual-slope pathloss model
    with distance-dependent probability of line-of-sight (LOS) transmission between
    BSs and receivers. Specifically, we consider the scenario where each link may
    be obstructed by randomly placed buildings. Using tools from stochastic
    geometry, we show that both coverage probability and area spectral efficiency
    decay to zero as the BS density grows large. Interestingly, we show that the BS
    height alone has a detrimental effect on the system performance even when the
    standard single-slope pathloss model is adopted.

    Globally Optimal Beamforming Design for Downlink CoMP transmission with Limited Backhaul Capacity

    Kien-Giang Nguyen, Quang-Doanh Vu, Markku Juntti, Le-Nam Tran
    Comments: 5 pages, 2 figures; Accepted for publication, ICASSP 2017
    Subjects: Information Theory (cs.IT)

    This paper considers a multicell downlink channel in which multiple base
    stations (BSs) cooperatively serve users by jointly precoding shared data
    transported from a central processor over limited-capacity backhaul links. We
    jointly design the beamformers and BS-user link selection so as to maximize the
    sum rate subject to user-specific signal-to-interference-noise (SINR)
    requirements, per-BS backhaul capacity and per-BS power constraints. As
    existing solutions for the considered problem are suboptimal and their
    optimality remains unknown due to the lack of globally optimal solutions, we
    characterized this gap by proposing a globally optimal algorithm for the
    problem of interest. Specifically, the proposed method is customized from a
    generic framework of a branch and bound algorithm applied to discrete monotonic
    optimization. We show that the proposed algorithm converges after a finite
    number of iterations, and can serve as a benchmark for existing suboptimal
    solutions and those that will be developed for similar contexts in the future.
    In this regard, we numerically compare the proposed optimal solution to a
    current state-of-the-art, which show that this suboptimal method only attains
    70% to 90% of the optimal performance.

    Energy Efficient Precoding C-RAN Downlink with Compression at Fronthaul

    Kien-Giang Nguyen, Quang-Doanh Vu, Markku Juntti, Le-Nam Tran
    Comments: 6 pages, 3 figures, accepted to IEEE ICC 2017 – Signal Processing for Communications Symposium
    Subjects: Information Theory (cs.IT)

    This paper considers a downlink transmission of cloud radio access network
    (C-RAN) in which precoded baseband signals at a common baseband unit are
    compressed before being forwarded to radio units (RUs) through limited
    fronthaul capacity links. We investigate the joint design of precoding,
    multivariate compression and RU-user selection which maximizes the energy
    efficiency of downlink C-RAN networks. The considered problem is inherently a
    rank-constrained mixed Boolean nonconvex program for which a globally optimal
    solution is difficult and computationally expensive to find. In order to derive
    practically appealing solutions, we invoke some useful relaxation and
    transformation techniques to arrive at a more tractable (but still nonconvex)
    continuous program. To solve the relaxation problem, we propose an iterative
    procedure based on DC algorithms which is provably convergent. Numerical
    results demonstrate the superior of the proposed solution in terms of
    achievable energy efficiency compared to existing schemes.

    A Novel Robust Transceiver Design for MIMO Interference Channel

    Ali Dalir, Hassan Aghaeinia
    Subjects: Information Theory (cs.IT)

    This paper focuses on robust transceiver design for throughput enhancement on
    the interference channel (IC), under imperfect channel state information (CSI).
    In this paper, algorithm is proposed to improve the throughput of the
    multi-input multi-output (MIMO) IC. Each transmitter and receiver has
    respectively M and N antennas and IC operates in a time division duplex mode.
    In the proposed algorithm, each transceiver adjusts its filter to minimize the
    estimated variance of signal-to-interference-plus-noise ratio (SINR) to hedge
    against the variability due to CSI error. Taylor expansion is exploited to
    approximate the effect of CSI imperfection on variance. Monte Carlo simulations
    are employed to investigate improvement in sum rate performance of the proposed
    algorithms and the advantage of incorporating variation minimization into the
    transceiver design.

    A Tight Upper Bound on the Second-Order Coding Rate of Parallel Gaussian Channels with Feedback

    Silas L. Fong, Vincent Y. F. Tan
    Comments: 18 pages. arXiv admin note: text overlap with arXiv:1410.2390
    Subjects: Information Theory (cs.IT)

    This paper investigates the asymptotic expansion for the maximum coding rate
    of a parallel Gaussian channel with feedback under the following setting: A
    peak power constraint is imposed on every transmitted codeword, and the average
    error probability of decoding the transmitted message is non-vanishing as the
    blocklength increases. It is well known that the presence of feedback does not
    increase the first-order asymptotics of the channel, i.e., capacity, in the
    asymptotic expansion, and the closed-form expression of the capacity can be
    obtained by the well-known water-filling algorithm. The main contribution of
    this paper is a self-contained proof of an upper bound on the second-order
    asymptotics of the parallel Gaussian channel with feedback. The proof
    techniques involve developing an information spectrum bound followed by using
    Curtiss’ theorem to show that a sum of dependent random variables associated
    with the information spectrum bound converges in distribution to a sum of
    independent random variables, thus facilitating the use of the usual central
    limit theorem. Combined with existing achievability results, our result implies
    that the presence of feedback does not improve the second-order asymptotics.

    A Contract-based Incentive Mechanism for Energy Harvesting-based Internet of Things

    Zhanwei Hou, He Chen, Yonghui Li, Zhu Han, Branka Vucetic
    Subjects: Computer Science and Game Theory (cs.GT); Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

    By enabling wireless devices to be charged wirelessly and remotely, radio
    frequency energy harvesting (RFEH) has become a promising technology to power
    the unattended Internet of Things (IoT) low-power devices. To enable this, in
    future IoT networks, besides the conventional data access points (DAPs)
    responsible for collecting data from IoT devices, energy access points (EAPs)
    should be deployed to transfer radio frequency (RF) energy to IoT devices to
    maintain their sustainable operations. In practice, the DAPs and EAPs may be
    operated by different operators and a DAP should provide certain incentives to
    motivate the surrounding EAPs to charge its associated IoT device(s) to assist
    its data collection. Motivated by this, in this paper we develop a contract
    theory-based incentive mechanism for the energy trading in RFEH assisted IoT
    systems. The necessary and sufficient condition for the feasibility of the
    formulated contract is analyzed. The optimal contract is derived to maximize
    the DAP’s expected utility as well as the social welfare. Simulation results
    demonstrate the feasibility and effectiveness of the proposed incentive
    mechanism.

    Invertibility of graph translation and support of Laplacian Fiedler vectors

    Matthew Begué, Kasso A. Okoudjou
    Comments: 21 pages, 7 figures
    Subjects: Functional Analysis (math.FA); Information Theory (cs.IT)

    The graph Laplacian operator is widely studied in spectral graph theory
    largely due to its importance in modern data analysis. Recently, the Fourier
    transform and other time-frequency operators have been defined on graphs using
    Laplacian eigenvalues and eigenvectors. We extend these results and prove that
    the translation operator to the (i)’th node is invertible if and only if all
    eigenvectors are nonzero on the (i)’th node. Because of this dependency on the
    support of eigenvectors we study the characteristic set of Laplacian
    eigenvectors. We prove that the Fiedler vector of a planar graph cannot vanish
    on large neighborhoods and then explicitly construct a family of non-planar
    graphs that do exhibit this property.

    Network Constrained Distributed Dual Coordinate Ascent for Machine Learning

    Myung Cho, Lifeng Lai, Weiyu Xu
    Comments: 5 pages, 6 figures
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Information Theory (cs.IT); Learning (cs.LG)

    With explosion of data size and limited storage space at a single location,
    data are often distributed at different locations. We thus face the challenge
    of performing large-scale machine learning from these distributed data through
    communication networks. In this paper, we study how the network communication
    constraints will impact the convergence speed of distributed machine learning
    optimization algorithms. In particular, we give the convergence rate analysis
    of the distributed dual coordinate ascent in a general tree structured network.
    Furthermore, by considering network communication delays, we optimize the
    network-constrained dual coordinate ascent algorithms to maximize its
    convergence speed. Our results show that under different network communication
    delays, to achieve maximum convergence speed, one needs to adopt
    delay-dependent numbers of local and global iterations for distributed dual
    coordinate ascent.




沪ICP备19023445号-2号
友情链接