IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Wed, 19 Apr 2017

    我爱机器学习(52ml.net)发表于 2017-04-19 00:00:00
    love 0

    Neural and Evolutionary Computing

    Diagonal RNNs in Symbolic Music Modeling

    Y. Cem Subakan, Paris Smaragdis
    Comments: Submitted to Waspaa 2017
    Subjects: Neural and Evolutionary Computing (cs.NE); Learning (cs.LG); Machine Learning (stat.ML)

    In this paper, we propose a new Recurrent Neural Network (RNN) architecture.
    The novelty is simple: We use diagonal recurrent matrices instead of full. This
    results in better test likelihood and faster convergence compared to regular
    full RNNs in most of our experiments. We show the benefits of using diagonal
    recurrent matrices with popularly used LSTM and GRU architectures as well as
    with the vanilla RNN architecture, on four standard symbolic music datasets.

    A Study of Deep Learning Robustness Against Computation Failures

    Jean-Charles Vialatte, François Leduc-Primeau
    Subjects: Neural and Evolutionary Computing (cs.NE); Learning (cs.LG)

    For many types of integrated circuits, accepting larger failure rates in
    computations can be used to improve energy efficiency. We study the performance
    of faulty implementations of certain deep neural networks based on pessimistic
    and optimistic models of the effect of hardware faults. After identifying the
    impact of hyperparameters such as the number of layers on robustness, we study
    the ability of the network to compensate for computational failures through an
    increase of the network size. We show that some networks can achieve equivalent
    performance under faulty implementations, and quantify the required increase in
    computational complexity.

    LibOPT: An Open-Source Platform for Fast Prototyping Soft Optimization Techniques

    Joao Paulo Papa, Gustavo Henrique Rosa, Douglas Rodrigues, Xin-She Yang
    Subjects: Neural and Evolutionary Computing (cs.NE)

    Optimization techniques play an important role in several scientific and
    real-world applications, thus becoming of great interest for the community. As
    a consequence, a number of open-source libraries are available in the
    literature, which ends up fostering the research and development of new
    techniques and applications. In this work, we present a new library for the
    implementation and fast prototyping of nature-inspired techniques called
    LibOPT. Currently, the library implements 15 techniques and 112 benchmarking
    functions, as well as it also supports 11 hypercomplex-based optimization
    approaches, which makes it one of the first of its kind. We showed how one can
    easily use and also implement new techniques in LibOPT under the C paradigm.
    Examples are provided with samples of source-code using benchmarking functions.

    The Emergence of Canalization and Evolvability in an Open-Ended, Interactive Evolutionary System

    Joost Huizinga, Kenneth O. Stanley, Jeff Clune
    Comments: SI can be found at: this http URL
    Subjects: Neural and Evolutionary Computing (cs.NE)

    Natural evolution has produced a tremendous diversity of functional
    organisms. Many believe an essential component of this process was the
    evolution of evolvability, whereby evolution speeds up its ability to innovate
    by generating a more adaptive pool of offspring. One hypothesized mechanism for
    evolvability is developmental canalization, wherein certain dimensions of
    variation become more likely to be traversed and others are prevented from
    being explored (e.g. offspring tend to have similarly sized legs, and mutations
    affect the length of both legs, not each leg individually). While ubiquitous in
    nature, canalization almost never evolves in computational simulations of
    evolution. Not only does that deprive us of in silico models in which to study
    the evolution of evolvability, but it also raises the question of which
    conditions give rise to this form of evolvability. Answering this question
    would shed light on why such evolvability emerged naturally and could
    accelerate engineering efforts to harness evolution to solve important
    engineering challenges. In this paper we reveal a unique system in which
    canalization did emerge in computational evolution. We document that genomes
    entrench certain dimensions of variation that were frequently explored during
    their evolutionary history. The genetic representation of these organisms also
    evolved to be highly modular and hierarchical, and we show that these
    organizational properties correlate with increased fitness. Interestingly, the
    type of computational evolutionary experiment that produced this evolvability
    was very different from traditional digital evolution in that there was no
    objective, suggesting that open-ended, divergent evolutionary processes may be
    necessary for the evolution of evolvability.

    Learning Affine Feature Space Transformations in Symbolic Regression

    Jan Žegklitz, Petr Pošík
    Subjects: Neural and Evolutionary Computing (cs.NE)

    We propose a new type of leaf node for use in Symbolic Regression (SR) that
    performs linear combinations of feature variables (LCF). These nodes can be
    handled in three different modes — an unsynchronized mode, where all LCFs are
    free to change on their own, a synchronized mode, where LCFs are sorted into
    groups in which they are forced to be identical throughout the whole
    individual, and a globally synchronized mode, which is similar to the previous
    mode but the grouping is done across the whole population. We also present two
    methods of evolving the weights of the LCFs — a purely stochastic way via
    mutation and a gradient-based way based on the backpropagation algorithm known
    from neural networks — and also a combination of both. We experimentally
    evaluate all configurations of LCFs in Multi-Gene Genetic Programming (MGGP),
    which was chosen as baseline, on a number of benchmarks. According to the
    results, we identified two configurations which increase the performance of the
    algorithm.

    A hybrid CPU-GPU parallelization scheme of variable neighborhood search for inventory optimization problems

    Nikolaos Antoniadis, Angelo Sifaleras
    Comments: 8 pages, 1 figure
    Journal-ref: Electronic Notes in Discrete Mathematics, Volume 58, April 2017,
    Pages 47-54, ISSN 1571-0653
    Subjects: Neural and Evolutionary Computing (cs.NE); Distributed, Parallel, and Cluster Computing (cs.DC); Discrete Mathematics (cs.DM)

    In this paper, we study various parallelization schemes for the Variable
    Neighborhood Search (VNS) metaheuristic on a CPU-GPU system via OpenMP and
    OpenACC. A hybrid parallel VNS method is applied to recent benchmark problem
    instances for the multi-product dynamic lot sizing problem with product returns
    and recovery, which appears in reverse logistics and is known to be NP-hard. We
    report our findings regarding these parallelization approaches and present
    promising computational results.

    On Improving the Capacity of Solving Large-scale Wireless Network Design Problems by Genetic Algorithms

    Fabio D'Andreagiovanni
    Comments: This is the authors’ final version of the paper published in Di Chio C. et al. (Eds): EvoApplications 2011, LNCS 6625, pp. 11-20, 2011. The final publication is available at Springer via this http URL
    Journal-ref: Di Chio C. et al. (Eds), EvoApplications 2011, Springer LNCS vol.
    6625, pp. 11-20, 2011
    Subjects: Optimization and Control (math.OC); Neural and Evolutionary Computing (cs.NE)

    Over the last decade, wireless networks have experienced an impressive growth
    and now play a main role in many telecommunications systems. As a consequence,
    scarce radio resources, such as frequencies, became congested and the need for
    effective and efficient assignment methods arose. In this work, we present a
    Genetic Algorithm for solving large instances of the Power, Frequency and
    Modulation Assignment Problem, arising in the design of wireless networks. To
    our best knowledge, this is the first Genetic Algorithm that is proposed for
    such problem. Compared to previous works, our approach allows a wider
    exploration of the set of power solutions, while eliminating sources of
    numerical problems. The performance of the algorithm is assessed by tests over
    a set of large realistic instances of a Fixed WiMAX Network.

    Criticality as It Could Be: organizational invariance as self-organized criticality in embodied agents

    Miguel Aguilera, Manuel G. Bedia
    Subjects: Adaptation and Self-Organizing Systems (nlin.AO); Disordered Systems and Neural Networks (cond-mat.dis-nn); Statistical Mechanics (cond-mat.stat-mech); Neural and Evolutionary Computing (cs.NE); Neurons and Cognition (q-bio.NC)

    This paper outlines a methodological approach to generate adaptive agents
    driving themselves near points of criticality. Using a synthetic approach we
    construct a conceptual model that, instead of specifying mechanistic
    requirements to generate criticality, exploits the maintenance of an
    organizational structure capable of reproducing critical behavior. Our approach
    captures the well-known principle of universality that classifies critical
    phenomena inside a few universality classes of systems without relying on
    specific mechanisms or topologies. In particular, we implement an artificial
    embodied agent controlled by a neural network maintaining a correlation
    structure randomly sampled form a lattice Ising model at a critical point. We
    evaluate the agent in two classical reinforcement learning scenarios: the
    Mountain Car benchmark and the Acrobot double pendulum, finding that in both
    cases the neural controller reaches a point of criticality, which coincides
    with a transition point between two regimes of the agent’s behaviour,
    maximizing the synergistic information between hidden neurons and sensorimotor
    patterns. Finally, we discuss the possible applications of this synthetic
    approach to the comprehension of deeper principles connected to the pervasive
    presence of criticality in biological and cognitive systems.


    Computer Vision and Pattern Recognition

    Light Field Blind Motion Deblurring

    Pratul P. Srinivasan, Ren Ng, Ravi Ramamoorthi
    Comments: To be presented at CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We study the problem of deblurring light fields of general 3D scenes captured
    under 3D camera motion and present both theoretical and practical
    contributions. By analyzing the motion-blurred light field in the primal and
    Fourier domains, we develop intuition into the effects of camera motion on the
    light field, show the advantages of capturing a 4D light field instead of a
    conventional 2D image for motion deblurring, and derive simple methods of
    motion deblurring in certain cases. We then present an algorithm to blindly
    deblur light fields of general scenes without any estimation of scene geometry,
    and demonstrate that we can recover both the sharp light field and the 3D
    camera motion path of real and synthetically-blurred light fields.

    Ranking to Learn: Feature Ranking and Selection via Eigenvector Centrality

    Giorgio Roffo, Simone Melzi
    Comments: Preprint version – Lecture Notes in Computer Science – Springer 2017
    Journal-ref: New Frontiers in Mining Complex Patterns, Fifth International
    workshop, nfMCP2016. Lecture Notes in Computer Science – Springer
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)

    In an era where accumulating data is easy and storing it inexpensive, feature
    selection plays a central role in helping to reduce the high-dimensionality of
    huge amounts of otherwise meaningless data. In this paper, we propose a
    graph-based method for feature selection that ranks features by identifying the
    most important ones into arbitrary set of cues. Mapping the problem on an
    affinity graph-where features are the nodes-the solution is given by assessing
    the importance of nodes through some indicators of centrality, in particular,
    the Eigen-vector Centrality (EC). The gist of EC is to estimate the importance
    of a feature as a function of the importance of its neighbors. Ranking central
    nodes individuates candidate features, which turn out to be effective from a
    classification point of view, as proved by a thoroughly experimental section.
    Our approach has been tested on 7 diverse datasets from recent literature
    (e.g., biological data and object recognition, among others), and compared
    against filter, embedded and wrappers methods. The results are remarkable in
    terms of accuracy, stability and low execution time.

    Interactive Outlining of Pancreatic Cancer Liver Metastases in Ultrasound Images

    Jan Egger, Dieter Schmalstieg, Xiaojun Chen, Wolfram G. Zoller, Alexander Hann
    Comments: 15 pages, 16 figures, 2 tables, 58 references
    Journal-ref: Sci. Rep. 7, 892, 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)

    Ultrasound (US) is the most commonly used liver imaging modality worldwide.
    Due to its low cost, it is increasingly used in the follow-up of cancer
    patients with metastases localized in the liver. In this contribution, we
    present the results of an interactive segmentation approach for liver
    metastases in US acquisitions. A (semi-) automatic segmentation is still very
    challenging because of the low image quality and the low contrast between the
    metastasis and the surrounding liver tissue. Thus, the state of the art in
    clinical practice is still manual measurement and outlining of the metastases
    in the US images. We tackle the problem by providing an interactive
    segmentation approach providing real-time feedback of the segmentation results.
    The approach has been evaluated with typical US acquisitions from the clinical
    routine, and the datasets consisted of pancreatic cancer metastases. Even for
    difficult cases, satisfying segmentations results could be achieved because of
    the interactive real-time behavior of the approach. In total, 40 clinical
    images have been evaluated with our method by comparing the results against
    manual ground truth segmentations. This evaluation yielded to an average Dice
    Score of 85% and an average Hausdorff Distance of 13 pixels.

    A Comment on "Analysis of Video Image Sequences Using Point and Line Correspondences"

    Mieczysław A. Kłopotek
    Journal-ref: preliminary version of: M.A. K{l}opotek: A comment on “Analysis
    of video image sequences using point and line correspondences”. Pattern
    Recognition 28(1995)2, pp. 283-292
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper we would like to deny the results of Wang et al. raising two
    fundamental claims:

    * A line does not contribute anything to recognition of motion parameters
    from two images

    * Four traceable points are not sufficient to recover motion parameters from
    two perspective

    To be constructive, however, we show that four traceable points are
    sufficient to recover motion parameters from two frames under orthogonal
    projection and that five points are sufficient to simplify the solution of the
    two-frame problem under orthogonal projection to solving a linear equation
    system.

    Image Fusion With Cosparse Analysis Operator

    Rui Gao, Sergiy A. Vorobyov, Hong Zhao
    Comments: 12 pages, 4 figures, 1 table, Submitted to IEEE Signal Processing Letters on December 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT)

    The paper addresses the image fusion problem, where multiple images captured
    with different focus distances are to be combined into a higher quality
    all-in-focus image. Most current approaches for image fusion strongly rely on
    the unrealistic noise-free assumption used during the image acquisition, and
    then yield limited robustness in fusion processing. In our approach, we
    formulate the multi-focus image fusion problem in terms of an analysis sparse
    model, and simultaneously perform the restoration and fusion of multi-focus
    images. Based on this model, we propose an analysis operator learning, and
    define a novel fusion function to generate an all-in-focus image. Experimental
    evaluations confirm the effectiveness of the proposed fusion approach both
    visually and quantitatively, and show that our approach outperforms
    state-of-the-art fusion methods.

    Robust Optical Flow Estimation in Rainy Scenes

    Ruoteng Li, Robby T. Tan, Loong-Fah Cheong
    Comments: 8 pages, ICCV
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper presents a method to estimate optical flow under rainy scenes.
    Optical flow estimation in the rainy scenes is considered challenging due to
    background degradation introduced by rain streaks and rain accumulation effects
    in the scene. Rain accumulation effect refers to poor visibility of remote
    object due to the intense rainfall. Most existing optical flow methods are
    erroneous when applied to rain sequences because the conventional brightness
    constancy constraint (BCC) and gradient constancy constraint (GCC) generally
    break down in this situation. In this paper, our method considers the rain
    streaks and rain accumulation separately. Based on the fact that the RGB color
    channels receive raindrop radiance equally, we introduce a residue channel as a
    new data constraint to significantly reduce rain streaks. In the case of rain
    accumulation, our method proposes to separate the image into a piecewise-smooth
    background layer and a high-frequency detail layer and enforce BCC on the
    background layer only. Results on both synthetic dataset and real images show
    that our algorithm outperforms existing methods on different types of rain
    sequences. To our knowledge, this is the first optical flow method dealing with
    rain.

    Fast 2-D Complex Gabor Filter with Kernel Decomposition

    Suhyuk Um, Jaeyoon Kim, Dongbo Min (Senior Member, IEEE)
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    2-D complex Gabor filtering has found numerous applications in the fields of
    computer vision and image processing. Especially, in some applications, it is
    often needed to compute 2-D complex Gabor filter bank consisting of the 2-D
    complex Gabor filtering outputs at multiple orientations and frequencies.
    Although several approaches for fast 2-D complex Gabor filtering have been
    proposed, they primarily focus on reducing the runtime of performing the 2-D
    complex Gabor filtering once at specific orientation and frequency. To obtain
    the 2-D complex Gabor filter bank output, existing methods are repeatedly
    applied with respect to multiple orientations and frequencies. In this paper,
    we propose a novel approach that efficiently computes the 2-D complex Gabor
    filter bank by reducing the computational redundancy that arises when
    performing the Gabor filtering at multiple orientations and frequencies. The
    proposed method first decomposes the Gabor basis kernels to allow a fast
    convolution with the Gaussian kernel in a separable manner. This enables
    reducing the runtime of the 2-D complex Gabor filter bank by reusing
    intermediate results of the 2-D complex Gabor filtering computed at a specific
    orientation. Furthermore, we extend this idea into 2-D localized sliding
    discrete Fourier transform (SDFT) using the Gaussian kernel in the DFT
    computation, which lends a spatial localization ability as in the 2-D complex
    Gabor filter. Experimental results demonstrate that our method runs faster than
    state-of-the-arts methods for fast 2-D complex Gabor filtering, while
    maintaining similar filtering quality.

    Deep Self-Taught Learning for Weakly Supervised Object Localization

    Zequn Jie, Yunchao Wei, Xiaojie Jin, Jiashi Feng, Wei Liu
    Comments: Accepted as spotlight paper by CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Most existing weakly supervised localization (WSL) approaches learn detectors
    by finding positive bounding boxes based on features learned with image-level
    supervision. However, those features do not contain spatial location related
    information and usually provide poor-quality positive samples for training a
    detector. To overcome this issue, we propose a deep self-taught learning
    approach, which makes the detector learn the object-level features reliable for
    acquiring tight positive samples and afterwards re-train itself based on them.
    Consequently, the detector progressively improves its detection ability and
    localizes more informative positive samples. To implement such self-taught
    learning, we propose a seed sample acquisition method via image-to-object
    transferring and dense subgraph discovery to find reliable positive samples for
    initializing the detector. An online supportive sample harvesting scheme is
    further proposed to dynamically select the most confident tight positive
    samples and train the detector in a mutual boosting way. To prevent the
    detector from being trapped in poor optima due to overfitting, we propose a new
    relative improvement of predicted CNN scores for guiding the self-taught
    learning process. Extensive experiments on PASCAL 2007 and 2012 show that our
    approach outperforms the state-of-the-arts, strongly validating its
    effectiveness.

    Video Object Segmentation using Supervoxel-Based Gerrymandering

    Brent A. Griffin, Jason J. Corso
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Pixels operate locally. Superpixels have some potential to collect
    information across many pixels; supervoxels have more potential by implicitly
    operating across time. In this paper, we explore this well established notion
    thoroughly analyzing how supervoxels can be used in place of and in conjunction
    with other means of aggregating information across space-time. Focusing on the
    problem of strictly unsupervised video object segmentation, we devise a method
    called supervoxel gerrymandering that links masks of foregroundness and
    backgroundness via local and non-local consensus measures. We pose and answer a
    series of critical questions about the ability of supervoxels to adequately
    sway local voting; the questions regard type and scale of supervoxels as well
    as local versus non-local consensus, and the questions are posed in a general
    way so as to impact the broader knowledge of the use of supervoxels in video
    understanding. We work with the DAVIS dataset and find that our analysis yields
    an unsupervised method that outperforms all other known unsupervised methods
    and even many supervised ones.

    A Gabor Filter Texture Analysis Approach for Histopathological Brain Tumor Subtype Discrimination

    Omar S. Al-Kadi
    Comments: 14 pages,4 figures, 2 tables
    Journal-ref: ISESCO Journal of Science and Technology, vol. 12, no. 22, 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Meningioma brain tumour discrimination is challenging as many histological
    patterns are mixed between the different subtypes. In clinical practice,
    dominant patterns are investigated for signs of specific meningioma pathology;
    however the simple observation could result in inter- and intra-observer
    variation due to the complexity of the histopathological patterns. Also
    employing a computerised feature extraction approach applied at a single
    resolution scale might not suffice in accurately delineating the mixture of
    histopathological patterns. In this work we propose a novel multiresolution
    feature extraction approach for characterising the textural properties of the
    different pathological patterns (i.e. mainly cell nuclei shape, orientation and
    spatial arrangement within the cytoplasm). The pattern textural properties are
    characterised at various scales and orientations for an improved separability
    between the different extracted features. The Gabor filter energy output of
    each magnitude response was combined with four other fixed-resolution texture
    signatures (2 model-based and 2 statistical-based) with and without cell nuclei
    segmentation. The highest classification accuracy of 95% was reported when
    combining the Gabor filters energy and the meningioma subimage fractal
    signature as a feature vector without performing any prior cell nuceli
    segmentation. This indicates that characterising the cell-nuclei
    self-similarity properties via Gabor filters can assists in achieving an
    improved meningioma subtype classification, which can assist in overcoming
    variations in reported diagnosis.

    Google's Cloud Vision API Is Not Robust To Noise

    Hossein Hosseini, Baicen Xiao, Radha Poovendran
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Google has recently introduced the Cloud Vision API for image analysis.
    According to the demonstration website, the API “quickly classifies images into
    thousands of categories, detects individual objects and faces within images,
    and finds and reads printed words contained within images.” It can be also used
    to “detect different types of inappropriate content from adult to violent
    content.”

    In this paper, we evaluate the robustness of the Google’s Cloud Vision API to
    input perturbation. In particular, we show that by adding sufficient noise to
    the image, the API generates completely different outputs for the noisy image,
    while a human observer would perceive its original content. We show that the
    attack is consistently successful, by performing extensive experiments on
    different image types, including natural images, images containing faces and
    images with texts. Our findings indicate the vulnerability of the API in
    adversarial environments. For example, an adversary can bypass an image
    filtering system by adding noise to an image with inappropriate content.

    Unsupervised Learning by Predicting Noise

    Piotr Bojanowski, Armand Joulin
    Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Convolutional neural networks provide visual features that perform remarkably
    well in many computer vision applications. However, training these networks
    requires significant amounts of supervision. This paper introduces a generic
    framework to train deep networks, end-to-end, with no supervision. We propose
    to fix a set of target representations, called Noise As Targets (NAT), and to
    constrain the deep features to align to them. This domain agnostic approach
    avoids the standard unsupervised learning issues of trivial solutions and
    collapsing of features. Thanks to a stochastic batch reassignment strategy and
    a separable square loss function, it scales to millions of images. The proposed
    approach produces representations that perform on par with state-of-the-art
    unsupervised methods on ImageNet and Pascal VOC.


    Artificial Intelligence

    Synergy of all-purpose static solver and temporal reasoning tools in dynamic integrated expert systems

    Galina Rybina, Alexey Mozgachev, Dmitry Demidov
    Comments: 8 pages, 3 figures
    Journal-ref: “Informatsionno-izmeritelnye i upravlyayushchie sistemy”
    (Information-measuring and Control Systems) no.8, vol.12, 2014. pp 27-33.
    ISSN 2070-0814
    Subjects: Artificial Intelligence (cs.AI)

    The paper discusses scientific and technological problems of dynamic
    integrated expert systems development. Extensions of problem-oriented
    methodology for dynamic integrated expert systems development are considered.
    Attention is paid to the temporal knowledge representation and processing.

    Understanding Negations in Information Processing: Learning from Replicating Human Behavior

    Nicolas Pröllochs, Stefan Feuerriegel, Dirk Neumann
    Comments: 39 pages
    Subjects: Artificial Intelligence (cs.AI); Applications (stat.AP); Machine Learning (stat.ML)

    Information systems experience an ever-growing volume of unstructured data,
    particularly in the form of textual materials. This represents a rich source of
    information from which one can create value for people, organizations and
    businesses. For instance, recommender systems can benefit from automatically
    understanding preferences based on user reviews or social media. However, it is
    difficult for computer programs to correctly infer meaning from narrative
    content. One major challenge is negations that invert the interpretation of
    words and sentences. As a remedy, this paper proposes a novel learning strategy
    to detect negations: we apply reinforcement learning to find a policy that
    replicates the human perception of negations based on an exogenous response,
    such as a user rating for reviews. Our method yields several benefits, as it
    eliminates the former need for expensive and subjective manual labeling in an
    intermediate stage. Moreover, the inferred policy can be used to derive
    statistical inferences and implications regarding how humans process and act on
    negations.

    Anomaly detection and motif discovery in symbolic representations of time series

    Fabio Guigou (ICube), Pierre Collet (ICube, UNISTRA), Pierre Parrend (ICube)
    Subjects: Artificial Intelligence (cs.AI)

    The advent of the Big Data hype and the consistent recollection of event logs
    and real-time data from sensors, monitoring software and machine configuration
    has generated a huge amount of time-varying data in about every sector of the
    industry. Rule-based processing of such data has ceased to be relevant in many
    scenarios where anomaly detection and pattern mining have to be entirely
    accomplished by the machine. Since the early 2000s, the de-facto standard for
    representing time series has been the Symbolic Aggregate approXimation (SAX).In
    this document, we present a few algorithms using this representation for
    anomaly detection and motif discovery, also known as pattern mining, in such
    data. We propose a benchmark of anomaly detection algorithms using data from
    Cloud monitoring software.

    Semantic Similarity from Natural Language and Ontology Analysis

    Sébastien Harispe, Sylvie Ranwez, Stefan Janaqi, Jacky Montmain
    Comments: preprint version of the book Semantic Similarity from Natural Language and Ontology Analysis (Synthesis Lectures on Human Language Technologies – Morgan & Claypool publishers)
    Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

    Artificial Intelligence federates numerous scientific fields in the aim of
    developing machines able to assist human operators performing complex
    treatments — most of which demand high cognitive skills (e.g. learning or
    decision processes). Central to this quest is to give machines the ability to
    estimate the likeness or similarity between things in the way human beings
    estimate the similarity between stimuli.

    In this context, this book focuses on semantic measures: approaches designed
    for comparing semantic entities such as units of language, e.g. words,
    sentences, or concepts and instances defined into knowledge bases. The aim of
    these measures is to assess the similarity or relatedness of such semantic
    entities by taking into account their semantics, i.e. their meaning —
    intuitively, the words tea and coffee, which both refer to stimulating
    beverage, will be estimated to be more semantically similar than the words
    toffee (confection) and coffee, despite that the last pair has a higher
    syntactic similarity. The two state-of-the-art approaches for estimating and
    quantifying semantic similarities/relatedness of semantic entities are
    presented in detail: the first one relies on corpora analysis and is based on
    Natural Language Processing techniques and semantic models while the second is
    based on more or less formal, computer-readable and workable forms of knowledge
    such as semantic networks, thesaurus or ontologies. (…) Beyond a simple
    inventory and categorization of existing measures, the aim of this monograph is
    to convey novices as well as researchers of these domains towards a better
    understanding of semantic similarity estimation and more generally semantic
    measures.

    Approximations from Anywhere and General Rough Sets

    A. Mani
    Comments: 20 Pages. Scheduled to appear in IJCRS’2017 LNCS Proceedings, Springer
    Subjects: Logic (math.LO); Artificial Intelligence (cs.AI); Discrete Mathematics (cs.DM); Information Theory (cs.IT)

    Not all approximations arise from information systems. The problem of fitting
    approximations, subjected to some rules (and related data), to information
    systems in a rough scheme of things is known as the emph{inverse problem}. The
    inverse problem is more general than the duality (or abstract representation)
    problems and was introduced by the present author in her earlier papers. From
    the practical perspective, a few (as opposed to one) theoretical frameworks may
    be suitable for formulating the problem itself. emph{Granular operator spaces}
    have been recently introduced and investigated by the present author in her
    recent work in the context of antichain based and dialectical semantics for
    general rough sets. The nature of the inverse problem is examined from
    number-theoretic and combinatorial perspectives in a higher order variant of
    granular operator spaces and some necessary conditions are proved. The results
    and the novel approach would be useful in a number of unsupervised and semi
    supervised learning contexts and algorithms.

    The Causality/Repair Connection in Databases: Causality-Programs

    Leopoldo Bertossi
    Comments: 7-pages, conference submission as short paper
    Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI)

    In this work, answer-set programs that specify repairs of databases are used
    as a basis for solving computational and reasoning problems about causes for
    query answers from databases.


    Information Retrieval

    Mining Worse and Better Opinions. Unsupervised and Agnostic Aggregation of Online Reviews

    Michela Fazzolari, Marinella Petrocchi, Alessandro Tommasi, Cesare Zavattari
    Subjects: Social and Information Networks (cs.SI); Computation and Language (cs.CL); Information Retrieval (cs.IR)

    In this paper, we propose a novel approach for aggregating online reviews,
    according to the opinions they express. Our methodology is unsupervised – due
    to the fact that it does not rely on pre-labeled reviews – and it is agnostic –
    since it does not make any assumption about the domain or the language of the
    review content. We measure the adherence of a review content to the domain
    terminology extracted from a review set. First, we demonstrate the
    informativeness of the adherence metric with respect to the score associated
    with a review. Then, we exploit the metric values to group reviews, according
    to the opinions they express. Our experimental campaign has been carried out on
    two large datasets collected from Booking and Amazon, respectively.

    FEUP at SemEval-2017 Task 5: Predicting Sentiment Polarity and Intensity with Financial Word Embeddings

    Pedro Saleiro, Eduarda Mendes Rodrigues, Carlos Soares, Eugénio Oliveira
    Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)

    This paper presents the approach developed at the Faculty of Engineering of
    University of Porto, to participate in SemEval 2017, Task 5: Fine-grained
    Sentiment Analysis on Financial Microblogs and News. The task consisted in
    predicting a real continuous variable from -1.0 to +1.0 representing the
    polarity and intensity of sentiment concerning companies/stocks mentioned in
    short texts. We modeled the task as a regression analysis problem and combined
    traditional techniques such as pre-processing short texts, bag-of-words
    representations and lexical-based features with enhanced financial specific
    bag-of-embeddings. We used an external collection of tweets and news headlines
    mentioning companies/stocks from S&P 500 to create financial word embeddings
    which are able to capture domain-specific syntactic and semantic similarities.
    The resulting approach obtained a cosine similarity score of 0.69 in sub-task
    5.1 – Microblogs and 0.68 in sub-task 5.2 – News Headlines.


    Computation and Language

    A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference

    Adina Williams, Nikita Nangia, Samuel R. Bowman
    Comments: 10 pages, 2 figures, 5 tables, submitted to EMNLP 2017
    Subjects: Computation and Language (cs.CL)

    This paper introduces the Multi-Genre Natural Language Inference (MultiNLI)
    corpus, a dataset designed for use in the development and evaluation of machine
    learning models for sentence understanding. In addition to being one of the
    largest corpora available for the task of NLI, at 433k examples, this corpus
    improves upon available resources in its coverage: it offers data from ten
    distinct genres of written and spoken English–making it possible to evaluate
    systems on nearly the full complexity of the language–and it offers an
    explicit setting for the evaluation of cross-genre domain adaptation.

    An Empirical Analysis of NMT-Derived Interlingual Embeddings and their Use in Parallel Sentence Identification

    Cristina España-Bonet, Ádám Csaba Varga, Alberto Barrón-Cedeño, Josef van Genabith
    Comments: 15 pages, 3 figures
    Subjects: Computation and Language (cs.CL)

    End-to-end neural machine translation has overtaken statistical machine
    translation in terms of translation quality for some language pairs, specially
    those with a large amount of parallel data available. Beside this palpable
    improvement, neural networks embrace several new properties. A single system
    can be trained to translate between many languages at almost no additional cost
    other than training time. Furthermore, internal representations learned by the
    network serve as a new semantic representation of words -or sentences- which,
    unlike standard word embeddings, are learned in an essentially bilingual or
    even multilingual context. In view of these properties, the contribution of the
    present work is two-fold. First, we systematically study the context vectors,
    i.e. output of the encoder, and their prowess as an interlingua representation
    of a sentence. Their quality and effectiveness are assessed by similarity
    measures across translations, semantically related, and semantically unrelated
    sentence pairs. Second, and as extrinsic evaluation of the first point, we
    identify parallel sentences in comparable corpora, obtaining an F1=98.2% on
    data from a shared task when using only context vectors. F1 reaches 98.9% when
    complementary similarity measures are used.

    Representing Sentences as Low-Rank Subspaces

    Jiaqi Mu, Suma Bhat, Pramod Viswanath
    Subjects: Computation and Language (cs.CL)

    Sentences are important semantic units of natural language. A generic,
    distributional representation of sentences that can capture the latent
    semantics is beneficial to multiple downstream applications. We observe a
    simple geometry of sentences — the word representations of a given sentence
    (on average 10.23 words in all SemEval datasets with a standard deviation 4.84)
    roughly lie in a low-rank subspace (roughly, rank 4). Motivated by this
    observation, we represent a sentence by the low-rank subspace spanned by its
    word vectors. Such an unsupervised representation is empirically validated via
    semantic textual similarity tasks on 19 different datasets, where it
    outperforms the sophisticated neural network models, including skip-thought
    vectors, by 15% on average.

    Baselines and test data for cross-lingual inference

    Željko Agić, Natalie Schluter
    Comments: Submitted for review at EMNLP 2017
    Subjects: Computation and Language (cs.CL)

    Research in natural language inference is currently exclusive to English.
    Here, we propose to advance toward multilingual evaluation. To that end, we
    provide test data for four major languages. We experiment with a set of
    baselines based on cross-lingual embeddings and machine translation. While our
    best system scores an average accuracy of just over 75%, we focus largely on
    enabling further research in multilingual inference.

    Sentiment analysis based on rhetorical structure theory: Learning deep neural networks from discourse trees

    Mathias Kraus, Stefan Feuerriegel
    Subjects: Computation and Language (cs.CL)

    Prominent applications of sentiment analysis are countless, including areas
    such as marketing, customer service and communication. The conventional
    bag-of-words approach for measuring sentiment merely counts term frequencies;
    however, it neglects the position of the terms within the discourse. As a
    remedy, we thus develop a discourse-aware method that builds upon the discourse
    structure of documents. For this purpose, we utilize rhetorical structure
    theory to label (sub-)clauses according to their hierarchical relationships and
    then assign polarity scores to individual leaves. To learn from the resulting
    rhetoric structure, we propose a tensor-based, tree-structured deep neural
    network (named RST-LSTM) in order to process the complete discourse tree. The
    underlying attention mechanism infers the salient passages of narrative
    materials. In addition, we suggest two algorithms for data augmentation (node
    reordering and artificial leaf insertion) that increase our training set and
    reduce overfitting. Our benchmarks demonstrate the superior performance of our
    approach. Ultimately, this work advances the status quo in natural language
    processing by developing algorithms that incorporate semantic information.

    SearchQA: A New Q&A Dataset Augmented with Context from a Search Engine

    Matthew Dunn, Levent Sagun, Mike Higgins, Ugur Guney, Volkan Cirik, Kyunghyun Cho
    Subjects: Computation and Language (cs.CL)

    We publicly release a new large-scale dataset, called SearchQA, for machine
    comprehension, or question-answering. Unlike recently released datasets, such
    as DeepMind CNN/DailyMail and SQuAD, the proposed SearchQA was constructed to
    reflect a full pipeline of general question-answering. That is, we start not
    from an existing article and generate a question-answer pair, but start from an
    existing question-answer pair, crawled from J! Archive, and augment it with
    text snippets retrieved by Google. Following this approach, we built SearchQA,
    which consists of more than 140k question-answer pairs with each pair having
    49.6 snippets on average. Each question-answer-context tuple of the SearchQA
    comes with additional meta-data such as the snippet’s URL, which we believe
    will be valuable resources for future research. We conduct human evaluation as
    well as test two baseline methods, one simple word selection and the other deep
    learning based, on the SearchQA. We show that there is a meaningful gap between
    the human and machine performances. This suggests that the proposed dataset
    could well serve as a benchmark for question-answering.

    Automatic Disambiguation of French Discourse Connectives

    Majid Laali, Leila Kosseim
    Journal-ref: International Journal of Computational Linguistics and
    Applications, vol. 7, no. 1, 2016, pp. 11-30
    Subjects: Computation and Language (cs.CL)

    Discourse connectives (e.g. however, because) are terms that can explicitly
    convey a discourse relation within a text. While discourse connectives have
    been shown to be an effective clue to automatically identify discourse
    relations, they are not always used to convey such relations, thus they should
    first be disambiguated between discourse-usage non-discourse-usage. In this
    paper, we investigate the applicability of features proposed for the
    disambiguation of English discourse connectives for French. Our results with
    the French Discourse Treebank (FDTB) show that syntactic and lexical features
    developed for English texts are as effective for French and allow the
    disambiguation of French discourse connectives with an accuracy of 94.2%.

    FEUP at SemEval-2017 Task 5: Predicting Sentiment Polarity and Intensity with Financial Word Embeddings

    Pedro Saleiro, Eduarda Mendes Rodrigues, Carlos Soares, Eugénio Oliveira
    Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)

    This paper presents the approach developed at the Faculty of Engineering of
    University of Porto, to participate in SemEval 2017, Task 5: Fine-grained
    Sentiment Analysis on Financial Microblogs and News. The task consisted in
    predicting a real continuous variable from -1.0 to +1.0 representing the
    polarity and intensity of sentiment concerning companies/stocks mentioned in
    short texts. We modeled the task as a regression analysis problem and combined
    traditional techniques such as pre-processing short texts, bag-of-words
    representations and lexical-based features with enhanced financial specific
    bag-of-embeddings. We used an external collection of tweets and news headlines
    mentioning companies/stocks from S&P 500 to create financial word embeddings
    which are able to capture domain-specific syntactic and semantic similarities.
    The resulting approach obtained a cosine similarity score of 0.69 in sub-task
    5.1 – Microblogs and 0.68 in sub-task 5.2 – News Headlines.

    Mining Worse and Better Opinions. Unsupervised and Agnostic Aggregation of Online Reviews

    Michela Fazzolari, Marinella Petrocchi, Alessandro Tommasi, Cesare Zavattari
    Subjects: Social and Information Networks (cs.SI); Computation and Language (cs.CL); Information Retrieval (cs.IR)

    In this paper, we propose a novel approach for aggregating online reviews,
    according to the opinions they express. Our methodology is unsupervised – due
    to the fact that it does not rely on pre-labeled reviews – and it is agnostic –
    since it does not make any assumption about the domain or the language of the
    review content. We measure the adherence of a review content to the domain
    terminology extracted from a review set. First, we demonstrate the
    informativeness of the adherence metric with respect to the score associated
    with a review. Then, we exploit the metric values to group reviews, according
    to the opinions they express. Our experimental campaign has been carried out on
    two large datasets collected from Booking and Amazon, respectively.

    Semantic Similarity from Natural Language and Ontology Analysis

    Sébastien Harispe, Sylvie Ranwez, Stefan Janaqi, Jacky Montmain
    Comments: preprint version of the book Semantic Similarity from Natural Language and Ontology Analysis (Synthesis Lectures on Human Language Technologies – Morgan & Claypool publishers)
    Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

    Artificial Intelligence federates numerous scientific fields in the aim of
    developing machines able to assist human operators performing complex
    treatments — most of which demand high cognitive skills (e.g. learning or
    decision processes). Central to this quest is to give machines the ability to
    estimate the likeness or similarity between things in the way human beings
    estimate the similarity between stimuli.

    In this context, this book focuses on semantic measures: approaches designed
    for comparing semantic entities such as units of language, e.g. words,
    sentences, or concepts and instances defined into knowledge bases. The aim of
    these measures is to assess the similarity or relatedness of such semantic
    entities by taking into account their semantics, i.e. their meaning —
    intuitively, the words tea and coffee, which both refer to stimulating
    beverage, will be estimated to be more semantically similar than the words
    toffee (confection) and coffee, despite that the last pair has a higher
    syntactic similarity. The two state-of-the-art approaches for estimating and
    quantifying semantic similarities/relatedness of semantic entities are
    presented in detail: the first one relies on corpora analysis and is based on
    Natural Language Processing techniques and semantic models while the second is
    based on more or less formal, computer-readable and workable forms of knowledge
    such as semantic networks, thesaurus or ontologies. (…) Beyond a simple
    inventory and categorization of existing measures, the aim of this monograph is
    to convey novices as well as researchers of these domains towards a better
    understanding of semantic similarity estimation and more generally semantic
    measures.

    Does Neural Machine Translation Benefit from Larger Context?

    Sebastien Jean, Stanislas Lauly, Orhan Firat, Kyunghyun Cho
    Subjects: Machine Learning (stat.ML); Computation and Language (cs.CL); Learning (cs.LG)

    We propose a neural machine translation architecture that models the
    surrounding text in addition to the source sentence. These models lead to
    better performance, both in terms of general translation quality and pronoun
    prediction, when trained on small corpora, although this improvement largely
    disappears when trained with a larger corpus. We also discover that
    attention-based neural machine translation is well suited for pronoun
    prediction and compares favorably with other approaches that were specifically
    designed for this task.

    Exploring Sparsity in Recurrent Neural Networks

    Sharan Narang, Gregory Diamos, Shubho Sengupta, Erich Elsen
    Comments: Published as a conference paper at ICLR 2017
    Subjects: Learning (cs.LG); Computation and Language (cs.CL)

    Recurrent Neural Networks (RNN) are widely used to solve a variety of
    problems and as the quantity of data and the amount of available compute have
    increased, so have model sizes. The number of parameters in recent
    state-of-the-art networks makes them hard to deploy, especially on mobile
    phones and embedded devices. The challenge is due to both the size of the model
    and the time it takes to evaluate it. In order to deploy these RNNs
    efficiently, we propose a technique to reduce the parameters of a network by
    pruning weights during the initial training of the network. At the end of
    training, the parameters of the network are sparse while accuracy is still
    close to the original dense neural network. The network size is reduced by 8x
    and the time required to train the model remains constant. Additionally, we can
    prune a larger dense network to achieve better than baseline performance while
    still reducing the total number of parameters significantly. Pruning RNNs
    reduces the size of the model and can also help achieve significant inference
    time speed-up using sparse matrix multiply. Benchmarks show that using our
    technique model size can be reduced by 90% and speed-up is around 2x to 7x.


    Distributed, Parallel, and Cluster Computing

    Benchmarking OpenCL, OpenACC, OpenMP, and CUDA: programming productivity, performance, and energy consumption

    Suejb Memeti, Lu Li, Sabri Pllana, Joanna Kolodziej, Christoph Kessler
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Performance (cs.PF); Programming Languages (cs.PL); Software Engineering (cs.SE)

    Many modern parallel computing systems are heterogeneous at their node level.
    Such nodes may comprise general purpose CPUs and accelerators (such as, GPU, or
    Intel Xeon Phi) that provide high performance with suitable energy-consumption
    characteristics. However, exploiting the available performance of heterogeneous
    architectures may be challenging. There are various parallel programming
    frameworks (such as, OpenMP, OpenCL, OpenACC, CUDA) and selecting the one that
    is suitable for a target context is not straightforward.

    In this paper, we study empirically the characteristics of OpenMP, OpenACC,
    OpenCL, and CUDA with respect to programming productivity, performance, and
    energy. To evaluate the programming productivity we use our homegrown tool
    CodeStat, which enables us to determine the percentage of code lines that was
    required to parallelize the code using a specific framework. We use our tool
    x-MeterPU to evaluate the energy consumption and the performance. Experiments
    are conducted using the industry-standard SPEC benchmark suite and the Rodinia
    benchmark suite for accelerated computing on heterogeneous systems that combine
    Intel Xeon E5 Processors with a GPU accelerator or an Intel Xeon Phi
    co-processor.

    Scalable Global Grid catalogue for LHC Run3 and beyond

    M Martinez Pedreira, C Grigoras for the ALICE Collaboration
    Comments: Proceedings of the 22nd International Conference on Computing in High Energy and Nuclear Physics, CHEP 2016, 10-14 October 2016, San Francisco. Submitted to Journal of Physics: Conference Series (JPCS)
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Databases (cs.DB)

    The AliEn (ALICE Environment) file catalogue is a global unique namespace
    providing mapping between a UNIX-like logical name structure and the
    corresponding physical files distributed over 80 storage elements worldwide.
    Powerful search tools and hierarchical metadata information are integral parts
    of the system and are used by the Grid jobs as well as local users to store and
    access all files on the Grid storage elements. The catalogue has been in
    production since 2005 and over the past 11 years has grown to more than 2
    billion logical file names. The backend is a set of distributed relational
    databases, ensuring smooth growth and fast access. Due to the anticipated fast
    future growth, we are looking for ways to enhance the performance and
    scalability by simplifying the catalogue schema while keeping the functionality
    intact. We investigated different backend solutions, such as distributed key
    value stores, as replacement for the relational database. This contribution
    covers the architectural changes in the system, together with the technology
    evaluation, benchmark results and conclusions.

    Making data center computations fast, but not so furious

    Daniel Porto, João Loff, Rui Duarte, Luis Ceze, Rodrigo Rodrigues
    Comments: The 7th Workshop on Multi-core and Rack Scale Systems – MARS’17
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    We propose an aggressive computational sprinting variant for data center
    environments. While most of previous work on computational sprinting focuses on
    maximizing the sprinting process while ensuring non-faulty conditions, we take
    advantage of the existing replication in data centers to push the system beyond
    its safety limits. In this paper we outline this vision, we survey existing
    techniques for achieving it, and we present some design ideas for future work
    in this area.

    A hybrid CPU-GPU parallelization scheme of variable neighborhood search for inventory optimization problems

    Nikolaos Antoniadis, Angelo Sifaleras
    Comments: 8 pages, 1 figure
    Journal-ref: Electronic Notes in Discrete Mathematics, Volume 58, April 2017,
    Pages 47-54, ISSN 1571-0653
    Subjects: Neural and Evolutionary Computing (cs.NE); Distributed, Parallel, and Cluster Computing (cs.DC); Discrete Mathematics (cs.DM)

    In this paper, we study various parallelization schemes for the Variable
    Neighborhood Search (VNS) metaheuristic on a CPU-GPU system via OpenMP and
    OpenACC. A hybrid parallel VNS method is applied to recent benchmark problem
    instances for the multi-product dynamic lot sizing problem with product returns
    and recovery, which appears in reverse logistics and is known to be NP-hard. We
    report our findings regarding these parallelization approaches and present
    promising computational results.

    Conceptual Frameworks for Building Online Citizen Science Projects

    Poonam Yadav, John Darlington
    Journal-ref: Human Computation (2016) 3:1:213-223 ISSN: 2330-8001, DOI:
    10.15346/hc.v3i1.12
    Subjects: Computers and Society (cs.CY); Distributed, Parallel, and Cluster Computing (cs.DC); Software Engineering (cs.SE); Social and Information Networks (cs.SI)

    In recent years, citizen science has grown in popularity due to a number of
    reasons, including the emphasis on informal learning and creativity potential
    associated with these initiatives. Citizen science projects address research
    questions from various domains, ranging from Ecology to Astronomy. Due to the
    advancement of communication technologies, which makes outreach and engagement
    of wider communities easier, scientists are keen to turn their own research
    into citizen science projects. However, the development, deployment and
    management of these projects remains challenging. One of the most important
    challenges is building the project itself. There is no single tool or
    framework, which guides the step-by-step development of the project, since
    every project has specific characteristics, such as geographical constraints or
    volunteers’ mode of participation. Therefore, in this article, we present a
    series of conceptual frameworks for categorisation, decision and deployment,
    which guide a citizen science project creator in every step of creating a new
    project starting from the research question to project deployment. The
    frameworks are designed with consideration to the properties of already
    existing citizen science projects and could be easily extended to include other
    dimensions, which are not currently perceived.

    A Secure Key Agreement Protocol for Dynamic Group

    Muhammad Bilal, Shin-Gak Kang
    Comments: This article is accepted for the publication in Cluster Computing-The Journal of Networks, Software Tools and Applications. Print ISSN 1386-7857, Online ISSN 1573-7543
    Subjects: Cryptography and Security (cs.CR); Distributed, Parallel, and Cluster Computing (cs.DC)

    To accomplish secure group communication, it is essential to share a unique
    cryptographic key among group members. The underlying challenges to group key
    agreement are scalability, efficiency, and security. In a dynamic group
    environment, the rekeying process is more frequent; therefore, it is more
    crucial to design an efficient group key agreement protocol. Moreover, with the
    emergence of various group-based services, it is becoming common for several
    multicast groups to coexist in the same network. These multicast groups may
    have several shared users; a join or leave request by a single user can trigger
    regeneration of multiple group keys. Under the given circumstances the rekeying
    process becomes a challenging task. In this work, we propose a novel
    methodology for group key agreement which exploits the state vectors of group
    members. The state vector is a set of randomly generated nonce instances which
    determine the logical link between group members and which empowers the group
    member to generate multiple cryptographic keys independently. Using local
    knowledge of a secret nonce, each member can generate and share a large number
    of secure keys, indicating that SGRS inherently provides a considerable amount
    of secure subgroup multicast communication using subgroup multicasting keys
    derived from local state vectors. The resulting protocol is secure and
    efficient in terms of both communication and computation.


    Learning

    Hot or not? Forecasting cellular network hot spots using sector performance indicators

    Joan Serrà, Ilias Leontiadis, Alexandros Karatzoglou, Konstantina Papagiannaki
    Comments: Accepted for publication at ICDE 2017 – Industrial Track
    Subjects: Learning (cs.LG); Networking and Internet Architecture (cs.NI); Systems and Control (cs.SY)

    To manage and maintain large-scale cellular networks, operators need to know
    which sectors underperform at any given time. For this purpose, they use the
    so-called hot spot score, which is the result of a combination of multiple
    network measurements and reflects the instantaneous overall performance of
    individual sectors. While operators have a good understanding of the current
    performance of a network and its overall trend, forecasting the performance of
    each sector over time is a challenging task, as it is affected by both regular
    and non-regular events, triggered by human behavior and hardware failures. In
    this paper, we study the spatio-temporal patterns of the hot spot score and
    uncover its regularities. Based on our observations, we then explore the
    possibility to use recent measurements’ history to predict future hot spots. To
    this end, we consider tree-based machine learning models, and study their
    performance as a function of time, amount of past data, and prediction horizon.
    Our results indicate that, compared to the best baseline, tree-based models can
    deliver up to 14% better forecasts for regular hot spots and 153% better
    forecasts for non-regular hot spots. The latter brings strong evidence that,
    for moderate horizons, forecasts can be made even for sectors exhibiting
    isolated, non-regular behavior. Overall, our work provides insight into the
    dynamics of cellular sectors and their predictability. It also paves the way
    for more proactive network operations with greater forecasting horizons.

    HPSLPred: An Ensemble Multi-label Classifier for Human Protein Subcellular Location Prediction with Imbalanced Source

    Shixiang Wan, Quan Zou
    Subjects: Learning (cs.LG)

    Predicting the subcellular localization of proteins is an important and
    challenging problem. Traditional experimental approaches are often expensive
    and time-consuming. Consequently, a growing number of research efforts employ a
    series of machine learning approaches to predict the subcellular location of
    proteins. There are two main challenges among the state-of-the-art prediction
    methods. First, most of the existing techniques are designed to deal with
    multi-class rather than multi-label classification, which ignores connections
    between multiple labels. In reality, multiple locations of particular proteins
    implies that there are vital and unique biological significances that deserve
    special focus and cannot be ignored. Second, techniques for handling imbalanced
    data in multi-label classification problems are necessary, but never employed.
    For solving these two issues, we have developed an ensemble multi-label
    classifier called HPSLPred, which can be applied for multi-label classification
    with an imbalanced protein source. For convenience, a user-friendly webserver
    has been established at this http URL

    Stein Variational Autoencoder

    Yunchen Pu, Zhe Gan, Ricardo Henao, Chunyuan Li, Shaobo Han, Lawrence Carin
    Subjects: Learning (cs.LG)

    A new method for learning variational autoencoders is developed, based on an
    application of Stein’s operator. The framework represents the encoder as a deep
    nonlinear function through which samples from a simple distribution are fed.
    One need not make parametric assumptions about the form of the encoder
    distribution, and performance is further enhanced by integrating the proposed
    encoder with importance sampling. Example results are demonstrated across
    multiple unsupervised and semi-supervised problems, including semi-supervised
    analysis of the ImageNet data, demonstrating the scalability of the model to
    large datasets.

    O(^2)TD: (Near)-Optimal Off-Policy TD Learning

    Bo Liu, Daoming Lyu, Wen Dong, Saad Biaz
    Comments: 10 pages, 7 figures
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    Temporal difference learning and Residual Gradient methods are the most
    widely used temporal difference based learning algorithms; however, it has been
    shown that none of their objective functions is optimal w.r.t approximating the
    true value function (V). Two novel algorithms are proposed to approximate the
    true value function (V). This paper makes the following contributions: (1) A
    batch algorithm that can help find the approximate optimal off-policy
    prediction of the true value function (V). (2) A linear computational cost (per
    step) near-optimal algorithm that can learn from a collection of off-policy
    samples. (3) A new perspective of the emphatic temporal difference learning
    which bridges the gap between off-policy optimality and off-policy stability.

    Exploring Sparsity in Recurrent Neural Networks

    Sharan Narang, Gregory Diamos, Shubho Sengupta, Erich Elsen
    Comments: Published as a conference paper at ICLR 2017
    Subjects: Learning (cs.LG); Computation and Language (cs.CL)

    Recurrent Neural Networks (RNN) are widely used to solve a variety of
    problems and as the quantity of data and the amount of available compute have
    increased, so have model sizes. The number of parameters in recent
    state-of-the-art networks makes them hard to deploy, especially on mobile
    phones and embedded devices. The challenge is due to both the size of the model
    and the time it takes to evaluate it. In order to deploy these RNNs
    efficiently, we propose a technique to reduce the parameters of a network by
    pruning weights during the initial training of the network. At the end of
    training, the parameters of the network are sparse while accuracy is still
    close to the original dense neural network. The network size is reduced by 8x
    and the time required to train the model remains constant. Additionally, we can
    prune a larger dense network to achieve better than baseline performance while
    still reducing the total number of parameters significantly. Pruning RNNs
    reduces the size of the model and can also help achieve significant inference
    time speed-up using sparse matrix multiply. Benchmarks show that using our
    technique model size can be reduced by 90% and speed-up is around 2x to 7x.

    Diagonal RNNs in Symbolic Music Modeling

    Y. Cem Subakan, Paris Smaragdis
    Comments: Submitted to Waspaa 2017
    Subjects: Neural and Evolutionary Computing (cs.NE); Learning (cs.LG); Machine Learning (stat.ML)

    In this paper, we propose a new Recurrent Neural Network (RNN) architecture.
    The novelty is simple: We use diagonal recurrent matrices instead of full. This
    results in better test likelihood and faster convergence compared to regular
    full RNNs in most of our experiments. We show the benefits of using diagonal
    recurrent matrices with popularly used LSTM and GRU architectures as well as
    with the vanilla RNN architecture, on four standard symbolic music datasets.

    Ranking to Learn: Feature Ranking and Selection via Eigenvector Centrality

    Giorgio Roffo, Simone Melzi
    Comments: Preprint version – Lecture Notes in Computer Science – Springer 2017
    Journal-ref: New Frontiers in Mining Complex Patterns, Fifth International
    workshop, nfMCP2016. Lecture Notes in Computer Science – Springer
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)

    In an era where accumulating data is easy and storing it inexpensive, feature
    selection plays a central role in helping to reduce the high-dimensionality of
    huge amounts of otherwise meaningless data. In this paper, we propose a
    graph-based method for feature selection that ranks features by identifying the
    most important ones into arbitrary set of cues. Mapping the problem on an
    affinity graph-where features are the nodes-the solution is given by assessing
    the importance of nodes through some indicators of centrality, in particular,
    the Eigen-vector Centrality (EC). The gist of EC is to estimate the importance
    of a feature as a function of the importance of its neighbors. Ranking central
    nodes individuates candidate features, which turn out to be effective from a
    classification point of view, as proved by a thoroughly experimental section.
    Our approach has been tested on 7 diverse datasets from recent literature
    (e.g., biological data and object recognition, among others), and compared
    against filter, embedded and wrappers methods. The results are remarkable in
    terms of accuracy, stability and low execution time.

    A Study of Deep Learning Robustness Against Computation Failures

    Jean-Charles Vialatte, François Leduc-Primeau
    Subjects: Neural and Evolutionary Computing (cs.NE); Learning (cs.LG)

    For many types of integrated circuits, accepting larger failure rates in
    computations can be used to improve energy efficiency. We study the performance
    of faulty implementations of certain deep neural networks based on pessimistic
    and optimistic models of the effect of hardware faults. After identifying the
    impact of hyperparameters such as the number of layers on robustness, we study
    the ability of the network to compensate for computational failures through an
    increase of the network size. We show that some networks can achieve equivalent
    performance under faulty implementations, and quantify the required increase in
    computational complexity.

    Unsupervised Learning by Predicting Noise

    Piotr Bojanowski, Armand Joulin
    Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Convolutional neural networks provide visual features that perform remarkably
    well in many computer vision applications. However, training these networks
    requires significant amounts of supervision. This paper introduces a generic
    framework to train deep networks, end-to-end, with no supervision. We propose
    to fix a set of target representations, called Noise As Targets (NAT), and to
    constrain the deep features to align to them. This domain agnostic approach
    avoids the standard unsupervised learning issues of trivial solutions and
    collapsing of features. Thanks to a stochastic batch reassignment strategy and
    a separable square loss function, it scales to millions of images. The proposed
    approach produces representations that perform on par with state-of-the-art
    unsupervised methods on ImageNet and Pascal VOC.

    Large-Scale Online Semantic Indexing of Biomedical Articles via an Ensemble of Multi-Label Classification Models

    Yannis Papanikolaou, Grigorios Tsoumakas, Manos Laliotis, Nikos Markantonatos, Ioannis Vlahavas
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Background: In this paper we present the approaches and methods employed in
    order to deal with a large scale multi-label semantic indexing task of
    biomedical papers. This work was mainly implemented within the context of the
    BioASQ challenge of 2014. Methods: The main contribution of this work is a
    multi-label ensemble method that incorporates a McNemar statistical
    significance test in order to validate the combination of the constituent
    machine learning algorithms. Some secondary contributions include a study on
    the temporal aspects of the BioASQ corpus (observations apply also to the
    BioASQ’s super-set, the PubMed articles collection) and the proper adaptation
    of the algorithms used to deal with this challenging classification task.
    Results: The ensemble method we developed is compared to other approaches in
    experimental scenarios with subsets of the BioASQ corpus giving positive
    results. During the BioASQ 2014 challenge we obtained the first place during
    the first batch and the third in the two following batches. Our success in the
    BioASQ challenge proved that a fully automated machine-learning approach, which
    does not implement any heuristics and rule-based approaches, can be highly
    competitive and outperform other approaches in similar challenging contexts.

    Know Your Master: Driver Profiling-based Anti-theft Method

    Byung Il Kwak, JiYoung Woo, Huy Kang Kim
    Comments: 8 pages, 11 figures, Accepted for PST 2016 : 14th International Conference on Privacy, Security and Trust
    Subjects: Cryptography and Security (cs.CR); Learning (cs.LG)

    Although many anti-theft technologies are implemented, auto-theft is still
    increasing. Also, security vulnerabilities of cars can be used for auto-theft
    by neutralizing anti-theft system. This keyless auto-theft attack will be
    increased as cars adopt computerized electronic devices more. To detect
    auto-theft efficiently, we propose the driver verification method that analyzes
    driving patterns using measurements from the sensor in the vehicle. In our
    model, we add mechanical features of automotive parts that are excluded in
    previous works, but can be differentiated by drivers’ driving behaviors. We
    design the model that uses significant features through feature selection to
    reduce the time cost of feature processing and improve the detection
    performance. Further, we enrich the feature set by deriving statistical
    features such as mean, median, and standard deviation. This minimizes the
    effect of fluctuation of feature values per driver and finally generates the
    reliable model. We also analyze the effect of the size of sliding window on
    performance to detect the time point when the detection becomes reliable and to
    inform owners the theft event as soon as possible. We apply our model with real
    driving and show the contribution of our work to the literature of driver
    identification.

    Learning Piece-wise Linear Models from Large Scale Data for Ad Click Prediction

    Kun Gai, Xiaoqiang Zhu, Han Li, Kai Liu, Zhe Wang
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    CTR prediction in real-world business is a difficult machine learning problem
    with large scale nonlinear sparse data. In this paper, we introduce an
    industrial strength solution with model named Large Scale Piece-wise Linear
    Model (LS-PLM). We formulate the learning problem with (L_1) and (L_{2,1})
    regularizers, leading to a non-convex and non-smooth optimization problem.
    Then, we propose a novel algorithm to solve it efficiently, based on
    directional derivatives and quasi-Newton method. In addition, we design a
    distributed system which can run on hundreds of machines parallel and provides
    us with the industrial scalability. LS-PLM model can capture nonlinear patterns
    from massive sparse data, saving us from heavy feature engineering jobs. Since
    2012, LS-PLM has become the main CTR prediction model in Alibaba’s online
    display advertising system, serving hundreds of millions users every day.

    Does Neural Machine Translation Benefit from Larger Context?

    Sebastien Jean, Stanislas Lauly, Orhan Firat, Kyunghyun Cho
    Subjects: Machine Learning (stat.ML); Computation and Language (cs.CL); Learning (cs.LG)

    We propose a neural machine translation architecture that models the
    surrounding text in addition to the source sentence. These models lead to
    better performance, both in terms of general translation quality and pronoun
    prediction, when trained on small corpora, although this improvement largely
    disappears when trained with a larger corpus. We also discover that
    attention-based neural machine translation is well suited for pronoun
    prediction and compares favorably with other approaches that were specifically
    designed for this task.

    Does robustness imply tractability? A lower bound for planted clique in the semi-random model

    Jacob Steinhardt
    Subjects: Computational Complexity (cs.CC); Information Theory (cs.IT); Learning (cs.LG); Statistics Theory (math.ST)

    We consider a robust analog of the planted clique problem. In this analog, a
    set (S) of vertices is chosen and all edges in (S) are included; then, edges
    between (S) and the rest of the graph are included with probability
    (frac{1}{2}), while edges not touching (S) are allowed to vary arbitrarily.
    For this semi-random model, we show that the information-theoretic threshold
    for recovery is ( ilde{Theta}(sqrt{n})), in sharp contrast to the classical
    information-theoretic threshold of (Theta(log(n))). This matches the
    conjectured computational threshold for the classical planted clique problem,
    and thus raises the intriguing possibility that, once we require robustness,
    there is no computational-statistical gap for planted clique.

    Google's Cloud Vision API Is Not Robust To Noise

    Hossein Hosseini, Baicen Xiao, Radha Poovendran
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Google has recently introduced the Cloud Vision API for image analysis.
    According to the demonstration website, the API “quickly classifies images into
    thousands of categories, detects individual objects and faces within images,
    and finds and reads printed words contained within images.” It can be also used
    to “detect different types of inappropriate content from adult to violent
    content.”

    In this paper, we evaluate the robustness of the Google’s Cloud Vision API to
    input perturbation. In particular, we show that by adding sufficient noise to
    the image, the API generates completely different outputs for the noisy image,
    while a human observer would perceive its original content. We show that the
    attack is consistently successful, by performing extensive experiments on
    different image types, including natural images, images containing faces and
    images with texts. Our findings indicate the vulnerability of the API in
    adversarial environments. For example, an adversary can bypass an image
    filtering system by adding noise to an image with inappropriate content.


    Information Theory

    Reverse Engineering of Communications Networks: Evolution and Challenges

    Mehdi Teimouri, Hamidreza Kakaei Motlagh
    Comments: 18 pages, 9 figures
    Subjects: Information Theory (cs.IT)

    Reverse engineering of a communications network is the process of identifying
    the communications protocol used in the network. This problem arises in various
    situations such as eavesdropping, intelligent jamming, cognitive radio, and
    adaptive coding and modulation (ACM). According to the Open Systems
    Interconnection (OSI) reference model, the first step in reverse engineering of
    communications networks is recognition of physical layer which consists of
    recognition of digital modulations and identification of physical layer
    transmission techniques. The next step is recognition of data link layer
    (consisting of frame synchronization, recognition of channel codes,
    reconstruction of interleavers, reconstruction of scramblers, etc.) and also
    recognition of network and transport layers. The final step in reverse
    engineering of communications networks is recognition of upper layers which
    essentially can be seen as identification of source encoders. The objective of
    this paper is to provide a comprehensive overview on the current methods for
    reverse engineering of communications networks. Furthermore, challenges and
    open research issues in this field are introduced.

    Wave-like Decoding of Tail-biting Spatially Coupled LDPC Codes Through Iterative Demapping

    Sebastian Cammerer, Laurent Schmalen, Vahid Aref, Stephan ten Brink
    Comments: presentat at the International Symposium on Turbo Codes & Iterative Information Processing (ISTC), Brest, Sept. 2016
    Subjects: Information Theory (cs.IT)

    For finite coupling lengths, terminated spatially coupled low-density
    parity-check (SC-LDPC) codes show a non-negligible rate-loss. In this paper, we
    investigate if this rate loss can be mitigated by tail-biting SC-LDPC codes in
    conjunction with iterative demapping of higher order modulation formats.
    Therefore, we examine the BP threshold of different coupled and uncoupled
    ensembles. A comparison between the decoding thresholds approximated by EXIT
    charts and the density evolution results of the coupled and uncoupled ensemble
    is given. We investigate the effect and potential of different labelings for
    such a set-up using per-bit EXIT curves, and exemplify the method for a 16-QAM
    system, e.g., using set partitioning labelings. A hybrid mapping is proposed,
    where different sub-blocks use different labelings in order to further optimize
    the decoding thresholds of tail-biting codes, while the computational
    complexity overhead through iterative demapping remains small.

    Power Efficient Hybrid Beamforming for Massive MIMO Public Channel

    Cheng Zhang, Yongming Huang, Yindi Jing, Luxi Yang
    Comments: submitted to journal
    Subjects: Information Theory (cs.IT)

    For hybrid massive MIMO public channel with any sector size in either
    microwave or millimeter wave (mmwave) band, this paper studies the hybrid
    beamforming optimized to minimize the transmit power while guaranteeing the
    quality of service (QoS) for randomly deployed users. First the ideal
    beampattern is derived via Parseval Identity, based on which a beamforming
    design problem is formulated to minimize the gap with the idea beampattern. The
    problem is transformable to a multiconvex one and an iterative optimization
    algorithm is used to obtain the full-digital beamformer. In addition, with the
    help of same beampattern theorem, the power amplifier (PA) efficiency of the
    beamformer is improved with unchanged beampattern. Finally, the hybrid
    beamforming design is obtained that achieves the full-digital beamformer
    solution. Simulations verifies the advantages of the proposed scheme over
    existing ones.

    Waveform Design for Wireless Power Transfer with Limited Feedback

    Yang Huang, Bruno Clerckx
    Comments: Submitted for journal publication
    Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

    Waveform design is a key technique to jointly exploit a beamforming gain, the
    channel frequency-selectivity and the rectifier nonlinearity, so as to enhance
    the end-to-end power transfer efficiency of Wireless Power Transfer (WPT).
    Those waveforms have been designed assuming perfect channel state information
    at the transmitter. This paper proposes two waveform strategies relying on
    limited feedback for multi-antenna multi-sine WPT over frequency-selective
    channels. In the waveform selection strategy, the Energy Transmitter (ET)
    transmits over multiple timeslots with every time a different waveform precoder
    within a codebook, and the Energy Receiver (ER) reports the index of the
    precoder in the codebook that leads to the largest harvested energy. In the
    waveform refinement strategy, the ET sequentially transmits two waveforms in
    each stage, and the ER reports one feedback bit indicating an increase/decrease
    in the harvested energy during this stage. Based on multiple one-bit feedback,
    the ET successively refines waveform precoders in a tree-structured codebook
    over multiple stages. By employing the framework of the generalized Lloyd’s
    algorithm, novel algorithms are proposed for both strategies to optimize the
    codebooks in both space and frequency domains. The proposed limited
    feedback-based waveform strategies are shown to outperform a set of baselines,
    achieving higher harvested energy.

    How to exploit prior information in low-complexity models

    Sajad Daei, Farzan Haddadi
    Subjects: Information Theory (cs.IT)

    Compressed Sensing refers to extracting a low-dimensional structured signal
    of interest from its incomplete random linear observations. A line of recent
    work has studied that, with the extra prior information about the signal, one
    can recover the signal with much fewer observations. For this purpose, the
    general approach is to solve weighted convex function minimization problem. In
    such settings, the convex function is chosen to promote the low-dimensional
    structure and the optimal weights are so chosen to reduce the number of
    measurements required for the optimization problem. In this paper, we consider
    a generalized non-uniform model in which the structured signal falls into some
    partitions, with entries of each partition having a definite probability to be
    an element of the structure support. Given these probabilities and regarding
    the recent developments in conic integral geometry, we provide a method to
    choose the unique optimal weights for any general low-dimensional signal model.
    This class of low-dimensional signal model includes many popular examples such
    as (ell_1) analysis (entry-wise sparsity in an arbitrary redundant
    dictionary), (ell_{1,2}) norm (block sparsity) and total variation semi-norm
    (for piece-wise constant signals). We show through precise analysis and
    simulations that the weighted convex optimization problem significantly
    improves the regular convex optimization problem as we choose the unique
    optimal weights.

    Improving the Performance of OTDOA based Positioning in NB-IoT System

    Sha Hu, Axel Berg, Xuhong Li, Fredrik Rusek
    Comments: Submitted to VTC-Fall, 2017, 7 pages, 11 figures
    Subjects: Information Theory (cs.IT)

    In this paper, we consider positioning with
    observed-time-difference-of-arrival (OTDOA) for a device deployed in
    long-term-evolution (LTE) based narrow-band Internet-of-things (NB-IoT)
    systems. We propose an iterative expectation-maximization based successive
    interference cancellation (EM-SIC) algorithm to jointly consider estimations of
    residual frequency-offset (FO), fading-channel taps and time-of-arrival (ToA)
    of the first arrival-path for each of the detected cells. In order to design a
    low complexity ToA detector and also due to the limits of low-cost analog
    circuits, we assume an NB-IoT device working at a low-sampling rate such as
    1.92 MHz or lower. The proposed EM-SIC algorithm comprises two stages to detect
    ToA, based on which OTDOA can be calculated. In a first stage, after running
    the EM-SIC block a predefined number of iterations, a coarse ToA is estimated
    for each of the detected cells. Then in a second stage, to improve the ToA
    resolution, a low-pass filter is utilized to interpolate the correlations of
    time-domain PRS signal evaluated at a low sampling-rate to a high sampling-rate
    such as 30.72 MHz. To keep low-complexity, only the correlations inside a small
    search window centered at the coarse ToA estimates are upsampled. Then, the
    refined ToAs are estimated based on upsampled correlations. If at least three
    cells are detected, with OTDOA and the locations of detected cell sites, the
    position of the NB-IoT device can be estimated. We show through numerical
    simulations that, the proposed EM-SIC based ToA detector is robust against
    impairments introduced by inter-cell interference, fading-channel and residual
    FO. Thus significant signal-to-noise (SNR) gains are obtained over traditional
    ToA detectors that do not consider these impairments when positioning a device.

    Joint Domain Based Massive Access for Small Packets Traffic of Uplink Wireless Channel

    Qiang Song, Ronggui Xie, Huarui Yin, Guo Wei
    Comments: 6 pages, 5 figures.submitted to globecom 2017
    Subjects: Information Theory (cs.IT)

    The fifth generation (5G) communication scenarios such as the cellular
    network and the emerging machine type communications will produce massive small
    packets. To support massive connectivity and avoid signaling overhead caused by
    the transmission of those small packets, this paper proposes a novel method to
    improve the transmission efficiency for massive connections of wireless uplink
    channel. The proposed method combines compressive sensing (CS) with power
    domain NOMA jointly, especially neither the scheduling nor the centralized
    power allocation is necessary in the method. Both the analysis and simulation
    show that the method can support up to two or three times overloading.

    On Low Complexity Detection for QAM Isomorphic Constellations

    Farbod Kayhan
    Comments: Submitted to IEEE GLOBECOM 2017
    Subjects: Information Theory (cs.IT)

    Despite of the known gap from the Shannon’s capacity, several standards are
    still employing QAM or star shape constellations, mainly due to the existing
    low complexity detectors. In this paper, we investigate the low complexity
    detection for a family of QAM isomorphic constellations. These constellations
    are known to perform very close to the peak-power limited capacity,
    outperforming the DVB-S2X standard constellations. The proposed strategy is to
    first remap the received signals to the QAM constellation using the existing
    isomorphism and then break the log likelihood ratio computations to two one
    dimensional PAM constellations. Gains larger than 0.6 dB with respect to QAM
    can be obtained over the peak power limited channels without any increase in
    detection complexity. Our scheme also provides a systematic way to design
    constellations with low complexity one dimensional detectors. Several open
    problems are discussed at the end of the paper.

    Coverage and Rate of Downlink Sequence Transmissions with Reliability Guarantees

    Jihong Park, Petar Popovski
    Comments: 4 pages, 4 figures, submitted to IEEE Wireless Communications Letters
    Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

    Real-time distributed control is a promising application of 5G in which
    communication links should satisfy certain reliability guarantees. In this
    letter, we derive closed-form maximum average rate when a device (e.g.
    industrial machine) downloads a sequence of n operational commands through
    cellular connection, while guaranteeing a certain coverage for all n messages.
    The result is based on novel closed-form n- successive coverage bounds. The
    proposed lower bound provides a simple approximation that is increasingly
    accurate in the high reliability region. For moderate coverage, a linear
    combination of the proposed bounds enables deriving a tractable approximation.
    Both techniques can be utilized for calculating the maximum average rate with
    and without using adaptive modulation and coding (AMC).

    On PGZ decoding of alternant codes

    R. Farré, N. Sayols, S. Xambó-Descamps
    Comments: 16 pages
    Subjects: Information Theory (cs.IT)

    In this note we first review the classical Petterson-Gorenstein-Zierler
    decoding algorithm for the class of alternant codes (which includes
    Reed-Solomon, Bose-Chaudhuri-Hocquenghem and classical Goppa codes), then we
    present an improvement of the method to find the number of errors and the
    errorlocator polynomial, and finally we illustrate the procedure with several
    examples. In two appendices we sketch the main features of the system we have
    used for the computations.

    Secret Key Generation from Correlated Sources and Secure Link

    Daming Cao, Wei Kang
    Comments: 5 pages, 1 figure
    Subjects: Information Theory (cs.IT)

    In this paper, we study the problem of secret key generation from both
    correlated sources and a secure channel. We obtain the optimal secret key rate
    in this problem and show that the optimal scheme is to conduct secret key
    generation and key distribution jointly, where every bit in the secret channel
    will yield more than one bit of secret key rate. This joint scheme is better
    than the separation-based scheme, where the secure channel is used for key
    distribution, and as a result, every bit in the secure channel can only provide
    one bit of secret key rate.

    ECG Signal Compression and Optimization in Remote Monitoring Networks

    Pengda Wong
    Comments: 12 pages, 12 figures
    Subjects: Information Theory (cs.IT)

    We proposed a practical ECG compression system which is beneficial for
    tele-monitoring cardiovascular diseases. There are two steps in the compression
    framework. First, we partition ECG signal into segments according to R- to
    R-wave periods. The partition aims at achieving more stable statistical
    features between segments of ECG signal which is beneficial for saving bit
    rates. After the partition, we optimize the bit rate in the sense of minimizing
    ECG reconstruction error under a constraint of consumed bits. From the
    experiment results, the proposed compression scheme is able to reduce the
    computation for updating codebook, and save channel capacity resources for
    transmitting ECG signals.

    Mutual Information, Relative Entropy and Estimation Error in Semi-martingale Channels

    Jiantao Jiao, Kartik Venkat, Tsachy Weissman
    Subjects: Information Theory (cs.IT)

    Fundamental relations between information and estimation have been
    established in the literature for the continuous-time Gaussian and Poisson
    channels, in a long line of work starting from the classical representation
    theorems by Duncan and Kabanov respectively. In this work, we demonstrate that
    such relations hold for a much larger family of continuous-time channels. We
    introduce the family of semi-martingale channels where the channel output is a
    semi-martingale stochastic process, and the channel input modulates the
    characteristics of the semi-martingale. For these channels, which includes as a
    special case the continuous time Gaussian and Poisson models, we establish new
    representations relating the mutual information between the channel input and
    output to an optimal causal filtering loss, thereby unifying and considerably
    extending results from the Gaussian and Poisson settings. Extensions to the
    setting of mismatched estimation are also presented where the relative entropy
    between the laws governing the output of the channel under two different input
    distributions is equal to the cumulative difference between the estimation loss
    incurred by using the mismatched and optimal causal filters respectively. The
    main tool underlying these results is the Doob–Meyer decomposition of a class
    of likelihood ratio sub-martingales. The results in this work can be viewed as
    the continuous-time analogues of recent generalizations for relations between
    information and estimation for discrete-time L’evy channels.

    Capacity of Cellular Wireless Network

    Rahul Vaze, Srikanth Iyer
    Comments: A shorter version to appear in WiOpt 2017
    Subjects: Information Theory (cs.IT)

    Earlier definitions of capacity for wireless networks, e.g., transport or
    transmission capacity, for which exact theoretical results are known, are well
    suited for ad hoc networks but are not directly applicable for cellular
    wireless networks, where large-scale basestation (BS) coordination is not
    possible, and retransmissions/ARQ under the SINR model is a universal feature.

    In this paper, cellular wireless networks, where both BS locations and mobile
    user (MU) locations are distributed as independent Poisson point processes are
    considered, and each MU connects to its nearest BS. With ARQ, under the SINR
    model, the effective downlink rate of packet transmission is the reciprocal of
    the expected delay (number of retransmissions needed till success), which we
    use as our network capacity definition after scaling it with the BS density.

    Exact characterization of this natural capacity metric for cellular wireless
    networks is derived. The capacity is shown to first increase polynomially with
    the BS density in the low BS density regime and then scale inverse
    exponentially with the increasing BS density. Two distinct upper bounds are
    derived that are relevant for the low and the high BS density regimes. A single
    power control strategy is shown to achieve the upper bounds in both the
    regimes. This result is fundamentally different from the well known capacity
    results for ad hoc networks, such as transport and transmission capacity that
    scale as the square root of the (high) BS density. Our results show that the
    strong temporal correlations of SINRs with PPP distributed BS locations is
    limiting, and the realizable capacity in cellular wireless networks in high-BS
    density regime is much smaller than previously thought. A byproduct of our
    analysis shows that the capacity of the ALOHA strategy with retransmissions is
    zero.

    Satellite Based Positioning Signal Acquisition at Higher Order Cycle Frequency

    Pengda Wong
    Subjects: Information Theory (cs.IT)

    The acquisition of the signal from the satellite based positioning systems,
    such as GPS, Galileo, and Compass, encounters challenges in the urban streets,
    indoor. For improving the acquisition performance, the data accumulation is
    usually performed to improve the signal-to-noise ratio which is defined on the
    second order statistics. Different from the conventional approaches, the
    acquisition based on higher order cyclostatistics is proposed. Using the
    cyclostatistics, the estimation of the initial phase and Doppler shift of the
    signal is presented respectively. Afterwards, a joint estimator is introduced.
    The analysis in this paper is performed on GPS signal. Indeed, the proposed
    estimation method can be straightforwardly extended to acquire the signal from
    the other satellite positioning systems. The simulation and experiment results
    demonstrate that the proposed signal acquisition scheme achieves the detection
    probability of 0.9 at the CNR 28dBHz.

    "Short-Dot": Computing Large Linear Transforms Distributedly Using Coded Short Dot Products

    Sanghamitra Dutta, Viveck Cadambe, Pulkit Grover
    Comments: Presented at NIPS 2016, Barcelona, Spain
    Subjects: Information Theory (cs.IT)

    Faced with saturation of Moore’s law and increasing dimension of data, system
    designers have increasingly resorted to parallel and distributed computing.
    However, distributed computing is often bottle necked by a small fraction of
    slow processors called “stragglers” that reduce the speed of computation
    because the fusion node has to wait for all processors to finish. To combat the
    effect of stragglers, recent literature introduces redundancy in computations
    across processors, e.g.,~using repetition-based strategies or erasure codes.
    The fusion node can exploit this redundancy by completing the computation using
    outputs from only a subset of the processors, ignoring the stragglers. In this
    paper, we propose a novel technique — that we call “Short-Dot” — to introduce
    redundant computations in a coding theory inspired fashion, for computing
    linear transforms of long vectors. Instead of computing long dot products as
    required in the original linear transform, we construct a larger number of
    redundant and short dot products that can be computed faster and more
    efficiently at individual processors. In reference to comparable schemes that
    introduce redundancy to tackle stragglers, Short-Dot reduces the cost of
    computation, storage and communication since shorter portions are stored and
    computed at each processor, and also shorter portions of the input is
    communicated to each processor. We demonstrate through probabilistic analysis
    as well as experiments that Short-Dot offers significant speed-up compared to
    existing techniques. We also derive trade-offs between the length of the
    dot-products and the resilience to stragglers (number of processors to wait
    for), for any such strategy and compare it to that achieved by our strategy.

    Performance Analysis of Slotted Secondary Transmission with Adaptive Modulation under Interweave Cognitive Radio Implementation

    Wen-Jing Wang, Hong-Chuan Yang
    Subjects: Information Theory (cs.IT)

    In cognitive radio communication, unlicensed secondary user (SU) can access
    under-utilized spectrum of the licensed primary user (PU) opportunistically for
    emerging wireless applications. With interweave implementation, SU has to
    perform spectrum sensing on the target frequency band and waits for
    transmission if PU occupies the channel. This waiting time results in extra
    delay for secondary transmission. In this paper, the delay and throughput
    performance of secondary packet transmission is evaluated with slotted
    transmission protocol. We propose a discrete-time Markov model to characterize
    secondary slotted transmission process. Close-form solution of collision
    probability is obtained. We then carry out the queuing delay and throughput
    analysis based on a two-dimensional-finite-state Markov chain for small-size
    packet transmission. For large-size packets, the distribution function of
    extended delivery time for secondary packet transmission is also derived.
    Selected numerical results are presented to illustrate the mathematical
    formulas and to validate our research results.

    Approximations from Anywhere and General Rough Sets

    A. Mani
    Comments: 20 Pages. Scheduled to appear in IJCRS’2017 LNCS Proceedings, Springer
    Subjects: Logic (math.LO); Artificial Intelligence (cs.AI); Discrete Mathematics (cs.DM); Information Theory (cs.IT)

    Not all approximations arise from information systems. The problem of fitting
    approximations, subjected to some rules (and related data), to information
    systems in a rough scheme of things is known as the emph{inverse problem}. The
    inverse problem is more general than the duality (or abstract representation)
    problems and was introduced by the present author in her earlier papers. From
    the practical perspective, a few (as opposed to one) theoretical frameworks may
    be suitable for formulating the problem itself. emph{Granular operator spaces}
    have been recently introduced and investigated by the present author in her
    recent work in the context of antichain based and dialectical semantics for
    general rough sets. The nature of the inverse problem is examined from
    number-theoretic and combinatorial perspectives in a higher order variant of
    granular operator spaces and some necessary conditions are proved. The results
    and the novel approach would be useful in a number of unsupervised and semi
    supervised learning contexts and algorithms.

    Image Fusion With Cosparse Analysis Operator

    Rui Gao, Sergiy A. Vorobyov, Hong Zhao
    Comments: 12 pages, 4 figures, 1 table, Submitted to IEEE Signal Processing Letters on December 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT)

    The paper addresses the image fusion problem, where multiple images captured
    with different focus distances are to be combined into a higher quality
    all-in-focus image. Most current approaches for image fusion strongly rely on
    the unrealistic noise-free assumption used during the image acquisition, and
    then yield limited robustness in fusion processing. In our approach, we
    formulate the multi-focus image fusion problem in terms of an analysis sparse
    model, and simultaneously perform the restoration and fusion of multi-focus
    images. Based on this model, we propose an analysis operator learning, and
    define a novel fusion function to generate an all-in-focus image. Experimental
    evaluations confirm the effectiveness of the proposed fusion approach both
    visually and quantitatively, and show that our approach outperforms
    state-of-the-art fusion methods.

    Competitive Resource Allocation in HetNets: the Impact of Small-cell Spectrum Constraints and Investment Costs

    Cheng Chen, Randall A. Berry, Michael L. Honig, Vijay G. Subramanian
    Comments: 14 pages, 8 figures. submitted to IEEE Transactions on Cognitive Communications and Networking
    Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT)

    Heterogeneous wireless networks with small-cell deployments in licensed and
    unlicensed spectrum bands are a promising approach for expanding wireless
    connectivity and service. As a result, wireless service providers (SPs) are
    adding small-cells to augment their existing macro-cell deployments. This added
    flexibility complicates network management, in particular, service pricing and
    spectrum allocations across macro- and small-cells. Further, these decisions
    depend on the degree of competition among SPs. Restrictions on shared spectrum
    access imposed by regulators, such as low power constraints that lead to
    small-cell deployments, along with the investment cost needed to add small
    cells to an existing network, also impact strategic decisions and market
    efficiency. If the revenue generated by small-cells does not cover the
    investment cost, then there will be no deployment even if it increases social
    welfare. We study the implications of such spectrum constraints and investment
    costs on resource allocation and pricing decisions by competitive SPs, along
    with the associated social welfare. Our results show that while the optimal
    resource allocation taking constraints and investment into account can be
    uniquely determined, adding those features with strategic SPs can have a
    substantial effect on the equilibrium market structure.

    Performance Impact of Base Station Antenna Heights in Dense Cellular Networks

    Ming Ding, David Lopez-Perez
    Comments: IEEE journal submission. Submission on Apr. 26, 2016. First-round decision (Major Revision) on Sep. 12, 2016. Major revision submitted on Nov. 7, 2016. Second-round in progress
    Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT)

    In this paper, we present a new and significant theoretical discovery. If the
    absolute height difference between base station (BS) antenna and user equipment
    (UE) antenna is larger than zero, then the network performance in terms of both
    the coverage probability and the area spectral efficiency (ASE) will
    continuously decrease toward zero as the BS density increases for ultra-dense
    (UD) small cell networks (SCNs). Such findings are completely different from
    the conclusions in existing works, both quantitatively and qualitatively. In
    particular, this performance behavior has a tremendous impact on the deployment
    of UD SCNs in the 5th-generation (5G) era. Network operators may invest large
    amounts of money in deploying more network infrastructure to only obtain an
    even less network capacity. Our study results reveal that one way to address
    this issue is to lower the SCN BS antenna height to the UE antenna height.
    However, this requires a revolutionized approach of BS architecture and
    deployment, which is explored in this paper too.

    Does robustness imply tractability? A lower bound for planted clique in the semi-random model

    Jacob Steinhardt
    Subjects: Computational Complexity (cs.CC); Information Theory (cs.IT); Learning (cs.LG); Statistics Theory (math.ST)

    We consider a robust analog of the planted clique problem. In this analog, a
    set (S) of vertices is chosen and all edges in (S) are included; then, edges
    between (S) and the rest of the graph are included with probability
    (frac{1}{2}), while edges not touching (S) are allowed to vary arbitrarily.
    For this semi-random model, we show that the information-theoretic threshold
    for recovery is ( ilde{Theta}(sqrt{n})), in sharp contrast to the classical
    information-theoretic threshold of (Theta(log(n))). This matches the
    conjectured computational threshold for the classical planted clique problem,
    and thus raises the intriguing possibility that, once we require robustness,
    there is no computational-statistical gap for planted clique.




沪ICP备19023445号-2号
友情链接