IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Fri, 6 Jan 2017

    我爱机器学习(52ml.net)发表于 2017-01-06 00:00:00
    love 0

    Neural and Evolutionary Computing

    Generating Focussed Molecule Libraries for Drug Discovery with Recurrent Neural Networks

    Marwin H.S. Segler, Thierry Kogej, Christian Tyrchan, Mark P. Waller
    Comments: 17 pages, 17 figures
    Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Learning (cs.LG); Chemical Physics (physics.chem-ph); Machine Learning (stat.ML)

    In de novo drug design, computational strategies are used to generate novel
    molecules with good affinity to the desired biological target. In this work, we
    show that recurrent neural networks can be trained as generative models for
    molecular structures, similar to statistical language models in natural
    language processing. We demonstrate that the properties of the generated
    molecules correlate very well with the properties of the molecules used to
    train the model. In order to enrich libraries with molecules active towards a
    given biological target, we propose to fine-tune the model with small sets of
    molecules, which are known to be active against that target.

    Against Staphylococcus aureus, the model reproduced 14% of 6051 hold-out test
    molecules that medicinal chemists designed, whereas against Plasmodium
    falciparum (Malaria) it reproduced 28% of 1240 test molecules. When coupled
    with a scoring function, our model can perform the complete de novo drug design
    cycle to generate large sets of novel molecules for drug discovery.

    Subpopulation Diversity Based Selecting Migration Moment in Distributed Evolutionary Algorithms

    Chengjun Li, Jia Wu
    Comments: 17 pages, 9 figures
    Subjects: Neural and Evolutionary Computing (cs.NE)

    In distributed evolutionary algorithms, migration interval is used to decide
    migration moments. Nevertheless, migration moments predetermined by intervals
    cannot match the dynamic situation of evolution. In this paper, a scheme of
    setting the success rate of migration based on subpopulation diversity at each
    interval is proposed. With the scheme, migration still occurs at intervals, but
    the probability of immigrants entering the target subpopulation will be
    determined by the diversity of this subpopulation according to a proposed
    formula. An analysis shows that the time consumption of our scheme is
    acceptable. In our experiments, the basement of parallelism is an evolutionary
    algorithm for the traveling salesman problem. Under different value
    combinations of parameters for the formula, outcomes for eight benchmark
    instances of the distributed evolutionary algorithm with the proposed scheme
    are compared with those of a traditional one, respectively. Results show that
    the distributed evolutionary algorithm based on our scheme has a significant
    advantage on solutions especially for high difficulty instances. Moreover, it
    can be seen that the algorithm with the scheme has the most outstanding
    performance under three value combinations of above-mentioned parameters for
    the formula.

    A Review of Neural Network Based Machine Learning Approaches for Rotor Angle Stability Control

    Reza Yousefian, Sukumar Kamalasadan
    Subjects: Systems and Control (cs.SY); Neural and Evolutionary Computing (cs.NE)

    This paper reviews the current status and challenges of Neural Networks (NNs)
    based machine learning approaches for modern power grid stability control
    including their design and implementation methodologies. NNs are widely
    accepted as Artificial Intelligence (AI) approaches offering an alternative way
    to control complex and ill-defined problems. In this paper various application
    of NNs for power system rotor angle stabilization and control problem is
    discussed. The main focus of this paper is on the use of Reinforcement Learning
    (RL) and Supervised Learning (SL) algorithms in power system wide-area control
    (WAC). Generally, these algorithms due to their capability in modeling
    nonlinearities and uncertainties are used for transient classification,
    neuro-control, wide-area monitoring and control, renewable energy management
    and control, and so on. The works of researchers in the field of conventional
    and renewable energy systems are reported and categorized. Paper concludes by
    presenting, comparing and evaluating various learning techniques and
    infrastructure configurations based on efficiency.


    Computer Vision and Pattern Recognition

    Learning from Synthetic Humans

    Gül Varol, Javier Romero, Xavier Martin, Naureen Mahmood, Michael Black, Ivan Laptev, Cordelia Schmid
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Estimating human pose, shape, and motion from images and videos are
    fundamental challenges with many applications. Recent advances in 2D human pose
    estimation use large amounts of manually-labeled training data for learning
    convolutional neural networks (CNNs). Such data is time consuming to acquire
    and difficult to extend. Moreover, manual labeling of 3D pose, depth and motion
    is impractical. In this work we present SURREAL (Synthetic hUmans foR REAL
    tasks): a new large-scale dataset with synthetically-generated but realistic
    images of people rendered from 3D sequences of human motion capture data. We
    generate more than 6 million frames together with ground truth pose, depth
    maps, and segmentation masks. We show that CNNs trained on our synthetic
    dataset allow for accurate human depth estimation and human part segmentation
    in real RGB images. Our results and the new dataset open up new possibilities
    for advancing person analysis using cheap and large-scale synthetic data.

    Autoencoder Regularized Network For Driving Style Representation Learning

    Weishan Dong, Ting Yuan, Kai Yang, Changsheng Li, Shilei Zhang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we study learning generalized driving style representations
    from automobile GPS trip data. We propose a novel Autoencoder Regularized deep
    neural Network (ARNet) and a trip encoding framework trip2vec to learn drivers’
    driving styles directly from GPS records, by combining supervised and
    unsupervised feature learning in a unified architecture. Experiments on a
    challenging driver number estimation problem and the driver identification
    problem show that ARNet can learn a good generalized driving style
    representation: It significantly outperforms existing methods and alternative
    architectures by reaching the least estimation error on average (0.68, less
    than one driver) and the highest identification accuracy (by at least 3%
    improvement) compared with traditional supervised learning methods.

    The Dem@Care Experiments and Datasets: a Technical Report

    Anastasios Karakostas, Alexia Briassouli, Konstantinos Avgerinakis, Ioannis Kompatsiaris, Magda Tsolaki
    Comments: 4pages 2figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    The objective of Dem@Care is the development of a complete system providing
    personal health services to people with dementia, as well as medical
    professionals and caregivers, by using a multitude of sensors, for
    context-aware, multi-parametric monitoring of lifestyle, ambient environment,
    and health parameters. Multi-sensor data analysis, combined with intelligent
    decision making mechanisms, will allow an accurate representation of the
    person’s current status and will provide the appropriate feedback, both to the
    person and the associated caregivers, enhancing the standard clinical workflow.
    Within the project framework, several data collection activities have taken
    place to assist technical development and evaluation tasks. In all these
    activities, particular attention has been paid to adhere to ethical guidelines
    and preserve the participants’ privacy. This technical report describes shorty
    the (a) the main objectives of the project, (b) the main ethical principles and
    (c) the datasets that have been already created.

    Overlapping Cover Local Regression Machines

    Mohamed Elhoseiny, Ahmed Elgammal
    Comments: Long Article with more experiments and analysis of conference paper “Overlapping Domain Cover for Scalable and Accurate Regression Kernel Machines”, presented orally 2015 at the British Machine Vision Conference 2015 (BMVC)
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)

    We present the Overlapping Domain Cover (ODC) notion for kernel machines, as
    a set of overlapping subsets of the data that covers the entire training set
    and optimized to be spatially cohesive as possible. We show how this notion
    benefit the speed of local kernel machines for regression in terms of both
    speed while achieving while minimizing the prediction error. We propose an
    efficient ODC framework, which is applicable to various regression models and
    in particular reduces the complexity of Twin Gaussian Processes (TGP)
    regression from cubic to quadratic. Our notion is also applicable to several
    kernel methods (e.g., Gaussian Process Regression(GPR) and IWTGP regression, as
    shown in our experiments). We also theoretically justified the idea behind our
    method to improve local prediction by the overlapping cover. We validated and
    analyzed our method on three benchmark human pose estimation datasets and
    interesting findings are discussed.


    Artificial Intelligence

    NeuroRule: A Connectionist Approach to Data Mining

    Hongjun Lu, Rudy Setiono, Huan Liu
    Comments: VLDB1995
    Subjects: Artificial Intelligence (cs.AI)

    Classification, which involves finding rules that partition a given data set
    into disjoint groups, is one class of data mining problems. Approaches proposed
    so far for mining classification rules for large databases are mainly decision
    tree based symbolic learning methods. The connectionist approach based on
    neural networks has been thought not well suited for data mining. One of the
    major reasons cited is that knowledge generated by neural networks is not
    explicitly represented in the form of rules suitable for verification or
    interpretation by humans. This paper examines this issue. With our newly
    developed algorithms, rules which are similar to, or more concise than those
    generated by the symbolic methods can be extracted from the neural networks.
    The data mining process using neural networks with the emphasis on rule
    extraction is described. Experimental results and comparison with previously
    published works are presented.

    Toward negotiable reinforcement learning: shifting priorities in Pareto optimal sequential decision-making

    Andrew Critch
    Subjects: Artificial Intelligence (cs.AI); Computer Science and Game Theory (cs.GT); Learning (cs.LG)

    Existing multi-objective reinforcement learning (MORL) algorithms do not
    account for objectives that arise from players with differing beliefs.
    Concretely, consider two players with different beliefs and utility functions
    who may cooperate to build a machine that takes actions on their behalf. A
    representation is needed for how much the machine’s policy will prioritize each
    player’s interests over time. Assuming the players have reached common
    knowledge of their situation, this paper derives a recursion that any Pareto
    optimal policy must satisfy. Two qualitative observations can be made from the
    recursion: the machine must (1) use each player’s own beliefs in evaluating how
    well an action will serve that player’s utility function, and (2) shift the
    relative priority it assigns to each player’s expected utilities over time, by
    a factor proportional to how well that player’s beliefs predict the machine’s
    inputs. Observation (2) represents a substantial divergence from na”{i}ve
    linear utility aggregation (as in Harsanyi’s utilitarian theorem, and existing
    MORL algorithms), which is shown here to be inadequate for Pareto optimal
    sequential decision-making on behalf of players with different beliefs.

    Generating Focussed Molecule Libraries for Drug Discovery with Recurrent Neural Networks

    Marwin H.S. Segler, Thierry Kogej, Christian Tyrchan, Mark P. Waller
    Comments: 17 pages, 17 figures
    Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Learning (cs.LG); Chemical Physics (physics.chem-ph); Machine Learning (stat.ML)

    In de novo drug design, computational strategies are used to generate novel
    molecules with good affinity to the desired biological target. In this work, we
    show that recurrent neural networks can be trained as generative models for
    molecular structures, similar to statistical language models in natural
    language processing. We demonstrate that the properties of the generated
    molecules correlate very well with the properties of the molecules used to
    train the model. In order to enrich libraries with molecules active towards a
    given biological target, we propose to fine-tune the model with small sets of
    molecules, which are known to be active against that target.

    Against Staphylococcus aureus, the model reproduced 14% of 6051 hold-out test
    molecules that medicinal chemists designed, whereas against Plasmodium
    falciparum (Malaria) it reproduced 28% of 1240 test molecules. When coupled
    with a scoring function, our model can perform the complete de novo drug design
    cycle to generate large sets of novel molecules for drug discovery.


    Information Retrieval

    Exploration of Proximity Heuristics in Length Normalization

    Pranav Agrawal
    Comments: 7 pages
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    Ranking functions used in information retrieval are primarily used in the
    search engines and they are often adopted for various language processing
    applications. However, features used in the construction of ranking functions
    should be analyzed before applying it on a data set. This paper gives
    guidelines on construction of generalized ranking functions with
    application-dependent features. The paper prescribes a specific case of a
    generalized function for recommendation system using feature engineering
    guidelines on the given data set. The behavior of both generalized and specific
    functions are studied and implemented on the unstructured textual data. The
    proximity feature based ranking function has outperformed by 52% from regular
    BM25.

    Outlier Detection for Text Data : An Extended Version

    Ramakrishnan Kannan, Hyenkyun Woo, Charu C. Aggarwal, Haesun Park
    Comments: Accepted at 2017 SIAM Data Mining Conference
    Subjects: Information Retrieval (cs.IR); Learning (cs.LG); Machine Learning (stat.ML)

    The problem of outlier detection is extremely challenging in many domains
    such as text, in which the attribute values are typically non-negative, and
    most values are zero. In such cases, it often becomes difficult to separate the
    outliers from the natural variations in the patterns in the underlying data. In
    this paper, we present a matrix factorization method, which is naturally able
    to distinguish the anomalies with the use of low rank approximations of the
    underlying data. Our iterative algorithm TONMF is based on block coordinate
    descent (BCD) framework. We define blocks over the term-document matrix such
    that the function becomes solvable. Given most recently updated values of other
    matrix blocks, we always update one block at a time to its optimal. Our
    approach has significant advantages over traditional methods for text outlier
    detection. Finally, we present experimental results illustrating the
    effectiveness of our method over competing methods.

    Temporal Effects on Hashtag Reuse in Twitter: A Cognitive-Inspired Hashtag Recommendation Approach

    Dominik Kowald, Subhash Pujari, Elisabeth Lex
    Comments: Accepted at WWW 2017
    Subjects: Information Retrieval (cs.IR)

    Hashtags have become a powerful tool in social platforms such as Twitter to
    categorize and search for content, and to spread short messages across members
    of the social network. In this paper, we study temporal hashtag usage practices
    in Twitter with the aim of designing a cognitive-inspired hashtag
    recommendation algorithm we call BLLi,s. Our main idea is to incorporate the
    effect of time on (i) individual hashtag reuse (i.e., reusing own hashtags),
    and (ii) social hashtag reuse (i.e., reusing hashtags, which has been
    previously used by a followee) into a predictive model. For this, we turn to
    the Base-Level Learning (BLL) equation from the cognitive architecture ACT-R,
    which accounts for the time-dependent decay of item exposure in human memory.
    We validate BLLi,s using two crawled Twitter datasets in two evaluation
    scenarios: firstly, only temporal usage patterns of past hashtag assignments
    are utilized and secondly, these patterns are combined with a content-based
    analysis of the current tweet. In both scenarios, we find not only that
    temporal effects play an important role for both individual and social hashtag
    reuse but also that BLLi,s provides significantly better prediction accuracy
    and ranking results than current state-of-the-art hashtag recommendation
    methods.

    A Probabilistic View of Neighborhood-based Recommendation Methods

    Jun Wang, Qiang Tang
    Comments: accepted by: ICDM 2016 – IEEE International Conference on Data Mining series (ICDM) workshop CLOUDMINE, 7 pages
    Subjects: Information Retrieval (cs.IR)

    Probabilistic graphic model is an elegant framework to compactly present
    complex real-world observations by modeling uncertainty and logical flow
    (conditionally independent factors). In this paper, we present a probabilistic
    framework of neighborhood-based recommendation methods (PNBM) in which
    similarity is regarded as an unobserved factor. Thus, PNBM leads the estimation
    of user preference to maximizing a posterior over similarity. We further
    introduce a novel multi-layer similarity descriptor which models and learns the
    joint influence of various features under PNBM, and name the new framework
    MPNBM. Empirical results on real-world datasets show that MPNBM allows very
    accurate estimation of user preferences.

    Adaptive Questionnaires for Direct Identification of Optimal Product Design

    Max Yi Ren, Clayton Scott
    Comments: submitted to Journal of Mechanical Design
    Subjects: Machine Learning (stat.ML); Information Retrieval (cs.IR)

    We consider the problem of identifying the most profitable product design
    from a finite set of candidates under unknown consumer preference. A standard
    approach to this problem follows a two-step strategy: First, estimate the
    preference of the consumer population, represented as a point in part-worth
    space, using an adaptive discrete-choice questionnaire. Second, integrate the
    estimated part-worth vector with engineering feasibility and cost models to
    determine the optimal design. In this work, we (1) demonstrate that accurate
    preference estimation is neither necessary nor sufficient for identifying the
    optimal design, (2) introduce a novel adaptive questionnaire that leverages
    knowledge about engineering feasibility and manufacturing costs to directly
    determine the optimal design, and (3) interpret product design in terms of a
    nonlinear segmentation of part-worth space, and use this interpretation to
    illuminate the intrinsic difficulty of optimal design in the presence of noisy
    questionnaire responses. We establish the superiority of the proposed approach
    using a well-documented optimal product design task. This study demonstrates
    how the identification of optimal product design can be accelerated by
    integrating marketing and manufacturing knowledge into the adaptive
    questionnaire.


    Computation and Language

    Textual Entailment with Structured Attentions and Composition

    Kai Zhao, Liang Huang, Mingbo Ma
    Subjects: Computation and Language (cs.CL)

    Deep learning techniques are increasingly popular in the textual entailment
    task, overcoming the fragility of traditional discrete models with hard
    alignments and logics. In particular, the recently proposed attention models
    (Rockt”aschel et al., 2015; Wang and Jiang, 2015) achieves state-of-the-art
    accuracy by computing soft word alignments between the premise and hypothesis
    sentences. However, there remains a major limitation: this line of work
    completely ignores syntax and recursion, which is helpful in many traditional
    efforts. We show that it is beneficial to extend the attention model to tree
    nodes between premise and hypothesis. More importantly, this subtree-level
    attention reveals information about entailment relation. We study the recursive
    composition of this subtree-level entailment relation, which can be viewed as a
    soft version of the Natural Logic framework (MacCartney and Manning, 2009).
    Experiments show that our structured attention and entailment composition model
    can correctly identify and infer entailment relations from the bottom up, and
    bring significant improvements in accuracy.

    Exploration of Proximity Heuristics in Length Normalization

    Pranav Agrawal
    Comments: 7 pages
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    Ranking functions used in information retrieval are primarily used in the
    search engines and they are often adopted for various language processing
    applications. However, features used in the construction of ranking functions
    should be analyzed before applying it on a data set. This paper gives
    guidelines on construction of generalized ranking functions with
    application-dependent features. The paper prescribes a specific case of a
    generalized function for recommendation system using feature engineering
    guidelines on the given data set. The behavior of both generalized and specific
    functions are studied and implemented on the unstructured textual data. The
    proximity feature based ranking function has outperformed by 52% from regular
    BM25.


    Distributed, Parallel, and Cluster Computing

    GPU Multisplit

    Saman Ashkiani, Andrew Davidson, Ulrich Meyer, John D. Owens
    Comments: 46 pages, invited paper to ACM Transactions on Parallel Computing (TOPC): “Special Issue: invited papers from PPoPP 2016”. This is an extended version of PPoPP’16 paper “GPU Multisplit”
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Multisplit is a broadly useful parallel primitive that permutes its input
    data into contiguous buckets or bins, where the function that categorizes an
    element into a bucket is provided by the programmer. Due to the lack of an
    efficient multisplit on GPUs, programmers often choose to implement multisplit
    with a sort. One way is to first generate an auxiliary array of bucket IDs and
    then sort input data based on it. In case smaller indexed buckets possess
    smaller valued keys, another way for multisplit is to directly sort input data.
    Both methods are inefficient and require more work than necessary: the former
    requires more expensive data movements while the latter spends unnecessary
    effort in sorting elements within each bucket. In this work, we provide a
    parallel model and multiple implementations for the multisplit problem. Our
    principal focus is multisplit for a small (up to 256) number of buckets. We use
    warp-synchronous programming models and emphasize warp-wide communications to
    avoid branch divergence and reduce memory usage. We also hierarchically reorder
    input elements to achieve better coalescing of global memory accesses. On a
    GeForce GTX 1080 GPU, we can reach a peak throughput of 18.93 Gkeys/s (or 11.68
    Gpairs/s) for a key-only (or key-value) multisplit. Finally, we demonstrate how
    multisplit can be used as a building block for radix sort. In our
    multisplit-based sort implementation, we achieve comparable performance to the
    fastest GPU sort routines, sorting 32-bit keys (and key-value pairs) with a
    throughput of 3.0 G keys/s (and 2.1 Gpair/s).

    Gunrock: GPU Graph Analytics

    Yangzihao Wang, Yuechao Pan, Andrew Davidson, Yuduo Wu, Carl Yang, Leyuan Wang, Muhammad Osama, Chenshan Yuan, Weitang Liu, Andy T. Riffel, John D. Owens
    Comments: 52 pages, invited paper to ACM Transactions on Parallel Computing (TOPC), an extended version of PPoPP’16 paper “Gunrock: A High-Performance Graph Processing Library on the GPU”
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    For large-scale graph analytics on the GPU, the irregularity of data access
    and control flow, and the complexity of programming GPUs, have presented two
    significant challenges to developing a programmable high-performance graph
    library. “Gunrock”, our graph-processing system designed specifically for the
    GPU, uses a high-level, bulk-synchronous, data-centric abstraction focused on
    operations on a vertex or edge frontier. Gunrock achieves a balance between
    performance and expressiveness by coupling high performance GPU computing
    primitives and optimization strategies with a high-level programming model that
    allows programmers to quickly develop new graph primitives with small code size
    and minimal GPU programming knowledge. We characterize the performance of
    various optimization strategies and evaluate Gunrock’s overall performance on
    different GPU architectures on a wide range of graph primitives that span from
    traversal-based algorithms and ranking algorithms, to triangle counting and
    bipartite-graph-based algorithms. The results show that on a single GPU,
    Gunrock has on average at least an order of magnitude speedup over Boost and
    PowerGraph, comparable performance to the fastest GPU hardwired primitives and
    CPU shared-memory graph libraries such as Ligra and Galois, and better
    performance than any other GPU high-level graph library.


    Learning

    Overlapping Cover Local Regression Machines

    Mohamed Elhoseiny, Ahmed Elgammal
    Comments: Long Article with more experiments and analysis of conference paper “Overlapping Domain Cover for Scalable and Accurate Regression Kernel Machines”, presented orally 2015 at the British Machine Vision Conference 2015 (BMVC)
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)

    We present the Overlapping Domain Cover (ODC) notion for kernel machines, as
    a set of overlapping subsets of the data that covers the entire training set
    and optimized to be spatially cohesive as possible. We show how this notion
    benefit the speed of local kernel machines for regression in terms of both
    speed while achieving while minimizing the prediction error. We propose an
    efficient ODC framework, which is applicable to various regression models and
    in particular reduces the complexity of Twin Gaussian Processes (TGP)
    regression from cubic to quadratic. Our notion is also applicable to several
    kernel methods (e.g., Gaussian Process Regression(GPR) and IWTGP regression, as
    shown in our experiments). We also theoretically justified the idea behind our
    method to improve local prediction by the overlapping cover. We validated and
    analyzed our method on three benchmark human pose estimation datasets and
    interesting findings are discussed.

    Signed Laplacian for spectral clustering revisited

    Andrew V. Knyazev
    Subjects: Data Structures and Algorithms (cs.DS); Learning (cs.LG); Numerical Analysis (math.NA); Machine Learning (stat.ML)

    Classical spectral clustering is based on a spectral decomposition of a graph
    Laplacian, obtained from a graph adjacency matrix representing positive graph
    edge weights describing similarities of graph vertices. In signed graphs, the
    graph edge weights can be negative to describe disparities of graph vertices,
    for example, negative correlations in the data. Negative weights lead to
    possible negative spectrum of the standard graph Laplacian, which is cured by
    defining a signed Laplacian. We revisit comparing the standard and signed
    Laplacians and argue that the former is more natural than the latter, also
    showing that the negative spectrum is actually beneficial, for spectral
    clustering of signed graphs.

    Generating Focussed Molecule Libraries for Drug Discovery with Recurrent Neural Networks

    Marwin H.S. Segler, Thierry Kogej, Christian Tyrchan, Mark P. Waller
    Comments: 17 pages, 17 figures
    Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Learning (cs.LG); Chemical Physics (physics.chem-ph); Machine Learning (stat.ML)

    In de novo drug design, computational strategies are used to generate novel
    molecules with good affinity to the desired biological target. In this work, we
    show that recurrent neural networks can be trained as generative models for
    molecular structures, similar to statistical language models in natural
    language processing. We demonstrate that the properties of the generated
    molecules correlate very well with the properties of the molecules used to
    train the model. In order to enrich libraries with molecules active towards a
    given biological target, we propose to fine-tune the model with small sets of
    molecules, which are known to be active against that target.

    Against Staphylococcus aureus, the model reproduced 14% of 6051 hold-out test
    molecules that medicinal chemists designed, whereas against Plasmodium
    falciparum (Malaria) it reproduced 28% of 1240 test molecules. When coupled
    with a scoring function, our model can perform the complete de novo drug design
    cycle to generate large sets of novel molecules for drug discovery.

    Outlier Detection for Text Data : An Extended Version

    Ramakrishnan Kannan, Hyenkyun Woo, Charu C. Aggarwal, Haesun Park
    Comments: Accepted at 2017 SIAM Data Mining Conference
    Subjects: Information Retrieval (cs.IR); Learning (cs.LG); Machine Learning (stat.ML)

    The problem of outlier detection is extremely challenging in many domains
    such as text, in which the attribute values are typically non-negative, and
    most values are zero. In such cases, it often becomes difficult to separate the
    outliers from the natural variations in the patterns in the underlying data. In
    this paper, we present a matrix factorization method, which is naturally able
    to distinguish the anomalies with the use of low rank approximations of the
    underlying data. Our iterative algorithm TONMF is based on block coordinate
    descent (BCD) framework. We define blocks over the term-document matrix such
    that the function becomes solvable. Given most recently updated values of other
    matrix blocks, we always update one block at a time to its optimal. Our
    approach has significant advantages over traditional methods for text outlier
    detection. Finally, we present experimental results illustrating the
    effectiveness of our method over competing methods.

    Toward negotiable reinforcement learning: shifting priorities in Pareto optimal sequential decision-making

    Andrew Critch
    Subjects: Artificial Intelligence (cs.AI); Computer Science and Game Theory (cs.GT); Learning (cs.LG)

    Existing multi-objective reinforcement learning (MORL) algorithms do not
    account for objectives that arise from players with differing beliefs.
    Concretely, consider two players with different beliefs and utility functions
    who may cooperate to build a machine that takes actions on their behalf. A
    representation is needed for how much the machine’s policy will prioritize each
    player’s interests over time. Assuming the players have reached common
    knowledge of their situation, this paper derives a recursion that any Pareto
    optimal policy must satisfy. Two qualitative observations can be made from the
    recursion: the machine must (1) use each player’s own beliefs in evaluating how
    well an action will serve that player’s utility function, and (2) shift the
    relative priority it assigns to each player’s expected utilities over time, by
    a factor proportional to how well that player’s beliefs predict the machine’s
    inputs. Observation (2) represents a substantial divergence from na”{i}ve
    linear utility aggregation (as in Harsanyi’s utilitarian theorem, and existing
    MORL algorithms), which is shown here to be inadequate for Pareto optimal
    sequential decision-making on behalf of players with different beliefs.

    OpenML: An R Package to Connect to the Networked Machine Learning Platform OpenML

    Giuseppe Casalicchio, Jakob Bossek, Michel Lang, Dominik Kirchhoff, Pascal Kerschke, Benjamin Hofner, Heidi Seibold, Joaquin Vanschoren, Bernd Bischl
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    OpenML is an online machine learning platform where researchers can easily
    share data, machine learning tasks and experiments as well as organize them
    online to work and collaborate more efficiently. In this paper, we present an R
    package to interface with the OpenML platform and illustrate its usage in
    combination with the machine learning R package mlr. We show how the OpenML
    package allows R users to easily search, download and upload data sets and
    machine learning tasks. Furthermore, we also show how to upload results of
    experiments, share them with others and download results from other users.
    Beyond ensuring reproducibility of results, the OpenML platform automates much
    of the drudge work, speeds up research, facilitates collaboration and increases
    the users’ visibility online.


    Information Theory

    Higher Order Context Transformations

    Michal Vašinek, Jan Platoš
    Subjects: Information Theory (cs.IT)

    The context transformation and generalized context transformation methods, we
    introduced recently, were able to reduce zero order entropy by exchanging
    digrams, and as a consequence, they were removing mutual information between
    consecutive symbols of the input message. These transformations were intended
    to be used as a preprocessor for zero-order entropy coding algorithms like
    Arithmetic or Huffman coding, since we know, that especially Arithmetic coding
    can achieve a compression rate almost of the size of Shannon’s entropy.

    This paper introduces a novel algorithm based on the concept of generalized
    context transformation, that allows transformation of words longer than simple
    digrams. The higher order contexts are exploited using recursive form of a
    generalized context transformation. It is shown that the zero order entropy of
    transformed data drops significantly, but on the other hand, the overhead given
    by a description of individual transformations increases and it has become a
    limiting factor in a successful transformation of smaller files.

    Twisted Reed–Solomon Codes

    Peter Beelen, Sven Puchinger, Johan Rosenkilde né Nielsen
    Comments: 5 pages, submitted to IEEE International Symposium on Information Theory 2017
    Subjects: Information Theory (cs.IT)

    We present a new general construction of MDS codes over a finite field
    (mathbb{F}_q). We describe two explicit subclasses which contain new MDS codes
    of length at least (q/2) for all values of (q ge 11). Moreover, we show that
    most of the new codes are not equivalent to a Reed–Solomon code.

    Energy-Efficient Transceiver Design for Hybrid Sub-Array Architecture MIMO Systems

    Shiwen He, Chenhao Qi, Yongpen Wu, Yongming Huang
    Comments: 8
    Subjects: Information Theory (cs.IT)

    Millimeter-wave (mmWave) communication operated in frequency bands between 30
    and 300 GHz has attracted extensive attention due to the potential ability of
    offering orders of magnitude greater bandwidths combined with further gains via
    beamforming and spatial multiplexing from multi-element antenna arrays. MmWave
    system may exploit the hybrid analog and digital precoding to achieve
    simultaneously the diversity, array and multiplexing gain with a lower cost of
    implementation. Motivated by this, in this paper, we investigate the design of
    hybrid precoder and combiner with sub-connected architecture, where each radio
    frequency chain is connected to only a subset of base station (BS) antennas
    from the perspective of energy efficient transmission. The problem of interest
    is a non-convex and NP-hard problem that is difficult to solve directly. In
    order to address it, we resort to design a two-layer optimization method to
    solve the problem of interest by exploiting jointly the interference alignment
    and fractional programming. First, the analog precoder and combiner are
    optimized via the alternating-direction optimization method (ADOM) where the
    phase shifter can be easily adjusted with an analytical structure. Then, we
    optimize the digital precoder and combiner based on an effective multiple-input
    multiple-output (MIMO) channel coefficient. The convergence of the proposed
    algorithms is proved using the monotonic boundary theorem and fractional
    programming theory. Extensive simulation results are given to validate the
    effectiveness of the presented method and to evaluate the energy efficiency
    performance under various system configurations.

    Several classes of optimal ternary cyclic codes

    Lanqiang Li, Li Liu, Shixin Zhu
    Subjects: Information Theory (cs.IT)

    Cyclic codes have efficient encoding and decoding algorithms over finite
    fields, so that they have practical applications in communication systems,
    consumer electronics and data storage systems. The objective of this paper is
    to give eight new classes of optimal ternary cyclic codes with parameters
    ([3^m-1,3^m-1-2m,4]), according to a result on the non-existence of solutions
    to a certain equation over (F_{3^m}). It is worth noticing that some recent
    conclusions on such optimal ternary cyclic codes are some special cases of our
    work. More importantly, three of the nine open problems proposed by Ding and
    Helleseth in [8] are solved completely. In addition, another one among the nine
    open problems is also promoted.

    Downlink Coverage Analysis for a Finite 3D Wireless Network of Unmanned Aerial Vehicles

    Vishnu Vardhan Chetlur, Harpreet S. Dhillon
    Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

    In this paper, we consider a finite network of unmanned aerial vehicles
    (UAVs) serving a given region. Modeling this network as a uniform binomial
    point process (BPP), we derive the downlink coverage probability of a reference
    receiver located at an arbitrary position on the ground assuming Nakagami-(m)
    fading for all wireless links. The reference receiver is assumed to connect to
    its closest transmitting node as is usually the case in cellular systems. After
    deriving the distribution of distances from the reference receiver to the
    serving and interfering nodes, we derive an exact expression for downlink
    coverage probability in terms of the derivative of Laplace transform of
    interference power distribution. In the downlink of this system, it is not
    unusual to encounter scenarios in which the line-of-sight (LOS) component is
    significantly stronger than the reflected multipath components. To emulate such
    scenarios, we also derive the coverage probability in the absence of fading
    from the results of Nakagami-(m) fading by taking the limit (m o infty).
    Using asymptotic expansion of incomplete gamma function, we concretely show
    that this limit reduces to a redundant condition. Consequently, we derive an
    accurate coverage probability approximation for this case using dominant
    interferer-based approach in which the effect of dominant interferer is exactly
    captured and the residual interference from other interferers is carefully
    approximated. We then derive the bounds of the approximate coverage probability
    using Berry-Esseen theorem. Our analyses reveal several useful trends in
    coverage probability as a function of height of the transmitting nodes and the
    location of reference receiver on the ground.

    Random Subsets of Structured Deterministic Frames have MANOVA Spectra

    Marina Haikin, Ram Zamir, Matan Gavish
    Subjects: Information Theory (cs.IT)

    We draw a random subset of (k) rows from a frame with (n) rows (vectors) and
    (m) columns (dimensions), where (k) and (m) are proportional to (n). For a
    variety of important deterministic equiangular tight frames (ETFs) and tight
    non-ETF frames, we consider the distribution of singular values of the
    (k)-subset matrix. We observe that for large (n) they can be precisely
    described by a known probability distribution — Wachter’s MANOVA spectral
    distribution, a phenomenon that was previously known only for two types of
    random frames. In terms of convergence to this limit, the (k)-subset matrix
    from all these frames is shown to behave exactly like the classical MANOVA
    (Jacobi) random matrix ensemble. Thus empirically the MANOVA ensemble offers a
    universal description of spectra of randomly selected (k)-subframes, even those
    taken from deterministic frames. The same universality phenomena is shown to
    hold for notable random frames as well. This description enables exact
    calculations of properties of solutions for systems of linear equations based
    on a random choice of (k) frame vectors out of (n) possible vectors, and has a
    variety of implications for erasure coding, compressed sensing, and sparse
    recovery. Our results are empirical, but they are exhaustive, precise and fully
    reproducible.

    Low Complexity Receiver for Uplink SCMA System via Expectation Propagation

    Xiangming Meng, Yiqun Wu, Yan Chen, Meng Cheng
    Comments: 5 pages, 5 figures, accepted by IEEE WCNC 2017
    Subjects: Information Theory (cs.IT)

    Sparse code multiple access (SCMA) scheme is considered to be one promising
    non-orthogonal multiple access technology for the future fifth generation (5G)
    communications. Due to the sparse nature, message passing algorithm (MPA) has
    been used as the receiver to achieve close to maximum likelihood (ML) detection
    performance with much lower complexity. However, the complexity order of MPA is
    still exponential with the size of codebook and the degree of signal
    superposition on a given resource element. In this paper, we propose a novel
    low complexity iterative receiver based on expectation propagation algorithm
    (EPA), which reduces the complexity order from exponential to linear.
    Simulation results demonstrate that the proposed EPA receiver achieves nearly
    the same block error rate (BLER) performance as the conventional message
    passing algorithm (MPA) receiver with orders less complexity.

    Efficient Multi-Dimensional Mapping Using QAM Constellations for BICM-ID

    Hassan M. Navazi, Md. Jahangir Hossain
    Subjects: Information Theory (cs.IT)

    Bit-interleaved coded modulation with iterative decoding (BICM-ID) offers
    very good error performance over additive white Gaussian noise (AWGN) and
    fading channels if it uses a wisely designed signal mapping. Further, error
    performance of BICM-ID can significantly be improved by employing
    multi-dimensional (MD) modulation. However, suitable MD mappings are obtained
    by computer search techniques except for MD modulations that use smaller
    constellation e.g., binary phase shift keying (BPSK), quadrature phase shift
    keying (QPSK) and (8)-ary phase shift keying ((8)-PSK) as basic modulation. The
    alphabet size of MD modulations increases exponentially as the order of the
    basic modulation increases and computer search becomes intractable. In this
    paper, we propose a systematic mapping method for MD modulations. The
    innovativeness of our proposed method is that it generates MD mappings using
    (16)- and (64)-quadrature amplitude modulation (QAM) very efficiently. The
    presented numerical results show that the proposed method improves bit error
    rate (BER) of BICM-ID.

    Effect of Cell-Selection on the Effective Fading Distribution in a Downlink (K)-tier HetNet

    Mustafa A. Kishk, Harpreet S. Dhillon
    Subjects: Information Theory (cs.IT); Probability (math.PR)

    This paper characterizes the statistics of effective fading gain in
    multi-tier cellular networks with strongest base station (BS) cell association
    policy. First, we derive the probability of association with the (n)-th nearest
    BS in the (k)-th tier. Next, we use this result to derive the probability
    density function (PDF) of the channel fading gain (effective fading)
    experienced by the user when associating with the strongest BS. Interestingly,
    our results show that the effective channel gain distribution solely depends
    upon the original channel fading and the path-loss exponent. Moreover, we show
    that in the case of Nakagami-(m) fading channels (Gamma distribution), the
    distribution of the effective fading is also Gamma but with a gain of
    (frac{alpha}{2}) in the shape parameter, where (alpha) is the path-loss
    exponent.

    The World's First Real-Time Testbed for Massive MIMO: Design, Implementation, and Validation

    Steffen Malkowsky, Joao Vieira, Liang Liu, Paul Harris, Karl Nieman, Nikhil Kundargi, Ian Wong, Fredrik Tufvesson, Viktor Öwall, Ove Edfors
    Comments: 14 pages, submitted to IEEE Access
    Subjects: Information Theory (cs.IT)

    This paper sets up a framework for designing a massive multiple-input
    multiple-output (MIMO) testbed by investigating hardware and system-level
    requirements such as processing complexity, duplexing mode and frame structure.
    Taking these into account, a generic system and processing partitioning is
    proposed which allows flexible scaling and processing distribution onto a
    multitude of physically separated devices. Based on the given hardware
    constraints such as maximum number of links and maximum throughput for
    peer-to-peer interconnections combined with processing capabilities, the
    framework allows to evaluate available off-the-shelf hardware components. To
    verify our design approach, we present the Lund University Massive MIMO
    (LuMaMi) testbed which constitutes the first reconfigurable real-time hardware
    platform for prototyping massive MIMO. Utilizing up to 100 base station
    antennas and more than 50 field-programmable gate arrays (FPGAs), up to 12 user
    equipments are served on the same time/frequency resource using an LTE-like
    OFDM TDD-based transmission scheme. Field trials with this system show that
    massive MIMO can spatially separate a multitude of users in a static indoor and
    static/dynamic outdoor environment.

    Adaptive Real-Time Software Defined MIMO Visible Light Communications using Spatial Multiplexing and Spatial Diversity

    Peng Deng, Mohsen Kavehrad
    Comments: 6 pages, 7 figures, Accept by WiSEE_2016. arXiv admin note: substantial text overlap with arXiv:1612.05402
    Subjects: Information Theory (cs.IT)

    In this paper, we experimentally demonstrate a real-time software defined
    multiple input multiple output (MIMO) visible light communication (VLC) system
    employing link adaptation of spatial multiplexing and spatial diversity.
    Real-time MIMO signal processing is implemented by using the Field Programmable
    Gate Array (FPGA) based Universal Software Radio Peripheral (USRP) devices.
    Software defined implantation of MIMO VLC can assist in enabling an adaptive
    and reconfigurable communication system without hardware changes. We measured
    the error vector magnitude (EVM), bit error rate (BER) and spectral efficiency
    performance for single carrier M-QAM MIMO VLC using spatial diversity and
    spatial multiplexing. Results show that spatial diversity MIMO VLC improves
    error performance at the cost of spectral efficiency that spatial multiplexing
    should enhance. We propose the adaptive MIMO solution that both modulation
    schema and MIMO schema are dynamically adapted to the changing channel
    conditions for enhancing the error performance and spectral efficiency. The
    average error-free spectral efficiency of adaptive 2×2 MIMO VLC achieved 12
    b/s/Hz over 2 meters indoor dynamic transmission.

    Perfect Sequences and Arrays over the Unit Quaternions

    Sam Blake
    Subjects: Information Theory (cs.IT)

    We introduce several new constructions for perfect periodic autocorrelation
    sequences and arrays over the unit quaternions. This paper uses both
    mathematical proofs and com- puter experiments to prove the (bounded) array
    constructions have perfect periodic auto- correlation. Furthermore, the first
    sequence construction generates odd-perfect sequences of unbounded lengths,
    with good ZCZ.

    Compressive Sensing Based Detection With Multimodal-Dependent Data

    Thakshila Wimalajeewa, Pramod K. Varshney
    Subjects: Applications (stat.AP); Information Theory (cs.IT)

    Detection with high dimensional multimodal data is a challenging problem when
    there are complex inter- and intra- modal dependencies. While several
    approaches have been proposed for dependent data fusion (e.g., based on copula
    theory), their advantages come at a high price in terms of computational
    complexity. To overcome computational difficulties, in this paper, we treat the
    detection problem in a compressed domain where compression is achieved via low
    dimensional random projections as considered in compressive sensing (CS). While
    CS has recently been exploited to solve detection problems under various
    assumptions on the signals of interest, its potential for dependent data fusion
    has not been explored adequately. We propose two approaches for detection when
    the uncompressed data is dependent. First, Gaussian approximation is employed
    to perform likelihood ratio (LR) based detection in the compressed domain. We
    show that, under certain conditions, LR based detection with a small number of
    compressed measurements leads to enhanced performance compared to detection
    with (uncompressed) high dimensional data using widely considered suboptimal
    approaches in the literature. Second, we explore the potential of CS to capture
    second order statistics of the uncompressed dependent data to construct a
    decision statistic in the compressed domain. While the second approach is quite
    computationally complex compared to the first approach, its use is promising
    over the first approach when multimodal data is dependent and highly
    correlated. Further, in both approaches, complete signal reconstruction is not
    necessary to make a reliable detection decision.

    A Hybrid Energy Sharing Framework for Green Cellular Networks

    Muhammad Junaid Farooq, Hakim Ghazzai, Abdullah Kadri, Hesham ElSawy, Mohamed-Slim Alouini
    Comments: 16 pages 8 figures in IEEE Transactions on Communications 2017
    Subjects: Networking and Internet Architecture (cs.NI); Computational Geometry (cs.CG); Information Theory (cs.IT)

    Cellular operators are increasingly turning towards renewable energy (RE) as
    an alternative to using traditional electricity in order to reduce operational
    expenditure and carbon footprint. Due to the randomness in both RE generation
    and mobile traffic at each base station (BS), a surplus or shortfall of energy
    may occur at any given time. To increase energy self-reliance and minimize the
    network’s energy cost, the operator needs to efficiently exploit the RE
    generated across all BSs. In this paper, a hybrid energy sharing framework for
    cellular network is proposed, where a combination of physical power lines and
    energy trading with other BSs using smart grid is used. Algorithms for physical
    power lines deployment between BSs, based on average and complete statistics of
    the net RE available, are developed. Afterwards, an energy management framework
    is formulated to optimally determine the quantities of electricity and RE to be
    procured and exchanged among BSs, respectively, while considering battery
    capacities and real-time energy pricing. Three cases are investigated where RE
    generation is unknown, perfectly known, and partially known ahead of time.
    Results investigate the time varying energy management of BSs and demonstrate
    considerable reduction in average energy cost thanks to the hybrid energy
    sharing scheme.

    A Matrix Factorization Approach for Learning Semidefinite-Representable Regularizers

    Yong Sheng Soh, Venkat Chandrasekaran
    Comments: 48 pages, 9 figures
    Subjects: Optimization and Control (math.OC); Information Theory (cs.IT); Machine Learning (stat.ML)

    Regularization techniques are widely employed in optimization-based
    approaches for solving ill-posed inverse problems in data analysis and
    scientific computing. These methods are based on augmenting the objective with
    a penalty function, which is specified based on prior domain-specific expertise
    to induce a desired structure in the solution. We consider the problem of
    learning suitable regularization functions from data in settings in which
    precise domain knowledge is not directly available. Previous work under the
    title of `dictionary learning’ or `sparse coding’ may be viewed as learning a
    regularization function that can be computed via linear programming. We
    describe generalizations of these methods to learn regularizers that can be
    computed and optimized via semidefinite programming. Our framework for learning
    such semidefinite regularizers is based on obtaining structured factorizations
    of data matrices, and our algorithmic approach for computing these
    factorizations combines recent techniques for rank minimization problems along
    with an operator analog of Sinkhorn scaling. Under suitable conditions on the
    input data, our algorithm provides a locally linearly convergent method for
    identifying the correct regularizer that promotes the type of structure
    contained in the data. Our analysis is based on the stability properties of
    Operator Sinkhorn scaling and their relation to geometric aspects of
    determinantal varieties (in particular tangent spaces with respect to these
    varieties). The regularizers obtained using our framework can be employed
    effectively in semidefinite programming relaxations for solving inverse
    problems.




沪ICP备19023445号-2号
友情链接