IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Thu, 5 Jan 2017

    我爱机器学习(52ml.net)发表于 2017-01-05 00:00:00
    love 0

    Neural and Evolutionary Computing

    PlatEMO: A MATLAB Platform for Evolutionary Multi-Objective Optimization

    Ye Tian, Ran Cheng, Xingyi Zhang, Yaochu Jin
    Comments: 20 pages, 12 figures, 4 tables
    Subjects: Neural and Evolutionary Computing (cs.NE)

    Over the last three decades, a large number of evolutionary algorithms have
    been developed for solving multiobjective optimization problems. However, there
    lacks an up-to-date and comprehensive software platform for researchers to
    properly benchmark existing algorithms and for practitioners to apply selected
    algorithms to solve their real-world problems. The demand of such a common tool
    becomes even more urgent, when the source code of many proposed algorithms has
    not been made publicly available. To address these issues, we have developed a
    MATLAB platform for evolutionary multi-objective optimization in this paper,
    called PlatEMO, which includes more than 50 multi-objective evolutionary
    algorithms and more than 100 multi-objective test problems, along with several
    widely used performance indicators. With a user-friendly graphical user
    interface, PlatEMO enables users to easily compare several evolutionary
    algorithms at one time and collect statistical results in Excel or LaTeX files.
    More importantly, PlatEMO is completely open source, such that users are able
    to develop new algorithms on the basis of it. This paper introduces the main
    features of PlatEMO and illustrates how to use it for performing comparative
    experiments, embedding new algorithms, creating new test problems, and
    developing performance indicators. Source code of PlatEMO is now available at:
    this http URL

    Demystifying Neural Style Transfer

    Yanghao Li, Naiyan Wang, Jiaying Liu, Xiaodi Hou
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Neural Style Transfer has recently demonstrated very exciting results which
    catches eyes in both academia and industry. Despite the amazing results, the
    principle of neural style transfer, especially why the Gram matrices could
    represent style remains unclear. In this paper, we propose a novel
    interpretation of neural style transfer by treating it as a domain adaptation
    problem. Specifically, we theoretically show that matching the Gram matrices of
    feature maps is equivalent to minimize the Maximum Mean Discrepancy (MMD) with
    the second order polynomial kernel. Thus, we argue that the essence of neural
    style transfer is to match the feature distributions between the style images
    and the generated images. To further support our standpoint, we experiment with
    several other distribution alignment methods, and achieve appealing results. We
    believe this novel interpretation connects these two important research fields,
    and could enlighten future researches.


    Computer Vision and Pattern Recognition

    SalGAN: Visual Saliency Prediction with Generative Adversarial Networks

    Junting Pan, Cristian Canton, Kevin McGuinness, Noel E. O'Connor, Jordi Torres, Elisa Sayrol, Xavier Giro-i-Nieto
    Comments: Our results can be reproduced with the source code and trained models available at this https URL
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We introduce SalGAN, a deep convolutional neural network for visual saliency
    prediction trained with adversarial examples. The first stage of the network
    consists of a generator model whose weights are learned by back-propagation
    computed from a binary cross entropy (BCE) loss over downsampled versions of
    the saliency maps. The resulting prediction is processed by a discriminator
    network trained to solve a binary classification task between the saliency maps
    generated by the generative stage and the ground truth ones. Our experiments
    show how adversarial training allows reaching state-of-the-art performance
    across different metrics when combined with a widely-used loss function like
    BCE.

    Transforming Sensor Data to the Image Domain for Deep Learning – an Application to Footstep Detection

    Monit Shah Singh, Vinaychandran Pondenkandath, Bo Zhou, Paul Lukowicz, Marcus Liwicki
    Comments: 8 pages, 8 figures, under review
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Convolutional Neural Networks (CNNs) have become the state-of-the-art in
    various computer vision tasks, but they are still premature for most sensor
    data, especially in pervasive and wearable computing. A major reason for this
    is the limited amount of annotated training data. In this paper, we propose the
    idea of leveraging the discriminative power of pre-trained deep CNNs on
    2-dimensional sensor data by transforming the sensor modality to the visual
    domain. By three proposed strategies, 2D sensor output is converted into
    pressure distribution imageries. Then we utilize a pre-trained CNN for transfer
    learning on the converted imagery data. We evaluate our method on a gait
    dataset of floor surface pressure mapping. We obtain a classification accuracy
    of 87.66%, which outperforms the conventional machine learning methods by over
    10%.

    Demystifying Neural Style Transfer

    Yanghao Li, Naiyan Wang, Jiaying Liu, Xiaodi Hou
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Neural Style Transfer has recently demonstrated very exciting results which
    catches eyes in both academia and industry. Despite the amazing results, the
    principle of neural style transfer, especially why the Gram matrices could
    represent style remains unclear. In this paper, we propose a novel
    interpretation of neural style transfer by treating it as a domain adaptation
    problem. Specifically, we theoretically show that matching the Gram matrices of
    feature maps is equivalent to minimize the Maximum Mean Discrepancy (MMD) with
    the second order polynomial kernel. Thus, we argue that the essence of neural
    style transfer is to match the feature distributions between the style images
    and the generated images. To further support our standpoint, we experiment with
    several other distribution alignment methods, and achieve appealing results. We
    believe this novel interpretation connects these two important research fields,
    and could enlighten future researches.

    Path-following based Point Matching using Similarity Transformation

    Wei Lian
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    To address the problem of 3D point matching where the poses of two point sets
    are unknown, we adapt a recently proposed path following based method to use
    similarity transformation instead of the original affine transformation. The
    reduced number of transformation parameters leads to more constrained and
    desirable matching results. Experimental results demonstrate better robustness
    of the proposed method over state-of-the-art methods.

    An Evaluation Framework and Database for MoCap-Based Gait Recognition Methods

    Michal Balazia, Petr Sojka
    Comments: 13 pages. Full paper published at the 1st IAPR Workshop on Reproducible Research in Pattern Recognition, Cancun, Mexico, December 2016. Postprint. arXiv admin note: text overlap with arXiv:1609.04392, arXiv:1609.06936
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    As a contribution to reproducible research, this paper presents a framework
    and a database to improve the development, evaluation and comparison of methods
    for gait recognition from motion capture (MoCap) data. The evaluation framework
    provides implementation details and source codes of state-of-the-art
    human-interpretable geometric features as well as our own approaches where gait
    features are learned by a modification of Fisher’s Linear Discriminant Analysis
    with the Maximum Margin Criterion, and by a combination of Principal Component
    Analysis and Linear Discriminant Analysis. It includes a description and source
    codes of a mechanism for evaluating four class separability coefficients of
    feature space and four rank-based classifier performance metrics. This
    framework also contains a tool for learning a custom classifier and for
    classifying a custom query on a custom gallery. We provide an experimental
    database along with source codes for its extraction from the general CMU MoCap
    database.

    A Concave Optimization Algorithm for Matching Partially Overlapping Point Sets

    Wei Lian, Lei Zhang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Point matching refers to the process of finding spatial transformation and
    correspondences between two sets of points. In this paper, we focus on the case
    that there is only partial overlap between two point sets. Following the
    approach of the robust point matching method, we model point matching as a
    mixed linear assignment-least square problem and show that after eliminating
    the transformation variable, the resulting problem of minimization with respect
    to point correspondence is a concave optimization problem. Furthermore, this
    problem has the property that the objective function can be converted into a
    form with few nonlinear terms via a linear transformation. Based on these
    properties, we employ the branch-and-bound (BnB) algorithm to optimize the
    resulting problem where the dimension of the search space is small. To further
    improve efficiency of the BnB algorithm where computation of the lower bound is
    the bottleneck, we propose a new lower bounding scheme which has a
    k-cardinality linear assignment formulation and can be efficiently solved.
    Experimental results show that the proposed algorithm outperforms
    state-of-the-art methods in terms of robustness to disturbances and point
    matching accuracy.

    Automated Blood Vessel Segmentation of Fundus Images Based on Region Features and Hierarchical Growth Algorithm

    Zhun Fan, Jiewei Lu, Wenji Li
    Comments: 10 pages, 6 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Automated blood vessel segmentation plays an important role in the diagnosis
    and treatment of various cardiovascular and ophthalmologic diseases. In this
    paper, a novel unsupervised segmentation algorithm is proposed to extract blood
    vessels from fundus images. At first, an enhanced vessel image is generated by
    morphological reconstruction, and then divided into three parts: preliminary
    vessel regions, background regions and undetermined regions. Secondly, a list
    of region features of blood vessels are defined and used to remove the noise
    regions in preliminary vessel regions and extract the skeletons of blood
    vessels. An intermediate vessel image is obtained by combing the denoised
    preliminary vessel regions with vessel skeletons. After that, a novel
    hierarchical growth algorithm, namely HGA, is proposed to label each pixel in
    undetermined regions as vessel or not in an incremental way, which takes the
    advantage of color and spatial information from the intermediate vessel image
    and background regions. Finally, postprocessing is performed to remove the
    nonvessel regions. The proposed algorithm has low computational complexity and
    outperforms many other state-of-art supervised and unsupervised methods in
    terms of accuracy. It achieves a vessel segmentation accuracy of 96.0%, 95.8%
    and 95.1% in an average time of 10.72s, 15.74s and 50.71s on images from three
    publicly available retinal image datasets DRIVE, STARE, and CHASE DB1,
    respectively.

    Learning a Mixture of Deep Networks for Single Image Super-Resolution

    Ding Liu, Zhaowen Wang, Nasser Nasrabadi, Thomas Huang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Single image super-resolution (SR) is an ill-posed problem which aims to
    recover high-resolution (HR) images from their low-resolution (LR)
    observations. The crux of this problem lies in learning the complex mapping
    between low-resolution patches and the corresponding high-resolution patches.
    Prior arts have used either a mixture of simple regression models or a single
    non-linear neural network for this propose. This paper proposes the method of
    learning a mixture of SR inference modules in a unified framework to tackle
    this problem. Specifically, a number of SR inference modules specialized in
    different image local patterns are first independently applied on the LR image
    to obtain various HR estimates, and the resultant HR estimates are adaptively
    aggregated to form the final HR image. By selecting neural networks as the SR
    inference module, the whole procedure can be incorporated into a unified
    network and be optimized jointly. Extensive experiments are conducted to
    investigate the relation between restoration performance and different network
    architectures. Compared with other current image SR approaches, our proposed
    method achieves state-of-the-arts restoration results on a wide range of images
    consistently while allowing more flexible design choices. The source codes are
    available in this http URL

    Semi-Supervised Endmember Identification In Nonlinear Spectral Mixtures Via Semantic Representation

    Yuki Itoh, Siwei Feng, Marco F. Duarte, Mario Parente
    Comments: 15 pages, 11 figures, 4 tables
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper proposes a new hyperspectral unmixing method for nonlinearly mixed
    hyperspectral data using a semantic representation in a semi-supervised
    fashion, assuming the availability of a spectral reference library. Existing
    semi-supervised unmixing algorithms select members from an endmember library
    that are present at each of the pixels; most such methods assume a linear
    mixing model. However, those methods will fail in the presence of nonlinear
    mixing among the observed spectra. To address this issue, we develop an
    endmember selection method using a recently proposed semantic spectral
    representation obtained via non-homogeneous hidden Markov chain (NHMC) model
    for a wavelet transform of the spectra. The semantic representation can encode
    spectrally discriminative features for any observed spectrum and, therefore,
    our proposed method can perform endmember selection without any assumption on
    the mixing model. Experimental results show that in the presence of
    sufficiently nonlinear mixing our proposed method outperforms dictionary-based
    sparse unmixing approaches based on linear models.

    Constrained Deep Weak Supervision for Histopathology Image Segmentation

    Zhipeng Jia, Xingyi Huang, Eric I-Chao Chang, Yan Xu
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we develop a new weakly-supervised learning algorithm to learn
    to segment cancerous regions in histopathology images. Our work is under a
    multiple instance learning framework (MIL) with a new formulation, deep weak
    supervision (DWS); we also propose an effective way to introduce constraints to
    our neural networks to assist the learning process. The contributions of our
    algorithm are threefold: (1) We build an end-to-end learning system that
    segments cancerous regions with fully convolutional networks (FCN) in which
    image-to-image weakly-supervised learning is performed. (2) We develop a deep
    week supervision formulation to exploit multi-scale learning under weak
    supervision within fully convolutional networks. (3) Constraints about positive
    instances are introduced in our approach to effectively explore additional
    weakly-supervised information that is easy to obtain and enjoys a significant
    boost to the learning process. The proposed algorithm, abbreviated as DWS-MIL,
    is easy to implement and can be trained efficiently. Our system demonstrates
    state-of-the-art results on large-scale histopathology image datasets and can
    be applied to various applications in medical imaging beyond histopathology
    images such as MRI, CT, and ultrasound images.

    Dense Associative Memory is Robust to Adversarial Inputs

    Dmitry Krotov, John J Hopfield
    Subjects: Learning (cs.LG); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Neurons and Cognition (q-bio.NC); Machine Learning (stat.ML)

    Deep neural networks (DNN) trained in a supervised way suffer from two known
    problems. First, the minima of the objective function used in learning
    correspond to data points (also known as rubbish examples or fooling images)
    that lack semantic similarity with the training data. Second, a clean input can
    be changed by a small, and often imperceptible for human vision, perturbation,
    so that the resulting deformed input is misclassified by the network. These
    findings emphasize the differences between the ways DNN and humans classify
    patterns, and raise a question of designing learning algorithms that more
    accurately mimic human perception compared to the existing methods.

    Our paper examines these questions within the framework of Dense Associative
    Memory (DAM) models. These models are defined by the energy function, with
    higher order (higher than quadratic) interactions between the neurons. We show
    that in the limit when the power of the interaction vertex in the energy
    function is sufficiently large, these models have the following three
    properties. First, the minima of the objective function are free from rubbish
    images, so that each minimum is a semantically meaningful pattern. Second,
    artificial patterns poised precisely at the decision boundary look ambiguous to
    human subjects and share aspects of both classes that are separated by that
    decision boundary. Third, adversarial images constructed by models with small
    power of the interaction vertex, which are equivalent to DNN with rectified
    linear units (ReLU), fail to transfer to and fool the models with higher order
    interactions. This opens up a possibility to use higher order models for
    detecting and stopping malicious adversarial attacks. The presented results
    suggest that DAM with higher order energy functions are closer to human visual
    perception than DNN with ReLUs.


    Artificial Intelligence

    Stochastic Planning and Lifted Inference

    Roni Khardon, Scott Sanner
    Subjects: Artificial Intelligence (cs.AI)

    Lifted probabilistic inference (Poole, 2003) and symbolic dynamic programming
    for lifted stochastic planning (Boutilier et al, 2001) were introduced around
    the same time as algorithmic efforts to use abstraction in stochastic systems.
    Over the years, these ideas evolved into two distinct lines of research, each
    supported by a rich literature. Lifted probabilistic inference focused on
    efficient arithmetic operations on template-based graphical models under a
    finite domain assumption while symbolic dynamic programming focused on
    supporting sequential decision-making in rich quantified logical action models
    and on open domain reasoning. Given their common motivation but different focal
    points, both lines of research have yielded highly complementary innovations.
    In this chapter, we aim to help close the gap between these two research areas
    by providing an overview of lifted stochastic planning from the perspective of
    probabilistic inference, showing strong connections to other chapters in this
    book. This also allows us to define Generalized Lifted Inference as a paradigm
    that unifies these areas and elucidates open problems for future research that
    can benefit both lifted inference and stochastic planning.

    A K-fold Method for Baseline Estimation in Policy Gradient Algorithms

    Nithyanand Kota, Abhishek Mishra, Sunil Srinivasa, Xi (Peter)
    Chen, Pieter Abbeel
    Subjects: Artificial Intelligence (cs.AI)

    The high variance issue in unbiased policy-gradient methods such as VPG and
    REINFORCE is typically mitigated by adding a baseline. However, the baseline
    fitting itself suffers from the underfitting or the overfitting problem. In
    this paper, we develop a K-fold method for baseline estimation in policy
    gradient algorithms. The parameter K is the baseline estimation hyperparameter
    that can adjust the bias-variance trade-off in the baseline estimates. We
    demonstrate the usefulness of our approach via two state-of-the-art policy
    gradient algorithms on three MuJoCo locomotive control tasks.

    Fuzzy finite element model updating using metaheuristic optimization algorithms

    I. Boulkaibet, T. Marwala, M.I. Friswell, H. Haddad Khodaparast, S. Adhikari
    Comments: This article was accepted by the 2017 International Modal Analysis Conference
    Subjects: Artificial Intelligence (cs.AI)

    In this paper, a non-probabilistic method based on fuzzy logic is used to
    update finite element models (FEMs). Model updating techniques use the measured
    data to improve the accuracy of numerical models of structures. However, the
    measured data are contaminated with experimental noise and the models are
    inaccurate due to randomness in the parameters. This kind of aleatory
    uncertainty is irreducible, and may decrease the accuracy of the finite element
    model updating process. However, uncertainty quantification methods can be used
    to identify the uncertainty in the updating parameters. In this paper, the
    uncertainties associated with the modal parameters are defined as fuzzy
    membership functions, while the model updating procedure is defined as an
    optimization problem at each {alpha}-cut level. To determine the membership
    functions of the updated parameters, an objective function is defined and
    minimized using two metaheuristic optimization algorithms: ant colony
    optimization (ACO) and particle swarm optimization (PSO). A structural example
    is used to investigate the accuracy of the fuzzy model updating strategy using
    the PSO and ACO algorithms. Furthermore, the results obtained by the fuzzy
    finite element model updating are compared with the Bayesian model updating
    results.

    On the Usability of Probably Approximately Correct Implication Bases

    Daniel Borchmann, Tom Hanika, Sergei Obiedkov
    Subjects: Logic in Computer Science (cs.LO); Artificial Intelligence (cs.AI); Learning (cs.LG)

    We revisit the notion of probably approximately correct implication bases
    from the literature and present a first formulation in the language of formal
    concept analysis, with the goal to investigate whether such bases represent a
    suitable substitute for exact implication bases in practical use-cases. To this
    end, we quantitatively examine the behavior of probably approximately correct
    implication bases on artificial and real-world data sets and compare their
    precision and recall with respect to their corresponding exact implication
    bases. Using a small example, we also provide qualitative insight that
    implications from probably approximately correct bases can still represent
    meaningful knowledge from a given data set.


    Information Retrieval

    World Literature According to Wikipedia: Introduction to a DBpedia-Based Framework

    Christoph Hube, Frank Fischer, Robert Jäschke, Gerhard Lauer, Mads Rosendahl Thomsen
    Comments: 33 pages, 6 figures, 6 tables
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    Among the manifold takes on world literature, it is our goal to contribute to
    the discussion from a digital point of view by analyzing the representation of
    world literature in Wikipedia with its millions of articles in hundreds of
    languages. As a preliminary, we introduce and compare three different
    approaches to identify writers on Wikipedia using data from DBpedia, a
    community project with the goal of extracting and providing structured
    information from Wikipedia. Equipped with our basic set of writers, we analyze
    how they are represented throughout the 15 biggest Wikipedia language versions.
    We combine intrinsic measures (mostly examining the connectedness of articles)
    with extrinsic ones (analyzing how often articles are frequented by readers)
    and develop methods to evaluate our results. The better part of our findings
    seems to convey a rather conservative, old-fashioned version of world
    literature, but a version derived from reproducible facts revealing an implicit
    literary canon based on the editing and reading behavior of millions of people.
    While still having to solve some known issues, the introduced methods will help
    us build an observatory of world literature to further investigate its
    representativeness and biases.


    Computation and Language

    Joint Semantic Synthesis and Morphological Analysis of the Derived Word

    Ryan Cotterell, Hinrich Schütze
    Comments: Accepted at TACL
    Subjects: Computation and Language (cs.CL)

    Much like sentences are composed of words, words themselves are composed of
    smaller units. For example, the English word questionably can be analyzed as
    question+able+ly. However, this structural decomposition of the word does not
    directly give us a semantic representation of the word’s meaning. Since
    morphology obeys the principle of compositionality, the semantics of the word
    can be systematically derived from the meaning of its parts. In this work, we
    propose a novel probabilistic model of word formation that captures both the
    analysis of a word w into its constituents segments and the synthesis of the
    meaning of w from the meanings of those segments. Our model jointly learns to
    segment words into morphemes and compose distributional semantic vectors of
    those morphemes. We experiment with the model on English CELEX data and German
    DerivBase (Zeller et al., 2013) data. We show that jointly modeling semantics
    increases both segmentation accuracy and morpheme F1 by between 3% and 5%.
    Additionally, we investigate different models of vector composition, showing
    that recurrent neural networks yield an improvement over simple additive
    models. Finally, we study the degree to which the representations correspond to
    a linguist’s notion of morphological productivity.

    Neural Probabilistic Model for Non-projective MST Parsing

    Xuezhe Ma, Eduard Hovy
    Comments: arXiv admin note: text overlap with arXiv:1603.01354
    Subjects: Computation and Language (cs.CL); Learning (cs.LG); Machine Learning (stat.ML)

    In this paper, we propose a probabilistic parsing model, which defines a
    proper conditional probability distribution over non-projective dependency
    trees for a given sentence, using neural representations as inputs. The neural
    network architecture is based on bi-directional LSTM-CNNs which benefits from
    both word- and character-level representations automatically, by using
    combination of bidirectional LSTM and CNN. On top of the neural network, we
    introduce a probabilistic structured layer, defining a conditional log-linear
    model over non-projective trees. We evaluate our model on 17 different
    datasets, across 14 different languages. By exploiting Kirchhoff’s Matrix-Tree
    Theorem (Tutte, 1984), the partition functions and marginals can be computed
    efficiently, leading to a straight-forward end-to-end model training procedure
    via back-propagation. Our parser achieves state-of-the-art parsing performance
    on 9 datasets.

    Unsupervised neural and Bayesian models for zero-resource speech processing

    Herman Kamper
    Comments: PhD thesis, University of Edinburgh, 107 pages, submitted and accepted 2016
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    In settings where only unlabelled speech data is available, zero-resource
    speech technology needs to be developed without transcriptions, pronunciation
    dictionaries, or language modelling text. There are two central problems in
    zero-resource speech processing: (i) finding frame-level feature
    representations which make it easier to discriminate between linguistic units
    (phones or words), and (ii) segmenting and clustering unlabelled speech into
    meaningful units. In this thesis, we argue that a combination of top-down and
    bottom-up modelling is advantageous in tackling these two problems.

    To address the problem of frame-level representation learning, we present the
    correspondence autoencoder (cAE), a neural network trained with weak top-down
    supervision from an unsupervised term discovery system. By combining this
    top-down supervision with unsupervised bottom-up initialization, the cAE yields
    much more discriminative features than previous approaches. We then present our
    unsupervised segmental Bayesian model that segments and clusters unlabelled
    speech into hypothesized words. By imposing a consistent top-down segmentation
    while also using bottom-up knowledge from detected syllable boundaries, our
    system outperforms several others on multi-speaker conversational English and
    Xitsonga speech data. Finally, we show that the clusters discovered by the
    segmental Bayesian model can be made less speaker- and gender-specific by using
    features from the cAE instead of traditional acoustic features.

    In summary, the different models and systems presented in this thesis show
    that both top-down and bottom-up modelling can improve representation learning,
    segmentation and clustering of unlabelled speech data.

    Fuzzy Based Implicit Sentiment Analysis on Quantitative Sentences

    Amir Hossein Yazdavar, Monireh Ebrahimi, Naomie Salim
    Comments: Text mining, Natural language processing, Sentiment analysis, Fuzzy set theory
    Subjects: Computation and Language (cs.CL)

    With the rapid growth of social media on the web, emotional polarity
    computation has become a flourishing frontier in the text mining community.
    However, it is challenging to understand the latest trends and summarize the
    state or general opinions about products due to the big diversity and size of
    social media data and this creates the need of automated and real time opinion
    extraction and mining. On the other hand, the bulk of current research has been
    devoted to study the subjective sentences which contain opinion keywords and
    limited work has been reported for objective statements that imply sentiment.
    In this paper, fuzzy based knowledge engineering model has been developed for
    sentiment classification of special group of such sentences including the
    change or deviation from desired range or value. Drug reviews are the rich
    source of such statements. Therefore, in this research, some experiments were
    carried out on patient’s reviews on several different cholesterol lowering
    drugs to determine their sentiment polarity. The main conclusion through this
    study is, in order to increase the accuracy level of existing drug opinion
    mining systems, objective sentences which imply opinion should be taken into
    account. Our experimental results demonstrate that our proposed model obtains
    over 72 percent F1 value.

    World Literature According to Wikipedia: Introduction to a DBpedia-Based Framework

    Christoph Hube, Frank Fischer, Robert Jäschke, Gerhard Lauer, Mads Rosendahl Thomsen
    Comments: 33 pages, 6 figures, 6 tables
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    Among the manifold takes on world literature, it is our goal to contribute to
    the discussion from a digital point of view by analyzing the representation of
    world literature in Wikipedia with its millions of articles in hundreds of
    languages. As a preliminary, we introduce and compare three different
    approaches to identify writers on Wikipedia using data from DBpedia, a
    community project with the goal of extracting and providing structured
    information from Wikipedia. Equipped with our basic set of writers, we analyze
    how they are represented throughout the 15 biggest Wikipedia language versions.
    We combine intrinsic measures (mostly examining the connectedness of articles)
    with extrinsic ones (analyzing how often articles are frequented by readers)
    and develop methods to evaluate our results. The better part of our findings
    seems to convey a rather conservative, old-fashioned version of world
    literature, but a version derived from reproducible facts revealing an implicit
    literary canon based on the editing and reading behavior of millions of people.
    While still having to solve some known issues, the introduced methods will help
    us build an observatory of world literature to further investigate its
    representativeness and biases.


    Distributed, Parallel, and Cluster Computing

    Rollback and Forking Detection for Trusted Execution Environments using Lightweight Collective Memory

    Marcus Brandenburger, Christian Cachin, Matthias Lorenz, Rüdiger Kapitza
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Novel hardware-aided trusted execution environments, as provided by Intel’s
    Software Guard Extensions (SGX), enable to execute applications in a secure
    context that enforces confidentiality and integrity of the application state
    even when the host system is misbehaving. While this paves the way towards
    secure and trustworthy cloud computing, essential system support to protect
    persistent application state against rollback and forking attacks is missing.
    In this paper we present LCM – a lightweight protocol to establish a collective
    memory amongst all clients of a remote application to detect integrity and
    consistency violations. LCM enables the detection of rollback attacks against
    the remote application, enforces the consistency notion of fork-linearizability
    and notifies clients about operation stability. The protocol exploits the
    trusted execution environment, complements it with simple client-side
    operations, and maintains only small, constant storage at the clients. This
    simplifies the solution compared to previous approaches, where the clients had
    to verify all operations initiated by other clients. We have implemented LCM
    and demonstrated its advantages with a key-value store application. The
    evaluation shows that it introduces low network and computation overhead; in
    particular, a LCM-protected key-value store achieves 0.8x – 1.1x of an
    SGX-secured key-value store throughput.

    Is Parallel Programming Hard, And, If So, What Can You Do About It? (v2017.01.02a)

    Paul E. McKenney
    Comments: 477 pages
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    The purpose of this book is to help you program shared-memory parallel
    machines without risking your sanity. We hope that this book’s design
    principles will help you avoid at least some parallel-programming pitfalls.
    That said, you should think of this book as a foundation on which to build,
    rather than as a completed cathedral. Your mission, if you choose to accept, is
    to help make further progress in the exciting field of parallel
    programming–progress that will in time render this book obsolete. Parallel
    programming is not as hard as some say, and we hope that this book makes your
    parallel-programming projects easier and more fun.

    In short, where parallel programming once focused on science, research, and
    grand-challenge projects, it is quickly becoming an engineering discipline. We
    therefore examine specific parallel-programming tasks and describe how to
    approach them. In some surprisingly common cases, they can even be automated.

    This book is written in the hope that presenting the engineering discipline
    underlying successful parallel-programming projects will free a new generation
    of parallel hackers from the need to slowly and painstakingly reinvent old
    wheels, enabling them to instead focus their energy and creativity on new
    frontiers. We sincerely hope that parallel programming brings you at least as
    much fun, excitement, and challenge that it has brought to us!

    Distributed Co-Simulation of Maritime Systems and Operations

    Severin Sadjina, Lars T. Kyllingstad, Martin Rindarøy, Stian Skjong, Vilmar Æsøy, Dariusz Eirik Fathi, Vahid Hassani, Trond Johnsen, Jørgen Bremnes Nielsen, Eilif Pedersen
    Comments: 17 pages, 9 figures
    Subjects: Computational Engineering, Finance, and Science (cs.CE); Distributed, Parallel, and Cluster Computing (cs.DC); Systems and Control (cs.SY)

    Here, we present the concept of an open virtual prototyping framework for
    maritime systems and operations that enables its users to develop re-usable
    component or subsystem models, and combine them in full-system simulations for
    prototyping, verification, training, and performance studies. This framework
    consists of a set of guidelines for model coupling, high-level and low-level
    coupling interfaces to guarantee interoperability, a full-system simulation
    software, and example models and demonstrators. We discuss the requirements for
    such a framework, address the challenges and the possibilities in fulfilling
    them, and aim to give a list of best practices for modular and efficient
    virtual prototyping and full-system simulation. The context of our work is
    within maritime systems and operations, but the issues and solutions we present
    here are general enough to be of interest to a much broader audience, both
    industrial and scientific.


    Learning

    Estimating Quality in User-Guided Multi-Objective Bandits Optimization

    Audrey Durand, Christian Gagné
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    Many real-world applications are characterized by a number of conflicting
    performance measures. As optimizing in a multi-objective setting leads to a set
    of non-dominated solutions, a preference function is required for selecting the
    solution with the appropriate trade-off between the objectives. This preference
    function is often unknown, especially when it comes from an expert human user.
    However, if we could provide the expert user with a proper estimation for each
    action, she would be able to pick her best choice. The question is: how good do
    these estimations have to be in order for her choice to remain the same as if
    she had access to the exact values? In this paper, we introduce the concept of
    preference radius to characterize the robustness of the preference function and
    provide guidelines for controlling the quality of estimations in the
    multi-objective setting. More specifically, we provide a general formulation of
    multi-objective optimization under the bandits setting and the pure exploration
    setting with user feedback for articulating the preferences. We show how the
    preference radius relates to the optimal gap and how it can be used to analyze
    algorithms in the bandits and pure exploration settings. We finally present
    experiments in the bandits setting, where we evaluate the impact of noise and
    delayed expert user feedback, and in the pure exploration setting, where we
    compare multi-objective Thompson sampling with uniform sampling.

    Joint Optimization Projection Matrix and Sparsifying Dictionary via Stochastic Method

    Tao Hong
    Comments: 4 figures, 2 tables
    Subjects: Learning (cs.LG)

    An efficient algorithm is proposed in this letter to optimize the Projection
    Matrix and Sparsifying Dictionary (PMSD) simultaneously on a large training
    dataset through stochastic method. A closed-from solution is derived for
    optimizing the projection matrix with a fixed sparsifying dictionary and the
    stochastic method is used to optimize the sparsifying dictionary with a fixed
    optimized projection matrix on a large training dataset. Benefiting from
    training on a large dataset, the proposed method yields a much better
    performance in terms of signal recovery accuracy than the existing ones. The
    simulation results on natural images demonstrate its effectiveness and
    efficiency.

    Dense Associative Memory is Robust to Adversarial Inputs

    Dmitry Krotov, John J Hopfield
    Subjects: Learning (cs.LG); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Neurons and Cognition (q-bio.NC); Machine Learning (stat.ML)

    Deep neural networks (DNN) trained in a supervised way suffer from two known
    problems. First, the minima of the objective function used in learning
    correspond to data points (also known as rubbish examples or fooling images)
    that lack semantic similarity with the training data. Second, a clean input can
    be changed by a small, and often imperceptible for human vision, perturbation,
    so that the resulting deformed input is misclassified by the network. These
    findings emphasize the differences between the ways DNN and humans classify
    patterns, and raise a question of designing learning algorithms that more
    accurately mimic human perception compared to the existing methods.

    Our paper examines these questions within the framework of Dense Associative
    Memory (DAM) models. These models are defined by the energy function, with
    higher order (higher than quadratic) interactions between the neurons. We show
    that in the limit when the power of the interaction vertex in the energy
    function is sufficiently large, these models have the following three
    properties. First, the minima of the objective function are free from rubbish
    images, so that each minimum is a semantically meaningful pattern. Second,
    artificial patterns poised precisely at the decision boundary look ambiguous to
    human subjects and share aspects of both classes that are separated by that
    decision boundary. Third, adversarial images constructed by models with small
    power of the interaction vertex, which are equivalent to DNN with rectified
    linear units (ReLU), fail to transfer to and fool the models with higher order
    interactions. This opens up a possibility to use higher order models for
    detecting and stopping malicious adversarial attacks. The presented results
    suggest that DAM with higher order energy functions are closer to human visual
    perception than DNN with ReLUs.

    Collapsing of dimensionality

    Marco Gori, Marco Maggini, Alessandro Rossi
    Subjects: Learning (cs.LG)

    We analyze a new approach to Machine Learning coming from a modification of
    classical regularization networks by casting the process in the time dimension,
    leading to a sort of collapse of dimensionality in the problem of learning the
    model parameters. This approach allows the definition of a online learning
    algorithm that progressively accumulates the knowledge provided in the input
    trajectory. The regularization principle leads to a solution based on a
    dynamical system that is paired with a procedure to develop a graph structure
    that stores the input regularities acquired from the temporal evolution. We
    report an extensive experimental exploration on the behavior of the parameter
    of the proposed model and an evaluation on artificial dataset.

    Demystifying Neural Style Transfer

    Yanghao Li, Naiyan Wang, Jiaying Liu, Xiaodi Hou
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Neural Style Transfer has recently demonstrated very exciting results which
    catches eyes in both academia and industry. Despite the amazing results, the
    principle of neural style transfer, especially why the Gram matrices could
    represent style remains unclear. In this paper, we propose a novel
    interpretation of neural style transfer by treating it as a domain adaptation
    problem. Specifically, we theoretically show that matching the Gram matrices of
    feature maps is equivalent to minimize the Maximum Mean Discrepancy (MMD) with
    the second order polynomial kernel. Thus, we argue that the essence of neural
    style transfer is to match the feature distributions between the style images
    and the generated images. To further support our standpoint, we experiment with
    several other distribution alignment methods, and achieve appealing results. We
    believe this novel interpretation connects these two important research fields,
    and could enlighten future researches.

    An Interval-Based Bayesian Generative Model for Human Complex Activity Recognition

    Li Liu, Yongzhong Yang, Lakshmi Narasimhan Govindarajan, Shu Wang, Bin Hu, Li Cheng, David S. Rosenblum
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Complex activity recognition is challenging due to the inherent uncertainty
    and diversity of performing a complex activity. Normally, each instance of a
    complex activity has its own configuration of atomic actions and their temporal
    dependencies. We propose in this paper an atomic action-based Bayesian model
    that constructs Allen’s interval relation networks to characterize complex
    activities with structural varieties in a probabilistic generative way: By
    introducing latent variables from the Chinese restaurant process, our approach
    is able to capture all possible styles of a particular complex activity as a
    unique set of distributions over atomic actions and relations. We also show
    that local temporal dependencies can be retained and are globally consistent in
    the resulting interval network. Moreover, network structure can be learned from
    empirical data. A new dataset of complex hand activities has been constructed
    and made publicly available, which is much larger in size than any existing
    datasets. Empirical evaluations on benchmark datasets as well as our in-house
    dataset demonstrate the competitiveness of our approach.

    On the Usability of Probably Approximately Correct Implication Bases

    Daniel Borchmann, Tom Hanika, Sergei Obiedkov
    Subjects: Logic in Computer Science (cs.LO); Artificial Intelligence (cs.AI); Learning (cs.LG)

    We revisit the notion of probably approximately correct implication bases
    from the literature and present a first formulation in the language of formal
    concept analysis, with the goal to investigate whether such bases represent a
    suitable substitute for exact implication bases in practical use-cases. To this
    end, we quantitatively examine the behavior of probably approximately correct
    implication bases on artificial and real-world data sets and compare their
    precision and recall with respect to their corresponding exact implication
    bases. Using a small example, we also provide qualitative insight that
    implications from probably approximately correct bases can still represent
    meaningful knowledge from a given data set.

    Neural Probabilistic Model for Non-projective MST Parsing

    Xuezhe Ma, Eduard Hovy
    Comments: arXiv admin note: text overlap with arXiv:1603.01354
    Subjects: Computation and Language (cs.CL); Learning (cs.LG); Machine Learning (stat.ML)

    In this paper, we propose a probabilistic parsing model, which defines a
    proper conditional probability distribution over non-projective dependency
    trees for a given sentence, using neural representations as inputs. The neural
    network architecture is based on bi-directional LSTM-CNNs which benefits from
    both word- and character-level representations automatically, by using
    combination of bidirectional LSTM and CNN. On top of the neural network, we
    introduce a probabilistic structured layer, defining a conditional log-linear
    model over non-projective trees. We evaluate our model on 17 different
    datasets, across 14 different languages. By exploiting Kirchhoff’s Matrix-Tree
    Theorem (Tutte, 1984), the partition functions and marginals can be computed
    efficiently, leading to a straight-forward end-to-end model training procedure
    via back-propagation. Our parser achieves state-of-the-art parsing performance
    on 9 datasets.

    Unsupervised neural and Bayesian models for zero-resource speech processing

    Herman Kamper
    Comments: PhD thesis, University of Edinburgh, 107 pages, submitted and accepted 2016
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    In settings where only unlabelled speech data is available, zero-resource
    speech technology needs to be developed without transcriptions, pronunciation
    dictionaries, or language modelling text. There are two central problems in
    zero-resource speech processing: (i) finding frame-level feature
    representations which make it easier to discriminate between linguistic units
    (phones or words), and (ii) segmenting and clustering unlabelled speech into
    meaningful units. In this thesis, we argue that a combination of top-down and
    bottom-up modelling is advantageous in tackling these two problems.

    To address the problem of frame-level representation learning, we present the
    correspondence autoencoder (cAE), a neural network trained with weak top-down
    supervision from an unsupervised term discovery system. By combining this
    top-down supervision with unsupervised bottom-up initialization, the cAE yields
    much more discriminative features than previous approaches. We then present our
    unsupervised segmental Bayesian model that segments and clusters unlabelled
    speech into hypothesized words. By imposing a consistent top-down segmentation
    while also using bottom-up knowledge from detected syllable boundaries, our
    system outperforms several others on multi-speaker conversational English and
    Xitsonga speech data. Finally, we show that the clusters discovered by the
    segmental Bayesian model can be made less speaker- and gender-specific by using
    features from the cAE instead of traditional acoustic features.

    In summary, the different models and systems presented in this thesis show
    that both top-down and bottom-up modelling can improve representation learning,
    segmentation and clustering of unlabelled speech data.

    Clustering Signed Networks with the Geometric Mean of Laplacians

    Pedro Mercado, Francesco Tudisco, Matthias Hein
    Comments: 14 pages, 5 figures. Accepted in Neural Information Processing Systems (NIPS), 2016
    Journal-ref: Advances in Neural Information Processing Systems 29,
    pp.4421–4429, 2016
    Subjects: Machine Learning (stat.ML); Learning (cs.LG); Numerical Analysis (math.NA)

    Signed networks allow to model positive and negative relationships. We
    analyze existing extensions of spectral clustering to signed networks. It turns
    out that existing approaches do not recover the ground truth clustering in
    several situations where either the positive or the negative network structures
    contain no noise. Our analysis shows that these problems arise as existing
    approaches take some form of arithmetic mean of the Laplacians of the positive
    and negative part. As a solution we propose to use the geometric mean of the
    Laplacians of positive and negative part and show that it outperforms the
    existing approaches. While the geometric mean of matrices is computationally
    expensive, we show that eigenvectors of the geometric mean can be computed
    efficiently, leading to a numerical scheme for sparse matrices which is of
    independent interest.

    Identification of Cancer Patient Subgroups via Smoothed Shortest Path Graph Kernel

    Ali Burak Ünal, Öznur Taştan
    Comments: NIPS Workshop on Machine Learning in Computational Biology, Barcelona, Spain, December 10, 2016
    Subjects: Computational Engineering, Finance, and Science (cs.CE); Learning (cs.LG)

    Characterizing patient somatic mutations through next-generation sequencing
    technologies opens up possibilities for refining cancer subtypes. However,
    catalogues of mutations reveal that only a small fraction of the genes are
    altered frequently in patients. On the other hand different genomic alterations
    may perturb the same pathways. We propose a novel clustering procedure that
    quantifies the similarities of patients from their mutational profile on
    pathways via a novel graph kernel. We represent each KEGG pathway as an
    undirected graph. For each patient the vertex labels are assigned based on her
    altered genes. Smoothed shortest path graph kernel (smSPK) evaluates each pair
    of patients by comparing their vertex labeled pathway graphs. Our clustering
    procedure involves two steps: the smSPK kernel matrix derived for each pathway
    are input to kernel k-means algorithm and each pathway is evaluated
    individually. In the next step, only those pathways that are successful are
    combined in to a single kernel input to kernel k-means to stratify patients.
    Evaluating the procedure on simulated data showed that smSPK clusters patients
    up to 88\% accuracy. Finally to identify ovarian cancer patient subgroups, we
    apply our methodology to the cancer genome atlas ovarian data that involves 481
    patients. The identified subgroups are evaluated through survival analysis.
    Grouping patients into four clusters results with patients groups that are
    significantly different in their survival times ((p)-value (le 0.005)).


    Information Theory

    Minimax Rényi Redundancy

    Semih Yagli, Yücel Altuğ, Sergio Verdú
    Comments: 34 Pages, Submitted to IEEE Transactions on Information Theory
    Subjects: Information Theory (cs.IT)

    The redundancy for universal lossless compression in Campbell’s setting is
    characterized as a minimax R’enyi divergence, which is shown to be equal to
    the maximal (alpha)-mutual information via a generalized redundancy-capacity
    theorem. Special attention is placed on the analysis of the asymptotics of
    minimax R’enyi divergence, which is determined up to a term vanishing in
    blocklength.

    Mobile Edge Computing: Survey and Research Outlook

    Yuyi Mao, Changsheng You, Jun Zhang, Kaibin Huang, Khaled B. Letaief
    Comments: 30 pages, double column, submitted to IEEE Commun. Surveys Tuts
    Subjects: Information Theory (cs.IT)

    Driven by the visions of Internet of Things and 5G communications, recent
    years have seen a paradigm shift in mobile computing, from the centralized
    Mobile Cloud Computing towards Mobile Edge Computing (MEC). The main feature of
    MEC is to push mobile computing, network control and storage to the network
    edges (e.g., base stations and access points) so as to enable
    computation-intensive and latency-critical applications at the resource-limited
    mobile devices. MEC promises dramatic reduction in latency and mobile energy
    consumption, tackling the key challenges for materializing 5G vision. The
    promised gains of MEC have motivated extensive efforts in both academia and
    industry on developing the technology. A main thrust of MEC research is to
    seamlessly merge the two disciplines of wireless communications and mobile
    computing, resulting in a wide-range of new designs ranging from techniques for
    computation offloading to network architectures. This paper provides a
    comprehensive survey of the state-of-the-art MEC research with a focus on joint
    radio-and-computational resource management. We also present a research outlook
    consisting of a set of promising directions for MEC research, including MEC
    system deployment, cache-enabled MEC, mobility management for MEC, green MEC,
    as well as privacy-aware MEC. Advancements in these directions will facilitate
    the transformation of MEC from theory to practice. Finally, we introduce recent
    standardization efforts on MEC as well as some typical MEC application
    scenarios.

    New descriptions of the weighted Reed-Muller codes and the homogeneous Reed-Muller codes

    Harinaivo Andriatahiny, Vololona Harinoro Rakotomalala
    Subjects: Information Theory (cs.IT)

    We give a description of the weighted Reed-Muller codes over a prime field in
    a modular algebra. A description of the homogeneous Reed-Muller codes in the
    same ambient space is presented for the binary case. A decoding procedure using
    the Landrock-Manz method is developed.

    A Generalization of Quasi-twisted Codes: Multi-twisted codes

    Nuh Aydin, Ajdin Halilovic
    Subjects: Information Theory (cs.IT)

    Cyclic codes and their various generalizations, such as quasi-twisted (QT)
    codes, have a special place in algebraic coding theory. Among other things,
    many of the best-known or optimal codes have been obtained from these classes.
    In this work we introduce a new generalization of QT codes that we call
    multi-twisted (MT) codes and study some of their basic properties. Presenting
    several methods of constructing codes in this class and obtaining bounds on the
    minimum distances, we show that there exist codes with good parameters in this
    class that cannot be obtained as QT or constacyclic codes. This suggests that
    considering this larger class in computer searches is promising for
    constructing codes with better parameters than currently best-known linear
    codes. Working with this new class of codes motivated us to consider a problem
    about binomials over finite fields and to discover a result that is interesting
    in its own right.

    Non-linear Cyclic Codes that Attain the Gilbert-Varshamov Bound

    Ishay Haviv, Michael Langberg, Moshe Schwartz, Eitan Yaakobi
    Subjects: Information Theory (cs.IT); Discrete Mathematics (cs.DM); Combinatorics (math.CO)

    We prove that there exist non-linear binary cyclic codes that attain the
    Gilbert-Varshamov bound.

    Divergence and Sufficiency for Convex Optimization

    Peter Harremoës
    Comments: 30 pages, 3 figures
    Subjects: Information Theory (cs.IT); Statistical Mechanics (cond-mat.stat-mech); Optimization and Control (math.OC)

    Logarithmic score and information divergence appear in both information
    theory, statistics, statistical mechanics, and portfolio theory. We demonstrate
    that all these topics involve some kind of optimization that leads directly to
    the use of Bregman divergences. If the Bregman divergence also fulfills a
    sufficiency condition it must be proportional to information divergence. We
    will demonstrate that sufficiency is equivalent to the apparently weaker notion
    of locality and it is also equivalent to the apparently stronger notion of
    monotonicity. These sufficiency conditions have quite different relevance in
    the different areas of application, and often they are not fulfilled. Therefore
    sufficiency conditions can be used to explain when results from one area can be
    transferred directly to another and when one will experience differences.

    Single Letter Expression of Capacity for a Class of Channels with Memory

    Christos K. Kourtellaris, Charalambos D. Charalambous, Ioannis Tzortzis
    Comments: submitted to IEEE Transactions on Information Theory, Paper no. IT-16-0909
    Subjects: Information Theory (cs.IT)

    We study finite alphabet channels with Unit Memory on the previous Channel
    Outputs called UMCO channels. We identify necessary and sufficient conditions,
    to test whether the capacity achieving channel input distributions with
    feedback are time-invariant, and whether feedback capacity is characterized by
    single letter, expressions, similar to that of memoryless channels. The method
    is based on showing that a certain dynamic programming equation, which in
    general, is a nested optimization problem over the sequence of channel input
    distributions, reduces to a non-nested optimization problem. Moreover, for UMCO
    channels, we give a simple expression for the ML error exponent, and we
    identify sufficient conditions to test whether feedback does not increase
    capacity. We derive similar results, when transmission cost constraints are
    imposed. We apply the results to a special class of the UMCO channels, the
    Binary State Symmetric Channel (BSSC) with and without transmission cost
    constraints, to show that the optimization problem of feedback capacity is
    non-nested, the capacity achieving channel input distribution and the
    corresponding channel output transition probability distribution are
    time-invariant, and feedback capacity is characterized by a single letter
    formulae, precisely as Shannon’s single letter characterization of capacity of
    memoryless channels. Then we derive closed form expressions for the capacity
    achieving channel input distribution and feedback capacity. We use the closed
    form expressions to evaluate an error exponent for ML decoding.

    Secrecy Outage Analysis for Downlink Transmissions in the Presence of Randomly Located Eavesdroppers

    Gaojie Chen, Justin P. Coon, Marco Di Renzo
    Subjects: Information Theory (cs.IT); Cryptography and Security (cs.CR)

    We analyze the secrecy outage probability in the downlink for wireless
    networks with spatially (Poisson) distributed eavesdroppers (EDs) under the
    assumption that the base station employs transmit antenna selection (TAS) to
    enhance secrecy performance. We compare the cases where the receiving user
    equipment (UE) operates in half-duplex (HD) mode and full-duplex (FD) mode. In
    the latter case, the UE simultaneously receives the intended downlink message
    and transmits a jamming signal to strengthen secrecy. We investigate two models
    of (semi)passive eavesdropping: (1) EDs act independently and (2) EDs collude
    to intercept the transmitted message. For both of these models, we obtain
    expressions for the secrecy outage probability in the downlink for HD and FD UE
    operation. The expressions for HD systems have very accurate approximate or
    exact forms in terms of elementary and/or special functions for all path loss
    exponents. Those related to the FD systems have exact integral forms for
    general path loss exponents, while exact closed forms are given for specific
    exponents. A closed-form approximation is also derived for the FD case with
    colluding EDs. The resulting analysis shows that the reduction in the secrecy
    outage probability is logarithmic in the number of antennas used for TAS and
    identifies conditions under which HD operation should be used instead of FD
    jamming at the UE. These performance trends and exact relations between system
    parameters can be used to develop adaptive power allocation and duplex
    operation methods in practice. Examples of such techniques are alluded to
    herein.

    STARIMA-based Traffic Prediction with Time-varying Lags

    Peibo Duan, Guoqiang Mao, Shangbo Wang, Changsheng Zhang, Bin Zhang
    Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

    Based on the observation that the correlation between observed traffic at two
    measurement points or traffic stations may be time-varying, attributable to the
    time-varying speed which subsequently causes variations in the time required to
    travel between the two points, in this paper, we develop a modified Space-Time
    Autoregressive Integrated Moving Average (STARIMA) model with time-varying lags
    for short-term traffic flow prediction. Particularly, the temporal lags in the
    modified STARIMA change with the time-varying speed at different time of the
    day or equivalently change with the (time-varying) time required to travel
    between two measurement points. Firstly, a technique is developed to evaluate
    the temporal lag in the STARIMA model, where the temporal lag is formulated as
    a function of the spatial lag (spatial distance) and the average speed.
    Secondly, an unsupervised classification algorithm based on ISODATA algorithm
    is designed to classify different time periods of the day according to the
    variation of the speed. The classification helps to determine the appropriate
    time lag to use in the STARIMA model. Finally, a STARIMA-based model with
    time-varying lags is developed for short-term traffic prediction. Experimental
    results using real traffic data show that the developed STARIMA-based model
    with time-varying lags has superior accuracy compared with its counterpart
    developed using the traditional cross-correlation function and without
    employing time-varying lags.

    Estimation of block sparsity in compressive sensing

    Zhiyong Zhou, Jun Yu
    Subjects: Applications (stat.AP); Information Theory (cs.IT)

    In this paper, we consider a soft measure of block sparsity,
    (k_alpha(mathbf{x})=left(lVertmathbf{x}
    Vert_{2,alpha}/lVertmathbf{x}
    Vert_{2,1}
    ight)^{frac{alpha}{1-alpha}},alphain[0,infty])
    and propose a procedure to estimate it by using multivariate isotropic
    symmetric (alpha)-stable random projections without sparsity or block sparsity
    assumptions. The limiting distribution of the estimator is given. Some
    simulations are conducted to illustrate our theoretical results.

    Constrained Low-rank Matrix Estimation: Phase Transitions, Approximate Message Passing and Applications

    Thibault Lesieur, Florent Krzakala, Lenka Zdeborová
    Comments: 64 pages, 12 figures
    Subjects: Statistics Theory (math.ST); Statistical Mechanics (cond-mat.stat-mech); Information Theory (cs.IT)

    This article is an extended version of previous work of the authors [40, 41]
    on low-rank matrix estimation in the presence of constraints on the factors
    into which the matrix is factorized. Low-rank matrix factorization is one of
    the basic methods used in data analysis for unsupervised learning of relevant
    features and other types of dimensionality reduction. We present a framework to
    study the constrained low-rank matrix estimation for a general prior on the
    factors, and a general output channel through which the matrix is observed. We
    draw a paralel with the study of vector-spin glass models – presenting a
    unifying way to study a number of problems considered previously in separate
    statistical physics works. We present a number of applications for the problem
    in data analysis. We derive in detail a general form of the low-rank
    approximate message passing (Low- RAMP) algorithm, that is known in statistical
    physics as the TAP equations. We thus unify the derivation of the TAP equations
    for models as different as the Sherrington-Kirkpatrick model, the restricted
    Boltzmann machine, the Hopfield model or vector (xy, Heisenberg and other) spin
    glasses. The state evolution of the Low-RAMP algorithm is also derived, and is
    equivalent to the replica symmetric solution for the large class of vector-spin
    glass models. In the section devoted to result we study in detail phase
    diagrams and phase transitions for the Bayes-optimal inference in low-rank
    matrix estimation. We present a typology of phase transitions and their
    relation to performance of algorithms such as the Low-RAMP or commonly used
    spectral methods.




沪ICP备19023445号-2号
友情链接