IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Fri, 27 Jan 2017

    我爱机器学习(52ml.net)发表于 2017-01-27 00:00:00
    love 0

    Computer Vision and Pattern Recognition

    Pose Invariant Embedding for Deep Person Re-identification

    Liang Zheng, Yujia Huang, Huchuan Lu, Yi Yang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Pedestrian misalignment, which mainly arises from detector errors and pose
    variations, is a critical problem for a robust person re-identification (re-ID)
    system. With bad alignment, the background noise will significantly compromise
    the feature learning and matching process. To address this problem, this paper
    introduces the pose invariant embedding (PIE) as a pedestrian descriptor.
    First, in order to align pedestrians to a standard pose, the PoseBox structure
    is introduced, which is generated through pose estimation followed by affine
    transformations. Second, to reduce the impact of pose estimation errors and
    information loss during PoseBox construction, we design a PoseBox fusion (PBF)
    CNN architecture that takes the original image, the PoseBox, and the pose
    estimation confidence as input. The proposed PIE descriptor is thus defined as
    the fully connected layer of the PBF network for the retrieval task.
    Experiments are conducted on the Market-1501, CUHK03, and VIPeR datasets. We
    show that PoseBox alone yields decent re-ID accuracy and that when integrated
    in the PBF network, the learned PIE descriptor produces competitive performance
    compared with the state-of-the-art approaches.

    Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in vitro

    Zhedong Zheng, Liang Zheng, Yi Yang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we mainly contribute a simple semi-supervised pipeline which
    only uses the original training set without extra data collection. It is
    challenging in 1) how to obtain more training data only from the training set
    and 2) how to use the newly generated data. In this work, the generative
    adversarial networks (GANs) are used to generate unlabeled samples. We propose
    the label smoothing regularization for outliers (LSRO) scheme. It assigns a
    uniform label distribution to the unlabeled images, which regularizes the
    supervised model and improves a ResNet baseline.

    We verify the proposed method on a practical task: person re-identification
    (re-ID). This task aims to retrieve the query person from other cameras. We
    adopt DCGAN for sample generation, and a baseline convolutional neural network
    (CNN) for embedding learning. In our experiment, we show that adding the
    GAN-generated data effectively improves the discriminative ability of the
    learned feature embedding. We evaluate the re-ID performance on two large-scale
    datasets: Market1501 and CUHK03. We obtain +4.37% and +1.6% improvement in
    rank-1 precision over the CNN baseline on Market1501 and CUHK03, respectively.

    Super-resolution Using Constrained Deep Texture Synthesis

    Libin Sun, James Hays
    Comments: 13 pages, 11 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Hallucinating high frequency image details in single image super-resolution
    is a challenging task. Traditional super-resolution methods tend to produce
    oversmoothed output images due to the ambiguity in mapping between low and high
    resolution patches. We build on recent success in deep learning based texture
    synthesis and show that this rich feature space can facilitate successful
    transfer and synthesis of high frequency image details to improve the visual
    quality of super-resolution results on a wide variety of natural textures and
    images.

    Sparse Ternary Codes for similarity search have higher coding gain than dense binary codes

    Sohrab Ferdowsi, Slava Voloshynovskiy, Dimche Kostadinov, Taras Holotyak
    Comments: Submitted to ISIT 2017
    Subjects: Information Theory (cs.IT); Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR)

    This paper addresses the problem of Approximate Nearest Neighbor (ANN) search
    in pattern recognition where feature vectors in a database are encoded as
    compact codes in order to speed-up the similarity search in large-scale
    databases. Considering the ANN problem from an information-theoretic
    perspective, we interpret it as an encoding which maps the original feature
    vectors to a less-entropic sparse representation while requiring them to be as
    informative as possible. We then define the coding gain for ANN search using
    information-theoretic measures. We next show that the classical approach to
    this problem which consists of binarization of the projected vectors is
    sub-optimal. Instead, we show that a recently proposed ternary encoding
    achieves higher coding gains.

    Learning Word-Like Units from Joint Audio-Visual Analysis

    David Harwath, James R. Glass
    Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)

    Given a collection of images and spoken audio captions, we present a method
    for discovering word-like acoustic units in the continuous speech signal and
    grounding them to semantically relevant image regions. For example, our model
    is able to detect spoken instances of the word ‘lighthouse’ within an utterance
    and associate them with image regions containing lighthouses. We do not use any
    form of conventional automatic speech recognition, nor do we use any text
    transcriptions or conventional linguistic annotations. Our model effectively
    implements a form of spoken language acquisition, in which the computer learns
    not only to recognize word categories by sound, but also to enrich the words it
    learns with semantics by grounding them in images.


    Artificial Intelligence

    Ethical Considerations in Artificial Intelligence Courses

    Emanuelle Burton, Judy Goldsmith, Sven Koenig, Benjamin Kuipers, Nicholas Mattei, Toby Walsh
    Comments: 29 pages including all case studies and links to video media on YouTube
    Subjects: Artificial Intelligence (cs.AI); Computers and Society (cs.CY); General Literature (cs.GL)

    The recent surge in interest in ethics in artificial intelligence may leave
    many educators wondering how to address moral, ethical, and philosophical
    issues in their AI courses. As instructors we want to develop curriculum that
    not only prepares students to be artificial intelligence practitioners, but
    also to understand the moral, ethical, and philosophical impacts that
    artificial intelligence will have on society. In this article we provide
    practical case studies and links to resources for use by AI educators. We also
    provide concrete suggestions on how to integrate AI ethics into a general
    artificial intelligence course and how to teach a stand-alone artificial
    intelligence ethics course.

    Dynamic time warping distance for message propagation classification in Twitter

    Siwar Jendoubi, Arnaud Martin, Ludovic Liétard, Boutheina Ben Yaghlane, Hend Ben Hadji
    Comments: 10 pages, 1 figure ECSQARU 2015, Proceedings of the 13th European Conferences on Symbolic and Quantitative Approaches to Reasoning with Uncertainty, 2015
    Subjects: Artificial Intelligence (cs.AI); Social and Information Networks (cs.SI); Machine Learning (stat.ML)

    Social messages classification is a research domain that has attracted the
    attention of many researchers in these last years. Indeed, the social message
    is different from ordinary text because it has some special characteristics
    like its shortness. Then the development of new approaches for the processing
    of the social message is now essential to make its classification more
    efficient. In this paper, we are mainly interested in the classification of
    social messages based on their spreading on online social networks (OSN). We
    proposed a new distance metric based on the Dynamic Time Warping distance and
    we use it with the probabilistic and the evidential k Nearest Neighbors (k-NN)
    classifiers to classify propagation networks (PrNets) of messages. The
    propagation network is a directed acyclic graph (DAG) that is used to record
    propagation traces of the message, the traversed links and their types. We
    tested the proposed metric with the chosen k-NN classifiers on real world
    propagation traces that were collected from Twitter social network and we got
    good classification accuracies.

    Identifying Consistent Statements about Numerical Data with Dispersion-Corrected Subgroup Discovery

    Mario Boley, Bryan R. Goldsmith, Luca M. Ghiringhelli, Jilles Vreeken
    Subjects: Artificial Intelligence (cs.AI); Databases (cs.DB)

    Existing algorithms for subgroup discovery with numerical targets do not
    optimize the error or target variable dispersion of the groups they find. This
    often leads to unreliable or inconsistent statements about the data, rendering
    practical applications, especially in scientific domains, futile. Therefore, we
    here extend the optimistic estimator framework for optimal subgroup discovery
    to a new class of objective functions: we show how tight estimators can be
    computed efficiently for all functions that are determined by subgroup size
    (non-decreasing dependence), the subgroup median value, and a dispersion
    measure around the median (non-increasing dependence). In the important special
    case when dispersion is measured using the average absolute deviation from the
    median, this novel approach yields a linear time algorithm. Empirical
    evaluation on a wide range of datasets shows that, when used within
    branch-and-bound search, this approach is highly efficient and indeed discovers
    subgroups with much smaller errors.

    Logic Programming Petri Nets

    Giovanni Sileno
    Comments: draft version
    Subjects: Artificial Intelligence (cs.AI)

    With the purpose of modeling, specifying and reasoning in an integrated
    fashion with procedural and declarative aspects (both commonly present in cases
    or scenarios), the paper introduces Logic Programming Petri Nets (LPPN), an
    extension to the Petri Net notation providing an interface to logic programming
    constructs. Two semantics are presented. First, a hybrid operational semantics
    that separates the process component, treated with Petri nets, from the
    constraint/terminological component, treated with Answer Set Programming (ASP).
    Second, a denotational semantics maps the notation to ASP fully, via Event
    Calculus. These two alternative specifications enable a preliminary evaluation
    in terms of reasoning efficiency.


    Information Retrieval

    Learning to Effectively Select Topics For Information Retrieval Test Collections

    Mucahid Kutlu, Tamer Elsayed, Matthew Lease
    Subjects: Information Retrieval (cs.IR)

    Employing test collections is a common way to evaluate the effectiveness of
    the information retrieval systems. However, due to high cost of constructing
    test collections, many researchers have proposed new methods to reduce this
    cost. Guiver, Mizzaro, and Robertson [19] show that some topics are better than
    others in terms of evaluation. Inspired by their work, we focus on finding a
    good subset of topics from the given topic pool. We develop a learning-to-rank
    based topic selection method. In our experiments with TREC Robust 2003 and
    Robust 2004 test collections, we show that we are able to better select topics
    with our approach vs. prior work. We also compare deep and narrow vs. wide and
    shallow judging in terms of evaluation reliability and reusability. When we
    select topics randomly, we find that shallow judging is preferable, as
    confirming previous work. However, if we select topics intelligently, we are
    able to increase reliability and reusability of test collections by reducing
    topics while using more judgments per topic.

    Match-Tensor: a Deep Relevance Model for Search

    Aaron Jaech, Hetunandan Kamisetty, Eric Ringger, Charlie Clarke
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    The application of Deep Neural Networks for ranking in search engines may
    obviate the need for the extensive feature engineering common to current
    learning-to-rank methods. However, we show that combining simple relevance
    matching features like BM25 with existing Deep Neural Net models often
    substantially improves the accuracy of these models, indicating that they do
    not capture essential local relevance matching signals. We describe a novel
    deep Recurrent Neural Net-based model that we call Match-Tensor. The
    architecture of the Match-Tensor model simultaneously accounts for both local
    relevance matching and global topicality signals allowing for a rich interplay
    between them when computing the relevance of a document to a query. On a large
    held-out test set consisting of social media documents, we demonstrate not only
    that Match-Tensor outperforms BM25 and other classes of DNNs but also that it
    largely subsumes signals present in these models.

    Private Information Retrieval from MDS Coded Data with Colluding Servers: Settling a Conjecture by Freij-Hollanti et al

    Hua Sun, Syed A. Jafar
    Subjects: Information Theory (cs.IT); Cryptography and Security (cs.CR); Information Retrieval (cs.IR)

    A ((K, N, T, K_c)) instance of the MDS-TPIR problem is comprised of (K)
    messages and (N) distributed servers. Each message is separately encoded
    through a ((K_c, N)) MDS storage code. A user wishes to retrieve one message,
    as efficiently as possible, while revealing no information about the desired
    message index to any colluding set of up to (T) servers. The fundamental limit
    on the efficiency of retrieval, i.e., the capacity of MDS-TPIR is known only at
    the extremes where either (T) or (K_c) belongs to ({1,N}). The focus of this
    work is a recent conjecture by Freij-Hollanti, Gnilke, Hollanti and Karpuk
    which offers a general capacity expression for MDS-TPIR. We prove that the
    conjecture is false by presenting as a counterexample a PIR scheme for the
    setting ((K, N, T, K_c) = (2,4,2,2)), which achieves the rate (3/5), exceeding
    the conjectured capacity, (4/7). Insights from the counterexample lead us to
    capacity characterizations for various instances of MDS-TPIR including all
    cases with ((K, N, T, K_c) = (2,N,T,N-1)), where (N) and (T) can be arbitrary.

    Sparse Ternary Codes for similarity search have higher coding gain than dense binary codes

    Sohrab Ferdowsi, Slava Voloshynovskiy, Dimche Kostadinov, Taras Holotyak
    Comments: Submitted to ISIT 2017
    Subjects: Information Theory (cs.IT); Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR)

    This paper addresses the problem of Approximate Nearest Neighbor (ANN) search
    in pattern recognition where feature vectors in a database are encoded as
    compact codes in order to speed-up the similarity search in large-scale
    databases. Considering the ANN problem from an information-theoretic
    perspective, we interpret it as an encoding which maps the original feature
    vectors to a less-entropic sparse representation while requiring them to be as
    informative as possible. We then define the coding gain for ANN search using
    information-theoretic measures. We next show that the classical approach to
    this problem which consists of binarization of the projected vectors is
    sub-optimal. Instead, we show that a recently proposed ternary encoding
    achieves higher coding gains.


    Computation and Language

    Learning Word-Like Units from Joint Audio-Visual Analysis

    David Harwath, James R. Glass
    Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)

    Given a collection of images and spoken audio captions, we present a method
    for discovering word-like acoustic units in the continuous speech signal and
    grounding them to semantically relevant image regions. For example, our model
    is able to detect spoken instances of the word ‘lighthouse’ within an utterance
    and associate them with image regions containing lighthouses. We do not use any
    form of conventional automatic speech recognition, nor do we use any text
    transcriptions or conventional linguistic annotations. Our model effectively
    implements a form of spoken language acquisition, in which the computer learns
    not only to recognize word categories by sound, but also to enrich the words it
    learns with semantics by grounding them in images.

    Match-Tensor: a Deep Relevance Model for Search

    Aaron Jaech, Hetunandan Kamisetty, Eric Ringger, Charlie Clarke
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    The application of Deep Neural Networks for ranking in search engines may
    obviate the need for the extensive feature engineering common to current
    learning-to-rank methods. However, we show that combining simple relevance
    matching features like BM25 with existing Deep Neural Net models often
    substantially improves the accuracy of these models, indicating that they do
    not capture essential local relevance matching signals. We describe a novel
    deep Recurrent Neural Net-based model that we call Match-Tensor. The
    architecture of the Match-Tensor model simultaneously accounts for both local
    relevance matching and global topicality signals allowing for a rich interplay
    between them when computing the relevance of a document to a query. On a large
    held-out test set consisting of social media documents, we demonstrate not only
    that Match-Tensor outperforms BM25 and other classes of DNNs but also that it
    largely subsumes signals present in these models.


    Distributed, Parallel, and Cluster Computing

    On the Design of Distributed Programming Models

    Christopher S. Meiklejohn
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Programming large-scale distributed applications requires new abstractions
    and models to be done well. We demonstrate that these models are possible.

    Following from both the FLP result and CAP theorem, we show that concurrent
    programming models are necessary, but not sufficient, in the construction of
    large-scale distributed systems because of the problem of failure and network
    partitions: languages need to be able to capture and encode the tradeoffs
    between consistency and availability.

    We demonstrate two programming models, Lasp and Spry, each of which makes a
    different trade-off with respects to availability. Lasp, preserving
    availability and sacrificing consistency supports the design of convergent,
    correct-by-construction applications. Spry, sacrificing both availability and
    consistency on a per-call basis, allows declarative specification of
    application-level service level agreements.

    ACIA, not ACID: Conditions, Properties and Challenges

    Yuqing Zhu, Jianxun Liu, Mengying Guo, Wenlong Ma, Guolei Yi, Yungang Bao
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Although ACID is the previous golden rule for transaction support, durability
    is now not a basic requirement for data storage. Rather, high availability is
    becoming the first-class property required by online applications. We show that
    high availability of data is almost surely a stronger property than durability.
    We thus propose ACIA (Atomic, Consistency, Isolation, Availability) as the new
    standard for transaction support. Essentially, the shift from ACID to ACIA is
    due to the change of assumed conditions for data management. Four major
    condition changes exist. With ACIA transactions, more diverse requirements can
    be flexibly supported for applications through the specification of consistency
    levels, isolation levels and fault tolerance levels. Clarifying the ACIA
    properties enables the exploitation of techniques used for ACID transactions,
    as well as bringing about new challenges for research.

    Scalable Architecture for Anomaly Detection and Visualization in Power Generating Assets

    Paras Jain, Chirag Tailor, Sam Ford, Liexiao Ding, Michael Phillips, Fang Liu, Nagi Gebraeel, Duen Horng Chau
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Power-generating assets (e.g., jet engines, gas turbines) are often
    instrumented with tens to hundreds of sensors for monitoring physical and
    performance degradation. Anomaly detection algorithms highlight deviations from
    predetermined benchmarks with the goal of detecting incipient faults. We are
    developing an integrated system to address three key challenges within
    analyzing sensor data from power-generating assets: (1) difficulty in ingesting
    and analyzing data from large numbers of machines; (2) prevalence of false
    alarms generated by anomaly detection algorithms resulting in unnecessary
    downtime and maintenance; and (3) lack of an integrated visualization that
    helps users understand and explore the flagged anomalies and relevant sensor
    context in the energy domain. We present preliminary results and our key
    findings in addressing these challenges. Our system’s scalable event ingestion
    framework, based on OpenTSDB, ingests nearly 400,000 sensor data samples per
    seconds using a 30 machine cluster. To reduce false alarm rates, we leverage
    the False Discovery Rate (FDR) algorithm which significantly reduces the number
    of false alarms. Our visualization tool presents the anomalies and associated
    content flagged by the FDR algorithm to inform users and practitioners in their
    decision making process. We believe our integrated platform will help reduce
    maintenance costs significantly while increasing asset lifespan. We are working
    to extend our system on multiple fronts, such as scaling to more data and more
    compute nodes (70 in total).

    Two-Party Function Computation on the Reconciled Data

    Ivo Kubjas, Vitaly Skachek
    Comments: Submitted to ISIT 2017
    Subjects: Information Theory (cs.IT); Distributed, Parallel, and Cluster Computing (cs.DC)

    Assume a distributed system with two users, each user possesses a collection
    of binary strings. We introduce a new problem termed function computation on
    the reconciled data, which generalizes a set reconciliation problem in the
    literature. It is shown that any deterministic protocol that computes a sum and
    a product of reconciled sets of nonnegative integers has to communicate at
    least (2^n + n – 1) and (2^n + n – 3) bits in the worst-case scenario,
    respectively, where (n) is the length of the binary string representations of
    the numbers. Connections to other problems in computer science, such as set
    disjointness and finding the intersection, are established, yielding a variety
    of additional bounds on the communication complexity. A protocol, which is
    based on use of a family of hash functions is presented, and its
    characteristics are analyzed.


    Learning

    Riemannian-geometry-based modeling and clustering of network-wide non-stationary time series: The brain-network case

    Konstantinos Slavakis, Shiva Salsabilian, David S. Wack, Sarah F. Muldoon, Henry E. Baidoo-Williams, Jean M. Vettel, Matthew Cieslak, Scott T. Grafton
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    This paper advocates Riemannian multi-manifold modeling in the context of
    network-wide non-stationary time-series analysis. Time-series data, collected
    sequentially over time and across a network, yield features which are viewed as
    points in or close to a union of multiple submanifolds of a Riemannian
    manifold, and distinguishing disparate time series amounts to clustering
    multiple Riemannian submanifolds. To support the claim that exploiting the
    latent Riemannian geometry behind many statistical features of time series is
    beneficial to learning from network data, this paper focuses on brain networks
    and puts forth two feature-generation schemes for network-wide dynamic time
    series. The first is motivated by Granger-causality arguments and uses an
    auto-regressive moving average model to map low-rank linear vector subspaces,
    spanned by column vectors of appropriately defined observability matrices, to
    points into the Grassmann manifold. The second utilizes (non-linear)
    dependencies among network nodes by introducing kernel-based partial
    correlations to generate points in the manifold of positive-definite matrices.
    Capitilizing on recently developed research on clustering Riemannian
    submanifolds, an algorithm is provided for distinguishing time series based on
    their geometrical properties, revealed within Riemannian feature spaces.
    Extensive numerical tests demonstrate that the proposed framework outperforms
    classical and state-of-the-art techniques in clustering brain-network
    states/structures hidden beneath synthetic fMRI time series and brain-activity
    signals generated from real brain-network structural connectivity matrices.

    Strongly Adaptive Regret Implies Optimally Dynamic Regret

    Lijun Zhang, Tianbao Yang, Rong Jin, Zhi-Hua Zhou
    Subjects: Learning (cs.LG)

    To cope with changing environments, recent literature in online learning has
    introduced the concepts of adaptive regret and dynamic regret independently. In
    this paper, we illustrate an intrinsic connection between these two concepts by
    showing that the dynamic regret can be expressed in terms of the adaptive
    regret and the functional variation. This observation implies that strongly
    adaptive algorithms can be directly leveraged to minimize the dynamic regret.
    As a result, we present a series of strongly adaptive algorithms whose dynamic
    regrets are minimax optimal for convex functions, exponentially concave
    functions, and strongly convex functions, respectively. To the best of our
    knowledge, this is first time that such kind of dynamic regret bound is
    established for exponentially concave functions. Moreover, all of those
    adaptive algorithms do not need any prior knowledge of the functional
    variation, which is a significant advantage over previous specialized methods
    for minimizing dynamic regret.

    FPGA Architecture for Deep Learning and its application to Planetary Robotics

    Pranay Gankidi, Jekan Thangavelautham
    Comments: 8 pages, 10 figures in Proceedings of the IEEE Aerospace Conference 2017
    Subjects: Learning (cs.LG); Instrumentation and Methods for Astrophysics (astro-ph.IM); Robotics (cs.RO)

    Autonomous control systems onboard planetary rovers and spacecraft benefit
    from having cognitive capabilities like learning so that they can adapt to
    unexpected situations in-situ. Q-learning is a form of reinforcement learning
    and it has been efficient in solving certain class of learning problems.
    However, embedded systems onboard planetary rovers and spacecraft rarely
    implement learning algorithms due to the constraints faced in the field, like
    processing power, chip size, convergence rate and costs due to the need for
    radiation hardening. These challenges present a compelling need for a portable,
    low-power, area efficient hardware accelerator to make learning algorithms
    practical onboard space hardware. This paper presents a FPGA implementation of
    Q-learning with Artificial Neural Networks (ANN). This method matches the
    massive parallelism inherent in neural network software with the fine-grain
    parallelism of an FPGA hardware thereby dramatically reducing processing time.
    Mars Science Laboratory currently uses Xilinx-Space-grade Virtex FPGA devices
    for image processing, pyrotechnic operation control and obstacle avoidance. We
    simulate and program our architecture on a Xilinx Virtex 7 FPGA. The
    architectural implementation for a single neuron Q-learning and a more complex
    Multilayer Perception (MLP) Q-learning accelerator has been demonstrated. The
    results show up to a 43-fold speed up by Virtex 7 FPGAs compared to a
    conventional Intel i5 2.3 GHz CPU. Finally, we simulate the proposed
    architecture using the Symphony simulator and compiler from Xilinx, and
    evaluate the performance and power consumption.

    Exploiting Convolutional Neural Network for Risk Prediction with Medical Feature Embedding

    Zhengping Che, Yu Cheng, Zhaonan Sun, Yan Liu
    Comments: NIPS 2016 Workshop on Machine Learning for Health (ML4HC)
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    The widespread availability of electronic health records (EHRs) promises to
    usher in the era of personalized medicine. However, the problem of extracting
    useful clinical representations from longitudinal EHR data remains challenging.
    In this paper, we explore deep neural network models with learned medical
    feature embedding to deal with the problems of high dimensionality and
    temporality. Specifically, we use a multi-layer convolutional neural network
    (CNN) to parameterize the model and is thus able to capture complex non-linear
    longitudinal evolution of EHRs. Our model can effectively capture local/short
    temporal dependency in EHRs, which is beneficial for risk prediction. To
    account for high dimensionality, we use the embedding medical features in the
    CNN model which hold the natural medical concepts. Our initial experiments
    produce promising results and demonstrate the effectiveness of both the medical
    feature embedding and the proposed convolutional neural network in risk
    prediction on cohorts of congestive heart failure and diabetes patients
    compared with several strong baselines.

    Linear convergence of SDCA in statistical estimation

    Chao Qu, Huan Xu
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    In this paper, we consider stochastic Dual Coordinate (SDCA) extbf{ without}
    strongly convex or convex assumption, which covers many useful models such as
    Lasso, group Lasso, and logistic regression with (ell_1) regularization,
    corrected Lasso and linear regression with SCAD regularizer. We prove that
    under some mild condition called restricted strong convexity satisfied by above
    examples, the convergence rate is still extbf{linear } up to the statistical
    precision of model which is much sharp than the previous work with sub-linear
    result.

    A theoretical framework for evaluating forward feature selection methods based on mutual information

    Francisco Macedo, M. Rosário Oliveira, António Pacheco, Rui Valadas
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Feature selection problems arise in a variety of applications, such as
    microarray analysis, clinical prediction, text categorization, image
    classification and face recognition, multi-label learning, and classification
    of internet traffic. Among the various classes of methods, forward feature
    selection methods based on mutual information have become very popular and are
    widely used in practice. However, comparative evaluations of these methods have
    been limited by being based on specific datasets and classifiers. In this
    paper, we develop a theoretical framework that allows evaluating the methods
    based on their theoretical properties. Our framework is grounded on the
    properties of the target objective function that the methods try to
    approximate, and on a novel categorization of features, according to their
    contribution to the explanation of the class; we derive upper and lower bounds
    for the target objective function and relate these bounds with the feature
    types. Then, we characterize the types of approximations taken by the methods,
    and analyze how these approximations cope with the good properties of the
    target objective function. Additionally, we develop a distributional setting
    designed to illustrate the various deficiencies of the methods, and provide
    several examples of wrong feature selections. Based on our work, we identify
    clearly the methods that should be avoided, and the methods that currently have
    the best performance.

    Fast and Accurate Time Series Classification with WEASEL

    Patrick Schäfer, Ulf Leser
    Subjects: Data Structures and Algorithms (cs.DS); Learning (cs.LG); Machine Learning (stat.ML)

    Time series (TS) occur in many scientific and commercial applications,
    ranging from earth surveillance to industry automation to the smart grids. An
    important type of TS analysis is classification, which can, for instance,
    improve energy load forecasting in smart grids by detecting the types of
    electronic devices based on their energy consumption profiles recorded by
    automatic sensors. Such sensor-driven applications are very often characterized
    by (a) very long TS and (b) very large TS datasets needing classification.
    However, current methods to time series classification (TSC) cannot cope with
    such data volumes at acceptable accuracy; they are either scalable but offer
    only inferior classification quality, or they achieve state-of-the-art
    classification quality but cannot scale to large data volumes.

    In this paper, we present WEASEL (Word ExtrAction for time SEries
    cLassification), a novel TSC method which is both scalable and accurate. Like
    other state-of-the-art TSC methods, WEASEL transforms time series into feature
    vectors, using a sliding-window approach, which are then analyzed through a
    machine learning classifier. The novelty of WEASEL lies in its specific method
    for deriving features, resulting in a much smaller yet much more discriminative
    feature set. On the popular UCR benchmark of 85 TS datasets, WEASEL is more
    accurate than the best current non-ensemble algorithms at orders-of-magnitude
    lower classification and training times, and it is almost as accurate as
    ensemble classifiers, whose computational complexity makes them inapplicable
    even for mid-size datasets. The outstanding robustness of WEASEL is also
    confirmed by experiments on two real smart grid datasets, where it
    out-of-the-box achieves almost the same accuracy as highly tuned,
    domain-specific methods.

    A Model-based Projection Technique for Segmenting Customers

    Srikanth Jagabathula, Lakshminarayanan Subramanian, Ashwin Venkataraman
    Comments: 51 pages, 3 figures, 4 tables
    Subjects: Methodology (stat.ME); Learning (cs.LG); Applications (stat.AP); Machine Learning (stat.ML)

    We consider the problem of segmenting a large population of customers into
    non-overlapping groups with similar preferences, using diverse preference
    observations such as purchases, ratings, clicks, etc. over subsets of items. We
    focus on the setting where the universe of items is large (ranging from
    thousands to millions) and unstructured (lacking well-defined attributes) and
    each customer provides observations for only a few items. These data
    characteristics limit the applicability of existing techniques in marketing and
    machine learning. To overcome these limitations, we propose a model-based
    projection technique, which transforms the diverse set of observations into a
    more comparable scale and deals with missing data by projecting the transformed
    data onto a low-dimensional space. We then cluster the projected data to obtain
    the customer segments. Theoretically, we derive precise necessary and
    sufficient conditions that guarantee asymptotic recovery of the true customer
    segments. Empirically, we demonstrate the speed and performance of our method
    in two real-world case studies: (a) 84% improvement in the accuracy of new
    movie recommendations on the MovieLens data set and (b) 6% improvement in the
    performance of similar item recommendations algorithm on an offline dataset at
    eBay. We show that our method outperforms standard latent-class and
    demographic-based techniques.


    Information Theory

    Private Information Retrieval from MDS Coded Data with Colluding Servers: Settling a Conjecture by Freij-Hollanti et al

    Hua Sun, Syed A. Jafar
    Subjects: Information Theory (cs.IT); Cryptography and Security (cs.CR); Information Retrieval (cs.IR)

    A ((K, N, T, K_c)) instance of the MDS-TPIR problem is comprised of (K)
    messages and (N) distributed servers. Each message is separately encoded
    through a ((K_c, N)) MDS storage code. A user wishes to retrieve one message,
    as efficiently as possible, while revealing no information about the desired
    message index to any colluding set of up to (T) servers. The fundamental limit
    on the efficiency of retrieval, i.e., the capacity of MDS-TPIR is known only at
    the extremes where either (T) or (K_c) belongs to ({1,N}). The focus of this
    work is a recent conjecture by Freij-Hollanti, Gnilke, Hollanti and Karpuk
    which offers a general capacity expression for MDS-TPIR. We prove that the
    conjecture is false by presenting as a counterexample a PIR scheme for the
    setting ((K, N, T, K_c) = (2,4,2,2)), which achieves the rate (3/5), exceeding
    the conjectured capacity, (4/7). Insights from the counterexample lead us to
    capacity characterizations for various instances of MDS-TPIR including all
    cases with ((K, N, T, K_c) = (2,N,T,N-1)), where (N) and (T) can be arbitrary.

    On extractable shared information

    Johannes Rauh, Pradeep Kr. Banerjee, Eckehard Olbrich, Jürgen Jost, Nils Bertschinger
    Comments: 5 pages
    Subjects: Information Theory (cs.IT)

    We consider the problem of quantifying the information shared by a pair of
    random variables (X_{1},X_{2}) about another variable (S). We propose a new
    measure of shared information, called extractable shared information that is
    left monotonic; that is, the information shared about (S) is bounded from below
    by the information shared about (f(S)) for any function (f). We show that our
    measure leads to a new nonnegative decomposition of the mutual information
    (I(S;X_1X_2)) into shared, complementary and unique components. We study
    properties of this decomposition and show that a left monotonic shared
    information is not compatible with a Blackwell interpretation of unique
    information. We also discuss whether it is possible to have a decomposition in
    which both shared and unique information are left monotonic.

    A Variational Characterization of Rényi Divergences

    Venkat Anantharam
    Comments: EECS Department, University of California, Berkeley CA 94720
    Subjects: Information Theory (cs.IT); Probability (math.PR); Statistics Theory (math.ST)

    Atar, Chowdhary and Dupuis have recently exhibited a variational formula for
    exponential integrals of bounded measurable functions in terms of R’enyi
    divergences. We develop a variational characterization of the R’enyi
    divergences between two probability distributions on a measurable sace in terms
    of relative entropies. When combined with the elementary variational formula
    for exponential integrals of bounded measurable functions in terms of relative
    entropy, this yields the variational formula of Atar, Chowdhary and Dupuis as a
    corollary. We also develop an analogous variational characterization of the
    R’enyi divergence rates between two stationary finite state Markov chains in
    terms of relative entropy rates. When combined with Varadhan’s variational
    characterization of the spectral radius of square matrices with nonnegative
    entries in terms of relative entropy, this yields an analog of the variational
    formula of Atar, Chowdary and Dupuis in the framework of finite state Markov
    chains.

    On Deep Learning-Based Channel Decoding

    Tobias Gruber, Sebastian Cammerer, Jakob Hoydis, Stephan ten Brink
    Comments: accepted for CISS 2017
    Subjects: Information Theory (cs.IT)

    We revisit the idea of using deep neural networks for one-shot decoding of
    random and structured codes, such as polar codes. Although it is possible to
    achieve maximum a posteriori (MAP) bit error rate (BER) performance for both
    code families and for short codeword lengths, we observe that (i) structured
    codes are easier to learn and (ii) the neural network is able to generalize to
    codewords that it has never seen during training for structured, but not for
    random codes. These results provide some evidence that neural networks can
    learn a form of decoding algorithm, rather than only a simple classifier. We
    introduce the metric normalized validation error (NVE) in order to further
    investigate the potential and limitations of deep learning-based decoding with
    respect to performance and complexity.

    Information-geometrical characterization of statistical models which are statistically equivalent to probability simplexes

    Hiroshi Nagaoka
    Comments: Submitted to IEEE ISIT 2017
    Subjects: Information Theory (cs.IT); Statistics Theory (math.ST)

    We give a characterization of statistical models on finite sets which are
    statistically equivalent to probability simplexes in terms of (alpha)-families
    including exponential families and mixture families. The subject has a close
    relation to some fundamental aspects of information geometry such as
    (alpha)-connections and autoparallelity.

    Fast Erasure Coding based on Polynomial Ring Transforms

    Jonathan Detchart, Jérôme Lacan
    Subjects: Information Theory (cs.IT)

    The complexity of software implementations of MDS erasure codes mainly
    depends on the efficiency of the finite field operations implementation. In
    this paper, we propose a method to reduce the complexity of the finite field
    multiplication by using fast transforms between a field and a ring to perform
    the multiplication in a ring. We show that moving to a ring reduces the
    complexity of the operations. Then, we show that this construction allows the
    use of simple scheduling to reduce the number of operations.

    Alpha Fair Coded Caching

    Apostolos Destounis, Mari Kobayashi, Georgios Paschos, Asma Ghorbel
    Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

    The performance of existing emph{coded caching} schemes is sensitive to
    worst channel quality, a problem which is exacerbated when communicating over
    fading channels. In this paper we address this limitation in the following
    manner: emph{in short-term}, we allow transmissions to subsets of users with
    good channel quality, avoiding users with fades, while emph{in long-term} we
    ensure fairness across the different users.Our online scheme combines (i) joint
    scheduling and power control for the broadcast channel with fading, and (ii)
    congestion control for ensuring the optimal long-term average performance. We
    restrict the caching operations to the decentralized scheme of
    cite{maddah2013decentralized}, and subject to this restriction we prove that
    our scheme has near-optimal overall performance with respect to the convex
    alpha-fairness coded caching optimization. By tuning the coefficient alpha, the
    operator can differentiate user performance with respect to video delivery
    rates achievable by coded caching.

    We demonstrate via simulations our scheme’s superiority over legacy coded
    caching and unicast opportunistic scheduling, which are identified as special
    cases of our general framework.

    Analogy and duality between random channel coding and lossy source coding

    Sergey Tridenski, Ram Zamir
    Comments: This paper is self-contained, and serves also as an addendum to our paper “Exponential source/channel duality”
    Subjects: Information Theory (cs.IT)

    Here we write in a unified fashion (using “R(P, Q, D)”) the random coding
    exponents in channel coding and lossy source coding. We derive their explicit
    forms and show, that, for a given random codebook distribution Q, the channel
    decoding error exponent can be viewed as an encoding success exponent in lossy
    source coding, and the channel correct-decoding exponent can be viewed as an
    encoding failure exponent in lossy source coding. We then extend the channel
    exponents to arbitrary D, which corresponds for D > 0 to erasure decoding and
    for D < 0 to list decoding. For comparison, we also derive the exact random
    coding exponent for Forney’s optimum tradeoff decoder.

    Exponential Source/Channel Duality

    Sergey Tridenski, Ram Zamir
    Subjects: Information Theory (cs.IT)

    We propose a source/channel duality in the exponential regime, where
    success/failure in source coding parallels error/correctness in channel coding,
    and a distortion constraint becomes a log-likelihood ratio (LLR) threshold. We
    establish this duality by first deriving exact exponents for lossy coding of a
    memoryless source P, at distortion D, for a general i.i.d. codebook
    distribution Q, for both encoding success (R < R(P,Q,D)) and failure (R >
    R(P,Q,D)). We then turn to maximum likelihood (ML) decoding over a memoryless
    channel P with an i.i.d. input Q, and show that if we substitute P=QP, Q=Q, and
    D=0 under the LLR distortion measure, then the exact exponents for
    decoding-error (R < I(Q, P)) and strict correct-decoding (R > I(Q, P)) follow
    as special cases of the exponents for source encoding success/failure,
    respectively. Moreover, by letting the threshold D take general values, the
    exact random-coding exponents for erasure (D > 0) and list decoding (D < 0)
    under the simplified Forney decoder are obtained. Finally, we derive the exact
    random-coding exponent for Forney’s optimum tradeoff erasure/list decoder, and
    show that at the erasure regime it coincides with Forney’s lower bound and with
    the simplified decoder exponent.

    Sparse Ternary Codes for similarity search have higher coding gain than dense binary codes

    Sohrab Ferdowsi, Slava Voloshynovskiy, Dimche Kostadinov, Taras Holotyak
    Comments: Submitted to ISIT 2017
    Subjects: Information Theory (cs.IT); Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR)

    This paper addresses the problem of Approximate Nearest Neighbor (ANN) search
    in pattern recognition where feature vectors in a database are encoded as
    compact codes in order to speed-up the similarity search in large-scale
    databases. Considering the ANN problem from an information-theoretic
    perspective, we interpret it as an encoding which maps the original feature
    vectors to a less-entropic sparse representation while requiring them to be as
    informative as possible. We then define the coding gain for ANN search using
    information-theoretic measures. We next show that the classical approach to
    this problem which consists of binarization of the projected vectors is
    sub-optimal. Instead, we show that a recently proposed ternary encoding
    achieves higher coding gains.

    Private Information Retrieval Schemes for Coded Data with Arbitrary Collusion Patterns

    Razane Tajeddine, Oliver W. Gnilke, David Karpuk, Ragnar Freij-Hollanti, Camilla Hollanti, Salim El Rouayheb
    Subjects: Information Theory (cs.IT)

    In Private Information Retrieval (PIR), one wants to download a file from a
    database without revealing to the database which file is being downloaded. Much
    attention has been paid to the case of the database being encoded across
    several servers, subsets of which can collude to attempt to deduce the
    requested file. With the goal of studying the achievable PIR rates in realistic
    scenarios, we generalize results for coded data from the case of all subsets of
    servers of size (t) colluding, to arbitrary subsets of the servers. We
    investigate the effectiveness of previous strategies in this new scenario, and
    present new results in the case where the servers are partitioned into disjoint
    colluding groups.

    Non-Uniformly Coupled LDPC Codes: Better Thresholds, Smaller Rate-loss, and Less Complexity

    Laurent Schmalen, Vahid Aref, Fanny Jardel
    Comments: submitted to IEEE International Symposium on Information Theory (ISIT) 2017
    Subjects: Information Theory (cs.IT)

    We consider spatially coupled low-density parity-check codes with finite
    smoothing parameters. A finite smoothing parameter is important for designing
    practical codes that are decoded using low-complexity windowed decoders. By
    optimizing the amount of coupling between spatial positions, we show that we
    can construct codes with excellent thresholds and small rate loss, even with
    the lowest possible smoothing parameter and large variable node degrees, which
    are required for low error floors. We also establish that the decoding
    convergence speed is faster with non-uniformly coupled codes, which we verify
    by density evolution of windowed decoding with a finite number of iterations.
    We also show that by only slightly increasing the smoothing parameter,
    practical codes with potentially low error floors and thresholds close to
    capacity can be constructed. Finally, we give some indications on protograph
    designs.

    Minimum-Distance Based Construction of Multi-Kernel Polar Codes

    Valerio Bioglio, Frederic Gabry, Ingmar Land, Jean-Claude Belfiore
    Comments: Submitted to ISIT 2017
    Subjects: Information Theory (cs.IT)

    In this paper, we propose a construction for multikernel polar codes based on
    the maximization of the minimum distance. Compared to the original construction
    based on density evolution, our new design shows particular advantages for
    short code lengths, where the polarization effect has less impact on the
    performance than the distances of the code. We introduce and compute the
    minimum-distance profile and provide a simple greedy algorithm for the code
    design. Compared to state-of-the art punctured or shortened Arikan polar codes,
    multi-kernel polar codes with our new design show significantly improved
    error-rate performance.

    Lattice coding for Rician fading channels from Hadamard rotations

    Alex Karrila, Niko R. Väisänen, David Karpuk, Camilla Hollanti
    Subjects: Information Theory (cs.IT)

    In this paper, we study lattice coding for Rician fading wireless channels.
    This is motivated in particular by preliminary studies suggesting the Rician
    fading model for millimeter-wavelength wireless communications. We restrict to
    lattice codes arising from rotations of (mathbb{Z}^n), and to a single-input
    single-output (SISO) channel. We observe that several lattice design criteria
    suggest the optimality of Hadamard rotations. For instance, we prove that
    Hadamard rotations maximize the diamond-packing density among all rotated
    (mathbb{Z}^n) lattices. Finally, we provide simulations to show that Hadamard
    rotations outperform optimal algebraic rotations and cross-packing lattices in
    the Rician channel.

    Coarse-graining and the Blackwell order

    Johannes Rauh, Pradeep Kr. Banerjee, Eckehard Olbrich, Jürgen Jost, Nils Bertschinger, David Wolpert
    Comments: 5 pages, 1 figure
    Subjects: Information Theory (cs.IT)

    Suppose we have a pair of information channels, (kappa_{1},kappa_{2}) with
    a common input. The Blackwell order is a partial order over channels that
    compares (kappa_{1}) and (kappa_{2}) by the maximal expected utility an agent
    can obtain when decisions are based on the outputs of (kappa_{1}) and
    (kappa_{2}). Equivalently, (kappa_{1}) is said to be Blackwell-inferior to
    (kappa_{2}) if and only if (kappa_{1}) can be constructed by garbling the
    output of (kappa_{2}). A related partial order stipulates that (kappa_{2}) is
    more capable than (kappa_{1}) if the mutual information between the input and
    output is larger for (kappa_{2}) than for (kappa_{1}) for any distribution
    over inputs. If one channel is Blackwell-inferior to another then it must be
    less capable. However examples are known where (kappa_{1}) is less capable
    than (kappa_{2}) even though it is not Blackwell-inferior. We give a new
    example of this phenomenon in which (kappa_{1}) is constructed by
    coarse-graining the inputs of (kappa_{2}). Such a coarse-graining is a special
    kind of “pre-garbling” of the channel inputs. This example directly establishes
    that the expected value of the shared utility function for the coarse-grained
    channel is larger than it is for the non-coarse-grained channel. This is
    surprising, as one might think that coarse-graining can only destroy
    information and lead to inferior channels.

    Backscatter Communications for Internet-of-Things: Theory and Applications

    Wanchun Liu, Kaibin Huang, Xiangyun Zhou, Salman Durrani
    Comments: submitted for possible journal publications
    Subjects: Information Theory (cs.IT)

    The Internet-of-Things (IoT) is an emerging concept of network connectivity
    at anytime and anywhere for billions of everyday objects, which has recently
    attracted tremendous attentions from both the industry and academia. The rapid
    growth of IoT has been driven by recent advancements in consumer electronics,
    wireless network densification, 5G communication technologies [e.g., millimeter
    wave and massive multiple-input and multiple-output (MIMO)], and
    cloud-computing enabled big-data analytics. One of the remaining key challenges
    for IoT is the limited network lifetime due to massive IoT devices being
    powered by batteries with finite capacities. The low-power and low-complexity
    backscatter communications (BackCom) has emerged to be a promising technology
    for tackling the challenge. In this article, we present an overview of the
    active area by discussing basic principles, system and network architectures
    and relevant techniques. Last, we describe the IoT applications for BackCom and
    how the technology can solve the energy challenge for IoT.

    Explicit Constructions and Bounds for Batch Codes with Restricted Size of Reconstruction Sets

    Eldho K. Thomas, Vitaly Skachek
    Subjects: Information Theory (cs.IT)

    Linear batch codes and codes for private information retrieval (PIR) with a
    query size (t) and a restricted size (r) of the reconstruction sets are
    studied. New bounds on the parameters of such codes are derived for small
    values of (t) or of (r) by providing corresponding constructions. By building
    on the ideas of Cadambe and Mazumdar, a new bound in a recursive form is
    derived for batch codes and PIR codes.

    Approximate Capacity of a Class of Partially Connected Interference Channels

    Muryong Kim, Yitao Chen, Sriram Vishwanath
    Comments: A short version submitted to ISIT 2017
    Subjects: Information Theory (cs.IT)

    We derive inner and outer bounds on the capacity region for a class of
    three-user partially connected interference channels. We focus on the impact of
    topology, interference alignment, and interplay between interference and noise.
    The representative channels we consider are the ones that have clear
    interference alignment gain. For these channels, Z-channel type outer bounds
    are tight to within a constant gap from capacity. We present near-optimal
    achievable schemes based on rate-splitting and lattice alignment.

    Sample Complexity of the Boolean Multireference Alignment Problem

    Emmanuel Abbe, Joao Pereira, Amit Singer
    Comments: 5 pages, submitted to ISIT
    Subjects: Information Theory (cs.IT)

    The Boolean multireference alignment problem consists in recovering a Boolean
    signal from multiple shifted and noisy observations. In this paper we obtain an
    expression for the error exponent of the maximum A posteriori decoder. This
    expression is used to characterize the number of measurements needed for signal
    recovery in the low SNR regime, in terms of higher order autocorrelations of
    the signal. The characterization is explicit for various signal dimensions,
    such as prime and even dimensions.

    Design of Improved Quasi-Cyclic Protograph-Based Raptor-Like LDPC Codes for Short Block-Lengths

    Sudarsan V. S. Ranganathan, Dariush Divsalar, Richard D. Wesel
    Comments: Longer version of a submission to 2017 International Symposium on Information Theory
    Subjects: Information Theory (cs.IT)

    Protograph-based Raptor-like low-density parity-check codes (PBRL codes) are
    a recently proposed family of easily encodable and decodable rate-compatible
    LDPC (RC-LDPC) codes. These codes have an excellent iterative decoding
    threshold and performance across all design rates. PBRL codes designed thus
    far, for both long and short block-lengths, have been based on optimizing the
    iterative decoding threshold of the protograph of the RC code family at various
    design rates.

    In this work, we propose a design method to obtain better quasi-cyclic (QC)
    RC-LDPC codes with PBRL structure for short block-lengths (of a few hundred
    bits). We achieve this by maximizing an upper bound on the minimum distance of
    any QC-LDPC code that can be obtained from the protograph of a PBRL ensemble.
    The obtained codes outperform the original PBRL codes at short block-lengths by
    significantly improving the error floor behavior at all design rates.
    Furthermore, we identify a reduction in complexity of the design procedure,
    facilitated by the general structure of a PBRL ensemble.

    The Role of Transmitter Cooperation in Linear Interference Networks with Block Erasures

    Yasemin Karacora, Tolunay Seyfi, Aly El Gamal
    Comments: 5 pages, submitted to International Symposium on Information Theory (ISIT 2017)
    Subjects: Information Theory (cs.IT)

    In this work, we explore the potential and optimal use of transmitter
    cooperation in large wireless networks with deep fading conditions. We consider
    a linear interference network with K transmitter-receiver pairs, where each
    transmitter can be connected to two neighboring receivers. Long-term
    fluctuations (shadow fading) in the wireless channel can lead to any link being
    erased with probability p. Each receiver is interested in one unique message
    that can be available at two transmitters. The considered rate criterion is the
    per user degrees of freedom (puDoF) as K goes to infinity. Prior to this work,
    the optimal assignment of messages to transmitters were identified in the two
    limits p -> 0 and p -> 1. We identify new schemes that achieve average puDoF
    values that are higher than the state of the art for a significant part of the
    range 0 < p < 1. The key idea to our results is to understand that the role of
    cooperation shifts from increasing the probability of delivering a message to
    its intended destination at high values of p, to interference cancellation at
    low values of p. Our schemes are based on an algorithm that achieves the
    optimal puDoF value, when restricted to a given message assignment as well as
    the use of interference avoidance zero-forcing schemes.

    Joint Uplink-Downlink Cell Associations for Interference Networks with Local Connectivity

    Manik Singhal, Aly El Gamal
    Comments: 5 pages, submitted to International Symposium on Information Theory (ISIT 2017)
    Subjects: Information Theory (cs.IT)

    We study information theoretic models of interference networks that consist
    of K Base Station (BS) – Mobile Terminal (MT) pairs. Each BS is connected to
    the MT carrying the same index as well as L following MTs. We fix the value of
    L and study large networks as K goes to infinity. We assume that each MT can be
    associated with Nc BSs, and these associations are determined by a cloud-based
    controller that has a global view of the network. An MT has to be associated
    with a BS, in order for the BS to transmit its message in the downlink, or
    decode its message in the uplink. In previous work, the cell associations that
    maximize the average uplink-downlink per user degrees of freedom (puDoF) were
    identified for the case when L=1. Further, when only the downlink is
    considered, the problem was settled for all values of L when we are restricted
    to use only zero-forcing interference cancellation schemes. In this work, we
    first propose puDoF inner bounds for arbitrary values of L when only the uplink
    is considered, and characterize the uplink puDoF value when Nc > L-1. We then
    introduce new achievable average uplink-downlink puDoF, and conjecture that the
    new scheme is optimal for all values of L, when we restrict our attention to
    zero-forcing schemes.

    Floor Scale Modulo Lifting for QC-LDPC codes

    Nikita Polyanskii, Vasiliy Usatyuk, Ilya Vorobyev
    Comments: 5 pages, 2 columns
    Subjects: Information Theory (cs.IT)

    In the given paper we present a novel approach for constructing a QC-LDPC
    code of smaller length by lifting a given QC-LDPC code. The proposed method can
    be considered as a generalization of floor lifting. Also we prove several
    probabilistic statements concerning a theoretical improvement of the method
    with respect to the number of small cycles. Making some offline calculation of
    scale parameter it is possible to construct a sequence of QC-LDPC codes with
    different circulant sizes generated from a single exponent matrix using only
    floor and scale operations. The only parameter we store in memory is a constant
    needed for scaling.

    Non-colocated Time-Reversal MUSIC: High-SNR Distribution of Null Spectrum

    D. Ciuonzo, P. Salvo Rossi
    Comments: accepted in IEEE Signal Processing Letters
    Subjects: Information Theory (cs.IT)

    We derive the asymptotic distribution of the null spectrum of the well-known
    Multiple Signal Classification (MUSIC) in its computational Time-Reversal (TR)
    form. The result pertains to a single-frequency non-colocated multistatic
    scenario and several TR-MUSIC variants are here investigated. The analysis
    builds upon the 1st-order perturbation of the singular value decomposition and
    allows a simple characterization of null-spectrum moments (up to the 2nd
    order). This enables a comparison in terms of spectrums stability. Finally, a
    numerical analysis is provided to confirm the theoretical findings.

    Locality and Availability of Array Codes Constructed from Subspaces

    Natalia Silberstein, Tuvi Etzion, Moshe Schwartz
    Subjects: Information Theory (cs.IT)

    Ever-increasing amounts of data are created and processed in internet-scale
    companies such as Google, Facebook, and Amazon. The efficient storage of such
    copious amounts of data has thus become a fundamental and acute problem in
    modern computing. No single machine can possibly satisfy such immense storage
    demands. Therefore, distributed storage systems (DSS), which rely on tens of
    thousands of storage nodes, are the only viable solution. Such systems are
    broadly used in all modern internet-scale systems. However, the design of a DSS
    poses a number of crucial challenges, markedly different from single-user
    storage systems. Such systems must be able to reconstruct the data efficiently,
    to overcome failure of servers, to correct errors, etc. Lots of research was
    done in the last few years to answer these challenges and the research is
    increasing in parallel to the increasing amount of stored data.

    The main goal of this paper is to consider codes which have two of the most
    important features of distributed storage systems, namely, locality and
    availability. Our codes are array codes which are based on subspaces of a
    linear space over a finite field. We present several constructions of such
    codes which are (q)-analog to some of the known block codes. Some of these
    codes possess independent intellectual merit. We examine the locality and
    availability of the constructed codes. In particular we distinguish between two
    types of locality and availability, node vs.~symbol, locality and availability.
    To our knowledge this is the first time that such a distinction is given in the
    literature.

    Two-Party Function Computation on the Reconciled Data

    Ivo Kubjas, Vitaly Skachek
    Comments: Submitted to ISIT 2017
    Subjects: Information Theory (cs.IT); Distributed, Parallel, and Cluster Computing (cs.DC)

    Assume a distributed system with two users, each user possesses a collection
    of binary strings. We introduce a new problem termed function computation on
    the reconciled data, which generalizes a set reconciliation problem in the
    literature. It is shown that any deterministic protocol that computes a sum and
    a product of reconciled sets of nonnegative integers has to communicate at
    least (2^n + n – 1) and (2^n + n – 3) bits in the worst-case scenario,
    respectively, where (n) is the length of the binary string representations of
    the numbers. Connections to other problems in computer science, such as set
    disjointness and finding the intersection, are established, yielding a variety
    of additional bounds on the communication complexity. A protocol, which is
    based on use of a family of hash functions is presented, and its
    characteristics are analyzed.

    Joint Power Allocation and Beamforming for Energy-Efficient Two-Way Multi-Relay Communications

    Zhichao Sheng, Hoang D. Tuan, Trung Q. Duong, H. Vincent Poor
    Comments: 26 pages, 9 figures
    Subjects: Information Theory (cs.IT)

    This paper considers the joint design of user power allocation and relay
    beamforming in relaying communications, in which multiple pairs of
    single-antenna users exchange information with each other via multiple-antenna
    relays in two time slots. All users transmit their signals to the relays in the
    first time slot while the relays broadcast the beamformed signals to all users
    in the second time slot. The aim is to maximize the system’s energy efficiency
    (EE) subject to quality-of-service (QoS) constraints in terms of exchange
    throughput requirements. The QoS constraints are nonconvex with many nonlinear
    cross-terms, so finding a feasible point is already computationally
    challenging. The sum throughput appears in the numerator while the total
    consumption power appears in the denominator of the EE objective function. The
    former is a nonconcave function and the latter is a nonconvex function, making
    fractional programming useless for EE optimization. Nevertheless, efficient
    iterations of low complexity to obtain its optimized solutions are developed.
    The performances of the multiple-user and multiple-relay networks under various
    scenarios are evaluated to show the merit of the paper development.

    Relay-Assisted Mixed FSO/RF Systems over Málaga-(mathcal{M}) and (κ)-(μ) Shadowed Fading Channels

    Nesrine Cherif, Imène Trigui, Sofiène Affes
    Subjects: Information Theory (cs.IT)

    This letter presents a unified analytical framework for relay-assisted mixed
    FSO/RF transmission. In addition to accounting for different FSO detection
    techniques, the mathematical model offers a twofold unification of mixed FSO/RF
    systems by considering mixed M’alaga-(mathcal{M})/(kappa)-(mu) shadowed
    fading, which includes as special cases nearly all linear turbulence/fading
    models adopted in the open literature.

    Group Testing using left-and-right-regular sparse-graph codes

    Avinash Vem, Nagaraj T. Janakiraman, Krishna R. Narayanan
    Comments: Part of this work is submitted to IEEE International Symposium on Information Theory 2017
    Subjects: Information Theory (cs.IT)

    We consider the problem of non-adaptive group testing of (N) items out of
    which (K) or less items are known to be defective. We propose a testing scheme
    based on left-and-right-regular sparse-graph codes and a simple iterative
    decoder. We show that for any arbitrarily small (epsilon>0) our scheme
    requires only (m=c_epsilon Klog frac{c_1N}{K}) tests to recover
    ((1-epsilon)) fraction of the defective items with high probability (w.h.p)
    i.e., with probability approaching (1) asymptotically in (N) and (K), where the
    value of constants (c_epsilon) and (ell) are a function of the desired error
    floor (epsilon) and constant (c_1=frac{ell}{c_epsilon}) (observed to be
    approximately equal to 1 for various values of (epsilon)). More importantly
    the iterative decoding algorithm has a sub-linear computational complexity of
    (mathcal{O}(Klog frac{N}{K})) which is known to be optimal. Also for (m=c_2
    Klog Klog frac{N}{K}) tests our scheme recovers the extit{whole} set of
    defective items w.h.p. These results are valid for both noiseless and noisy
    versions of the problem as long as the number of defective items scale
    sub-linearly with the total number of items, i.e., (K=o(N)). The simulation
    results validate the theoretical results by showing a substantial improvement
    in the number of tests required when compared to the testing scheme based on
    left-regular sparse-graphs.

    An Explicit, Coupled-Layer Construction of a High-Rate MSR Code with Low Sub-Packetization Level, Small Field Size and (d< (n-1))

    Birenjith Sasidharan, Myna Vajha, P. Vijay Kumar
    Comments: submitted to ISIT 2017. arXiv admin note: text overlap with arXiv:1607.07335
    Subjects: Information Theory (cs.IT)

    This paper presents an explicit construction for an
    (((n=2qt,k=2q(t-1),d=n-(q+1)), (alpha = q(2q)^{t-1},eta =
    frac{alpha}{q}))) regenerating code over a field (mathbb{F}_Q) operating at
    the Minimum Storage Regeneration (MSR) point. The MSR code can be constructed
    to have rate (k/n) as close to (1) as desired, sub-packetization level (alpha
    leq r^{frac{n}{r}}) for (r=(n-k)), field size (Q) no larger than (n) and
    where all code symbols can be repaired with the same minimum data download.
    This is the first-known construction of such an MSR code for (d<(n-1)).

    On The Compound MIMO Wiretap Channel with Mean Feedback

    Amr Abdelaziz, C. Emre Koksal, Hesham El Gamal, Ashraf D. Elbayoumy
    Subjects: Cryptography and Security (cs.CR); Information Theory (cs.IT)

    Compound MIMO wiretap channel with double sided uncertainty is considered
    under channel mean information model. In mean information model, channel
    variations is centered around its mean value which is fed back to the
    transmitter. We show that the worst case main channel is anti-parallel to the
    channel mean information resulting in an overall unit rank channel. Further,
    the worst eavesdropper channel is shown to be isotropic around its mean
    information. Accordingly, beamforming is shown to be the optimal signaling
    strategy. We show that the saddle point property holds under mean information
    model, and thus, compound secrecy capacity equals to the worst case capacity
    over the class of uncertainty. We show that, null steering (NS) beamforming is
    the optimal beamforming direction, that is, transmission in the direction
    orthogonal to the eavesdropper mean channel direction maintaining the maximum
    possible gain in mean main channel direction. An equivalence relation with MIMO
    wiretap channel with Rician fading is established. Simulation results show that
    NS beamforming outperforms both maximum ratio transmission (MRT) and zero
    forcing (ZF) beamforming approaches over the entire SNR range.




沪ICP备19023445号-2号
友情链接