IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Tue, 20 Dec 2016

    我爱机器学习(52ml.net)发表于 2016-12-20 00:00:00
    love 0

    Neural and Evolutionary Computing

    A recurrent neural network without chaos

    Thomas Laurent, James von Brecht
    Subjects: Neural and Evolutionary Computing (cs.NE); Computation and Language (cs.CL); Learning (cs.LG)

    We introduce an exceptionally simple gated recurrent neural network (RNN)
    that achieves performance comparable to well-known gated architectures, such as
    LSTMs and GRUs, on the word-level language modeling task. We prove that our
    model has simple, predicable and non-chaotic dynamics. This stands in stark
    contrast to more standard gated architectures, whose underlying dynamical
    systems exhibit chaotic behavior.

    Transfer Learning based Dynamic Multiobjective Optimization Algorithms

    Min Jiang, Zhongqiang Huang, Liming Qiu, Wenzhen Huang, Gary G. Yen
    Subjects: Neural and Evolutionary Computing (cs.NE)

    One of the major distinguishing features of the dynamic multiobjective
    optimization problems (DMOPs) is the optimization objectives will change over
    time, thus tracking the varying Pareto-optimal front becomes a challenge. One
    of the promising solutions is reusing the “experiences” to construct a
    prediction model via statistical machine learning approaches. However most of
    the existing methods ignore the non-independent and identically distributed
    nature of data used to construct the prediction model. In this paper, we
    propose an algorithmic framework, called Tr-DMOEA, which integrates transfer
    learning and population-based evolutionary algorithm for solving the DMOPs.
    This approach takes the transfer learning method as a tool to help reuse the
    past experience for speeding up the evolutionary process, and at the same time,
    any population based multiobjective algorithms can benefit from this
    integration without any extensive modifications. To verify this, we incorporate
    the proposed approach into the development of three well-known algorithms,
    nondominated sorting genetic algorithm II (NSGA-II), multiobjective particle
    swarm optimization (MOPSO), and the regularity model-based multiobjective
    estimation of distribution algorithm (RM-MEDA), and then employ twelve
    benchmark functions to test these algorithms as well as compare with some
    chosen state-of-the-art designs. The experimental results confirm the
    effectiveness of the proposed method through exploiting machine learning
    technology.

    Learning Features by Watching Objects Move

    Deepak Pathak, Ross Girshick, Piotr Dollár, Trevor Darrell, Bharath Hariharan
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    This paper presents a novel yet intuitive approach to unsupervised feature
    learning. Inspired by the human visual system, we explore whether low-level
    motion-based grouping cues can be used to learn an effective visual
    representation. Specifically, we use unsupervised motion-based segmentation on
    videos to obtain segments, which we use as ‘pseudo ground truth’ to train a
    convolutional network to segment objects from a single frame. Given the
    extensive evidence that motion plays a key role in the development of the human
    visual system, we hope that this straightforward approach to unsupervised
    learning will be more effective than cleverly designed ‘pretext’ tasks studied
    in the literature. Indeed, our extensive experiments show that this is the
    case. When used for transfer learning on object detection, our representation
    significantly outperforms previous unsupervised approaches across multiple
    settings, especially when training data for the target task is scarce.

    EgoTransfer: Transferring Motion Across Egocentric and Exocentric Domains using Deep Neural Networks

    Shervin Ardeshir, Krishna Regmi, Ali Borji
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Mirror neurons have been observed in the primary motor cortex of primate
    species, in particular in humans and monkeys. A mirror neuron fires when a
    person performs a certain action, and also when he observes the same action
    being performed by another person. A crucial step towards building fully
    autonomous intelligent systems with human-like learning abilities is the
    capability in modeling the mirror neuron. On one hand, the abundance of
    egocentric cameras in the past few years has offered the opportunity to study a
    lot of vision problems from the first-person perspective. A great deal of
    interesting research has been done during the past few years, trying to explore
    various computer vision tasks from the perspective of the self. On the other
    hand, videos recorded by traditional static cameras, capture humans performing
    different actions from an exocentric third-person perspective. In this work, we
    take the first step towards relating motion information across these two
    perspectives. We train models that predict motion in an egocentric view, by
    observing it from an exocentric view, and vice versa. This allows models to
    predict how an egocentric motion would look like from outside. To do so, we
    train linear and nonlinear models and evaluate their performance in terms of
    retrieving the egocentric (exocentric) motion features, while having access to
    an exocentric (egocentric) motion feature. Our experimental results demonstrate
    that motion information can be successfully transferred across the two views.

    Quantum Reinforcement Learning

    Daniel Crawford, Anna Levit, Navid Ghadermarzy, Jaspreet S. Oberoi, Pooya Ronagh
    Comments: 16 pages
    Subjects: Quantum Physics (quant-ph); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Optimization and Control (math.OC)

    We investigate whether quantum annealers with select chip layouts can
    outperform classical computers in reinforcement learning tasks. We associate a
    transverse field Ising spin Hamiltonian with a layout of qubits similar to that
    of a deep Boltzmann machine (DBM) and use simulated quantum annealing (SQA) to
    numerically simulate quantum sampling from this system. We design a
    reinforcement learning algorithm in which the set of visible nodes representing
    the states and actions of an optimal policy are the first and last layers of
    the deep network. In absence of a transverse field, our simulations show that
    DBMs train more effectively than restricted Boltzmann machines (RBM) with the
    same number of weights. Since sampling from Boltzmann distributions of a DBM is
    not classically feasible, this is evidence of supremacy of a non-Turing
    sampling oracle. We then develop a framework for training the network as a
    quantum Boltzmann machine (QBM) in the presence of a significant transverse
    field for reinforcement learning. The latter method further improves the
    reinforcement learning method using DBMs.


    Computer Vision and Pattern Recognition

    Asynchronous Temporal Fields for Action Recognition

    Gunnar A. Sigurdsson, Santosh Divvala, Ali Farhadi, Abhinav Gupta
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Actions are more than just movements and trajectories: we cook to eat and we
    hold a cup to drink from it. A thorough understanding of videos requires going
    beyond appearance modeling and necessitates reasoning about the sequence of
    activities, as well as the higher-level constructs such as intentions. But how
    do we model and reason about these? We propose a fully-connected temporal CRF
    model for reasoning over various aspects of activities that includes objects,
    actions, and intentions, where the potentials are predicted by a deep network.
    End-to-end training of such structured models is a challenging endeavor: For
    inference and learning we need to construct mini-batches consisting of whole
    videos, leading to mini-batches with only a few videos. This causes
    high-correlation between data points leading to breakdown of the backprop
    algorithm. To address this challenge, we present an asynchronous variational
    inference method that allows efficient end-to-end training. Our method achieves
    a classification mAP of 21.9% on the Charades benchmark, outperforming the
    state-of-the-art (17.2% mAP), and offers equal gains on the task of temporal
    localization.

    Learning Features by Watching Objects Move

    Deepak Pathak, Ross Girshick, Piotr Dollár, Trevor Darrell, Bharath Hariharan
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    This paper presents a novel yet intuitive approach to unsupervised feature
    learning. Inspired by the human visual system, we explore whether low-level
    motion-based grouping cues can be used to learn an effective visual
    representation. Specifically, we use unsupervised motion-based segmentation on
    videos to obtain segments, which we use as ‘pseudo ground truth’ to train a
    convolutional network to segment objects from a single frame. Given the
    extensive evidence that motion plays a key role in the development of the human
    visual system, we hope that this straightforward approach to unsupervised
    learning will be more effective than cleverly designed ‘pretext’ tasks studied
    in the literature. Indeed, our extensive experiments show that this is the
    case. When used for transfer learning on object detection, our representation
    significantly outperforms previous unsupervised approaches across multiple
    settings, especially when training data for the target task is scarce.

    Dense Supervision for Visual Comparisons via Synthetic Images

    Aron Yu, Kristen Grauman
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Distinguishing subtle differences in attributes is valuable, yet learning to
    make visual comparisons remains non-trivial. Not only is the number of possible
    comparisons quadratic in the number of training images, but also access to
    images adequately spanning the space of fine-grained visual differences is
    limited. We propose to overcome the sparsity of supervision problem via
    synthetically generated images. Building on a state-of-the-art image generation
    engine, we sample pairs of training images exhibiting slight modifications of
    individual attributes. Augmenting real training image pairs with these
    examples, we then train attribute ranking models to predict the relative
    strength of an attribute in novel pairs of real images. Our results on datasets
    of faces and fashion images show the great promise of bootstrapping imperfect
    image generators to counteract sample sparsity for learning to rank.

    Image Retrieval with Deep Local Features and Attention-based Keypoints

    Hyeonwoo Noh, Andre Araujo, Jack Sim, Bohyung Han
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We introduce a local feature descriptor for large-scale image retrieval
    applications, called DELF (DEep Local Feature). The new feature is based on
    convolutional neural networks, which are trained without object- and
    patch-level annotations on a landmark image dataset. To enhance DELF’s image
    retrieval performance, we also propose an attention mechanism for keypoint
    selection, which shares most network layers with the descriptor. This new
    framework can be used in image retrieval as a drop-in replacement for other
    keypoint detectors and descriptors, enabling more accurate feature matching and
    geometric verification. Our technique is particularly useful for the
    large-scale setting, where a system must operate with high precision. In this
    case, our system produces reliable confidence scores to reject false positives
    effectively—in particular, our system is robust against queries that have no
    correct match in the database. We present an evaluation methodology for this
    challenging retrieval setting, using standard and large-scale datasets. We show
    that recently proposed methods do not perform well in this setup; DELF
    outperforms several recent global and local descriptors by substantial margins.

    Photo-Quality Evaluation based on Computational Aesthetics: Review of Feature Extraction Techniques

    Dimitris Spathis
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Researchers try to model the aesthetic quality of photographs into low and
    high- level features, drawing inspiration from art theory, psychology and
    marketing. We attempt to describe every feature extraction measure employed in
    the above process. The contribution of this literature review is the taxonomy
    of each feature by its implementation complexity, considering real-world
    applications and integration in mobile apps and digital cameras. Also, we
    discuss the machine learning results along with some unexplored research areas
    as future work.

    An extended Perona-Malik model based on probabilistic models

    Lars M. Mescheder, Dirk A. Lorenz
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Numerical Analysis (math.NA); Machine Learning (stat.ML)

    The Perona-Malik model has been very successful at restoring images from
    noisy input. In this paper, we reinterpret the Perona-Malik model in the
    language of Gaussian scale mixtures and derive some extensions of the model.
    Specifically, we show that the expectation-maximization (EM) algorithm applied
    to Gaussian scale mixtures leads to the lagged-diffusivity algorithm for
    computing stationary points of the Perona-Malik diffusion equations. Moreover,
    we show how mean field approximations to these Gaussian scale mixtures lead to
    a modification of the lagged-diffusivity algorithm that better captures the
    uncertainties in the restoration. Since this modification can be hard to
    compute in practice we propose relaxations to the mean field objective to make
    the algorithm computationally feasible. Our numerical experiments show that
    this modified lagged-diffusivity algorithm often performs better at restoring
    textured areas and fuzzy edges than the unmodified algorithm. As a second
    application of the Gaussian scale mixture framework, we show how an efficient
    sampling procedure can be obtained for the probabilistic model, making the
    computation of the conditional mean and other expectations algorithmically
    feasible. Again, the resulting algorithm has a strong resemblance to the
    lagged-diffusivity algorithm. Finally, we show that a probabilistic version of
    the Mumford-Shah segementation model can be obtained in the same framework with
    a discrete edge-prior.

    Crowd collectiveness measure via graph-based node clique learning

    Weiya Ren
    Comments: 23 pages. 10 figures. 5 tables
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Social and Information Networks (cs.SI)

    Collectiveness motions of crowd systems have attracted a great deal of
    attentions in recently years. In this paper, we try to measure the
    collectiveness of a crowd system by the proposed node clique learning method.
    The proposed method is a graph based method, and investigates the influence
    from one node to other nodes. A node is represented by a set of nodes which
    named a clique, which is obtained by spreading information from this node to
    other nodes in graph. Then only nodes with sufficient information are selected
    as the clique of this node. The motion coherence between two nodes is defined
    by node cliques comparing. The collectiveness of a node and the collectiveness
    of the crowd system are defined by the nodes coherence. Self-driven particle
    (SDP) model and the crowd motion database are used to test the ability of the
    proposed method in measuring collectiveness.

    Few-Shot Object Recognition from Machine-Labeled Web Images

    Zhongwen Xu, Linchao Zhu, Yi Yang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    With the tremendous advances of Convolutional Neural Networks (ConvNets) on
    object recognition, we can now obtain reliable enough machine-labeled
    annotations easily by predictions from off-the-shelf ConvNets. In this work, we
    present an abstraction memory based framework for few-shot learning, building
    upon machine-labeled image annotations. Our method takes some large-scale
    machine-annotated datasets (e.g., OpenImages) as an external memory bank. In
    the external memory bank, the information is stored in the memory slots with
    the form of key-value, where image feature is regarded as key and label
    embedding serves as value. When queried by the few-shot examples, our model
    selects visually similar data from the external memory bank, and writes the
    useful information obtained from related external data into another memory
    bank, i.e., abstraction memory. Long Short-Term Memory (LSTM) controllers and
    attention mechanisms are utilized to guarantee the data written to the
    abstraction memory is correlated to the query example. The abstraction memory
    concentrates information from the external memory bank, so that it makes the
    few-shot recognition effective. In the experiments, we firstly confirm that our
    model can learn to conduct few-shot object recognition on clean human-labeled
    data from ImageNet dataset. Then, we demonstrate that with our model,
    machine-labeled image annotations are very effective and abundant resources to
    perform object recognition on novel categories. Experimental results show that
    our proposed model with machine-labeled annotations achieves great performance,
    only with a gap of 1% between of the one with human-labeled annotations.

    Active and Continuous Exploration with Deep Neural Networks and Expected Model Output Changes

    Christoph Käding, Erik Rodner, Alexander Freytag, Joachim Denzler
    Comments: accepted contribution at NIPS 2016 Workshop on Continual Learning and Deep Networks
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    The demands on visual recognition systems do not end with the complexity
    offered by current large-scale image datasets, such as ImageNet. In
    consequence, we need curious and continuously learning algorithms that actively
    acquire knowledge about semantic concepts which are present in available
    unlabeled data. As a step towards this goal, we show how to perform continuous
    active learning and exploration, where an algorithm actively selects relevant
    batches of unlabeled examples for annotation. These examples could either
    belong to already known or to yet undiscovered classes. Our algorithm is based
    on a new generalization of the Expected Model Output Change principle for deep
    architectures and is especially tailored to deep neural networks. Furthermore,
    we show easy-to-implement approximations that yield efficient techniques for
    active selection. Empirical experiments show that our method outperforms
    currently used heuristics.

    Cross-Modal Manifold Learning for Cross-modal Retrieval

    Sailesh Conjeti, Anees Kazi, Nassir Navab, Amin Katouzian
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper presents a new scalable algorithm for cross-modal similarity
    preserving retrieval in a learnt manifold space. Unlike existing approaches
    that compromise between preserving global and local geometries, the proposed
    technique respects both simultaneously during manifold alignment. The global
    topologies are maintained by recovering underlying mapping functions in the
    joint manifold space by deploying partially corresponding instances. The
    inter-, and intra-modality affinity matrices are then computed to reinforce
    original data skeleton using perturbed minimum spanning tree (pMST), and
    maximizing the affinity among similar cross-modal instances, respectively. The
    performance of proposed algorithm is evaluated upon two multimodal image
    datasets (coronary atherosclerosis histology and brain MRI) for two
    applications: classification, and regression. Our exhaustive validations and
    results demonstrate the superiority of our technique over comparative methods
    and its feasibility for improving computer-assisted diagnosis systems, where
    disease-specific complementary information shall be aggregated and interpreted
    across modalities to form the final decision.

    X-ray In-Depth Decomposition: Can Deep Learning Reveal The Latent Structures?

    Shadi Albarqouni, Javad Fotouhi, Nassir Navab
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    X-ray radiography is the most readily available imaging modality and has a
    broad range of applications that spans from diagnosis to intra-operative
    guidance in cardiac, orthopedics, and trauma procedures. Proper interpretation
    of the hidden and obscured anatomy in X-ray images remains a challenge and
    often requires high radiation dose and imaging from several perspectives.

    In this work, we aim at decomposing the conventional X-ray image into d X-ray
    components of independent, non-overlapped, clipped sub-volumes using deep
    learning approach. Despite the challenging aspects of modelling such a highly
    ill-posed problem, exciting and encouraging results are obtained paving the
    path for further contributions in this direction.

    On Random Weights for Texture Generation in One Layer Neural Networks

    Mihir Mongia, Kundan Kumar, Akram Erraqabi, Yoshua Bengio
    Comments: ICASSP 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Recent work in the literature has shown experimentally that one can use the
    lower layers of a trained convolutional neural network (CNN) to model natural
    textures. More interestingly, it has also been experimentally shown that only
    one layer with random filters can also model textures although with less
    variability. In this paper we ask the question as to why one layer CNNs with
    random filters are so effective in generating textures? We theoretically show
    that one layer convolutional architectures (without a non-linearity) paired
    with the an energy function used in previous literature, can in fact preserve
    and modulate frequency coefficients in a manner so that random weights and
    pretrained weights will generate the same type of images. Based on the results
    of this analysis we question whether similar properties hold in the case where
    one uses one convolution layer with a non-linearity. We show that in the case
    of ReLu non-linearity there are situations where only one input will give the
    minimum possible energy whereas in the case of no nonlinearity, there are
    always infinite solutions that will give the minimum possible energy. Thus we
    can show that in certain situations adding a ReLu non-linearity generates less
    variable images.

    Dual Deep Network for Visual Tracking

    Zhizhen Chi, Hongyang Li, Huchuan Lu, Ming-Hsuan Yang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Visual tracking addresses the problem of identifying and localizing an
    unknown target in a video given the target specified by a bounding box in the
    first frame. In this paper, we propose a dual network to better utilize
    features among layers for visual tracking. It is observed that features in
    higher layers encode semantic context while its counterparts in lower layers
    are sensitive to discriminative appearance. Thus we exploit the hierarchical
    features in different layers of a deep model and design a dual structure to
    obtain better feature representation from various streams, which is rarely
    investigated in previous work. To highlight geometric contours of the target,
    we integrate the hierarchical feature maps with an edge detector as the coarse
    prior maps to further embed local details around the target. To leverage the
    robustness of our dual network, we train it with random patches measuring the
    similarities between the network activation and target appearance, which serves
    as a regularization to enforce the dual network to focus on target object. The
    proposed dual network is updated online in a unique manner based on the
    observation that the target being tracked in consecutive frames should share
    more similar feature representations than those in the surrounding background.
    It is also found that for a target object, the prior maps can help further
    enhance performance by passing message into the output maps of the dual
    network. Therefore, an independent component analysis with reference algorithm
    (ICA-R) is employed to extract target context using prior maps as guidance.
    Online tracking is conducted by maximizing the posterior estimate on the final
    maps with stochastic and periodic update. Quantitative and qualitative
    evaluations on two large-scale benchmark data sets show that the proposed
    algorithm performs favourably against the state-of-the-arts.

    Parsing Images of Overlapping Organisms with Deep Singling-Out Networks

    Victor Yurchenko, Victor Lempitsky
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This work is motivated by the mostly unsolved task of parsing biological
    images with multiple overlapping articulated model organisms (such as worms or
    larvae). We present a general approach that separates the two main challenges
    associated with such data, individual object shape estimation and object groups
    disentangling. At the core of the approach is a deep feed-forward singling-out
    network (SON) that is trained to map each local patch to a vectorial descriptor
    that is sensitive to the characteristics (e.g. shape) of a central object,
    while being invariant to the variability of all other surrounding elements.
    Given a SON, a local image patch can be matched to a gallery of isolated
    elements using their SON-descriptors, thus producing a hypothesis about the
    shape of the central element in that patch. The image-level optimization based
    on integer programming can then pick a subset of the hypotheses to explain
    (parse) the whole image and disentangle groups of organisms.

    While sharing many similarities with existing “analysis-by-synthesis”
    approaches, our method avoids the need for stochastic search in the
    high-dimensional configuration space and numerous rendering operations at
    test-time. We show that our approach can parse microscopy images of three
    popular model organisms (the C.Elegans roundworms, the Drosophila larvae, and
    the E.Coli bacteria) even under significant crowding and overlaps between
    organisms. We speculate that the overall approach is applicable to a wider
    class of image parsing problems concerned with crowded articulated objects, for
    which rendering training images is possible.

    Adversarial Deep Structural Networks for Mammographic Mass Segmentation

    Wentao Zhu, Xiaohui Xie
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Mass segmentation is an important task in mammogram analysis, providing
    effective morphological features and regions of interest (ROI) for mass
    detection and classification. Inspired by the success of using deep
    convolutional features for natural image analysis and conditional random fields
    (CRF) for structural learning, we propose an end-to-end network for
    mammographic mass segmentation. The network employs a fully convolutional
    network (FCN) to model potential function, followed by a CRF to perform
    structural learning. Because the mass distribution varies greatly with pixel
    position, the FCN is combined with position priori for the task. Due to the
    small size of mammogram datasets, we use adversarial training to control
    over-fitting. Four models with different convolutional kernels are further
    fused to improve the segmentation results. Experimental results on two public
    datasets, INbreast and DDSM-BCRP, show that our end-to-end network combined
    with adversarial training achieves the-state-of-the-art results.

    Deep Multi-instance Networks with Sparse Label Assignment for Whole Mammogram Classification

    Wentao Zhu, Qi Lou, Yeeleng Scott Vang, Xiaohui Xie
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Mammogram classification is directly related to computer-aided diagnosis of
    breast cancer. Traditional methods requires great effort to annotate the
    training data by costly manual labeling and specialized computational models to
    detect these annotations during test. Inspired by the success of using deep
    convolutional features for natural image analysis and multi-instance learning
    for labeling a set of instances/patches, we propose end-to-end trained deep
    multi-instance networks for mass classification based on whole mammogram
    without the aforementioned costly need to annotate the training data. We
    explore three different schemes to construct deep multi-instance networks for
    whole mammogram classification. Experimental results on the INbreast dataset
    demonstrate the robustness of proposed deep networks compared to previous work
    using segmentation and detection annotations in the training.

    Learning a No-Reference Quality Metric for Single-Image Super-Resolution

    Chao Ma, Chih-Yuan Yang, Xiaokang Yang, Ming-Hsuan Yang
    Comments: Accepted by Computer Vision and Image Understanding
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Numerous single-image super-resolution algorithms have been proposed in the
    literature, but few studies address the problem of performance evaluation based
    on visual perception. While most super-resolution images are evaluated by
    fullreference metrics, the effectiveness is not clear and the required
    ground-truth images are not always available in practice. To address these
    problems, we conduct human subject studies using a large set of
    super-resolution images and propose a no-reference metric learned from visual
    perceptual scores. Specifically, we design three types of low-level statistical
    features in both spatial and frequency domains to quantify super-resolved
    artifacts, and learn a two-stage regression model to predict the quality scores
    of super-resolution images without referring to ground-truth images. Extensive
    experimental results show that the proposed metric is effective and efficient
    to assess the quality of super-resolution images based on human perception.

    Deep Learning on Lie Groups for Skeleton-based Action Recognition

    Zhiwu Huang, Chengde Wan, Thomas Probst, Luc Van Gool
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In recent years, skeleton-based action recognition has become a popular 3D
    classification problem. State-of-the-art methods typically first represent each
    motion sequence as a high-dimensional trajectory on a Lie group with an
    additional dynamic time warping, and then shallowly learn favorable Lie group
    features. In this paper we incorporate the Lie group structure into a deep
    network architecture to learn more appropriate Lie group features for 3D action
    recognition. Within the network structure, we design rotation mapping layers to
    transform the input Lie group features into desirable ones, which are aligned
    better in the temporal domain. To reduce the high feature dimensionality, the
    architecture is equipped with rotation pooling layers for the elements on the
    Lie group. Furthermore, we propose a logarithm mapping layer to map the
    resulting manifold data into a tangent space that facilitates the application
    of regular output layers for the final classification. Evaluations of the
    proposed network for standard 3D human action recognition datasets clearly
    demonstrate its superiority over existing shallow Lie group feature learning
    methods as well as most conventional deep learning methods.

    3D Shape Induction from 2D Views of Multiple Objects

    Matheus Gadelha, Subhransu Maji, Rui Wang
    Comments: Submitted to CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper we investigate the problem of inducing a distribution over
    three-dimensional structures given two-dimensional views of multiple objects
    taken from unknown viewpoints. Our approach called “projective generative
    adversarial networks” (PrGANs) trains a deep generative model of 3D shapes
    whose projections match the distributions of the input 2D views. The addition
    of a projection module allows us to infer the underlying 3D shape distribution
    without using any 3D, viewpoint information, or annotation during the learning
    phase. We show that our approach produces 3D shapes of comparable quality to
    GANs trained on 3D data for a number of shape categories including chairs,
    airplanes, and cars. Experiments also show that the disentangled representation
    of 2D shapes into geometry and viewpoint leads to a good generative model of 2D
    shapes. The key advantage is that our model allows us to predict 3D, viewpoint,
    and generate novel views from an input image in a completely unsupervised
    manner.

    EgoTransfer: Transferring Motion Across Egocentric and Exocentric Domains using Deep Neural Networks

    Shervin Ardeshir, Krishna Regmi, Ali Borji
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Mirror neurons have been observed in the primary motor cortex of primate
    species, in particular in humans and monkeys. A mirror neuron fires when a
    person performs a certain action, and also when he observes the same action
    being performed by another person. A crucial step towards building fully
    autonomous intelligent systems with human-like learning abilities is the
    capability in modeling the mirror neuron. On one hand, the abundance of
    egocentric cameras in the past few years has offered the opportunity to study a
    lot of vision problems from the first-person perspective. A great deal of
    interesting research has been done during the past few years, trying to explore
    various computer vision tasks from the perspective of the self. On the other
    hand, videos recorded by traditional static cameras, capture humans performing
    different actions from an exocentric third-person perspective. In this work, we
    take the first step towards relating motion information across these two
    perspectives. We train models that predict motion in an egocentric view, by
    observing it from an exocentric view, and vice versa. This allows models to
    predict how an egocentric motion would look like from outside. To do so, we
    train linear and nonlinear models and evaluate their performance in terms of
    retrieving the egocentric (exocentric) motion features, while having access to
    an exocentric (egocentric) motion feature. Our experimental results demonstrate
    that motion information can be successfully transferred across the two views.

    Learning to predict where to look in interactive environments using deep recurrent q-learning

    Sajad Mousavi, Ali Borji, Nasser Mozayani
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Bottom-Up (BU) saliency models do not perform well in complex interactive
    environments where humans are actively engaged in tasks (e.g., sandwich making
    and playing the video games). In this paper, we leverage Reinforcement Learning
    (RL) to highlight task-relevant locations of input frames. We propose a soft
    attention mechanism combined with the Deep Q-Network (DQN) model to teach an RL
    agent how to play a game and where to look by focusing on the most pertinent
    parts of its visual input. Our evaluations on several Atari 2600 games show
    that the soft attention based model could predict fixation locations
    significantly better than bottom-up models such as Itti-Kochs saliency and
    Graph-Based Visual Saliency (GBVS) models.

    Microscopic Muscle Image Enhancement

    Xiangfei Kong, Lin Yang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We propose a robust image enhancement algorithm dedicated for muscle fiber
    specimen images captured by optical microscopes. Blur or out of focus problems
    are prevalent in muscle images during the image acquisition stage. Traditional
    image deconvolution methods do not work since they assume the blur kernels are
    known and also produce ring artifacts. We provide a compact framework which
    involves a novel spatially non-uniform blind deblurring approach specialized to
    muscle images which automatically detects and alleviates degraded regions. Ring
    artifacts problems are addressed and a kernel propagation strategy is proposed
    to speedup the algorithm and deals with the high non-uniformity of the blur
    kernels on muscle images. Experiments show that the proposed framework performs
    well on muscle images taken with modern advanced optical microscopes. Our
    framework is free of laborious parameter settings and is computationally
    efficient.

    A Fusion Method Based on Decision Reliability Ratio for Finger Vein Verification

    Liao Ni, Yi Zhang, He Zheng, Shilei Liu, Houjun Huang, Wenxin Li
    Comments: 5 pages, 6 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Finger vein verification has developed a lot since its first proposal, but
    there is still not a perfect algorithm. It is proved that algorithms with the
    same overall accuracy may have different misclassified patterns. We could make
    use of this complementation to fuse individual algorithms together for more
    precise result. According to our observation, algorithm has different
    confidence on its decisions but it is seldom considered in fusion methods. Our
    work is first to define decision reliability ratio to quantify this confidence,
    and then propose the Maximum Decision Reliability Ratio (MDRR) fusion method
    incorporating Weighted Voting. Experiment conducted on a data set of 1000
    fingers and 5 images per finger proves the effectiveness of the method. The
    classifier obtained by MDRR method gets an accuracy of 99.42% while the maximum
    accuracy of the original individual classifiers is 97.77%. The experiment
    results also show the MDRR outperforms the traditional fusion methods as
    Voting, Weighted Voting, Sum and Weighted Sum.

    Handwritten Signature Verification Using Hand-Worn Devices

    Ben Nassi, Alona Levy, Yuval Elovici, Erez Shmueli
    Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY)

    Online signature verification technologies, such as those available in banks
    and post offices, rely on dedicated digital devices such as tablets or smart
    pens to capture, analyze and verify signatures. In this paper, we suggest a
    novel method for online signature verification that relies on the increasingly
    available hand-worn devices, such as smartwatches or fitness trackers, instead
    of dedicated ad-hoc devices. Our method uses a set of known genuine and forged
    signatures, recorded using the motion sensors of a hand-worn device, to train a
    machine learning classifier. Then, given the recording of an unknown signature
    and a claimed identity, the classifier can determine whether the signature is
    genuine or forged. In order to validate our method, it was applied on 1980
    recordings of genuine and forged signatures that we collected from 66 subjects
    in our institution. Using our method, we were able to successfully distinguish
    between genuine and forged signatures with a high degree of accuracy (0.98 AUC
    and 0.05 EER).

    Efficient Global Spatial-Angular Sparse Coding for Diffusion MRI with Separable Dictionaries

    Evan Schwab, René Vidal, Nicolas Charon
    Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Quantitative Methods (q-bio.QM)

    Diffusion MRI (dMRI) can reconstruct neuronal fibers in the brain, in vivo,
    by measuring water diffusion along angular gradient directions in q-space. High
    angular resolution diffusion imaging (HARDI) can produce better estimates of
    fiber orientation than the popularly used diffusion tensor imaging, but the
    high number of samples needed to estimate diffusivity requires lengthy patient
    scan times. To accelerate dMRI, compressed sensing (CS) has been utilized by
    exploiting a sparse representation of the data, discovered through sparse
    coding. The sparser the representation, the fewer samples are needed to
    reconstruct a high resolution signal with limited information loss and so a
    focus of much dMRI research has been finding the sparsest possible dictionary
    representation. All prior methods, however, rely on an angular model of q-space
    signals in each voxel which fundamentally limits the global sparsity level
    since at least one dictionary atom is needed for each voxel. In contrast, we
    formulate a global spatial-angular representation of dMRI that will allow us to
    sparsely model an entire dMRI brain signal below the limit of one atom per
    voxel using joint spatial-angular sparse coding. But a main challenge is
    optimizing over large-scale dMRI data. In this work, we present extensions to a
    number of sparse coding algorithms that are better suited for large-scale
    problems by exploiting the separable Kronecker structure of our global
    spatial-angular dictionary. We compare the complexity and speed of our methods
    with prior Kronecker sparse coding algorithms and show promising sparsity
    results on phantom and real HARDI brain data for various dictionary choices.
    With great efficiency our method achieves significantly sparser HARDI
    representations than the state-of-the-art which has the potential achieve new
    levels of HARDI acceleration within a unified (k,q)-CS framework.


    Artificial Intelligence

    Context and Interference Effects in the Combinations of Natural Concepts

    Diederik Aerts, Jonito Aerts Arguëlles, Lester Beltran, Lyneth Beltran, Massimiliano Sassoli de Bianchi, Sandro Sozzo, Tomas Veloz
    Comments: 12 pages, no figures
    Subjects: Artificial Intelligence (cs.AI); Quantum Physics (quant-ph)

    The mathematical formalism of quantum theory exhibits significant
    effectiveness when applied to cognitive phenomena that have resisted
    traditional (set theoretical) modeling. Relying on a decade of research on the
    operational foundations of micro-physical and conceptual entities, we present a
    theoretical framework for the representation of concepts and their conjunctions
    and disjunctions that uses the quantum formalism. This framework provides a
    unified solution to the ‘conceptual combinations problem’ of cognitive
    psychology, explaining the observed deviations from classical (Boolean, fuzzy
    set and Kolmogorovian) structures in terms of genuine quantum effects. In
    particular, natural concepts ‘interfere’ when they combine to form more complex
    conceptual entities, and they also exhibit a ‘quantum-type context-dependence’,
    which are responsible of the ‘over- and under-extension’ that are
    systematically observed in experiments on membership judgments.

    A Hidden Absorbing Semi-Markov Model for Informatively Censored Temporal Data: Learning and Inference

    Ahmed M. Alaa, Mihaela van der Schaar
    Subjects: Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    Modeling continuous-time physiological processes that manifest a patient’s
    evolving clinical states is a key step in approaching many problems in
    healthcare. In this paper, we develop the Hidden Absorbing Semi-Markov Model
    (HASMM): a versatile probabilistic model that is capable of capturing the
    modern electronic health record (EHR) data. Unlike exist- ing models, an HASMM
    accommodates irregularly sampled, temporally correlated, and informatively
    censored physiological data, and can describe non-stationary clinical state
    transitions. Learning an HASMM from the EHR data is achieved via a novel
    forward- filtering backward-sampling Monte-Carlo EM algorithm that exploits the
    knowledge of the end-point clinical outcomes (informative censoring) in the EHR
    data, and implements the E-step by sequentially sampling the patients’ clinical
    states in the reverse-time direction while conditioning on the future states.
    Real-time inferences are drawn via a forward- filtering algorithm that operates
    on a virtually constructed discrete-time embedded Markov chain that mirrors the
    patient’s continuous-time state trajectory. We demonstrate the di- agnostic and
    prognostic utility of the HASMM in a critical care prognosis setting using a
    real-world dataset for patients admitted to the Ronald Reagan UCLA Medical
    Center.

    Sample-efficient Deep Reinforcement Learning for Dialog Control

    Kavosh Asadi, Jason D. Williams
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    Representing a dialog policy as a recurrent neural network (RNN) is
    attractive because it handles partial observability, infers a latent
    representation of state, and can be optimized with supervised learning (SL) or
    reinforcement learning (RL). For RL, a policy gradient approach is natural, but
    is sample inefficient. In this paper, we present 3 methods for reducing the
    number of dialogs required to optimize an RNN-based dialog policy with RL. The
    key idea is to maintain a second RNN which predicts the value of the current
    policy, and to apply experience replay to both networks. On two tasks, these
    methods reduce the number of dialogs/episodes required by about a third, vs.
    standard policy gradient methods.

    Web-based Semantic Similarity for Emotion Recognition in Web Objects

    Valentina Franzoni, Giulio Biondi, Alfredo Milani, Yuanxi Li
    Comments: Authors preprint, including revision differences with respect to the main publication ‘Web-based Similarity for Emotion Recognition in Web Objects’ published in the UCC ’16 workshop in IEEE UCC, December 06 – 09, 2016, Shanghai, China. DOI: this http URL
    Journal-ref: In Proc 9th International Conference on Utility and Cloud
    Computing (UCC 2016). ACM, New York, NY, USA, 327-332
    Subjects: Artificial Intelligence (cs.AI); Social and Information Networks (cs.SI)

    In this project we propose a new approach for emotion recognition using
    web-based similarity (e.g. confidence, PMI and PMING). We aim to extract basic
    emotions from short sentences with emotional content (e.g. news titles, tweets,
    captions), performing a web-based quantitative evaluation of semantic proximity
    between each word of the analyzed sentence and each emotion of a psychological
    model (e.g. Plutchik, Ekman, Lovheim). The phases of the extraction include:
    text preprocessing (tokenization, stop words, filtering), search engine
    automated query, HTML parsing of results (i.e. scraping), estimation of
    semantic proximity, ranking of emotions according to proximity measures. The
    main idea is that, since it is possible to generalize semantic similarity under
    the assumption that similar concepts co-occur in documents indexed in search
    engines, therefore also emotions can be generalized in the same way, through
    tags or terms that express them in a particular language, ranking emotions.
    Training results are compared to human evaluation, then additional comparative
    tests on results are performed, both for the global ranking correlation (e.g.
    Kendall, Spearman, Pearson) both for the evaluation of the emotion linked to
    each single word. Different from sentiment analysis, our approach works at a
    deeper level of abstraction, aiming at recognizing specific emotions and not
    only the positive/negative sentiment, in order to predict emotions as semantic
    data.

    Optimal Target Assignment and Path Finding for Teams of Agents

    Hang Ma, Sven Koenig
    Comments: In AAMAS 2016
    Subjects: Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA); Robotics (cs.RO)

    We study the TAPF (combined target-assignment and path-finding) problem for
    teams of agents in known terrain, which generalizes both the anonymous and
    non-anonymous multi-agent path-finding problems. Each of the teams is given the
    same number of targets as there are agents in the team. Each agent has to move
    to exactly one target given to its team such that all targets are visited. The
    TAPF problem is to first assign agents to targets and then plan collision-free
    paths for the agents to their targets in a way such that the makespan is
    minimized. We present the CBM (Conflict-Based Min-Cost-Flow) algorithm, a
    hierarchical algorithm that solves TAPF instances optimally by combining ideas
    from anonymous and non-anonymous multi-agent path-finding algorithms. On the
    low level, CBM uses a min-cost max-flow algorithm on a time-expanded network to
    assign all agents in a single team to targets and plan their paths. On the high
    level, CBM uses conflict-based search to resolve collisions among agents in
    different teams. Theoretically, we prove that CBM is correct, complete and
    optimal. Experimentally, we show the scalability of CBM to TAPF instances with
    dozens of teams and hundreds of agents and adapt it to a simulated warehouse
    system.

    Learning Features by Watching Objects Move

    Deepak Pathak, Ross Girshick, Piotr Dollár, Trevor Darrell, Bharath Hariharan
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    This paper presents a novel yet intuitive approach to unsupervised feature
    learning. Inspired by the human visual system, we explore whether low-level
    motion-based grouping cues can be used to learn an effective visual
    representation. Specifically, we use unsupervised motion-based segmentation on
    videos to obtain segments, which we use as ‘pseudo ground truth’ to train a
    convolutional network to segment objects from a single frame. Given the
    extensive evidence that motion plays a key role in the development of the human
    visual system, we hope that this straightforward approach to unsupervised
    learning will be more effective than cleverly designed ‘pretext’ tasks studied
    in the literature. Indeed, our extensive experiments show that this is the
    case. When used for transfer learning on object detection, our representation
    significantly outperforms previous unsupervised approaches across multiple
    settings, especially when training data for the target task is scarce.

    Learning Human-Understandable Strategies

    Sam Ganzfried, Farzana Yusuf
    Subjects: Computer Science and Game Theory (cs.GT); Artificial Intelligence (cs.AI); Learning (cs.LG); Multiagent Systems (cs.MA)

    Algorithms for equilibrium computation generally make no attempt to ensure
    that the computed strategies are understandable by humans. For instance the
    strategies for the strongest poker agents are represented as massive binary
    files. In many situations, we would like to compute strategies that can
    actually be implemented by humans, who may have computational limitations and
    may only be able to remember a small number of features or components of the
    strategies that have been computed. We study poker games where private
    information distributions can be arbitrary. We create a large training set of
    game instances and solutions, by randomly selecting the private information
    probabilities, and present algorithms that learn from the training instances in
    order to perform well in games with unseen information distributions. One
    approach first clusters the training points into a small number of clusters and
    then creates a small decision tree based on the cluster centers. This approach
    produces low test error and could be easily implemented by humans since it only
    requires memorizing a small number of “if-then” rules.

    Improving Tweet Representations using Temporal and User Context

    Ganesh J, Manish Gupta, Vasudeva Varma
    Comments: To be presented at European Conference on Information Retrieval (ECIR) 2017
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

    In this work we propose a novel representation learning model which computes
    semantic representations for tweets accurately. Our model systematically
    exploits the chronologically adjacent tweets (‘context’) from users’ Twitter
    timelines for this task. Further, we make our model user-aware so that it can
    do well in modeling the target tweet by exploiting the rich knowledge about the
    user such as the way the user writes the post and also summarizing the topics
    on which the user writes. We empirically demonstrate that the proposed models
    outperform the state-of-the-art models in predicting the user profile
    attributes like spouse, education and job by 19.66%, 2.27% and 2.22%
    respectively.

    Reducing Redundant Computations with Flexible Attention

    Raphael Shu, Hideki Nakayama
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

    Recently, attention mechanism plays a key role to achieve high performance
    for Neural Machine Translation models. It applies a score function to the
    encoder states to obtain alignment weights. However, as this computation is
    done for all positions in each decoding step, the attention mechanism greatly
    increases the computational complexity. In this paper we propose a novel
    attention model which can reduce redundant attentional computations in a
    flexible manner. The proposed mechanism tracks the center of attention in each
    decoding step, and computes position-based penalties. In the test time, the
    computations of the score function for heavily penalized positions are skipped.
    In our experiments, we found that the computations in the attention model can
    be reduced by 54% in average with almost no loss of accuracy.

    Self-Correcting Models for Model-Based Reinforcement Learning

    Erik Talvitie
    Comments: To appear in Proceedings of the 31st AAAI Conference on Artificial Intelligence, 2017. This version incorporates the appendix into document (rather than as supplementary material)
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

    When an agent cannot represent a perfectly accurate model of its
    environment’s dynamics, model-based reinforcement learning (MBRL) can fail
    catastrophically. Planning involves composing the predictions of the model;
    when flawed predictions are composed, even minor errors can compound and render
    the model useless for planning. Hallucinated Replay (Talvitie 2014) trains the
    model to “correct” itself when it produces errors, substantially improving MBRL
    with flawed models. This paper theoretically analyzes this approach,
    illuminates settings in which it is likely to be effective or ineffective, and
    presents a novel error bound, showing that a model’s ability to self-correct is
    more tightly related to MBRL performance than one-step prediction error. These
    results inspire an MBRL algorithm for deterministic MDPs with performance
    guarantees that are robust to model class limitations.

    An Integrated Optimization+ Learning Approach to Optimal Dynamic Pricing for the Retailer with Multi-type Customers in Smart Grids

    Fanlin Meng, Xiao-Jun Zeng, Yan Zhang, Chris J. Dent, Dunwei Gong
    Comments: 9 pages, 5 figures
    Subjects: Systems and Control (cs.SY); Artificial Intelligence (cs.AI); Computer Science and Game Theory (cs.GT); Optimization and Control (math.OC)

    In this paper, we consider a realistic scenario in the context of smart grids
    where an electricity retailer serves three different types of customers, i.e.,
    customers with home energy management system embedded in their smart meters
    (C-HEMS), customers with only smart meters (C-SM), and customers without smart
    meters (C-NONE). The main objective of this paper is to support the retailer to
    make optimal day-ahead dynamic pricing decisions subject to realistic market
    constraints in such a mixed customer pool. To this end, we firstly propose an
    optimal energy management model for each C-HEMS, two appliance-level customer
    behaviour learning models for each CSM, and an aggregated demand forecasting
    model for the whole C-NONE. With the above demand models established capturing
    energy consumption patterns of different type customers, we then adopt genetic
    algorithms (GAs) based solution framework to obtain the optimal dynamic prices
    for the retailer. In addition to smart meters and HEMS, the penetration of
    Photovoltaic (PV) generation and energy storage in the households is further
    considered in the pricing optimization model. Simulation results indicate that
    the proposed pricing optimization algorithms are effective.

    A Comment on Argumentation

    Karl Schlechta (LIF)
    Subjects: Logic (math.LO); Artificial Intelligence (cs.AI)

    We use the theory of defaults and their meaning of [GS16] to develop (the
    outline of a) new theory of argumentation.

    Exploiting sparsity to build efficient kernel based collaborative filtering for top-N item recommendation

    Mirko Polato, Fabio Aiolli
    Comments: Under revision for Neurocomputing (Elsevier Journal)
    Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Learning (cs.LG)

    The increasing availability of implicit feedback datasets has raised the
    interest in developing effective collaborative filtering techniques able to
    deal asymmetrically with unambiguous positive feedback and ambiguous negative
    feedback. In this paper, we propose a principled kernel-based collaborative
    filtering method for top-N item recommendation with implicit feedback. We
    present an efficient implementation using the linear kernel, and we show how to
    generalize it to kernels of the dot product family preserving the efficiency.
    We also investigate on the elements which influence the sparsity of a standard
    cosine kernel. This analysis shows that the sparsity of the kernel strongly
    depends on the properties of the dataset, in particular on the long tail
    distribution. We compare our method with state-of-the-art algorithms achieving
    good results both in terms of efficiency and effectiveness.

    Quantum Reinforcement Learning

    Daniel Crawford, Anna Levit, Navid Ghadermarzy, Jaspreet S. Oberoi, Pooya Ronagh
    Comments: 16 pages
    Subjects: Quantum Physics (quant-ph); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Optimization and Control (math.OC)

    We investigate whether quantum annealers with select chip layouts can
    outperform classical computers in reinforcement learning tasks. We associate a
    transverse field Ising spin Hamiltonian with a layout of qubits similar to that
    of a deep Boltzmann machine (DBM) and use simulated quantum annealing (SQA) to
    numerically simulate quantum sampling from this system. We design a
    reinforcement learning algorithm in which the set of visible nodes representing
    the states and actions of an optimal policy are the first and last layers of
    the deep network. In absence of a transverse field, our simulations show that
    DBMs train more effectively than restricted Boltzmann machines (RBM) with the
    same number of weights. Since sampling from Boltzmann distributions of a DBM is
    not classically feasible, this is evidence of supremacy of a non-Turing
    sampling oracle. We then develop a framework for training the network as a
    quantum Boltzmann machine (QBM) in the presence of a significant transverse
    field for reinforcement learning. The latter method further improves the
    reinforcement learning method using DBMs.

    A User Simulator for Task-Completion Dialogues

    Xiujun Li, Zachary C. Lipton, Bhuwan Dhingra, Lihong Li, Jianfeng Gao, Yun-Nung Chen
    Comments: 14 pages
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

    Despite widespread interests in reinforcement-learning for task-oriented
    dialogue systems, several obstacles can frustrate research and development
    progress. First, reinforcement learners typically require interaction with the
    environment, so conventional dialogue corpora cannot be used directly. Second,
    each task presents specific challenges, requiring separate corpus of
    task-specific annotated data. Third, collecting and annotating human-machine or
    human-human conversations for task-oriented dialogues requires extensive domain
    knowledge. Because building an appropriate dataset can be both financially
    costly and time-consuming, one popular approach is to build a user simulator
    based upon a corpus of example dialogues. Then, one can train reinforcement
    learning agents in an online fashion as they interact with the simulator.
    Dialogue agents trained on these simulators can serve as an effective starting
    point. Once agents master the simulator, they may be deployed in a real
    environment to interact with humans, and continue to be trained online. To ease
    empirical algorithmic comparisons in dialogues, this paper introduces a new,
    publicly available simulation framework, where our simulator, designed for the
    movie-booking domain, leverages both rules and collected data. The simulator
    supports two tasks: movie ticket booking and movie seeking. Finally, we
    demonstrate several agents and detail the procedure to add and test your own
    agent in the proposed framework.


    Information Retrieval

    Data-Driven Relevance Judgments for Ranking Evaluation

    Nuno Moniz, Luís Torgo, João Vinagre
    Comments: 12 pages, 4 figures
    Subjects: Information Retrieval (cs.IR)

    Ranking evaluation metrics are a fundamental element of design and
    improvement efforts in information retrieval. We observe that most popular
    metrics disregard information portrayed in the scores used to derive rankings,
    when available. This may pose a numerical scaling problem, causing an under- or
    over-estimation of the evaluation depending on the degree of divergence between
    the scores of ranked items. The purpose of this work is to propose a principled
    way of quantifying multi-graded relevance judgments of items and enable a more
    accurate penalization of ordering errors in rankings. We propose a data-driven
    generation of relevance functions based on the degree of the divergence amongst
    a set of items’ scores and its application in the evaluation metric Normalized
    Discounted Cumulative Gain (nDCG). We use synthetic data to demonstrate the
    interest of our proposal and a combination of data on news items from Google
    News and their respective popularity in Twitter to show its performance in
    comparison to the standard nDCG. Results show that our proposal is capable of
    providing a more fine-grained evaluation of rankings when compared to the
    standard nDCG, and that the latter frequently under- or over-estimates its
    evaluation scores in light of the divergence of items’ scores.

    Exploiting sparsity to build efficient kernel based collaborative filtering for top-N item recommendation

    Mirko Polato, Fabio Aiolli
    Comments: Under revision for Neurocomputing (Elsevier Journal)
    Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Learning (cs.LG)

    The increasing availability of implicit feedback datasets has raised the
    interest in developing effective collaborative filtering techniques able to
    deal asymmetrically with unambiguous positive feedback and ambiguous negative
    feedback. In this paper, we propose a principled kernel-based collaborative
    filtering method for top-N item recommendation with implicit feedback. We
    present an efficient implementation using the linear kernel, and we show how to
    generalize it to kernels of the dot product family preserving the efficiency.
    We also investigate on the elements which influence the sparsity of a standard
    cosine kernel. This analysis shows that the sparsity of the kernel strongly
    depends on the properties of the dataset, in particular on the long tail
    distribution. We compare our method with state-of-the-art algorithms achieving
    good results both in terms of efficiency and effectiveness.

    iCrawl: Improving the Freshness of Web Collections by Integrating Social Web and Focused Web Crawling

    Gerhard Gossen, Elena Demidova, Thomas Risse
    Comments: Published in the Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries 2015
    Journal-ref: Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital
    Libraries (pp. 75–84) (2015)
    Subjects: Digital Libraries (cs.DL); Information Retrieval (cs.IR)

    Researchers in the Digital Humanities and journalists need to monitor,
    collect and analyze fresh online content regarding current events such as the
    Ebola outbreak or the Ukraine crisis on demand. However, existing focused
    crawling approaches only consider topical aspects while ignoring temporal
    aspects and therefore cannot achieve thematically coherent and fresh Web
    collections. Especially Social Media provide a rich source of fresh content,
    which is not used by state-of-the-art focused crawlers. In this paper we
    address the issues of enabling the collection of fresh and relevant Web and
    Social Web content for a topic of interest through seamless integration of Web
    and Social Media in a novel integrated focused crawler. The crawler collects
    Web and Social Media content in a single system and exploits the stream of
    fresh Social Media content for guiding the crawler.

    A Scalable Document-based Architecture for Text Analysis

    Ciprian-Octavian Truică, Jérôme Darmont (ERIC), Julien Velcin (ERIC)
    Journal-ref: 12th International Conference on Advanced Data Mining and
    Applications (ADMA 2016), Dec 2016, Gold Coast, Australia. Springer, 10086,
    pp.481-494, 2016, Lecture Notes in Artificial Intelligence
    Subjects: Databases (cs.DB); Information Retrieval (cs.IR)

    Analyzing textual data is a very challenging task because of the huge volume
    of data generated daily. Fundamental issues in text analysis include the lack
    of structure in document datasets, the need for various preprocessing steps
    %(e.g., stem or lemma extraction, part-of-speech tagging, named entities
    recognition…), and performance and scaling issues. Existing text analysis
    architectures partly solve these issues, providing restrictive data schemas,
    addressing only one aspect of text preprocessing and focusing on one single
    task when dealing with performance optimization. %As a result, no definite
    solution is currently available. Thus, we propose in this paper a new generic
    text analysis architecture, where document structure is flexible, many
    preprocessing techniques are integrated and textual datasets are indexed for
    efficient access. We implement our conceptual architecture using both a
    relational and a document-oriented database. Our experiments demonstrate the
    feasibility of our approach and the superiority of the document-oriented
    logical and physical implementation.

    The iCrawl Wizard — Supporting Interactive Focused Crawl Specification

    Gerhard Gossen, Elena Demidova, Thomas Risse
    Comments: Published in the Proceedings of the European Conference on Information Retrieval (ECIR) 2015
    Subjects: Digital Libraries (cs.DL); Information Retrieval (cs.IR)

    Collections of Web documents about specific topics are needed for many areas
    of current research. Focused crawling enables the creation of such collections
    on demand. Current focused crawlers require the user to manually specify
    starting points for the crawl (seed URLs). These are also used to describe the
    expected topic of the collection. The choice of seed URLs influences the
    quality of the resulting collection and requires a lot of expertise. In this
    demonstration we present the iCrawl Wizard, a tool that assists users in
    defining focused crawls efficiently and semi-automatically. Our tool uses major
    search engines and Social Media APIs as well as information extraction
    techniques to find seed URLs and a semantic description of the crawl intent.
    Using the iCrawl Wizard even non-expert users can create semantic
    specifications for focused crawlers interactively and efficiently.


    Computation and Language

    Domain specialization: a post-training domain adaptation for Neural Machine Translation

    Christophe Servan, Josep Crego, Jean Senellart
    Comments: Submitted to EACL 2017 short paper
    Subjects: Computation and Language (cs.CL)

    Domain adaptation is a key feature in Machine Translation. It generally
    encompasses terminology, domain and style adaptation, especially for human
    post-editing workflows in Computer Assisted Translation (CAT). With Neural
    Machine Translation (NMT), we introduce a new notion of domain adaptation that
    we call “specialization” and which is showing promising results both in the
    learning speed and in adaptation accuracy. In this paper, we propose to explore
    this approach under several perspectives.

    Domain Control for Neural Machine Translation

    Catherine Kobus, Josep Crego, Jean Senellart
    Comments: Submitted to EACL 2017 short paper
    Subjects: Computation and Language (cs.CL)

    Machine translation systems are very sensitive to the domains they were
    trained on. Several domain adaptation techniques have been deeply studied. We
    propose a new technique for neural machine translation (NMT) that we call
    domain control which is performed at runtime using a unique neural network
    covering multiple domains. The presented approach shows quality improvements
    when compared to dedicated domains translating on any of the covered domains
    and even on out-of-domain data. In addition, model parameters do not need to be
    re-estimated for each domain, making this effective to real use cases.
    Evaluation is carried out on English-to-French translation for two different
    testing scenarios. We first consider the case where an end-user performs
    translations on a known domain. Secondly, we consider the scenario where the
    domain is not known and predicted at the sentence level before translating.
    Results show consistent accuracy improvements for both conditions.

    Neural Machine Translation from Simplified Translations

    Josep Crego, Jean Senellart
    Comments: Submitted to EACL 2017 Short paper
    Subjects: Computation and Language (cs.CL)

    Text simplification aims at reducing the lexical, grammatical and structural
    complexity of a text while keeping the same meaning. In the context of machine
    translation, we introduce the idea of simplified translations in order to boost
    the learning ability of deep neural translation models. We conduct preliminary
    experiments showing that translation complexity is actually reduced in a
    translation of a source bi-text compared to the target reference of the bi-text
    while using a neural machine translation (NMT) system learned on the exact same
    bi-text. Based on knowledge distillation idea, we then train an NMT system
    using the simplified bi-text, and show that it outperforms the initial system
    that was built over the reference data set. Performance is further boosted when
    both reference and automatic translations are used to learn the network. We
    perform an elementary analysis of the translated corpus and report accuracy
    results of the proposed approach on English-to-French and English-to-German
    translation tasks.

    Boosting Neural Machine Translation

    Dakun Zhang, Jungi Kim, Josep Crego, Jean Senellart
    Comments: Submitted to EACL 2017 short paper
    Subjects: Computation and Language (cs.CL)

    Training efficiency is one of the main problems for Neural Machine
    Translation (NMT). Deep networks, very large data and many training iterations
    are necessary to achieve state-of-the-art performance for NMT. This results in
    very high computation cost and slow down research and industrialization. In
    this paper, we first investigate the instability by randomizations for NMT
    training, and further propose an efficient training method based on data
    boosting and bootstrapping with no modifications to the neural network.
    Experiments show that this method can converge much faster compared with a
    baseline system and achieve stable improvement up to 2.36 BLEU points with 80%
    training cost.

    Improving Tweet Representations using Temporal and User Context

    Ganesh J, Manish Gupta, Vasudeva Varma
    Comments: To be presented at European Conference on Information Retrieval (ECIR) 2017
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

    In this work we propose a novel representation learning model which computes
    semantic representations for tweets accurately. Our model systematically
    exploits the chronologically adjacent tweets (‘context’) from users’ Twitter
    timelines for this task. Further, we make our model user-aware so that it can
    do well in modeling the target tweet by exploiting the rich knowledge about the
    user such as the way the user writes the post and also summarizing the topics
    on which the user writes. We empirically demonstrate that the proposed models
    outperform the state-of-the-art models in predicting the user profile
    attributes like spouse, education and job by 19.66%, 2.27% and 2.22%
    respectively.

    Reducing Redundant Computations with Flexible Attention

    Raphael Shu, Hideki Nakayama
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

    Recently, attention mechanism plays a key role to achieve high performance
    for Neural Machine Translation models. It applies a score function to the
    encoder states to obtain alignment weights. However, as this computation is
    done for all positions in each decoding step, the attention mechanism greatly
    increases the computational complexity. In this paper we propose a novel
    attention model which can reduce redundant attentional computations in a
    flexible manner. The proposed mechanism tracks the center of attention in each
    decoding step, and computes position-based penalties. In the test time, the
    computations of the score function for heavily penalized positions are skipped.
    In our experiments, we found that the computations in the attention model can
    be reduced by 54% in average with almost no loss of accuracy.

    Neural Multi-Source Morphological Reinflection

    Katharina Kann, Ryan Cotterell, Hinrich Schütze
    Comments: EACL 2017
    Subjects: Computation and Language (cs.CL)

    We explore the task of multi-source morphological reinflection, which
    generalizes the standard, single-source version. The input consists of (i) a
    target tag and (ii) multiple pairs of source form and source tag for a lemma.
    The motivation is that it is beneficial to have access to more than one source
    form since different source forms can provide complementary information, e.g.,
    different stems. We further present a novel extension to the encoder- decoder
    recurrent neural architecture, consisting of multiple encoders, to better solve
    the task. We show that our new architecture outperforms single-source
    reinflection models and publish our dataset for multi-source morphological
    reinflection to facilitate future research.

    A recurrent neural network without chaos

    Thomas Laurent, James von Brecht
    Subjects: Neural and Evolutionary Computing (cs.NE); Computation and Language (cs.CL); Learning (cs.LG)

    We introduce an exceptionally simple gated recurrent neural network (RNN)
    that achieves performance comparable to well-known gated architectures, such as
    LSTMs and GRUs, on the word-level language modeling task. We prove that our
    model has simple, predicable and non-chaotic dynamics. This stands in stark
    contrast to more standard gated architectures, whose underlying dynamical
    systems exhibit chaotic behavior.

    A User Simulator for Task-Completion Dialogues

    Xiujun Li, Zachary C. Lipton, Bhuwan Dhingra, Lihong Li, Jianfeng Gao, Yun-Nung Chen
    Comments: 14 pages
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

    Despite widespread interests in reinforcement-learning for task-oriented
    dialogue systems, several obstacles can frustrate research and development
    progress. First, reinforcement learners typically require interaction with the
    environment, so conventional dialogue corpora cannot be used directly. Second,
    each task presents specific challenges, requiring separate corpus of
    task-specific annotated data. Third, collecting and annotating human-machine or
    human-human conversations for task-oriented dialogues requires extensive domain
    knowledge. Because building an appropriate dataset can be both financially
    costly and time-consuming, one popular approach is to build a user simulator
    based upon a corpus of example dialogues. Then, one can train reinforcement
    learning agents in an online fashion as they interact with the simulator.
    Dialogue agents trained on these simulators can serve as an effective starting
    point. Once agents master the simulator, they may be deployed in a real
    environment to interact with humans, and continue to be trained online. To ease
    empirical algorithmic comparisons in dialogues, this paper introduces a new,
    publicly available simulation framework, where our simulator, designed for the
    movie-booking domain, leverages both rules and collected data. The simulator
    supports two tasks: movie ticket booking and movie seeking. Finally, we
    demonstrate several agents and detail the procedure to add and test your own
    agent in the proposed framework.


    Distributed, Parallel, and Cluster Computing

    Hybrid Transactional Replication: State-Machine and Deferred-Update Replication Combined

    Tadeusz Kobus, Maciej Kokociński, Paweł T. Wojciechowski
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    We propose Hybrid Transactional Replication (HTR), a novel replication scheme
    for highly dependable services. It combines two schemes: a transaction is
    executed either optimistically by only one service replica in the deferred
    update mode (DU), or deterministically by all replicas in the state machine
    mode (SM); the choice is made by an oracle. The DU mode allows for parallelism
    and thus takes advantage of multicore hardware. In contrast to DU, the SM mode
    guarantees abort-free execution, so it is suitable for irrevocable operations
    and transactions generating high contention. For expressiveness, transactions
    can be discarded or retried on demand. We formally prove that the higher
    flexibility of the scheme does not come at the cost of weaker guarantees for
    clients: HTR satisfies strong consistency guarantees akin to those provided by
    other popular transactional replication schemes such as Deferred Update
    Replication. We developed HTR-enabled Paxos STM, an object-based distributed
    transactional memory system, and evaluated it thoroughly under various
    workloads. We show the benefits of using a novel oracle, which relies on
    machine learning techniques for automatic adaptation to changing conditions. In
    our tests, the ML-based oracle provides up to 50% improvement in throughput
    when compared to the system running with DU-only or SM-only oracles. Our
    approach is inspired by a well known algorithm used in the context of the
    multi-armed bandit problem.

    Performance Optimisation of Smoothed Particle Hydrodynamics Algorithms for Multi/Many-Core Architectures

    Fabio Baruffa, Luigi Iapichino, Nicolay J. Hammer, Vasileios Karakasis
    Comments: 18 pages, 5 figures, submitted
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Instrumentation and Methods for Astrophysics (astro-ph.IM); Computational Physics (physics.comp-ph)

    We describe a strategy for code modernisation of Gadget, a widely used
    community code for computational astrophysics. The focus of this work is on
    node-level performance optimisation, targeting current multi/many-core Intel
    architectures. We identify and isolate a sample code kernel, which is
    representative of a typical Smoothed Particle Hydrodynamics (SPH) algorithm.
    The code modifications include threading parallelism optimisation, change of
    the data layout into Structure of Arrays (SoA), auto-vectorisation and
    algorithmic improvements in the particle sorting. We measure lower execution
    time and improved threading scalability both on Intel Xeon ((2.6 imes) on Ivy
    Bridge) and Xeon Phi ((13.7 imes) on Knights Corner) systems. First tests on
    second generation Xeon Phi (Knights Landing) demonstrate the portability of the
    devised optimisation solutions to upcoming architectures.

    Distributed Data Processing Frameworks for Big Graph Data

    Afsin Akdogan, Hien To
    Comments: Survey paper that covers data processing frameworks for big graph data
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Recently we create so much data (2.5 quintillion bytes every day) that 90% of
    the data in the world today has been created in the last two years alone [1].
    This data comes from sensors used to gather traffic or climate information,
    posts to social media sites, photos, videos, emails, purchase transaction
    records, call logs of cellular networks, etc. This data is big data. In this
    report, we first briefly discuss what programming models are used for big data
    processing, and focus on graph data and do a survey study about what
    programming models/frameworks are used to solve graph problems at very
    large-scale. In section 2, we introduce the programming models which are not
    specifically designed to handle graph data but we include them in this survey
    because we believe these are important frameworks and/or there have been
    studies to customize them for more efficient graph processing. In section 3, we
    discuss some techniques that yield up to 1340 times speedup for some certain
    graph problems when applied to Hadoop. In section 4, we discuss vertex-based
    programming model which is simply designed to process large graphs and the
    frameworks adapting it. In section 5, we implement two of the fundamental graph
    algorithms (Page Rank and Weight Bipartite Matching), and run them on a single
    node as the baseline approach to see how fast they are for large datasets and
    whether it is worth to partition them.

    Computability of Perpetual Exploration in Highly Dynamic Rings

    Marjorie Bournat (Regal), Swan Dubois (Regal), Franck Petit (Regal)
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    We consider systems made of autonomous mobile robots evolving in highly
    dynamic discrete environment i.e., graphs where edges may appear and disappear
    unpredictably without any recurrence, stability, nor periodicity assumption.
    Robots are uniform (they execute the same algorithm), they are anonymous (they
    are devoid of any observable ID), they have no means allowing them to
    communicate together, they share no common sense of direction, and they have no
    global knowledge related to the size of the environment. However, each of them
    is endowed with persistent memory and is able to detect whether it stands alone
    at its current location. A highly dynamic environment is modeled by a graph
    such that its topology keeps continuously changing over time. In this paper, we
    consider only dynamic graphs in which nodes are anonymous, each of them is
    infinitely often reachable from any other one, and such that its underlying
    graph (i.e., the static graph made of the same set of nodes and that includes
    all edges that are present at least once over time) forms a ring of arbitrary
    size. In this context, we consider the fundamental problem of perpetual
    exploration: each node is required to be infinitely often visited by a robot.
    This paper analyzes the computability of this problem in (fully) synchronous
    settings, i.e., we study the deterministic solvability of the problem with
    respect to the number of robots. We provide three algorithms and two
    impossibility results that characterize, for any ring size, the necessary and
    sufficient number of robots to perform perpetual exploration of highly dynamic
    rings.

    A Relative Study of Task Scheduling Algorithms in Cloud Computing Environment

    Syed Arshad Ali, Mansaf Alam
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Cloud Computing is a paradigm of both parallel processing and distributed
    computing. It offers computing facilities as a utility service in pay as par
    use manner. Virtualization, self service provisioning, elasticity and pay per
    use are the key features of Cloud Computing. It provides different types of
    resources over the Internet to perform user submitted tasks. In cloud
    environment, huge number of tasks are executed simultaneously, an effective
    Task Scheduling is required to gain better performance of the cloud system.
    Various Cloud Based Task Scheduling algorithms are available that schedule the
    task of user to resources for execution. Due to the novelty of Cloud Computing,
    traditional scheduling algorithms cannot satisfy the needs of cloud , the
    researchers are trying to modify traditional algorithms that can fulfill the
    cloud requirements like rapid elasticity, resource pooling and on demand self
    service. In this paper the current state of Task Scheduling algorithms has been
    discussed and compared on the basis of various scheduling parameters like
    execution time, throughput, make span, resource utilization, quality of
    service, energy consumption, response time and cost.

    Partitioning, Indexing and Querying Spatial Data on Cloud

    Afsin Akdogan
    Comments: PhD Dissertation – University of Southern California
    Subjects: Databases (cs.DB); Distributed, Parallel, and Cluster Computing (cs.DC)

    The number of mobile devices (e.g., smartphones, wearable technologies) is
    rapidly growing. In line with this trend, a massive amount of spatial data is
    being collected since these devices allow users to geo-tag user-generated
    content. Clearly, a scalable computing infrastructure is needed to manage such
    large datasets. Meanwhile, Cloud Computing service providers (e.g., Amazon,
    Google, and Microsoft) allow users to lease computing resources. However, most
    of the existing spatial indexing techniques are designed for the centralized
    paradigm which is limited to the capabilities of a single sever. To address the
    scalability shortcomings of existing approaches, we provide a study that focus
    on generating a distributed spatial index structure that not only scales out to
    multiple servers but also scales up since it fully exploits the multi-core CPUs
    available on each server using Voronoi diagram as the partitioning and indexing
    technique which we also use to process spatial queries effectively. More
    specifically, since the data objects continuously move and issue position
    updates to the index structure, we collect the latest positions of objects and
    periodically generate a read-only index to eliminate costly distributed
    updates. Our approach scales near-linearly in index construction and query
    processing, and can efficiently construct an index for millions of objects
    within a few seconds. In addition to scalability and efficiency, we also aim to
    maximize the server utilization that can support the same workload with less
    number of servers. Server utilization is a crucial point while using Cloud
    Computing because users are charged based on the total amount of time they
    reserve each server, with no consideration of utilization.

    PAM: Parallel Augmented Maps

    Yihan Sun, Daniel Ferizovic, Guy E. Blelloch
    Subjects: Data Structures and Algorithms (cs.DS); Databases (cs.DB); Distributed, Parallel, and Cluster Computing (cs.DC)

    In this paper we introduce an interface for supporting ordered maps that are
    augmented to support quick “sums” of values over ranges of the keys. We have
    implemented this interface as part of a C++ library called PAM (Parallel and
    Persistent Augmented Map library). This library supports a wide variety of
    functions on maps ranging from basic insertion and deletion to more interesting
    functions such as union, intersection, difference, filtering, extracting
    ranges, splitting, and range-sums. The functions in the library are parallel,
    persistent (meaning that functions do not affect their inputs), and
    work-efficient. The underlying data structure is the augmented balanced binary
    search tree, which is a binary search tree in which each node is augmented with
    a value keeping the “sum” of its subtree with respect to some user supplied
    function. With this augmentation the library can be directly applied to many
    settings such as to 2D range trees, interval trees, word index searching, and
    segment trees. The interface greatly simplifies the implementation of such data
    structures while it achieves efficiency that is significantly better than
    previous libraries.

    We tested our library and its corresponding applications. Experiments show
    that our implementation of set functions can get up to 50+ speedup on 72 cores.
    As for our range tree implementation, the sequential running time is more
    efficient than existing libraries such as CGAL, and can get up to 42+ speedup
    on 72 cores.


    Learning

    Simple Black-Box Adversarial Perturbations for Deep Networks

    Nina Narodytska, Shiva Prasad Kasiviswanathan
    Comments: 19 Pages
    Subjects: Learning (cs.LG); Cryptography and Security (cs.CR); Machine Learning (stat.ML)

    Deep neural networks are powerful and popular learning models that achieve
    state-of-the-art pattern recognition performance on many computer vision,
    speech, and language processing tasks. However, these networks have also been
    shown susceptible to carefully crafted adversarial perturbations which force
    misclassification of the inputs. Adversarial examples enable adversaries to
    subvert the expected system behavior leading to undesired consequences and
    could pose a security risk when these systems are deployed in the real world.

    In this work, we focus on deep convolutional neural networks and demonstrate
    that adversaries can easily craft adversarial examples even without any
    internal knowledge of the target network. Our attacks treat the network as an
    oracle (black-box) and only assume that the output of the network can be
    observed on the probed inputs. Our first attack is based on a simple idea of
    adding perturbation to a randomly selected single pixel or a small set of them.
    We then improve the effectiveness of this attack by carefully constructing a
    small set of pixels to perturb by using the idea of greedy local-search. Our
    proposed attacks also naturally extend to a stronger notion of
    misclassification. Our extensive experimental results illustrate that even
    these elementary attacks can reveal a deep neural network’s vulnerabilities.
    The simplicity and effectiveness of our proposed schemes mean that they could
    serve as a litmus test for designing robust networks.

    Corralling a Band of Bandit Algorithms

    Alekh Agarwal, Haipeng Luo, Behnam Neyshabur, Robert E. Schapire
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    We study the problem of combining multiple bandit algorithms (that is, online
    learning algorithms with partial feedback) with the goal of creating a master
    algorithm that performs almost as well as the best base algorithm if it were to
    be run on its own. The main challenge is that when run with a master, base
    algorithms unavoidably receive much less feedback and it is thus critical that
    the master not starve a base algorithm that might performs uncompetitively
    initially but would eventually outperform others if given enough feedback. We
    address this difficulty by devising a version of Online Mirror Descent with a
    special mirror map together with a sophisticated learning rate scheme. We show
    that this approach manages to achieve a more delicate balance between
    exploiting and exploring base algorithms than previous works yielding superior
    regret bounds.

    Our results are applicable to many settings, such as multi-armed bandits,
    contextual bandits, and convex bandits. As examples, we present two main
    applications. The first is to create an algorithm that enjoys worst-case
    robustness while at the same time performing much better when the environment
    is relatively easy. The second is to create an algorithm that works
    simultaneously under different assumptions of the environment, such as
    different priors or different loss structures.

    Training Ternary Neural Networks with Exact Proximal Operator

    Penghang Yin, Shuai Zhang, Jack Xin, Yingyong Qi
    Subjects: Learning (cs.LG)

    In this paper, we propose a stochastic proximal gradient method to train
    ternary weight neural networks (TNN). The proposed method features weight
    ternarization via an exact formula of proximal operator. Our experiments show
    that our trained TNN are able to preserve the state-of-the-art performance on
    MNIST and CIFAR10 benchmark datesets.

    Self-Correcting Models for Model-Based Reinforcement Learning

    Erik Talvitie
    Comments: To appear in Proceedings of the 31st AAAI Conference on Artificial Intelligence, 2017. This version incorporates the appendix into document (rather than as supplementary material)
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

    When an agent cannot represent a perfectly accurate model of its
    environment’s dynamics, model-based reinforcement learning (MBRL) can fail
    catastrophically. Planning involves composing the predictions of the model;
    when flawed predictions are composed, even minor errors can compound and render
    the model useless for planning. Hallucinated Replay (Talvitie 2014) trains the
    model to “correct” itself when it produces errors, substantially improving MBRL
    with flawed models. This paper theoretically analyzes this approach,
    illuminates settings in which it is likely to be effective or ineffective, and
    presents a novel error bound, showing that a model’s ability to self-correct is
    more tightly related to MBRL performance than one-step prediction error. These
    results inspire an MBRL algorithm for deterministic MDPs with performance
    guarantees that are robust to model class limitations.

    Inexact Proximal Gradient Methods for Non-convex and Non-smooth Optimization

    Bin Gu, Zhouyuan Huo, Heng Huang
    Subjects: Learning (cs.LG)

    Non-convex and non-smooth optimization plays an important role in machine
    learning. Proximal gradient method is one of the most important methods for
    solving the non-convex and non-smooth problems, where a proximal operator need
    to be solved exactly for each step. However, in a lot of problems the proximal
    operator does not have an analytic solution, or is expensive to obtain an exact
    solution. In this paper, we propose inexact proximal gradient methods (not only
    a basic inexact proximal gradient method (IPG), but also a Nesterov’s
    accelerated inexact proximal gradient method (AIPG)) for non-convex and
    non-smooth optimization, which tolerate an error in the calculation of the
    proximal operator. Theoretical analysis shows that IPG and AIPG have the same
    convergence rates as in the error-free case, provided that the errors decrease
    at appropriate rates.

    Building Diversified Multiple Trees for Classification in High Dimensional Noise Data

    Jiuyong Li, Lin Liu, Jixue Liu, Ryan Green
    Comments: 12 pages
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    It is common that a trained classification model is applied to the operating
    data that is deviated from the training data because of noise. This paper
    demonstrate an ensemble classifier, Diversified Multiple Trees (DMT) is more
    robust to classify noised data than other widely used ensemble methods. DMT is
    tested on three real world biological data sets from different laboratories in
    comparison with four benchmark ensemble classifiers. Experimental results show
    that DMT is significantly more accurate than other benchmark ensemble
    classifiers on noised test data. We also discussed a limitation of DMT and its
    possible variations.

    A new recurrent neural network based predictive model for Faecal Calprotectin analysis: A retrospective study

    Zeeshan Khawar Malik, Zain U. Hussain, Ziad Kobti, Charlie W. Lees, Newton Howard, Amir Hussain
    Subjects: Learning (cs.LG)

    Faecal Calprotectin (FC) is a surrogate marker for intestinal inflammation,
    termed Inflammatory Bowel Disease (IBD), but not for cancer. In this
    retrospective study of 804 patients, an enhanced benchmark predictive model for
    analyzing FC is developed, based on a novel state-of-the-art Echo State Network
    (ESN), an advanced dynamic recurrent neural network which implements a
    biologically plausible architecture, and a supervised learning mechanism. The
    proposed machine learning driven predictive model is benchmarked against a
    conventional logistic regression model, demonstrating statistically significant
    performance improvements.

    Machine Learning, Linear and Bayesian Models for Logistic Regression in Failure Detection Problems

    B. Pavlyshenko
    Subjects: Learning (cs.LG)

    In this work, we study the use of logistic regression in manufacturing
    failures detection. As a data set for the analysis, we used the data from
    Kaggle competition Bosch Production Line Performance. We considered the use of
    machine learning, linear and Bayesian models. For machine learning approach, we
    analyzed XGBoost tree based classifier to obtain high scored classification.
    Using the generalized linear model for logistic regression makes it possible to
    analyze the influence of the factors under study. The Bayesian approach for
    logistic regression gives the statistical distribution for the parameters of
    the model. It can be useful in the probabilistic analysis, e.g. risk
    assessment.

    A User Simulator for Task-Completion Dialogues

    Xiujun Li, Zachary C. Lipton, Bhuwan Dhingra, Lihong Li, Jianfeng Gao, Yun-Nung Chen
    Comments: 14 pages
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

    Despite widespread interests in reinforcement-learning for task-oriented
    dialogue systems, several obstacles can frustrate research and development
    progress. First, reinforcement learners typically require interaction with the
    environment, so conventional dialogue corpora cannot be used directly. Second,
    each task presents specific challenges, requiring separate corpus of
    task-specific annotated data. Third, collecting and annotating human-machine or
    human-human conversations for task-oriented dialogues requires extensive domain
    knowledge. Because building an appropriate dataset can be both financially
    costly and time-consuming, one popular approach is to build a user simulator
    based upon a corpus of example dialogues. Then, one can train reinforcement
    learning agents in an online fashion as they interact with the simulator.
    Dialogue agents trained on these simulators can serve as an effective starting
    point. Once agents master the simulator, they may be deployed in a real
    environment to interact with humans, and continue to be trained online. To ease
    empirical algorithmic comparisons in dialogues, this paper introduces a new,
    publicly available simulation framework, where our simulator, designed for the
    movie-booking domain, leverages both rules and collected data. The simulator
    supports two tasks: movie ticket booking and movie seeking. Finally, we
    demonstrate several agents and detail the procedure to add and test your own
    agent in the proposed framework.

    Learning Features by Watching Objects Move

    Deepak Pathak, Ross Girshick, Piotr Dollár, Trevor Darrell, Bharath Hariharan
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    This paper presents a novel yet intuitive approach to unsupervised feature
    learning. Inspired by the human visual system, we explore whether low-level
    motion-based grouping cues can be used to learn an effective visual
    representation. Specifically, we use unsupervised motion-based segmentation on
    videos to obtain segments, which we use as ‘pseudo ground truth’ to train a
    convolutional network to segment objects from a single frame. Given the
    extensive evidence that motion plays a key role in the development of the human
    visual system, we hope that this straightforward approach to unsupervised
    learning will be more effective than cleverly designed ‘pretext’ tasks studied
    in the literature. Indeed, our extensive experiments show that this is the
    case. When used for transfer learning on object detection, our representation
    significantly outperforms previous unsupervised approaches across multiple
    settings, especially when training data for the target task is scarce.

    Learning Human-Understandable Strategies

    Sam Ganzfried, Farzana Yusuf
    Subjects: Computer Science and Game Theory (cs.GT); Artificial Intelligence (cs.AI); Learning (cs.LG); Multiagent Systems (cs.MA)

    Algorithms for equilibrium computation generally make no attempt to ensure
    that the computed strategies are understandable by humans. For instance the
    strategies for the strongest poker agents are represented as massive binary
    files. In many situations, we would like to compute strategies that can
    actually be implemented by humans, who may have computational limitations and
    may only be able to remember a small number of features or components of the
    strategies that have been computed. We study poker games where private
    information distributions can be arbitrary. We create a large training set of
    game instances and solutions, by randomly selecting the private information
    probabilities, and present algorithms that learn from the training instances in
    order to perform well in games with unseen information distributions. One
    approach first clusters the training points into a small number of clusters and
    then creates a small decision tree based on the cluster centers. This approach
    produces low test error and could be easily implemented by humans since it only
    requires memorizing a small number of “if-then” rules.

    VAST : The Virtual Acoustic Space Traveler Dataset

    Clément Gaultier (PANAMA), Saurabh Kataria (PANAMA, IIT Kanpur), Antoine Deleforge (PANAMA)
    Comments: International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA), Feb 2017, Grenoble, France. International Conference on Latent Variable Analysis and Signal Separation
    Subjects: Sound (cs.SD); Learning (cs.LG)

    This paper introduces a new paradigm for sound source lo-calization referred
    to as virtual acoustic space traveling (VAST) and presents a first dataset
    designed for this purpose. Existing sound source localization methods are
    either based on an approximate physical model (physics-driven) or on a
    specific-purpose calibration set (data-driven). With VAST, the idea is to learn
    a mapping from audio features to desired audio properties using a massive
    dataset of simulated room impulse responses. This virtual dataset is designed
    to be maximally representative of the potential audio scenes that the
    considered system may be evolving in, while remaining reasonably compact. We
    show that virtually-learned mappings on this dataset generalize to real data,
    overcoming some intrinsic limitations of traditional binaural sound
    localization methods based on time differences of arrival.

    A recurrent neural network without chaos

    Thomas Laurent, James von Brecht
    Subjects: Neural and Evolutionary Computing (cs.NE); Computation and Language (cs.CL); Learning (cs.LG)

    We introduce an exceptionally simple gated recurrent neural network (RNN)
    that achieves performance comparable to well-known gated architectures, such as
    LSTMs and GRUs, on the word-level language modeling task. We prove that our
    model has simple, predicable and non-chaotic dynamics. This stands in stark
    contrast to more standard gated architectures, whose underlying dynamical
    systems exhibit chaotic behavior.

    On Random Weights for Texture Generation in One Layer Neural Networks

    Mihir Mongia, Kundan Kumar, Akram Erraqabi, Yoshua Bengio
    Comments: ICASSP 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Recent work in the literature has shown experimentally that one can use the
    lower layers of a trained convolutional neural network (CNN) to model natural
    textures. More interestingly, it has also been experimentally shown that only
    one layer with random filters can also model textures although with less
    variability. In this paper we ask the question as to why one layer CNNs with
    random filters are so effective in generating textures? We theoretically show
    that one layer convolutional architectures (without a non-linearity) paired
    with the an energy function used in previous literature, can in fact preserve
    and modulate frequency coefficients in a manner so that random weights and
    pretrained weights will generate the same type of images. Based on the results
    of this analysis we question whether similar properties hold in the case where
    one uses one convolution layer with a non-linearity. We show that in the case
    of ReLu non-linearity there are situations where only one input will give the
    minimum possible energy whereas in the case of no nonlinearity, there are
    always infinite solutions that will give the minimum possible energy. Thus we
    can show that in certain situations adding a ReLu non-linearity generates less
    variable images.

    Sample-efficient Deep Reinforcement Learning for Dialog Control

    Kavosh Asadi, Jason D. Williams
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    Representing a dialog policy as a recurrent neural network (RNN) is
    attractive because it handles partial observability, infers a latent
    representation of state, and can be optimized with supervised learning (SL) or
    reinforcement learning (RL). For RL, a policy gradient approach is natural, but
    is sample inefficient. In this paper, we present 3 methods for reducing the
    number of dialogs required to optimize an RNN-based dialog policy with RL. The
    key idea is to maintain a second RNN which predicts the value of the current
    policy, and to apply experience replay to both networks. On two tasks, these
    methods reduce the number of dialogs/episodes required by about a third, vs.
    standard policy gradient methods.

    Adversarial Deep Structural Networks for Mammographic Mass Segmentation

    Wentao Zhu, Xiaohui Xie
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Mass segmentation is an important task in mammogram analysis, providing
    effective morphological features and regions of interest (ROI) for mass
    detection and classification. Inspired by the success of using deep
    convolutional features for natural image analysis and conditional random fields
    (CRF) for structural learning, we propose an end-to-end network for
    mammographic mass segmentation. The network employs a fully convolutional
    network (FCN) to model potential function, followed by a CRF to perform
    structural learning. Because the mass distribution varies greatly with pixel
    position, the FCN is combined with position priori for the task. Due to the
    small size of mammogram datasets, we use adversarial training to control
    over-fitting. Four models with different convolutional kernels are further
    fused to improve the segmentation results. Experimental results on two public
    datasets, INbreast and DDSM-BCRP, show that our end-to-end network combined
    with adversarial training achieves the-state-of-the-art results.

    Deep Multi-instance Networks with Sparse Label Assignment for Whole Mammogram Classification

    Wentao Zhu, Qi Lou, Yeeleng Scott Vang, Xiaohui Xie
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Mammogram classification is directly related to computer-aided diagnosis of
    breast cancer. Traditional methods requires great effort to annotate the
    training data by costly manual labeling and specialized computational models to
    detect these annotations during test. Inspired by the success of using deep
    convolutional features for natural image analysis and multi-instance learning
    for labeling a set of instances/patches, we propose end-to-end trained deep
    multi-instance networks for mass classification based on whole mammogram
    without the aforementioned costly need to annotate the training data. We
    explore three different schemes to construct deep multi-instance networks for
    whole mammogram classification. Experimental results on the INbreast dataset
    demonstrate the robustness of proposed deep networks compared to previous work
    using segmentation and detection annotations in the training.

    EgoTransfer: Transferring Motion Across Egocentric and Exocentric Domains using Deep Neural Networks

    Shervin Ardeshir, Krishna Regmi, Ali Borji
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Mirror neurons have been observed in the primary motor cortex of primate
    species, in particular in humans and monkeys. A mirror neuron fires when a
    person performs a certain action, and also when he observes the same action
    being performed by another person. A crucial step towards building fully
    autonomous intelligent systems with human-like learning abilities is the
    capability in modeling the mirror neuron. On one hand, the abundance of
    egocentric cameras in the past few years has offered the opportunity to study a
    lot of vision problems from the first-person perspective. A great deal of
    interesting research has been done during the past few years, trying to explore
    various computer vision tasks from the perspective of the self. On the other
    hand, videos recorded by traditional static cameras, capture humans performing
    different actions from an exocentric third-person perspective. In this work, we
    take the first step towards relating motion information across these two
    perspectives. We train models that predict motion in an egocentric view, by
    observing it from an exocentric view, and vice versa. This allows models to
    predict how an egocentric motion would look like from outside. To do so, we
    train linear and nonlinear models and evaluate their performance in terms of
    retrieving the egocentric (exocentric) motion features, while having access to
    an exocentric (egocentric) motion feature. Our experimental results demonstrate
    that motion information can be successfully transferred across the two views.

    Learning to predict where to look in interactive environments using deep recurrent q-learning

    Sajad Mousavi, Ali Borji, Nasser Mozayani
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Bottom-Up (BU) saliency models do not perform well in complex interactive
    environments where humans are actively engaged in tasks (e.g., sandwich making
    and playing the video games). In this paper, we leverage Reinforcement Learning
    (RL) to highlight task-relevant locations of input frames. We propose a soft
    attention mechanism combined with the Deep Q-Network (DQN) model to teach an RL
    agent how to play a game and where to look by focusing on the most pertinent
    parts of its visual input. Our evaluations on several Atari 2600 games show
    that the soft attention based model could predict fixation locations
    significantly better than bottom-up models such as Itti-Kochs saliency and
    Graph-Based Visual Saliency (GBVS) models.

    Towards Wide Learning: Experiments in Healthcare

    Snehasis Banerjee, Tanushyam Chattopadhyay, Swagata Biswas, Rohan Banerjee, Anirban Dutta Choudhury, Arpan Pal
    Comments: 4 pages, Machine Learning for Health Workshop, NIPS 2016
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    In this paper, a Wide Learning architecture is proposed that attempts to
    automate the feature engineering portion of the machine learning (ML) pipeline.
    Feature engineering is widely considered as the most time consuming and expert
    knowledge demanding portion of any ML task. The proposed feature recommendation
    approach is tested on 3 healthcare datasets: a) PhysioNet Challenge 2016
    dataset of phonocardiogram (PCG) signals, b) MIMIC II blood pressure
    classification dataset of photoplethysmogram (PPG) signals and c) an emotion
    classification dataset of PPG signals. While the proposed method beats the
    state of the art techniques for 2nd and 3rd dataset, it reaches 94.38% of the
    accuracy level of the winner of PhysioNet Challenge 2016. In all cases, the
    effort to reach a satisfactory performance was drastically less (a few days)
    than manual feature engineering.

    Exploiting sparsity to build efficient kernel based collaborative filtering for top-N item recommendation

    Mirko Polato, Fabio Aiolli
    Comments: Under revision for Neurocomputing (Elsevier Journal)
    Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Learning (cs.LG)

    The increasing availability of implicit feedback datasets has raised the
    interest in developing effective collaborative filtering techniques able to
    deal asymmetrically with unambiguous positive feedback and ambiguous negative
    feedback. In this paper, we propose a principled kernel-based collaborative
    filtering method for top-N item recommendation with implicit feedback. We
    present an efficient implementation using the linear kernel, and we show how to
    generalize it to kernels of the dot product family preserving the efficiency.
    We also investigate on the elements which influence the sparsity of a standard
    cosine kernel. This analysis shows that the sparsity of the kernel strongly
    depends on the properties of the dataset, in particular on the long tail
    distribution. We compare our method with state-of-the-art algorithms achieving
    good results both in terms of efficiency and effectiveness.

    Mutual information for fitting deep nonlinear models

    Jacob S. Hunter (1), Nathan O. Hodas (1) ((1) Pacific Northwest National Laboratory)
    Subjects: Optimization and Control (math.OC); Learning (cs.LG); Machine Learning (stat.ML)

    Deep nonlinear models pose a challenge for fitting parameters due to lack of
    knowledge of the hidden layer and the potentially non-affine relation of the
    initial and observed layers. In the present work we investigate the use of
    information theoretic measures such as mutual information and Kullback-Leibler
    (KL) divergence as objective functions for fitting such models without
    knowledge of the hidden layer. We investigate one model as a proof of concept
    and one application of cogntive performance. We further investigate the use of
    optimizers with these methods. Mutual information is largely successful as an
    objective, depending on the parameters. KL divergence is found to be similarly
    succesful, given some knowledge of the statistics of the hidden layer.

    Quantum Reinforcement Learning

    Daniel Crawford, Anna Levit, Navid Ghadermarzy, Jaspreet S. Oberoi, Pooya Ronagh
    Comments: 16 pages
    Subjects: Quantum Physics (quant-ph); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Optimization and Control (math.OC)

    We investigate whether quantum annealers with select chip layouts can
    outperform classical computers in reinforcement learning tasks. We associate a
    transverse field Ising spin Hamiltonian with a layout of qubits similar to that
    of a deep Boltzmann machine (DBM) and use simulated quantum annealing (SQA) to
    numerically simulate quantum sampling from this system. We design a
    reinforcement learning algorithm in which the set of visible nodes representing
    the states and actions of an optimal policy are the first and last layers of
    the deep network. In absence of a transverse field, our simulations show that
    DBMs train more effectively than restricted Boltzmann machines (RBM) with the
    same number of weights. Since sampling from Boltzmann distributions of a DBM is
    not classically feasible, this is evidence of supremacy of a non-Turing
    sampling oracle. We then develop a framework for training the network as a
    quantum Boltzmann machine (QBM) in the presence of a significant transverse
    field for reinforcement learning. The latter method further improves the
    reinforcement learning method using DBMs.


    Information Theory

    Compressive Sensing for Millimeter Wave Antenna Array Diagnosis

    Mohammed E. Eltayeb, Tareq Y. Al-Naffouri, Robert W. Heath Jr
    Subjects: Information Theory (cs.IT)

    The radiation pattern of an antenna array depends on the excitation weights
    and the geometry of the array. Due to wind and atmospheric conditions, outdoor
    millimeter wave antenna elements are subject to full or partial blockages from
    a plethora of particles like dirt, salt, ice, and water droplets. Handheld
    devices are also subject to blockages from random finger placement and/or
    finger prints. These blockages cause absorption and scattering to the signal
    incident on the array, and change the array geometry. This distorts the
    far-field radiation pattern of the array leading to an increase in the sidelobe
    level and decrease in gain. This paper studies the effects of blockages on the
    far-field radiation pattern of linear arrays and proposes two array diagnosis
    techniques for millimeter wave antenna arrays. The proposed techniques jointly
    estimate the locations of the blocked antennas and the induced attenuation and
    phase shifts. Numerical results show that the proposed techniques provide
    satisfactory results in terms of fault detection with reduced number of
    measurements (diagnosis time) provided that the number of blockages is small
    compared to the array size.

    Learning the Differential Correlation Matrix of a Smooth Function From Point Samples

    Armin Eftekhari, Ping Li, Michael B. Wakin, Rachel A. Ward
    Subjects: Information Theory (cs.IT)

    Consider an open set (mathbb{D}subseteqmathbb{R}^n), equipped with a
    probability measure (mu). An important characteristic of a smooth function
    (f:mathbb{D}
    ightarrowmathbb{R}) is its (differential) (correlation)
    (matrix) (Sigma_{mu}:=int
    abla f(x) (
    abla f(x))^* mu(dx)
    inmathbb{R}^{n imes n}), where (
    abla f(x)inmathbb{R}^n) is the gradient
    of (f(cdot)) at (xinmathbb{D}). For instance, the span of the leading (r)
    eigenvectors of (Sigma_{mu}) forms an (active) (subspace) of (f(cdot)),
    thereby extending the concept of (principal) (component) (analysis) to the
    problem of (ridge) (approximation). In this work, we propose a simple algorithm
    for estimating (Sigma_{mu}) from point values of (f(cdot)) (without)
    imposing any structural assumptions on (f(cdot)). Theoretical guarantees for
    this algorithm are provided with the aid of the same technical tools that have
    proved valuable in the context of covariance matrix estimation from partial
    measurements.

    Codes correcting deletions in oblivious and random models

    Venkatesan Guruswami, Ray Li
    Subjects: Information Theory (cs.IT); Computational Complexity (cs.CC); Data Structures and Algorithms (cs.DS)

    We prove the existence of binary codes of positive rate that can correct an
    arbitrary pattern of (p) fraction of deletions, for any (p < 1), when the bit
    positions to delete are picked obliviously of the codeword. Formally, we prove
    that for every (p < 1), there exists (mathcal{R} > 0) and a code family with a
    randomized encoder (mathsf{Enc}: {0,1}^{mathcal{R} n} o {0,1}^n) and
    (deterministic) decoder (mathsf{Dec}: {0,1}^{(1-p)n} o
    {0,1}^{mathcal{R} n}) such that for all deletion patterns ( au) with (pn)
    deletions and all messages (m in {0,1}^{mathcal{R} n}), ({ extbf{Pr}} [
    mathsf{Dec}( au(mathsf{Enc}(m)))
    eq m ] le o(1)), where the probability
    is over the randomness of the encoder (which is private to the encoder and not
    known to the decoder). The oblivious model is a significant generalization of
    the random deletion channel where each bit is deleted independently with
    probability (p). For the random deletion channel, existence of codes of rate
    ((1-p)/9), and thus bounded away from (0) for any (p < 1), was already known.
    We give an explicit construction with polynomial time encoding and deletion
    correction algorithms with rate (c_0 (1-p)) for an absolute constant (c_0 > 0).

    Distributed Optimization of Hierarchical Small Cell Networks: A GNEP Framework

    Jiaheng Wang, Wei Guan, Yongming Huang, Robert Schober, Xiaohu You
    Comments: Accepted by IEEE Journal on Selected Areas in Communications
    Subjects: Information Theory (cs.IT)

    Deployment of small cell base stations (SBSs) overlaying the coverage area of
    a macrocell BS (MBS) results in a two-tier hierarchical small cell network.
    Cross-tier and inter-tier interference not only jeopardize primary macrocell
    communication but also limit the spectral efficiency of small cell
    communication. This paper focuses on distributed interference management for
    downlink small cell networks. We address the optimization of transmit
    strategies from both the game theoretical and the network utility maximization
    (NUM) perspectives and show that they can be unified in a generalized Nash
    equilibrium problem (GNEP) framework. Specifically, the small cell network
    design is first formulated as a GNEP, where the SBSs and MBS compete for the
    spectral resources by maximizing their own rates while satisfying global
    quality of service (QoS) constraints. We analyze the GNEP via variational
    inequality theory and propose distributed algorithms, which only require the
    broadcasting of some pricing information, to achieve a generalized Nash
    equilibrium (GNE). Then, we also consider a nonconvex NUM problem that aims to
    maximize the sum rate of all BSs subject to global QoS constraints. We
    establish the connection between the NUM problem and a penalized GNEP and show
    that its stationary solution can be obtained via a fixed point iteration of the
    GNE. We propose GNEP-based distributed algorithms that achieve a stationary
    solution of the NUM problem at the expense of additional signaling overhead and
    complexity. The convergence of the proposed algorithms is proved and guaranteed
    for properly chosen algorithm parameters. The proposed GNEP framework can scale
    from a QoS constrained game to a NUM design for small cell networks by trading
    off signaling overhead and complexity.

    Accurate Reconstruction of Finite Rate of Innovation Signals on the Sphere

    Yahya Sattar, Zubair Khalid, Rodney A. Kennedy
    Comments: 5 Pages, 1 Figure, Submitted for review in IEEE Signal Processing Letters
    Subjects: Information Theory (cs.IT)

    We develop a method for the accurate reconstruction of non-bandlimited finite
    rate of innovation signals on the sphere. For signals consisting of a finite
    number of Dirac functions on the sphere, we develop an annihilating filter
    based method for the accurate recovery of parameters of the Dirac functions
    using a finite number of observations of the bandlimited signal. In comparison
    to existing techniques, the proposed method enables more accurate
    reconstruction primarily due to better conditioning of systems involved in the
    recovery of parameters. For the recovery of (K) Diracs on the sphere, the
    proposed method requires samples of the signal bandlimited in the spherical
    harmonic~(SH) domain at SH degree equal or greater than ( K + sqrt{K +
    frac{1}{4}} – frac{1}{2}). In comparison to the existing state-of-the art
    technique, the required bandlimit, and consequently the number of samples, of
    the proposed method is the same or less. We also conduct numerical experiments
    to demonstrate that the proposed technique is more accurate than the existing
    methods by a factor of (10^{7}) or more for (2 le Kle 20).

    Compact Full Duplex MIMO Radios in D2D Underlaid Cellular Networks: From System Design to Prototype Results

    MinKeun Chung, Min Soo Sim, Dong Ku Kim, Chan-Byoung Chae
    Comments: Submitted to IEEE Journal on Selected Areas in Communications
    Subjects: Information Theory (cs.IT)

    This paper considers the implementation and application possibilities of a
    compact full duplex multiple-input multiple-output (MIMO) architecture for
    users in a network where direct communication exists between users, e.g.,
    device-to-device (D2D) and cellular link coexist in the same spectrum. For the
    architecture of the compact full duplex radio, we combine dual-polarization
    with high cross-polarization discrimination (XPD), based analog
    self-interference canceler, and Long Term Evolution (LTE)-based per-subcarrier
    digital self-interference canceler. While we consider the compactness and power
    efficiency of an analog solution, we focus on the digital canceler design with
    robustness to a frequency-selective channel and high compatibility with a
    conventional LTE system. For an over-the-air wireless experiment through a full
    duplex link prototype with a two-user pair, we implement a full duplex MIMO
    physical layer (PHY), supporting a 20 MHz bandwidth, on an FPGA-based
    software-defined radio platform. Further, we propose the novel timing
    synchronization strategy to construct a more viable full duplex MIMO link. By
    having the full duplex link prototype fully operating in real-time, we present
    the first characterization of the proposed compact full duplex MIMO performance
    depending on the transmit power of the full duplex node, and the link quality
    between nodes. One of the crucial insights of this work is that the full duplex
    of a user is capable of acquiring the throughput gain if the user has
    self-interference capability with guaranteed performance.

    Multi-Kernel Construction of Polar Codes

    Frederic Gabry, Valerio Bioglio, Ingmar Land, Jean-Claude Belfiore
    Comments: submitted to Workshop on Channel Coding for 5G and Future Networks – IEEE ICC 2017
    Subjects: Information Theory (cs.IT)

    We propose a generalized construction for binary polar codes based on mixing
    multiple kernels of different sizes in order to construct polar codes of block
    lengths that are not only powers of integers. This results in a multi kernel
    polar code with very good performance while the encoding complexity remains low
    and the decoding follows the same general structure as for the original Arikan
    polar codes. The construction provides numerous practical advantages as more
    code lengths can be achieved without puncturing or shortening. We observe
    numerically that the error-rate performance of our construction outperforms
    stateof the-art constructions using puncturing methods.

    Distributed Detection of a Non-cooperative Target via Generalized Locally-optimum Approaches

    D. Ciuonzo, P. Salvo Rossi
    Comments: Accepted in Elsevier Information Fusion, Special Issue on Event-Based Distributed Information Fusion Over Sensor Networks (invited paper)
    Subjects: Information Theory (cs.IT)

    In this paper we tackle distributed detection of a non-cooperative target
    with a Wireless Sensor Network (WSN). When the target is present, sensors
    observe an unknown random signal with amplitude attenuation depending on the
    distance between the sensor and the target (unknown) positions, embedded in
    white Gaussian noise. The Fusion Center (FC) receives sensors decisions through
    error-prone Binary Symmetric Channels (BSCs) and is in charge of performing a
    (potentially) more-accurate global decision. The resulting problem is a
    one-sided testing with nuisance parameters present only under the
    target-present hypothesis. We first focus on fusion rules based on Generalized
    Likelihood Ratio Test (GLRT), Bayesian and hybrid approaches. Then, aimed at
    reducing the computational complexity, we develop fusion rules based on
    generalizations of the well-known Locally-Optimum Detection (LOD) framework.
    Finally, all the proposed rules are compared in terms of performance and
    complexity.

    Design and Simulation of a Power Efficient Waveguide Rotman Lens

    Marjan Kashani, Ali Khaleghi
    Subjects: Information Theory (cs.IT)

    Rotman lens is a true-time-delay beam forming network with parallel plate
    structure. Owing to its frequency- independency and wide-band operation, it has
    gained vast applications as feeding element of linear array antennas. However,
    the lens suffers from high intrinsic power loss and the problem worsens in
    microstrip structures by their inevitable dielectric loss. On the other hand,
    waveguides experience extremely lower loss and also have higher power capacity.
    Thus, waveguides can be advantageously utilized in Rotman lens structure. Here,
    a waveguide Rotman lens for the frequency range of 8 to 12 GHz is designed and
    simulated in CST Microwave Studio. Phase error and power efficiency of the
    proposed structure are satisfactory.

    Multiple Access for Visible Light Communications: Research Challenges and Future Trends

    Sarah S. Bawazir, Paschalis C. Sofotasios, Sami Muhaidat, Yousof Al-Hammadi, George K. Karagiannidis
    Subjects: Information Theory (cs.IT)

    The ever-increasing demand of mobile Internet and multimedia services poses
    unique and significant challenges for current and future generation wireless
    networks. These challenges are mainly relating to the support of massive
    ubiquitous connectivity, low latency and highly efficient utilization of
    spectrum resources. Therefore, it is vital to consider them extensively prior
    to design and deployment of future wireless networks. To this end, the present
    article provides a comprehensive overview of a particularly promising and
    effective wireless technology, namely, visible light communication (VLC). In
    this context, we initially provide a thorough overview of frequency domain
    multiple access techniques for single and multi-carrier systems, which is then
    followed by an in depth discussion on the technical considerations of optical
    code division multiple access techniques and their adoption in indoor VLC
    applications. Furthermore, we address space division multiple access and,
    finally, we revisit and analyze the distinct characteristics of a new
    technology, namely, non-orthogonal multiple access, which has been proposed
    recently as a particularly effective solution.

    Secure Communications in the Presence of a Buffer-Aided Wireless-Powered Relay With Self-Energy Recycling

    Ahmed El Shafie, Naofal Al-Dhahir
    Comments: Published in IEEE Wireless Communications Letters in October 2015, this http URL
    Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

    We consider a relay-assisted wireless network, where the energy-harvesting
    buffer-aided relay node is powered by radio-frequency signals from a source
    node wishing to communicate with its destination. We propose two secure
    cooperative protocols for a network composed of a source node equipped with a
    data buffer communicating with its destination in the presence of a
    buffer-aided relay node and an eavesdropper. Our proposed protocols are
    designed based on the channel state information and the buffer state
    information at the source and relay nodes. The burstiness of data at the source
    node queue (buffer) and the energy recycling process at the relay are taken
    into account in our analysis. In addition, we take the decoding and signal
    processing circuit power consumption constraints into consideration. For the
    limiting case of an infinite-size battery at the relay, we derive a sufficient
    condition for the energy queue to saturate. Our numerical results demonstrate
    the throughput gains of our proposed protocols.

    A Coordinate-Descent Framework to Design Low PSL/ISL Sequences

    M. Alaee, A. Aubry, A. De Maio, M. M. Naghsh, M. Modarres-Hashemi
    Subjects: Information Theory (cs.IT)

    This paper is focused on the design of phase sequences with good (aperiodic)
    autocorrelation properties in terms of Peak Sidelobe Level (PSL) and Integrated
    Sidelobe Level (ISL). The problem is formulated as a bi-objective Pareto
    optimization forcing either a continuous or a discrete phase constraint at the
    design stage. An iterative procedure based on the coordinate descent method is
    introduced to deal with the resulting optimization problems which are
    non-convex and NP-hard in general. Each iteration of the devised method
    requires the solution of a non-convex min-max problem. It is handled either
    through a novel bisection or an FFT-based method for the continuous and the
    discrete phase constraint, respectively. Additionally, a heuristic approach to
    initialize the procedures employing the lp-norm minimization technique is
    proposed. Simulation results illustrate that the proposed methodologies can
    outperform some counterparts providing sequences with good autocorrelation
    features especially in the discrete phase/binary case.

    Dependence Measures Bounding the Exploration Bias for General Measurements

    Jiantao Jiao, Yanjun Han, Tsachy Weissman
    Subjects: Information Theory (cs.IT)

    We propose a framework to analyze and quantify the bias in adaptive data
    analysis. It generalizes that proposed by Russo and Zou’15, applying to all
    measurements whose moment generating function exists, and to all measurements
    with a finite (p)-norm. We introduce a new class of dependence measures which
    retain key properties of mutual information while more effectively quantifying
    the exploration bias for heavy tailed distributions. We provide examples of
    cases where our bounds are nearly tight in situations where the original
    framework of Russo and Zou’15 does not apply.

    Improper Signaling in Two-Path Relay Channels

    Mohamed Gaafar, Osama Amin, Rafael F. Schaefer, Mohamed-Slim Alouini
    Comments: Submitted to 2017 ICC Workshops
    Subjects: Information Theory (cs.IT)

    Inter-relay interference (IRI) challenges the operation of two-path relaying
    systems. Furthermore, the unavailability of the channel state information (CSI)
    at the source and the limited detection capabilities at the relays prevent
    neither eliminating the interference nor adopting joint detection at the relays
    nodes. Improper signaling is a powerful signaling scheme that reduces the
    interference impact at the receiver side and improves the achievable rate
    performance. Therefore, improper signaling is adopted at both relays, which
    have access to the global CSI of the proposed system. Then, the improper signal
    characteristics are designed to maximize the total end-to-end achievable rate
    at the relays. To this end, both the power and the circularity coefficient, a
    measure of the impropriety degree of the signal, are optimized at the relays.
    Although the optimization problem is not convex, optimal power allocation for
    both relays for a fixed circularity coefficient is obtained. Moreover, the
    circularity coefficient is tuned to maximize the rate for a given power
    allocation. Finally, a joint solution of the optimization problem is proposed
    using a coordinate descent method based on alternate optimization. The
    simulation results show that employing improper signaling improves the
    achievable rate at medium and high IRI.

    On the Non-existence of certain classes of perfect p-ary sequences and perfect almost p-ary sequences

    Chang Lv
    Comments: 16 pages, git20160928a/43dd4da
    Subjects: Information Theory (cs.IT); Number Theory (math.NT)

    We obtain new non-existence results of perfect p-ary sequences with period n
    (called type ([p, n])). The first case is a class with type [pequiv5pmod
    8,p^aqn’]. The second case contains five types [pequiv3pmod 4,p^aq^ln’] for
    certain (p, q) and (l). Moreover, we also have similar non-existence results
    for perfect almost p-ary sequences.

    Random linear systems with sparse solutions — finite dimensions

    Mihailo Stojnic
    Subjects: Optimization and Control (math.OC); Information Theory (cs.IT); Probability (math.PR)

    In our companion work cite{Stojnicl1RegPosasymldp} we revisited random
    under-determined linear systems with sparse solutions. The main emphasis was on
    the performance analysis of the (ell_1) heuristic in the so-called asymptotic
    regime, i.e. in the regime where the systems’ dimensions are large. Through an
    earlier sequence of work
    cite{DonohoPol,DonohoUnsigned,StojnicCSetam09,StojnicUpper10}, it is now well
    known that in such a regime the (ell_1) exhibits the so-called emph{phase
    transition} (PT) phenomenon. cite{Stojnicl1RegPosasymldp} then went much
    further and established the so-called emph{large deviations principle} (LDP)
    type of behavior that characterizes not only the breaking points of the
    (ell_1)’s success but also the behavior in the entire so-called
    emph{transition zone} around these points. Both of these concepts, the PTs and
    the LDPs, are in fact defined so that one can use them to characterize the
    asymptotic behavior. In this paper we complement the results of
    cite{Stojnicl1RegPosasymldp} by providing an exact detailed analysis in the
    non-asymptotic regime. Of course, not only are the non-asymptotic results
    complementing those from cite{Stojnicl1RegPosasymldp}, they actually are the
    ones that ultimately fully characterize the (ell_1)’s behavior in the most
    general sense. We introduce several novel high-dimensional geometry type of
    strategies that enable us to eventually determine the (ell_1)’s behavior.

    Energy optimization for distributions on the sphere and improvement to the Welch bounds

    Yan Shuo Tan
    Subjects: Probability (math.PR); Information Theory (cs.IT)

    For any Borel probability measure on (mathbb{R}^n), we may define a family
    of eccentricity tensors. This new notion, together with a tensorization trick,
    allows us to prove an energy minimization property for rotationally invariant
    probability measures. We use this theory to give a new proof of the Welch
    bounds, and to improve upon them for collections of real vectors. In addition,
    we are able to give elementary proofs for two theorems characterizing
    probability measures optimizing one-parameter families of energy integrals on
    the sphere. We are also able to explain why a phase transition occurs for
    optimizers of these two families.

    An Artificial-Noise-Aided Hybrid TS/PS Scheme for OFDM-Based SWIPT Systems

    Ahmed El Shafie, Kamel Tourki, Naofal Al-Dhahir
    Comments: Accepted for publication in IEEE Communications Letters Dec 2016
    Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT)

    We propose a new artificial-noise aided hybrid time-switching/power-splitting
    scheme for orthogonal frequency-division multiplexing (OFDM) systems to
    securely transmit data and transfer energy to a legitimate receiving node. In
    our proposed scheme, the cyclic prefix has two more benefits in addition to the
    cancellation of the inter-symbol interference between the OFDM blocks. Firstly,
    it enables the legitimate transmitter to send artificial-noise (AN) vectors in
    a way such that the interference can be canceled at the legitimate receiver
    prior to information decoding. Secondly, its power is used to energize the
    legitimate receiver. We optimize the cyclic prefix length, the time-switching
    and power-splitting parameters, and the power allocation ratio between the data
    and AN signals at the legitimate transmitter to maximize the average secrecy
    rate subject to a constraint on the average energy transfer rate at the
    legitimate receiver. Our numerical results demonstrate that our proposed scheme
    can achieve up to 23% average secrecy rate gain relative to a pure
    power-splitting scheme.

    On the absolute irreducibility of hyperplane sections of generalized Fermat varieties in (Bbb{P}^3) and the conjecture on exceptional APN functions: the Kasami-Welch degree case

    Moises Delgado, Heeralal Janwa
    Subjects: Algebraic Geometry (math.AG); Cryptography and Security (cs.CR); Information Theory (cs.IT); Combinatorics (math.CO); Number Theory (math.NT)

    Let (f) be a function on a finite field (F). The decomposition of the
    generalized Fermat variety (X) defined by the multivariate polynomial of degree
    (n), (phi(x,y,z)=f(x)+f(y)+f(z)) in (Bbb{P}^3(overline{mathbb{F}}_2)),
    plays a crucial role in the study of almost perfect non-linear (APN) functions
    and exceptional APN functions. Their structure depends fundamentally on the
    Fermat varieties corresponding to the monomial functions of exceptional degrees
    (n=2^k+1) and (n=2^{2k}-2^k+1) (Gold and Kasami-Welch numbers, respectively).
    Very important results for these have been obtained by Janwa, McGuire and
    Wilson in [12,13]. In this paper we study (X) related to the Kasami-Welch
    degree monomials and its decomposition into absolutely irreducible components.
    We show that, in this decomposition, the components intersect transversally at
    a singular point.

    This structural fact implies that the corresponding generalized Fermat
    hypersurfaces, related to Kasami-Welch degree polynomial families, are
    absolutely irreducible. In particular, we prove that if
    (f(x)=x^{2^{2k}-2^k+1}+h(x)), where ({
    m deg}(h)equiv 3{pmod 4}), then the
    corresponding APN multivariate hypersurface is absolutely irreducible, and
    hence (f(x)) is not exceptional APN function. We also prove conditional result
    in the case when ({
    m deg}(h)equiv 5{pmod 8}). Since for odd degree (f(x)),
    the conjecture needs to be resolved only for the Gold degree and the
    Kasami-Welch degree cases our results contribute substantially to the proof of
    the conjecture on exceptional APN functions—in the hardest case: the
    Kasami-Welch degree.

    Secure Multiparty Interactive Communication with Unknown Noise Rate

    Abhinav Aggarwal, Varsha Dani, Nico Doettling, Thomas P. Hayes, Jared Saia
    Comments: 15 pages, 2 figures
    Subjects: Cryptography and Security (cs.CR); Information Theory (cs.IT)

    A group of n users want to run an asynchronous protocol, whose length L is
    unknown a priori. An oblivious bit-flipping adversary, who does not have access
    to the private random bits of the users, enters the scene with unbounded budget
    T. This paper presents an algorithm that compiles the noise-free protocol (pi)
    into a robust protocol ?(pi’) that can deal with this adversary and achieve
    successful communication with probability 1 – O(1/n), sending O((L + T) log n(L
    + T)) bits in expectation. It does so without any a priori knowledge of L or T.
    Our result is, hence, within logarithmic factors of the optimal.

    Asymmetric Hat Game with three players and three colors

    Theo van Uem
    Comments: 11 pages. arXiv admin note: text overlap with arXiv:1612.00276
    Subjects: Combinatorics (math.CO); Discrete Mathematics (cs.DM); Information Theory (cs.IT)

    Winning probabilities of The Hat Game (Ebert’s Hat Problem) with three
    players and three colors are only known in the symmetric case: all
    probabilities of the colors are equal. This paper solves the asymmetric case:
    probabilities may be different. We find winning probabilies and optimal
    strategies in all cases.

    Security of Rechargeable Energy-Harvesting Transmitters in Wireless Networks

    Ahmed El Shafie, Dusit Niyato, Naofal Al-Dhahir
    Comments: IEEE Wireless Communications Letters 2016, this http URL
    Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT)

    In this letter, we investigate the security of a single-antenna rechargeable
    source node in the presence of a multi-antenna rechargeable cooperative jammer
    and a potential single-antenna eavesdropper. The batteries at the legitimate
    transmitting nodes (i.e. the source node and the jamming node) are assumed to
    be limited in capacity and are modeled as queueing systems. We investigate the
    impact of the energy arrival rates at the batteries on the achievable secrecy
    rates. In our energy-constrained network, we propose an efficient scheme to
    enhance the system’s security by optimizing the transmission times of the
    source node. The jammer uses a subset of its antennas (and transmit
    radio-frequency chains) to create a beamformer which maximizes the system’s
    secrecy rate while completely canceling the artificial noise at the legitimate
    destination. Our numerical results demonstrate the significant average secrecy
    rate gain of our proposed scheme.

    Physical-Layer Security of a Buffer-Aided Full-Duplex Relaying~System

    Ahmed El Shafie, Ahmed Sultan, Naofal Al-Dhahir
    Comments: Published in IEEE Comm Letters this http URL
    Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT)

    This letter proposes a novel hybrid half-/full-duplex relaying scheme to
    enhance the relay channel security. A source node (Alice) communicates with her
    destination node (Bob) in the presence of a buffer-aided full-duplex relay node
    (Rooney) and a potential eavesdropper (Eve). Rooney adopts two different
    relaying strategies, namely randomize-and-forward and decode-and-forward
    relaying strategies, to improve the security of the legitimate system. In the
    first relaying strategy, Rooney uses a codebook different from that used at
    Alice. In the second relaying strategy, Rooney and Alice use the same
    codebooks. In addition, Rooney switches between half-duplex and full-duplex
    modes to further enhance the security of the legitimate system. The numerical
    results demonstrate that our proposed scheme achieves a significant average
    secrecy end-to-end throughput improvement relative to the conventional
    bufferless full-duplex relaying scheme.




沪ICP备19023445号-2号
友情链接