IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Mon, 22 May 2017

    我爱机器学习(52ml.net)发表于 2017-05-22 00:00:00
    love 0

    Neural and Evolutionary Computing

    Parameter Adaptation and Criticality in Particle Swarm Optimization

    Carlos Garcia Cordero
    Comments: 70 pages, MSc. Thesis in Artificial Intelligence, University of Edinburgh
    Subjects: Neural and Evolutionary Computing (cs.NE)

    Generality is one of the main advantages of heuristic algorithms, as such,
    multiple parameters are exposed to the user with the objective of allowing them
    to shape the algorithms to their specific needs. Parameter selection,
    therefore, becomes an intrinsic problem of every heuristic algorithm. Selecting
    good parameter values relies not only on knowledge related to the problem at
    hand, but to the algorithms themselves. This research explores the usage of
    self-organized criticality to reduce user interaction in the process of
    selecting suitable parameters for particle swarm optimization (PSO) heuristics.
    A particle swarm variant (named Adaptive PSO) with self-organized criticality
    is developed and benchmarked against the standard PSO. Criticality is observed
    in the dynamic behaviour of this swarm and excellent results are observed in
    the long run. In contrast with the standard PSO, the Adaptive PSO does not
    stagnate at any point in time, balancing the concepts of exploration and
    exploitation better. A software platform for experimenting with particle
    swarms, called PSO Laboratory, is also developed. This software is used to test
    the standard PSO as well as all other PSO variants developed in the process of
    creating the Adaptive PSO. As the software is intended to be of aid to future
    and related research, special attention has been put in the development of a
    friendly graphical user interface. Particle swarms are executed in real time,
    allowing users to experiment by changing parameters on-the-fly.

    A Survey of Neuromorphic Computing and Neural Networks in Hardware

    Catherine D. Schuman, Thomas E. Potok, Robert M. Patton, J. Douglas Birdwell, Mark E. Dean, Garrett S. Rose, James S. Plank
    Subjects: Neural and Evolutionary Computing (cs.NE)

    Neuromorphic computing has come to refer to a variety of brain-inspired
    computers, devices, and models that contrast the pervasive von Neumann computer
    architecture. This biologically inspired approach has created highly connected
    synthetic neurons and synapses that can be used to model neuroscience theories
    as well as solve challenging machine learning problems. The promise of the
    technology is to create a brain-like ability to learn and adapt, but the
    technical challenges are significant, starting with an accurate neuroscience
    model of how the brain works, to finding materials and engineering
    breakthroughs to build devices to support these models, to creating a
    programming framework so the systems can learn, to creating applications with
    brain-like capabilities. In this work, we provide a comprehensive survey of the
    research and motivations for neuromorphic computing over its history. We begin
    with a 35-year review of the motivations and drivers of neuromorphic computing,
    then look at the major research areas of the field, which we define as
    neuro-inspired models, algorithms and learning approaches, hardware and
    devices, supporting systems, and finally applications. We conclude with a broad
    discussion on the major research topics that need to be addressed in the coming
    years to see the promise of neuromorphic computing fulfilled. The goals of this
    work are to provide an exhaustive review of the research conducted in
    neuromorphic computing since the inception of the term, and to motivate further
    work by illuminating gaps in the field where new research is needed.

    Proposal for a Leaky Integrate Fire Spiking Neuron Using Voltage Driven Domain Wall Motion

    Akhilesh Jaiswal, Amogh Agrawal, Kaushik Roy
    Subjects: Emerging Technologies (cs.ET); Neural and Evolutionary Computing (cs.NE)

    Conventional von-Neumann computing models have achieved remarkable feats for
    the past few decades. However, they fail to deliver the required efficiency for
    certain basic tasks like image and speech recognition when compared to
    biological systems. As such, taking cues from biological systems, novel
    computing paradigms are being explored for efficient hardware implementations
    of recognition/classification tasks. The basic building blocks of such
    neuromorphic systems are neurons and synapses. Towards that end, we propose a
    leaky-integrate-fire (LIF) neuron using domain wall motion induced by
    magnetoelectric effect. Due to a strong elastic pinning between the
    ferromagnetic domain wall (FM-DW) and the underlying ferro-electric domain wall
    (FE-DW), the FM-DW gets dragged by the FE-DW on application of a voltage pulse.
    The fact that FE materials are insulators allows for pure voltage-driven FM-DW
    motion, which in turn can be used to mimic both the leaky and the integrate
    behavior of a spiking neuron. Further, the voltage driven nature of the
    proposed neuron allows energy-efficient operation. A detailed device to circuit
    level simulation framework based on micromagnetic simulations has been
    developed to analyze the feasibility of the proposed device.

    Learning Convolutional Text Representations for Visual Question Answering

    Zhengyang Wang, Shuiwang Ji
    Subjects: Learning (cs.LG); Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    Visual question answering is a recently proposed artificial intelligence task
    that requires a deep understanding of both images and texts. In deep learning,
    images are typically modeled through convolutional neural networks, and texts
    are typically modeled through recurrent neural networks. While the requirement
    for modeling images is similar to traditional computer vision tasks, such as
    object recognition and image classification, visual question answering raises a
    different need for textual representation as compared to other natural language
    processing tasks. In this work, we perform a detailed analysis on natural
    language questions in visual question answering. Based on the analysis, we
    propose to rely on convolutional neural networks for learning textual
    representations. By exploring the various properties of convolutional neural
    networks specialized for text data, such as width and depth, we present our
    “CNN Inception + Gate” model. We show that our model improves question
    representations and thus the overall accuracy of visual question answering
    models. We also show that the text representation requirement in visual
    question answering is more complicated and comprehensive than that in
    conventional natural language processing tasks, making it a better task to
    evaluate textual representation methods. Shallow models like fastText, which
    can obtain comparable results with deep learning models in tasks like text
    classification, are not suitable in visual question answering.

    Spatial Variational Auto-Encoding via Matrix-Variate Normal Distributions

    Zhengyang Wang, Hao Yuan, Shuiwang Ji
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    The key idea of variational auto-encoders (VAEs) resembles that of
    traditional auto-encoder models in which spatial information is supposed to be
    explicitly encoded in the latent space. However, the latent variables in VAEs
    are vectors, which are commonly interpreted as multiple feature maps of size
    1×1. Such representations can only convey spatial information implicitly when
    coupled with powerful decoders. In this work, we propose spatial VAEs that use
    latent variables as feature maps of larger size to explicitly capture spatial
    information. This is achieved by allowing the latent variables to be sampled
    from matrix-variate normal (MVN) distributions whose parameters are computed
    from the encoder network. To increase dependencies among locations on latent
    feature maps and reduce the number of parameters, we further propose spatial
    VAEs via low-rank MVN distributions. Experimental results show that the
    proposed spatial VAEs outperform original VAEs in capturing rich structural and
    spatial information.

    Pixel Deconvolutional Networks

    Hongyang Gao, Hao Yuan, Zhengyang Wang, Shuiwang Ji
    Comments: 10 pages
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    Deconvolutional layers have been widely used in a variety of deep models for
    up-sampling, including encoder-decoder networks for semantic segmentation and
    deep generative models for unsupervised learning. One of the key limitations of
    deconvolutional operations is that they result in the so-called checkerboard
    problem. This is caused by the fact that no direct relationship exists among
    adjacent pixels on the output feature map. To address this problem, we propose
    the pixel deconvolutional layer (PixelDCL) to establish direct relationships
    among adjacent pixels on the up-sampled feature map. Our method is based on a
    fresh interpretation of the regular deconvolution operation. The resulting
    PixelDCL can be used to replace any deconvolutional layer in a plug-and-play
    manner without compromising the fully trainable capabilities of original
    models. The proposed PixelDCL may result in slight decrease in efficiency, but
    this can be overcome by an implementation trick. Experimental results on
    semantic segmentation demonstrate that PixelDCL can consider spatial features
    such as edges and shapes and yields more accurate segmentation outputs than
    deconvolutional layers. When used in image generation tasks, our PixelDCL can
    largely overcome the checkerboard problem suffered by regular deconvolution
    operations.

    Building effective deep neural network architectures one feature at a time

    Martin Mundt, Tobias Weis, Kishore Konda, Visvanathan Ramesh
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    Successful training of convolutional neural networks is often associated with
    the training of sufficiently deep architectures composed of high amounts of
    features while relying on a variety of regularization and pruning techniques to
    converge to less redundant states. We introduce an easy to compute metric,
    based on feature time evolution, to evaluate feature importance during training
    and demonstrate its potency in determining a networks effective capacity. In
    consequence we propose a novel algorithm to evolve fixed-depth architectures
    starting from just a single feature per layer to attain effective
    representational capacities needed for a specific task by greedily adding
    feature by feature. We revisit popular CNN architectures and demonstrate how
    evolved architectures not only converge to similar topologies that benefit from
    less parameters or improved accuracy, but furthermore exhibit systematic
    correspondence in representational complexity with the specified task. In
    contrast to conventional design patterns that typically have a monotonic
    increase in the amount of features with increased depth, we observe that CNNs
    perform better when there is a peak in learnable parameters in intermediate,
    with falloffs to earlier and later layers.


    Computer Vision and Pattern Recognition

    Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics

    Alex Kendall, Yarin Gal, Roberto Cipolla
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Numerous deep learning applications benefit from multi-task learning with
    multiple regression and classification objectives. In this paper we make the
    observation that the performance of such systems is strongly dependent on the
    relative weighting between each task’s loss. Tuning these weights by hand is a
    difficult and expensive process, making multi-task learning prohibitive in
    practice. We propose a principled approach to multi-task deep learning which
    weighs multiple loss functions by considering the homoscedastic uncertainty of
    each task. This allows us to simultaneously learn various quantities with
    different units or scales in both classification and regression settings. We
    demonstrate our model learning per-pixel depth regression, semantic and
    instance segmentation from a monocular input image. Perhaps surprisingly, we
    show our model can learn multi-task weightings and outperform separate models
    trained individually on each task.

    Snapshot Difference Imaging using Time-of-Flight Sensors

    Clara Callenberg, Felix Heide, Gordon Wetzstein, Matthias Hullin
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)

    Computational photography encompasses a diversity of imaging techniques, but
    one of the core operations performed by many of them is to compute image
    differences. An intuitive approach to computing such differences is to capture
    several images sequentially and then process them jointly. Usually, this
    approach leads to artifacts when recording dynamic scenes. In this paper, we
    introduce a snapshot difference imaging approach that is directly implemented
    in the sensor hardware of emerging time-of-flight cameras. With a variety of
    examples, we demonstrate that the proposed snapshot difference imaging
    technique is useful for direct-global illumination separation, for direct
    imaging of spatial and temporal image gradients, for direct depth edge imaging,
    and more.

    CNN-Based Joint Clustering and Representation Learning with Feature Drift Compensation for Large-Scale Image Data

    Chih-Chung Hsu, Chia-Wen Lin
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Given a large unlabeled set of images, how to efficiently and effectively
    group them into clusters based on the extracted visual representations remains
    a challenging problem. To address this problem, we propose a convolutional
    neural network (CNN) to jointly solve clustering and representation learning in
    an iterative manner. In the proposed method, given an input image set, we first
    randomly pick k samples and extract their features as initial cluster centroids
    using the proposed CNN with an initial model pre-trained from the ImageNet
    dataset. Mini-batch k-means is then performed to assign cluster labels to
    individual input samples for a mini-batch of images randomly sampled from the
    input image set until all images are processed. Subsequently, the proposed CNN
    simultaneously updates the parameters of the proposed CNN and the centroids of
    image clusters iteratively based on stochastic gradient descent. We also
    proposed a feature draft compensation scheme to mitigate the drift error caused
    by feature mismatch in representation learning. Experimental results
    demonstrate the proposed method outperforms start-of-the-art clustering schemes
    in terms of accuracy and storage complexity on large-scale image sets
    containing millions of images.

    Bitwise Operations of Cellular Automaton on Gray-scale Images

    Karttikeya Mangalam, K S Venkatesh
    Comments: 5 Pages. The code is available at : this https URL
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Cellular Automata (CA) theory is a discrete model that represents the state
    of each of its cells from a finite set of possible values which evolve in time
    according to a pre-defined set of transition rules. CA have been applied to a
    number of image processing tasks such as Convex Hull Detection, Image Denoising
    etc. but mostly under the limitation of restricting the input to binary images.
    In general, a gray-scale image may be converted to a number of different binary
    images which are finally recombined after CA operations on each of them
    individually. We have developed a multinomial regression based weighed
    summation method to recombine binary images for better performance of CA based
    Image Processing algorithms. The recombination algorithm is tested for the
    specific case of denoising Salt and Pepper Noise to test against standard
    benchmark algorithms such as the Median Filter for various images and noise
    levels. The results indicate several interesting invariances in the application
    of the CA, such as the particular noise realization and the choice of
    sub-sampling of pixels to determine recombination weights. Additionally, it
    appears that simpler algorithms for weight optimization which seek local minima
    work as effectively as those that seek global minima such as Simulated
    Annealing.

    What do We Learn by Semantic Scene Understanding for Remote Sensing imagery in CNN framework?

    Haifeng Li, Jian Peng, Chao Tao, Jie Chen, Min Deng
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Recently, deep convolutional neural network (DCNN) achieved increasingly
    remarkable success and rapidly developed in the field of natural image
    recognition. Compared with the natural image, the scale of remote sensing image
    is larger and the scene and the object it represents are more macroscopic. This
    study inquires whether remote sensing scene and natural scene recognitions
    differ and raises the following questions: What are the key factors in remote
    sensing scene recognition? Is the DCNN recognition mechanism centered on object
    recognition still applicable to the scenarios of remote sensing scene
    understanding? We performed several experiments to explore the influence of the
    DCNN structure and the scale of remote sensing scene understanding from the
    perspective of scene complexity. Our experiment shows that understanding a
    complex scene depends on an in-depth network and multiple-scale perception.
    Using a visualization method, we qualitatively and quantitatively analyze the
    recognition mechanism in a complex remote sensing scene and demonstrate the
    importance of multi-objective joint semantic support.

    MRI-PET Registration with Automated Algorithm in Pre-clinical Studies

    Nathanael L. Baisa, Stephanie Bricq, Alain Lalande
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Magnetic Resonance Imaging (MRI) and Positron Emission Tomography (PET)
    automatic 3-D registration is implemented and validated for small animal image
    volumes so that the high-resolution anatomical MRI information can be fused
    with the low spatial resolution of functional PET information for the
    localization of lesion that is currently in high demand in the study of tumor
    of cancer (oncology) and its corresponding preparation of pharmaceutical drugs.
    Though many registration algorithms are developed and applied on human brain
    volumes, these methods may not be as efficient on small animal datasets due to
    lack of intensity information and often the high anisotropy in voxel
    dimensions. Therefore, a fully automatic registration algorithm which can
    register not only assumably rigid small animal volumes such as brain but also
    deformable organs such as kidney, cardiac and chest is developed using a
    combination of global affine and local B-spline transformation models in which
    mutual information is used as a similarity criterion. The global affine
    registration uses a multi-resolution pyramid on image volumes of 3 levels
    whereas in local B-spline registration, a multi-resolution scheme is applied on
    the B-spline grid of 2 levels on the finest resolution of the image volumes in
    which only the transform itself is affected rather than the image volumes.
    Since mutual information lacks sufficient spatial information, PCA is used to
    inject it by estimating initial translation and rotation parameters. It is
    computationally efficient since it is implemented using C++ and ITK library,
    and is qualitatively and quantitatively shown that this PCA-initialized global
    registration followed by local registration is in close agreement with expert
    manual registration and outperforms the one without PCA initialization tested
    on small animal brain and kidney.

    What are the Receptive, Effective Receptive, and Projective Fields of Neurons in Convolutional Neural Networks?

    Hung Le, Ali Borji
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this work, we explain in detail how receptive fields, effective receptive
    fields, and projective fields of neurons in different layers, convolution or
    pooling, of a Convolutional Neural Network (CNN) are calculated. While our
    focus here is on CNNs, the same operations, but in the reverse order, can be
    used to calculate these quantities for deconvolutional neural networks. These
    are important concepts, not only for better understanding and analyzing
    convolutional and deconvolutional networks, but also for optimizing their
    performance in real-world applications.

    Segmentation of 3D High-frequency Ultrasound Images of Human Lymph Nodes Using Graph Cut with Energy Functional Adapted to Local Intensity Distribution

    Jen-wei Kuo, Jonathan Mamou, Yao Wang, Emi Saegusa-Beecroft, Junji Machi, Ernest J. Feleppa
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Previous studies by our group have shown that three-dimensional
    high-frequency quantitative ultrasound methods have the potential to
    differentiate metastatic lymph nodes from cancer-free lymph nodes dissected
    from human cancer patients. To successfully perform these methods inside the
    lymph node parenchyma, an automatic segmentation method is highly desired to
    exclude the surrounding thin layer of fat from quantitative ultrasound
    processing and accurately correct for ultrasound attenuation. In high-frequency
    ultrasound images of lymph nodes, the intensity distribution of lymph node
    parenchyma and fat varies spatially because of acoustic attenuation and
    focusing effects. Thus, the intensity contrast between two object regions
    (e.g., lymph node parenchyma and fat) is also spatially varying. In our
    previous work, nested graph cut demonstrated its ability to simultaneously
    segment lymph node parenchyma, fat, and the outer phosphate-buffered saline
    bath even when some boundaries are lost because of acoustic attenuation and
    focusing effects. This paper describes a novel approach called graph cut with
    locally adaptive energy to further deal with spatially varying distributions of
    lymph node parenchyma and fat caused by inhomogeneous acoustic attenuation. The
    proposed method achieved Dice similarity coefficients of 0.937+-0.035 when
    compared to expert manual segmentation on a representative dataset consisting
    of 115 three-dimensional lymph node images obtained from colorectal cancer
    patients.

    The Kinetics Human Action Video Dataset

    Will Kay, Joao Carreira, Karen Simonyan, Brian Zhang, Chloe Hillier, Sudheendra Vijayanarasimhan, Fabio Viola, Tim Green, Trevor Back, Paul Natsev, Mustafa Suleyman, Andrew Zisserman
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We describe the DeepMind Kinetics human action video dataset. The dataset
    contains 400 human action classes, with at least 400 video clips for each
    action. Each clip lasts around 10s and is taken from a different YouTube video.
    The actions are human focussed and cover a broad range of classes including
    human-object interactions such as playing instruments, as well as human-human
    interactions such as shaking hands. We describe the statistics of the dataset,
    how it was collected, and give some baseline performance figures for neural
    network architectures trained and tested for human action classification on
    this dataset. We also carry out a preliminary analysis of whether imbalance in
    the dataset leads to bias in the classifiers.

    Hyperspectral Band Selection Using Unsupervised Non-Linear Deep Auto Encoder to Train External Classifiers

    Muhammad Ahmad, Stanislav Protasov, Adil Mehmood Khan
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In order to make hyperspectral image classification compu- tationally
    tractable, it is often necessary to select the most informative bands instead
    to process the whole data without losing the geometrical representation of
    original data. To cope with said issue, an improved un- supervised non-linear
    deep auto encoder (UDAE) based band selection method is proposed. The proposed
    UDAE is able to select the most infor- mative bands in such a way that preserve
    the key information but in the lower dimensions, where the hidden
    representation is a non-linear trans- formation that maps the original space to
    a space of lower dimensions. This work emphasizes to analyze what type of
    information is needed to preserve the hierarchical UDAE representation while
    selecting a sub- set from original space. Our experiments on publically
    available hyper- spectral dataset demonstrate the effectiveness of UDAE method,
    which equates favorably with other state-of-the-art methods.

    Local Shape Spectrum Analysis for 3D Facial Expression Recognition

    Dmytro Derkach, Federico M. Sukno
    Comments: 12th IEEE International Conference on Face and Gesture Recognition, Washington, DC, USA, 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We investigate the problem of facial expression recognition using 3D data.
    Building from one of the most successful frameworks for facial analysis using
    exclusively 3D geometry, we extend the analysis from a curve-based
    representation into a spectral representation, which allows a complete
    description of the underlying surface that can be further tuned to the desired
    level of detail. Spectral representations are based on the decomposition of the
    geometry in its spatial frequency components, much like a Fourier transform,
    which are related to intrinsic characteristics of the surface. In this work, we
    propose the use of Graph Laplacian Features (GLF), which results from the
    projection of local surface patches into a common basis obtained from the Graph
    Laplacian eigenspace. We test the proposed approach in the BU-3DFE database in
    terms of expressions and Action Units recognition. Our results confirm that the
    proposed GLF produces consistently higher recognition rates than the
    curves-based approach, thanks to a more complete description of the surface,
    while requiring a lower computational complexity. We also show that the GLF
    outperform the most popular alternative approach for spectral representation,
    Shape- DNA, which is based on the Laplace Beltrami Operator and cannot provide
    a stable basis that guarantee that the extracted signatures for the different
    patches are directly comparable.

    Affine-Gradient Based Local Binary Pattern Descriptor for Texture Classiffication

    You Hao, Shirui Li, Hanlin Mo, Hua Li
    Comments: 11 pages,4 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We present a novel Affine-Gradient based Local Binary Pattern (AGLBP)
    descriptor for texture classification. It is very hard to describe complicated
    texture using single type information, such as Local Binary Pattern (LBP),
    which just utilizes the sign information of the difference between the pixel
    and its local neighbors. Our descriptor has three characteristics: 1) In order
    to make full use of the information contained in the texture, the
    Affine-Gradient, which is different from Euclidean-Gradient and invariant to
    affine transformation is incorporated into AGLBP. 2) An improved method is
    proposed for rotation invariance, which depends on the reference direction
    calculating respect to local neighbors. 3) Feature selection method,
    considering both the statistical frequency and the intraclass variance of the
    training dataset, is also applied to reduce the dimensionality of descriptors.
    Experiments on three standard texture datasets, Outex12, Outex10 and KTH-TIPS2,
    are conducted to evaluate the performance of AGLBP. The results show that our
    proposed descriptor gets better performance comparing to some state-of-the-art
    rotation texture descriptors in texture classification.

    Fiber Orientation Estimation Guided by a Deep Network

    Chuyang Ye, Jerry L. Prince
    Comments: A shorter version is accepted by MICCAI 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Diffusion magnetic resonance imaging (dMRI) is currently the only tool for
    noninvasively imaging the brain’s white matter tracts. The fiber orientation
    (FO) is a key feature computed from dMRI for fiber tract reconstruction.
    Because the number of FOs in a voxel is usually small, dictionary-based sparse
    reconstruction has been used to estimate FOs with a relatively small number of
    diffusion gradients. However, accurate FO estimation in regions with complex FO
    configurations in the presence of noise can still be challenging. In this work
    we explore the use of a deep network for FO estimation in a dictionary-based
    framework and propose an algorithm named Fiber Orientation Reconstruction
    guided by a Deep Network (FORDN). FORDN consists of two steps. First, we use a
    smaller dictionary encoding coarse basis FOs to represent the diffusion
    signals. To estimate the mixture fractions of the dictionary atoms (and thus
    coarse FOs), a deep network is designed specifically for solving the sparse
    reconstruction problem. Here, the smaller dictionary is used to reduce the
    computational cost of training. Second, the coarse FOs inform the final FO
    estimation, where a larger dictionary encoding dense basis FOs is used and a
    weighted l1-norm regularized least squares problem is solved to encourage FOs
    that are consistent with the network output. FORDN was evaluated and compared
    with state-of-the-art algorithms that estimate FOs using sparse reconstruction
    on simulated and real dMRI data, and the results demonstrate the benefit of
    using a deep network for FO estimation.

    ADMM-Net: A Deep Learning Approach for Compressive Sensing MRI

    Yan Yang, Jian Sun, Huibin Li, Zongben Xu
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Compressive sensing (CS) is an effective approach for fast Magnetic Resonance
    Imaging (MRI). It aims at reconstructing MR images from a small number of
    under-sampled data in k-space, and accelerating the data acquisition in MRI. To
    improve the current MRI system in reconstruction accuracy and speed, in this
    paper, we propose two novel deep architectures, dubbed ADMM-Nets in basic and
    generalized versions. ADMM-Nets are defined over data flow graphs, which are
    derived from the iterative procedures in Alternating Direction Method of
    Multipliers (ADMM) algorithm for optimizing a general CS-based MRI model. They
    take the sampled k-space data as inputs and output reconstructed MR images.
    Moreover, we extend our network to cope with complex-valued MR images. In the
    training phase, all parameters of the nets, e.g., transforms, shrinkage
    functions, etc., are discriminatively trained end-to-end. In the testing phase,
    they have computational overhead similar to ADMM algorithm but use optimized
    parameters learned from the data for CS-based reconstruction task. We
    investigate different configurations in network structures and conduct
    extensive experiments on MR image reconstruction under different sampling
    rates. Due to the combination of the advantages in model-based approach and
    deep learning approach, the ADMM-Nets achieve state-of-the-art reconstruction
    accuracies with fast computational speed.

    Prediction of Sea Surface Temperature using Long Short-Term Memory

    Qin Zhang, Hui Wang, Junyu Dong, Guoqiang Zhong, Xin Sun
    Comments: 5 page, 5 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This letter adopts long short-term memory(LSTM) to predict sea surface
    temperature(SST), which is the first attempt, to our knowledge, to use
    recurrent neural network to solve the problem of SST prediction, and to make
    one week and one month daily prediction. We formulate the SST prediction
    problem as a time series regression problem. LSTM is a special kind of
    recurrent neural network, which introduces gate mechanism into vanilla RNN to
    prevent the vanished or exploding gradient problem. It has strong ability to
    model the temporal relationship of time series data and can handle the
    long-term dependency problem well. The proposed network architecture is
    composed of two kinds of layers: LSTM layer and full-connected dense layer.
    LSTM layer is utilized to model the time series relationship. Full-connected
    layer is utilized to map the output of LSTM layer to a final prediction. We
    explore the optimal setting of this architecture by experiments and report the
    accuracy of coastal seas of China to confirm the effectiveness of the proposed
    method. In addition, we also show its online updated characteristics.

    Online Signature Verification using Recurrent Neural Network and Length-normalized Path Signature

    Songxuan Lai, Lianwen Jin, Weixin Yang
    Comments: 6 pages, 5 figures, 5 tables
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Inspired by the great success of recurrent neural networks (RNNs) in
    sequential modeling, we introduce a novel RNN system to improve the performance
    of online signature verification. The training objective is to directly
    minimize intra-class variations and to push the distances between skilled
    forgeries and genuine samples above a given threshold. By back-propagating the
    training signals, our RNN network produced discriminative features with desired
    metrics. Additionally, we propose a novel descriptor, called the
    length-normalized path signature (LNPS), and apply it to online signature
    verification. LNPS has interesting properties, such as scale invariance and
    rotation invariance after linear combination, and shows promising results in
    online signature verification. Experiments on the publicly available SVC-2004
    dataset yielded state-of-the-art performance of 2.37% equal error rate (EER).

    Modeling Geometrical Mysteries of Cafe Wall illusions

    Nasim Nematzadeh, David M.W. Powers
    Comments: Submitted to PLOS ONE
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper explores the tilt illusion effect in the Cafe Wall pattern using a
    classical Gaussian Receptive Field model. In this illusion, the mortar lines
    are misperceived as diverging or converging rather than horizontal. We examine
    the capability of a simple bioplausible filtering model to recognize different
    degrees of tilt effect in the Cafe Wall illusion based on different
    characteristics of the pattern. Our study employed a Difference of Gaussians
    model of retinal to cortical ON center and/or OFF center receptive fields. A
    wide range of parameters of the stimulus, for example mortar thickness,
    brightness, tile contrast, phase of the tile displacement, have been studied.
    Our model constructs a multiple scale edge map representation that reveals tilt
    cues and clues involved in the illusory perception of the Cafe Wall pattern. We
    will show here that our model not only can detect the tilt in this pattern, but
    also can predict the strength of the illusion and quantify the degree of tilt.
    The results of our simulations are consistent with previous psychophysical
    findings across the full range of Cafe Wall variations tested. Our results also
    suggest that the Difference of Gaussian mechanism is the heart of the effects
    explained by, and the mechanisms proposed by the Irradiation, Brightness
    Induction, and Band-pass filtering models.

    Deep-LK for Efficient Adaptive Object Tracking

    Chaoyang Wang, Hamed Kiani Galoogahi, Chen-Hsuan Lin, Chen Huang, Simon Lucey
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper we present a new approach for efficient regression based object
    tracking which we refer to as Deep- LK. Our approach is closely related to the
    Generic Object Tracking Using Regression Networks (GOTURN) framework of Held et
    al. We make the following contributions. First, we demonstrate that there is a
    theoretical relationship between siamese regression networks like GOTURN and
    the classical Inverse-Compositional Lucas & Kanade (IC-LK) algorithm. Further,
    we demonstrate that unlike GOTURN IC-LK adapts its regressor to the appearance
    of the currently tracked frame. We argue that this missing property in GOTURN
    can be attributed to its poor performance on unseen objects and/or viewpoints.
    Second, we propose a novel framework for object tracking – which we refer to as
    Deep-LK – that is inspired by the IC-LK framework. Finally, we show impressive
    results demonstrating that Deep-LK substantially outperforms GOTURN.
    Additionally, we demonstrate comparable tracking performance to current state
    of the art deep-trackers whilst being an order of magnitude (i.e. 100 FPS)
    computationally efficient.

    Exploring the structure of a real-time, arbitrary neural artistic stylization network

    Golnaz Ghiasi, Honglak Lee, Manjunath Kudlur, Vincent Dumoulin
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we present a method which combines the flexibility of the
    neural algorithm of artistic style with the speed of fast style transfer
    networks to allow real-time stylization using any content/style image pair. We
    build upon recent work leveraging conditional instance normalization for
    multi-style transfer networks by learning to predict the conditional instance
    normalization parameters directly from a style image. The model is successfully
    trained on a corpus of roughly 80,000 paintings and is able to generalize to
    paintings previously unobserved. We demonstrate that the learned embedding
    space is smooth and contains a rich structure and organizes semantic
    information associated with paintings in an entirely unsupervised manner.

    Building effective deep neural network architectures one feature at a time

    Martin Mundt, Tobias Weis, Kishore Konda, Visvanathan Ramesh
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    Successful training of convolutional neural networks is often associated with
    the training of sufficiently deep architectures composed of high amounts of
    features while relying on a variety of regularization and pruning techniques to
    converge to less redundant states. We introduce an easy to compute metric,
    based on feature time evolution, to evaluate feature importance during training
    and demonstrate its potency in determining a networks effective capacity. In
    consequence we propose a novel algorithm to evolve fixed-depth architectures
    starting from just a single feature per layer to attain effective
    representational capacities needed for a specific task by greedily adding
    feature by feature. We revisit popular CNN architectures and demonstrate how
    evolved architectures not only converge to similar topologies that benefit from
    less parameters or improved accuracy, but furthermore exhibit systematic
    correspondence in representational complexity with the specified task. In
    contrast to conventional design patterns that typically have a monotonic
    increase in the amount of features with increased depth, we observe that CNNs
    perform better when there is a peak in learnable parameters in intermediate,
    with falloffs to earlier and later layers.

    A General Model for Robust Tensor Factorization with Unknown Noise

    Xi'ai Chen, Zhi Han, Yao Wang, Qian Zhao, Deyu Meng, Lin Lin, Yandong Tang
    Comments: 13 pages, 8 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Because of the limitations of matrix factorization, such as losing spatial
    structure information, the concept of low-rank tensor factorization (LRTF) has
    been applied for the recovery of a low dimensional subspace from high
    dimensional visual data. The low-rank tensor recovery is generally achieved by
    minimizing the loss function between the observed data and the factorization
    representation. The loss function is designed in various forms under different
    noise distribution assumptions, like (L_1) norm for Laplacian distribution and
    (L_2) norm for Gaussian distribution. However, they often fail to tackle the
    real data which are corrupted by the noise with unknown distribution. In this
    paper, we propose a generalized weighted low-rank tensor factorization method
    (GWLRTF) integrated with the idea of noise modelling. This procedure treats the
    target data as high-order tensor directly and models the noise by a Mixture of
    Gaussians, which is called MoG GWLRTF. The parameters in the model are
    estimated under the EM framework and through a new developed algorithm of
    weighted low-rank tensor factorization. We provide two versions of the
    algorithm with different tensor factorization operations, i.e., CP
    factorization and Tucker factorization. Extensive experiments indicate the
    respective advantages of this two versions in different applications and also
    demonstrate the effectiveness of MoG GWLRTF compared with other competing
    methods.

    Spatial Variational Auto-Encoding via Matrix-Variate Normal Distributions

    Zhengyang Wang, Hao Yuan, Shuiwang Ji
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    The key idea of variational auto-encoders (VAEs) resembles that of
    traditional auto-encoder models in which spatial information is supposed to be
    explicitly encoded in the latent space. However, the latent variables in VAEs
    are vectors, which are commonly interpreted as multiple feature maps of size
    1×1. Such representations can only convey spatial information implicitly when
    coupled with powerful decoders. In this work, we propose spatial VAEs that use
    latent variables as feature maps of larger size to explicitly capture spatial
    information. This is achieved by allowing the latent variables to be sampled
    from matrix-variate normal (MVN) distributions whose parameters are computed
    from the encoder network. To increase dependencies among locations on latent
    feature maps and reduce the number of parameters, we further propose spatial
    VAEs via low-rank MVN distributions. Experimental results show that the
    proposed spatial VAEs outperform original VAEs in capturing rich structural and
    spatial information.

    Pixel Deconvolutional Networks

    Hongyang Gao, Hao Yuan, Zhengyang Wang, Shuiwang Ji
    Comments: 10 pages
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    Deconvolutional layers have been widely used in a variety of deep models for
    up-sampling, including encoder-decoder networks for semantic segmentation and
    deep generative models for unsupervised learning. One of the key limitations of
    deconvolutional operations is that they result in the so-called checkerboard
    problem. This is caused by the fact that no direct relationship exists among
    adjacent pixels on the output feature map. To address this problem, we propose
    the pixel deconvolutional layer (PixelDCL) to establish direct relationships
    among adjacent pixels on the up-sampled feature map. Our method is based on a
    fresh interpretation of the regular deconvolution operation. The resulting
    PixelDCL can be used to replace any deconvolutional layer in a plug-and-play
    manner without compromising the fully trainable capabilities of original
    models. The proposed PixelDCL may result in slight decrease in efficiency, but
    this can be overcome by an implementation trick. Experimental results on
    semantic segmentation demonstrate that PixelDCL can consider spatial features
    such as edges and shapes and yields more accurate segmentation outputs than
    deconvolutional layers. When used in image generation tasks, our PixelDCL can
    largely overcome the checkerboard problem suffered by regular deconvolution
    operations.


    Artificial Intelligence

    The Bag Semantics of Ontology-Based Data Access

    Charalampos Nikolaou, Egor V. Kostylev, George Konstantinidis, Mark Kaminski, Bernardo Cuenca Grau, Ian Horrocks
    Subjects: Artificial Intelligence (cs.AI)

    Ontology-based data access (OBDA) is a popular approach for integrating and
    querying multiple data sources by means of a shared ontology. The ontology is
    linked to the sources using mappings, which assign views over the data to
    ontology predicates. Motivated by the need for OBDA systems supporting
    database-style aggregate queries, we propose a bag semantics for OBDA, where
    duplicate tuples in the views defined by the mappings are retained, as is the
    case in standard databases. We show that bag semantics makes conjunctive query
    answering in OBDA coNP-hard in data complexity. To regain tractability, we
    consider a rather general class of queries and show its rewritability to a
    generalisation of the relational calculus to bags.

    Induction of Interpretable Possibilistic Logic Theories from Relational Data

    Ondrej Kuzelka, Jesse Davis, Steven Schockaert
    Comments: Longer version of a paper appearing in IJCAI 2017
    Subjects: Artificial Intelligence (cs.AI)

    The field of Statistical Relational Learning (SRL) is concerned with learning
    probabilistic models from relational data. Learned SRL models are typically
    represented using some kind of weighted logical formulas, which make them
    considerably more interpretable than those obtained by e.g. neural networks. In
    practice, however, these models are often still difficult to interpret
    correctly, as they can contain many formulas that interact in non-trivial ways
    and weights do not always have an intuitive meaning. To address this, we
    propose a new SRL method which uses possibilistic logic to encode relational
    models. Learned models are then essentially stratified classical theories,
    which explicitly encode what can be derived with a given level of certainty.
    Compared to Markov Logic Networks (MLNs), our method is faster and produces
    considerably more interpretable models.

    Foundations of Declarative Data Analysis Using Limit Datalog Programs

    Mark Kaminski, Bernardo Cuenca Grau, Egor V. Kostylev, Boris Motik, Ian Horrocks
    Comments: 23 pages. To appear in IJCAI-17
    Subjects: Artificial Intelligence (cs.AI); Logic in Computer Science (cs.LO)

    Motivated by applications in declarative data analysis, we study
    (mathit{Datalog}_{mathbb{Z}})—an extension of positive Datalog with
    arithmetic functions over integers. This language is known to be undecidable,
    so we propose two fragments. In (mathit{limit}~mathit{Datalog}_{mathbb{Z}})
    predicates are axiomatised to keep minimal/maximal numeric values, allowing us
    to show that fact entailment is coNExpTime-complete in combined, and
    coNP-complete in data complexity. Moreover, an additional (mathit{stability})
    requirement causes the complexity to drop to ExpTime and PTime, respectively.
    Finally, we show that stable (mathit{Datalog}_{mathbb{Z}}) can express many
    useful data analysis tasks, and so our results provide a sound foundation for
    the development of advanced information systems.

    The Conference Paper Assignment Problem: Using Order Weighted Averages to Assign Indivisible Goods

    Jing Wu Lian, Nicholas Mattei, Renee Noble, Toby Walsh
    Comments: 3 Figure
    Subjects: Artificial Intelligence (cs.AI); Computer Science and Game Theory (cs.GT); Multiagent Systems (cs.MA)

    Motivated by the common academic problem of allocating papers to referees for
    conference reviewing we propose a novel mechanism for solving the assignment
    problem when we have a two sided matching problem with preferences from one
    side (the agents/reviewers) over the other side (the objects/papers) and both
    sides have capacity constraints. The assignment problem is a fundamental
    problem in both computer science and economics with application in many areas
    including task and resource allocation. We draw inspiration from multi-criteria
    decision making and voting and use order weighted averages (OWAs) to propose a
    novel and flexible class of algorithms for the assignment problem. We show an
    algorithm for finding a (Sigma)-OWA assignment in polynomial time, in contrast
    to the NP-hardness of finding an egalitarian assignment. Inspired by this
    setting we observe an interesting connection between our model and the classic
    proportional multi-winner election problem in social choice.

    A Comparison of Reinforcement Learning Techniques for Fuzzy Cloud Auto-Scaling

    Hamid Arabnejad, Claus Pahl, Pooyan Jamshidi, Giovani Estrada
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI)

    A goal of cloud service management is to design self-adaptable auto-scaler to
    react to workload fluctuations and changing the resources assigned. The key
    problem is how and when to add/remove resources in order to meet agreed
    service-level agreements. Reducing application cost and guaranteeing
    service-level agreements (SLAs) are two critical factors of dynamic controller
    design. In this paper, we compare two dynamic learning strategies based on a
    fuzzy logic system, which learns and modifies fuzzy scaling rules at runtime. A
    self-adaptive fuzzy logic controller is combined with two reinforcement
    learning (RL) approaches: (i) Fuzzy SARSA learning (FSL) and (ii) Fuzzy
    Q-learning (FQL). As an off-policy approach, Q-learning learns independent of
    the policy currently followed, whereas SARSA as an on-policy always
    incorporates the actual agent’s behavior and leads to faster learning. Both
    approaches are implemented and compared in their advantages and disadvantages,
    here in the OpenStack cloud platform. We demonstrate that both auto-scaling
    approaches can handle various load traffic situations, sudden and periodic, and
    delivering resources on demand while reducing operating costs and preventing
    SLA violations. The experimental results demonstrate that FSL and FQL have
    acceptable performance in terms of adjusted number of virtual machine targeted
    to optimize SLA compliance and response time.

    Estimating Accuracy from Unlabeled Data: A Probabilistic Logic Approach

    Emmanouil A. Platanios, Hoifung Poon, Tom M. Mitchell, Eric Horvitz
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    We propose an efficient method to estimate the accuracy of classifiers using
    only unlabeled data. We consider a setting with multiple classification
    problems where the target classes may be tied together through logical
    constraints. For example, a set of classes may be mutually exclusive, meaning
    that a data instance can belong to at most one of them. The proposed method is
    based on the intuition that: (i) when classifiers agree, they are more likely
    to be correct, and (ii) when the classifiers make a prediction that violates
    the constraints, at least one classifier must be making an error. Experiments
    on four real-world data sets produce accuracy estimates within a few percent of
    the true accuracy, using solely unlabeled data. Our models also outperform
    existing state-of-the-art solutions in both estimating accuracies, and
    combining multiple classifier outputs. The results emphasize the utility of
    logical constraints in estimating accuracy, thus validating our intuition.

    Atari games and Intel processors

    Robert Adamski, Tomasz Grel, Maciej Klimek, Henryk Michalewski
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI); Learning (cs.LG)

    The asynchronous nature of the state-of-the-art reinforcement learning
    algorithms such as the Asynchronous Advantage Actor-Critic algorithm, makes
    them exceptionally suitable for CPU computations. However, given the fact that
    deep reinforcement learning often deals with interpreting visual information, a
    large part of the train and inference time is spent performing convolutions. In
    this work we present our results on learning strategies in Atari games using a
    Convolutional Neural Network, the Math Kernel Library and TensorFlow 0.11rc0
    machine learning framework. We also analyze effects of asynchronous
    computations on the convergence of reinforcement learning algorithms.

    Feature Control as Intrinsic Motivation for Hierarchical Reinforcement Learning

    Nat Dilokthanakul, Christos Kaplanis, Nick Pawlowski, Murray Shanahan
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

    The problem of sparse rewards is one of the hardest challenges in
    contemporary reinforcement learning. Hierarchical reinforcement learning (HRL)
    tackles this problem by using a set of temporally-extended actions, or options,
    each of which has its own subgoal. These subgoals are normally handcrafted for
    specific tasks. Here, though, we introduce a generic class of subgoals with
    broad applicability in the visual domain. Underlying our approach (in common
    with work using “auxiliary tasks”) is the hypothesis that the ability to
    control aspects of the environment is an inherently useful skill to have. We
    incorporate such subgoals in an end-to-end hierarchical reinforcement learning
    system and test two variants of our algorithm on a number of games from the
    Atari suite. We highlight the advantage of our approach in one of the hardest
    games — Montezuma’s revenge — for which the ability to handle sparse rewards
    is key. Our agent learns several times faster than the current state-of-the-art
    HRL agent in this game, reaching a similar level of performance.

    Monaural Audio Speaker Separation with Source Contrastive Estimation

    Cory Stephenson, Patrick Callier, Abhinav Ganesh, Karl Ni
    Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    We propose an algorithm to separate simultaneously speaking persons from each
    other, the “cocktail party problem”, using a single microphone. Our approach
    involves a deep recurrent neural networks regression to a vector space that is
    descriptive of independent speakers. Such a vector space can embed empirically
    determined speaker characteristics and is optimized by distinguishing between
    speaker masks. We call this technique source-contrastive estimation. The
    methodology is inspired by negative sampling, which has seen success in natural
    language processing, where an embedding is learned by correlating and
    de-correlating a given input vector with output weights. Although the matrix
    determined by the output weights is dependent on a set of known speakers, we
    only use the input vectors during inference. Doing so will ensure that source
    separation is explicitly speaker-independent. Our approach is similar to recent
    deep neural network clustering and permutation-invariant training research; we
    use weighted spectral features and masks to augment individual speaker
    frequencies while filtering out other speakers. We avoid, however, the severe
    computational burden of other approaches with our technique. Furthermore, by
    training a vector space rather than combinations of different speakers or
    differences thereof, we avoid the so-called permutation problem during
    training. Our algorithm offers an intuitive, computationally efficient response
    to the cocktail party problem, and most importantly boasts better empirical
    performance than other current techniques.


    Information Retrieval

    Classification revisited: a web of knowledge

    Aida Slavic
    Comments: Aida Slavic (2011) Classification revisited: a web of knowledge. In: Innovations in information retrieval: perspectives for theory and practice. Eds. Allen Foster and Pauline Rafferty. London: Facet, pp. 23-48
    Subjects: Information Retrieval (cs.IR)

    The vision of the Semantic Web (SW) is gradually unfolding and taking shape
    through a web of linked data, a part of which is built by capturing semantics
    stored in existing knowledge organization systems (KOS), subject metadata and
    resource metadata. The content of vast bibliographic collections is currently
    categorized by some widely used bibliographic classification and we may soon
    see them being mined for information and linked in a meaningful way across the
    Web. Bibliographic classifications are designed for knowledge mediation which
    offers both a rich terminology and different ways in which concepts can be
    categorized and related to each other in the universe of knowledge. From
    1990-2010 they have been used in various resource discovery services on the Web
    and continue to be used to support information integration in a number of
    international digital library projects. In this chapter we will revisit some of
    the ways in which universal classifications, as language independent concept
    schemes, can assist humans and computers in structuring and presenting
    information and formulating queries. Most importantly, we highlight issues
    important to understanding bibliographic classifications, both in terms of
    their unused potential and technical limitations.

    Speeding up Memory-based Collaborative Filtering with Landmarks

    Gustavo R. Lima, Carlos E. Mello, Geraldo Zimbrao
    Subjects: Information Retrieval (cs.IR)

    Recommender systems play an important role in many scenarios where users are
    overwhelmed with too many choices to make. In this context, Collaborative
    Filtering (CF) arises by providing a simple and widely used approach for
    personalized recommendation. Memory-based CF algorithms mostly rely on
    similarities between pairs of users or items, which are posteriorly employed in
    classifiers like k-Nearest Neighbor (kNN) to generalize for unknown ratings. A
    major issue regarding this approach is to build the similarity matrix.
    Depending on the dimensionality of the rating matrix, the similarity
    computations may become computationally intractable. To overcome this issue, we
    propose to represent users by their distances to preselected users, namely
    landmarks. This procedure allows to drastically reduce the computational cost
    associated with the similarity matrix. We evaluated our proposal on two
    distinct distinguishing databases, and the results showed our method has
    consistently and considerably outperformed eight CF algorithms (including both
    memory-based and model-based) in terms of computational performance.

    Faceted classification: management and use

    Aida Slavic
    Journal-ref: Axiomathes, 18 (2), pp. 257-71 (2008)
    Subjects: Information Retrieval (cs.IR)

    The paper discusses issues related to the use of faceted classifications in
    an online environment. The author argues that knowledge organization systems
    can be fully utilized in information retrieval only if they are exposed and
    made available for machine processing. The experience with classification
    automation to date may be used to speed up and ease the conversion of existing
    faceted schemes or the creation of management tools for new systems. The author
    suggests that it is possible to agree on a set of functional requirements for
    supporting faceted classifications online that are equally relevant for the
    maintenance of classifications, the creation of classification indexing tools,
    or the management of classifications in an authority file. It is suggested that
    a set of requirements for analytico-synthetic classifications may be put
    forward to improve standards for the use and exchange of knowledge organization
    systems.

    End-to-End Cross-Modality Retrieval with CCA Projections and Pairwise Ranking Loss

    Matthias Dorfer, Jan Schlüter, Andreu Vall, Filip Korzeniowski, Gerhard Widmer
    Comments: 15 pages
    Subjects: Information Retrieval (cs.IR)

    Cross-modality retrieval encompasses retrieval tasks where the fetched items
    are of a different type than the search query, e.g., retrieving pictures
    relevant to a given text query. The state-of-the-art approach to cross-modality
    retrieval relies on learning a joint embedding space of the two modalities,
    where items from either modality are retrieved using nearest-neighbor search.
    In this work, we introduce a neural network layer based on Canonical
    Correlation Analysis (CCA) that learns better embedding spaces by analytically
    computing projections that maximize correlation. In contrast to previous
    approaches, the CCA layer allows us to combine existing objectives for
    embedding space learning, such as pairwise ranking losses, with the optimal
    projections of CCA. We show the effectiveness of our approach for
    cross-modality retrieval on three datasets, surpassing both Deep CCA and a
    multi-view network using freely learned projections optimized by a pairwise
    ranking loss, especially when few training data is available.


    Computation and Language

    A Lightweight Regression Method to Infer Psycholinguistic Properties for Brazilian Portuguese

    Leandro B. dos Santos, Magali S. Duran, Nathan S. Hartmann, Arnaldo Candido Jr., Gustavo H. Paetzold, Sandra M. Aluisio
    Comments: Paper accepted for TSD2017
    Subjects: Computation and Language (cs.CL)

    Psycholinguistic properties of words have been used in various approaches to
    Natural Language Processing tasks, such as text simplification and readability
    assessment. Most of these properties are subjective, involving costly and
    time-consuming surveys to be gathered. Recent approaches use the limited
    datasets of psycholinguistic properties to extend them automatically to large
    lexicons. However, some of the resources used by such approaches are not
    available to most languages. This study presents a method to infer
    psycholinguistic properties for Brazilian Portuguese (BP) using regressors
    built with a light set of features usually available for less resourced
    languages: word length, frequency lists, lexical databases composed of school
    dictionaries and word embedding models. The correlations between the properties
    inferred are close to those obtained by related works. The resulting resource
    contains 26,874 words in BP annotated with concreteness, age of acquisition,
    imageability and subjective frequency.

    Learning Convolutional Text Representations for Visual Question Answering

    Zhengyang Wang, Shuiwang Ji
    Subjects: Learning (cs.LG); Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    Visual question answering is a recently proposed artificial intelligence task
    that requires a deep understanding of both images and texts. In deep learning,
    images are typically modeled through convolutional neural networks, and texts
    are typically modeled through recurrent neural networks. While the requirement
    for modeling images is similar to traditional computer vision tasks, such as
    object recognition and image classification, visual question answering raises a
    different need for textual representation as compared to other natural language
    processing tasks. In this work, we perform a detailed analysis on natural
    language questions in visual question answering. Based on the analysis, we
    propose to rely on convolutional neural networks for learning textual
    representations. By exploring the various properties of convolutional neural
    networks specialized for text data, such as width and depth, we present our
    “CNN Inception + Gate” model. We show that our model improves question
    representations and thus the overall accuracy of visual question answering
    models. We also show that the text representation requirement in visual
    question answering is more complicated and comprehensive than that in
    conventional natural language processing tasks, making it a better task to
    evaluate textual representation methods. Shallow models like fastText, which
    can obtain comparable results with deep learning models in tasks like text
    classification, are not suitable in visual question answering.


    Distributed, Parallel, and Cluster Computing

    A Comparison of Reinforcement Learning Techniques for Fuzzy Cloud Auto-Scaling

    Hamid Arabnejad, Claus Pahl, Pooyan Jamshidi, Giovani Estrada
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI)

    A goal of cloud service management is to design self-adaptable auto-scaler to
    react to workload fluctuations and changing the resources assigned. The key
    problem is how and when to add/remove resources in order to meet agreed
    service-level agreements. Reducing application cost and guaranteeing
    service-level agreements (SLAs) are two critical factors of dynamic controller
    design. In this paper, we compare two dynamic learning strategies based on a
    fuzzy logic system, which learns and modifies fuzzy scaling rules at runtime. A
    self-adaptive fuzzy logic controller is combined with two reinforcement
    learning (RL) approaches: (i) Fuzzy SARSA learning (FSL) and (ii) Fuzzy
    Q-learning (FQL). As an off-policy approach, Q-learning learns independent of
    the policy currently followed, whereas SARSA as an on-policy always
    incorporates the actual agent’s behavior and leads to faster learning. Both
    approaches are implemented and compared in their advantages and disadvantages,
    here in the OpenStack cloud platform. We demonstrate that both auto-scaling
    approaches can handle various load traffic situations, sudden and periodic, and
    delivering resources on demand while reducing operating costs and preventing
    SLA violations. The experimental results demonstrate that FSL and FQL have
    acceptable performance in terms of adjusted number of virtual machine targeted
    to optimize SLA compliance and response time.

    Atari games and Intel processors

    Robert Adamski, Tomasz Grel, Maciej Klimek, Henryk Michalewski
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI); Learning (cs.LG)

    The asynchronous nature of the state-of-the-art reinforcement learning
    algorithms such as the Asynchronous Advantage Actor-Critic algorithm, makes
    them exceptionally suitable for CPU computations. However, given the fact that
    deep reinforcement learning often deals with interpreting visual information, a
    large part of the train and inference time is spent performing convolutions. In
    this work we present our results on learning strategies in Atari games using a
    Convolutional Neural Network, the Math Kernel Library and TensorFlow 0.11rc0
    machine learning framework. We also analyze effects of asynchronous
    computations on the convergence of reinforcement learning algorithms.

    CacheShuffle: An Oblivious Shuffle Algorithm Using Caches

    Sarvar Patel, Giuseppe Persiano, Kevin Yeo
    Comments: 15 pages, 4 figures
    Subjects: Cryptography and Security (cs.CR); Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS)

    In this work, we present CacheShuffle, an algorithm for obliviously shuffling
    data on an untrusted server. Our methods improve the previously best known
    results by reducing the number of block accesses to (4 + epsilon)N which is
    close to the lower bound of 2N. Experimentation shows that for practical sizes
    of N, there is a 4x improvement over the previous best result. Also our result
    only requires 2 roundtrips compared to 5 roundtrips needed by the previous
    result. The novelty in our algorithm involves introducing a persistent client
    cache. Additionally, we show a recursive version of our algorithm which allows
    oblivious shuffling with smaller client memory.


    Learning

    Estimating Accuracy from Unlabeled Data: A Probabilistic Logic Approach

    Emmanouil A. Platanios, Hoifung Poon, Tom M. Mitchell, Eric Horvitz
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    We propose an efficient method to estimate the accuracy of classifiers using
    only unlabeled data. We consider a setting with multiple classification
    problems where the target classes may be tied together through logical
    constraints. For example, a set of classes may be mutually exclusive, meaning
    that a data instance can belong to at most one of them. The proposed method is
    based on the intuition that: (i) when classifiers agree, they are more likely
    to be correct, and (ii) when the classifiers make a prediction that violates
    the constraints, at least one classifier must be making an error. Experiments
    on four real-world data sets produce accuracy estimates within a few percent of
    the true accuracy, using solely unlabeled data. Our models also outperform
    existing state-of-the-art solutions in both estimating accuracies, and
    combining multiple classifier outputs. The results emphasize the utility of
    logical constraints in estimating accuracy, thus validating our intuition.

    EE-Grad: Exploration and Exploitation for Cost-Efficient Mini-Batch SGD

    Mehmet A. Donmez, Maxim Raginsky, Andrew C. Singer
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    We present a generic framework for trading off fidelity and cost in computing
    stochastic gradients when the costs of acquiring stochastic gradients of
    different quality are not known a priori. We consider a mini-batch oracle that
    distributes a limited query budget over a number of stochastic gradients and
    aggregates them to estimate the true gradient. Since the optimal mini-batch
    size depends on the unknown cost-fidelity function, we propose an algorithm,
    {it EE-Grad}, that sequentially explores the performance of mini-batch oracles
    and exploits the accumulated knowledge to estimate the one achieving the best
    performance in terms of cost-efficiency. We provide performance guarantees for
    EE-Grad with respect to the optimal mini-batch oracle, and illustrate these
    results in the case of strongly convex objectives. We also provide a simple
    numerical example that corroborates our theoretical findings.

    Linear regression without correspondence

    Daniel Hsu, Kevin Shi, Xiaorui Sun
    Subjects: Learning (cs.LG); Statistics Theory (math.ST); Machine Learning (stat.ML)

    This article considers algorithmic and statistical aspects of linear
    regression when the correspondence between the covariates and the responses is
    unknown. First, a fully polynomial-time approximation scheme is given for the
    natural least squares optimization problem in any constant dimension. Next, in
    an average-case and noise-free setting where the responses exactly correspond
    to a linear function of i.i.d. draws from a standard multivariate normal
    distribution, an efficient algorithm based on lattice basis reduction is shown
    to exactly recover the unknown linear function in arbitrary dimension. Finally,
    lower bounds on the signal-to-noise ratio are established for approximate
    recovery of the unknown linear function by any estimator.

    Posterior sampling for reinforcement learning: worst-case regret bounds

    Shipra Agrawal, Randy Jia
    Subjects: Learning (cs.LG)

    We present an algorithm based on posterior sampling (aka Thompson sampling)
    that achieves near-optimal worst-case regret bounds when the underlying Markov
    Decision Process (MDP) is communicating with a finite, though unknown,
    diameter. Our main result is a high probability regret upper bound of
    ( ilde{O}(Dsqrt{SAT})) for any communicating MDP with (S) states, (A) actions
    and diameter (D), when (Tge S^5A). Here, regret compares the total reward
    achieved by the algorithm to the total expected reward of an optimal
    infinite-horizon undiscounted average reward policy, in time horizon (T). This
    result improves over the best previously known upper bound of
    ( ilde{O}(DSsqrt{AT})) achieved by any algorithm in this setting, and matches
    the dependence on (S) in the established lower bound of (Omega(sqrt{DSAT}))
    for this problem. Our techniques involve proving some novel results about the
    anti-concentration of Dirichlet distribution, which may be of independent
    interest.

    Spectral-graph Based Classifications: Linear Regression for Classification and Normalized Radial Basis Function Network

    Zhenfang Hu, Gang Pan, Zhaohui Wu
    Subjects: Learning (cs.LG)

    Spectral graph theory has been widely applied in unsupervised and
    semi-supervised learning. In this paper, we find for the first time, to our
    knowledge, that it also plays a concrete role in supervised classification. It
    turns out that two classifiers are inherently related to the theory: linear
    regression for classification (LRC) and normalized radial basis function
    network (nRBFN), corresponding to linear and nonlinear kernel respectively. The
    spectral graph theory provides us with a new insight into a fundamental aspect
    of classification: the tradeoff between fitting error and overfitting risk.
    With the theory, ideal working conditions for LRC and nRBFN are presented,
    which ensure not only zero fitting error but also low overfitting risk. For
    quantitative analysis, two concepts, the fitting error and the spectral risk
    (indicating overfitting), have been defined. Their bounds for nRBFN and LRC are
    derived. A special result shows that the spectral risk of nRBFN is lower
    bounded by the number of classes and upper bounded by the size of radial basis.
    When the conditions are not met exactly, the classifiers will pursue the
    minimum fitting error, running into the risk of overfitting. It turns out that
    (ell_2)-norm regularization can be applied to control overfitting. Its effect
    is explored under the spectral context. It is found that the two terms in the
    (ell_2)-regularized objective are one-one correspondent to the fitting error
    and the spectral risk, revealing a tradeoff between the two quantities.
    Concerning practical performance, we devise a basis selection strategy to
    address the main problem hindering the applications of (n)RBFN. With the
    strategy, nRBFN is easy to implement yet flexible. Experiments on 14 benchmark
    data sets show the performance of nRBFN is comparable to that of SVM, whereas
    the parameter tuning of nRBFN is much easier, leading to reduction of model
    selection time.

    Unbiased estimates for linear regression via volume sampling

    Michal Derezinski, Manfred K. Warmuth
    Subjects: Learning (cs.LG)

    Given a full rank matrix (X) with more columns than rows consider the task of
    estimating the pseudo inverse (X^+) based on the pseudo inverse of a sampled
    subset of columns (of size at least the number of rows). We show that this is
    possible if the subset of columns is chosen proportional to the squared volume
    spanned by the rows of the chosen submatrix (ie, volume sampling). The
    resulting estimator is unbiased and surprisingly the covariance of the
    estimator also has a closed form: It equals a specific factor times
    (X^+X^{+ op}).

    Pseudo inverse plays an important part in solving the linear least squares
    problem, where we try to predict a label for each column of (X). We assume
    labels are expensive and we are only given the labels for the small subset of
    columns we sample from (X). Using our methods we show that the weight vector of
    the solution for the sub problem is an unbiased estimator of the optimal
    solution for the whole problem based on all column labels.

    We believe that these new formulas establish a fundamental connection between
    linear least squares and volume sampling. We use our methods to obtain an
    algorithm for volume sampling that is faster than state-of-the-art and for
    obtaining bounds for the total loss of the estimated least-squares solution on
    all labeled columns.

    Practical Algorithms for Best-K Identification in Multi-Armed Bandits

    Haotian Jiang, Jian Li, Mingda Qiao
    Subjects: Learning (cs.LG); Data Structures and Algorithms (cs.DS); Machine Learning (stat.ML)

    In the Best-(K) identification problem (Best-(K)-Arm), we are given (N)
    stochastic bandit arms with unknown reward distributions. Our goal is to
    identify the (K) arms with the largest means with high confidence, by drawing
    samples from the arms adaptively. This problem is motivated by various
    practical applications and has attracted considerable attention in the past
    decade. In this paper, we propose new practical algorithms for the Best-(K)-Arm
    problem, which have nearly optimal sample complexity bounds (matching the lower
    bound up to logarithmic factors) and outperform the state-of-the-art algorithms
    for the Best-(K)-Arm problem (even for (K=1)) in practice.

    Learning Convolutional Text Representations for Visual Question Answering

    Zhengyang Wang, Shuiwang Ji
    Subjects: Learning (cs.LG); Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    Visual question answering is a recently proposed artificial intelligence task
    that requires a deep understanding of both images and texts. In deep learning,
    images are typically modeled through convolutional neural networks, and texts
    are typically modeled through recurrent neural networks. While the requirement
    for modeling images is similar to traditional computer vision tasks, such as
    object recognition and image classification, visual question answering raises a
    different need for textual representation as compared to other natural language
    processing tasks. In this work, we perform a detailed analysis on natural
    language questions in visual question answering. Based on the analysis, we
    propose to rely on convolutional neural networks for learning textual
    representations. By exploring the various properties of convolutional neural
    networks specialized for text data, such as width and depth, we present our
    “CNN Inception + Gate” model. We show that our model improves question
    representations and thus the overall accuracy of visual question answering
    models. We also show that the text representation requirement in visual
    question answering is more complicated and comprehensive than that in
    conventional natural language processing tasks, making it a better task to
    evaluate textual representation methods. Shallow models like fastText, which
    can obtain comparable results with deep learning models in tasks like text
    classification, are not suitable in visual question answering.

    Spatial Variational Auto-Encoding via Matrix-Variate Normal Distributions

    Zhengyang Wang, Hao Yuan, Shuiwang Ji
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    The key idea of variational auto-encoders (VAEs) resembles that of
    traditional auto-encoder models in which spatial information is supposed to be
    explicitly encoded in the latent space. However, the latent variables in VAEs
    are vectors, which are commonly interpreted as multiple feature maps of size
    1×1. Such representations can only convey spatial information implicitly when
    coupled with powerful decoders. In this work, we propose spatial VAEs that use
    latent variables as feature maps of larger size to explicitly capture spatial
    information. This is achieved by allowing the latent variables to be sampled
    from matrix-variate normal (MVN) distributions whose parameters are computed
    from the encoder network. To increase dependencies among locations on latent
    feature maps and reduce the number of parameters, we further propose spatial
    VAEs via low-rank MVN distributions. Experimental results show that the
    proposed spatial VAEs outperform original VAEs in capturing rich structural and
    spatial information.

    Pixel Deconvolutional Networks

    Hongyang Gao, Hao Yuan, Zhengyang Wang, Shuiwang Ji
    Comments: 10 pages
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    Deconvolutional layers have been widely used in a variety of deep models for
    up-sampling, including encoder-decoder networks for semantic segmentation and
    deep generative models for unsupervised learning. One of the key limitations of
    deconvolutional operations is that they result in the so-called checkerboard
    problem. This is caused by the fact that no direct relationship exists among
    adjacent pixels on the output feature map. To address this problem, we propose
    the pixel deconvolutional layer (PixelDCL) to establish direct relationships
    among adjacent pixels on the up-sampled feature map. Our method is based on a
    fresh interpretation of the regular deconvolution operation. The resulting
    PixelDCL can be used to replace any deconvolutional layer in a plug-and-play
    manner without compromising the fully trainable capabilities of original
    models. The proposed PixelDCL may result in slight decrease in efficiency, but
    this can be overcome by an implementation trick. Experimental results on
    semantic segmentation demonstrate that PixelDCL can consider spatial features
    such as edges and shapes and yields more accurate segmentation outputs than
    deconvolutional layers. When used in image generation tasks, our PixelDCL can
    largely overcome the checkerboard problem suffered by regular deconvolution
    operations.

    Feature Control as Intrinsic Motivation for Hierarchical Reinforcement Learning

    Nat Dilokthanakul, Christos Kaplanis, Nick Pawlowski, Murray Shanahan
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

    The problem of sparse rewards is one of the hardest challenges in
    contemporary reinforcement learning. Hierarchical reinforcement learning (HRL)
    tackles this problem by using a set of temporally-extended actions, or options,
    each of which has its own subgoal. These subgoals are normally handcrafted for
    specific tasks. Here, though, we introduce a generic class of subgoals with
    broad applicability in the visual domain. Underlying our approach (in common
    with work using “auxiliary tasks”) is the hypothesis that the ability to
    control aspects of the environment is an inherently useful skill to have. We
    incorporate such subgoals in an end-to-end hierarchical reinforcement learning
    system and test two variants of our algorithm on a number of games from the
    Atari suite. We highlight the advantage of our approach in one of the hardest
    games — Montezuma’s revenge — for which the ability to handle sparse rewards
    is key. Our agent learns several times faster than the current state-of-the-art
    HRL agent in this game, reaching a similar level of performance.

    Fast Singular Value Shrinkage with Chebyshev Polynomial Approximation Based on Signal Sparsity

    Masaki Onuki, Shunsuke Ono, Keiichiro Shirai, Yuichi Tanaka
    Comments: This is a journal paper
    Subjects: Numerical Analysis (cs.NA); Learning (cs.LG)

    We propose an approximation method for thresholding of singular values using
    Chebyshev polynomial approximation (CPA). Many signal processing problems
    require iterative application of singular value decomposition (SVD) for
    minimizing the rank of a given data matrix with other cost functions and/or
    constraints, which is called matrix rank minimization. In matrix rank
    minimization, singular values of a matrix are shrunk by hard-thresholding,
    soft-thresholding, or weighted soft-thresholding. However, the computational
    cost of SVD is generally too expensive to handle high dimensional signals such
    as images; hence, in this case, matrix rank minimization requires enormous
    computation time. In this paper, we leverage CPA to (approximately) manipulate
    singular values without computing singular values and vectors. The thresholding
    of singular values is expressed by a multiplication of certain matrices, which
    is derived from a characteristic of CPA. The multiplication is also efficiently
    computed using the sparsity of signals. As a result, the computational cost is
    significantly reduced. Experimental results suggest the effectiveness of our
    method through several image processing applications based on matrix rank
    minimization with nuclear norm relaxation in terms of computation time and
    approximation precision.

    Deep adversarial neural decoding

    Yağmur Güçlütürk, Umut Güçlü, Katja Seeliger, Sander Bosch, Rob van Lier, Marcel van Gerven
    Subjects: Neurons and Cognition (q-bio.NC); Learning (cs.LG); Machine Learning (stat.ML)

    Here, we present a novel approach to solve the problem of reconstructing
    perceived stimuli from brain responses by combining probabilistic inference
    with deep learning. Our approach first inverts the linear transformation from
    latent features to brain responses with maximum a posteriori estimation and
    then inverts the nonlinear transformation from perceived stimuli to latent
    features with adversarial training of convolutional neural networks. We test
    our approach with a functional magnetic resonance imaging experiment and show
    that it can generate state-of-the-art reconstructions of perceived faces from
    brain activations.

    Gradient Estimators for Implicit Models

    Yingzhen Li, Richard E. Turner
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Implicit models, which allow for the generation of samples but not for
    point-wise evaluation of probabilities, are omnipresent in real world problems
    tackled by machine learning and a hot topic of current research. Some examples
    include data simulators that are widely used in engineering and scientific
    research, generative adversarial networks (GANs) for image synthesis, and
    hot-off-the-press approximate inference techniques relying on implicit
    distributions. The majority of existing approaches to learning implicit models
    rely on approximating the intractable distribution or optimisation objective
    for gradient-based optimisation, which is liable to produce inaccurate updates
    and thus poor models. This paper alleviates the need for such approximations by
    proposing the Stein gradient estimator, which directly estimates the score
    function of the implicitly defined distribution. The efficacy of the proposed
    estimator is empirically demonstrated by examples that include meta-learning
    for approximate inference, and entropy regularised GANs that provide improved
    sample diversities.

    Machine learning for classification and quantification of monoclonal antibody preparations for cancer therapy

    Laetitia Le, Camille Marini, Alexandre Gramfort, David Nguyen, Mehdi Cherti, Sana Tfaili, Ali Tfayli, Arlette Baillet-Guffroy, Patrice Prognon, Pierre Chaminade, Eric Caudron, Balázs Kégl
    Subjects: Quantitative Methods (q-bio.QM); Learning (cs.LG)

    Monoclonal antibodies constitute one of the most important strategies to
    treat patients suffering from cancers such as hematological malignancies and
    solid tumors. In order to guarantee the quality of those preparations prepared
    at hospital, quality control has to be developed. The aim of this study was to
    explore a noninvasive, nondestructive, and rapid analytical method to ensure
    the quality of the final preparation without causing any delay in the process.
    We analyzed four mAbs (Inlfiximab, Bevacizumab, Ramucirumab and Rituximab)
    diluted at therapeutic concentration in chloride sodium 0.9% using Raman
    spectroscopy. To reduce the prediction errors obtained with traditional
    chemometric data analysis, we explored a data-driven approach using statistical
    machine learning methods where preprocessing and predictive models are jointly
    optimized. We prepared a data analytics workflow and submitted the problem to a
    collaborative data challenge platform called Rapid Analytics and Model
    Prototyping (RAMP). This allowed to use solutions from about 300 data
    scientists during five days of collaborative work. The prediction of the four
    mAbs samples was considerably improved with a misclassification rate and the
    mean error rate of 0.8% and 4%, respectively.

    What do We Learn by Semantic Scene Understanding for Remote Sensing imagery in CNN framework?

    Haifeng Li, Jian Peng, Chao Tao, Jie Chen, Min Deng
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Recently, deep convolutional neural network (DCNN) achieved increasingly
    remarkable success and rapidly developed in the field of natural image
    recognition. Compared with the natural image, the scale of remote sensing image
    is larger and the scene and the object it represents are more macroscopic. This
    study inquires whether remote sensing scene and natural scene recognitions
    differ and raises the following questions: What are the key factors in remote
    sensing scene recognition? Is the DCNN recognition mechanism centered on object
    recognition still applicable to the scenarios of remote sensing scene
    understanding? We performed several experiments to explore the influence of the
    DCNN structure and the scale of remote sensing scene understanding from the
    perspective of scene complexity. Our experiment shows that understanding a
    complex scene depends on an in-depth network and multiple-scale perception.
    Using a visualization method, we qualitatively and quantitatively analyze the
    recognition mechanism in a complex remote sensing scene and demonstrate the
    importance of multi-objective joint semantic support.

    Masked Autoregressive Flow for Density Estimation

    George Papamakarios, Theo Pavlakou, Iain Murray
    Comments: 17 pages
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Autoregressive models are among the best performing neural density
    estimators. We describe an approach for increasing the flexibility of an
    autoregressive model, based on modelling the random numbers that the model uses
    internally when generating data. By constructing a stack of autoregressive
    models, each modelling the random numbers of the next model in the stack, we
    obtain a type of normalizing flow suitable for density estimation, which we
    call Masked Autoregressive Flow. This type of flow is closely related to
    Inverse Autoregressive Flow and is a generalization of Real NVP. Masked
    Autoregressive Flow achieves state-of-the-art performance in a range of
    general-purpose density estimation tasks.

    The Landscape of Deep Learning Algorithms

    Pan Zhou, Jiashi Feng
    Subjects: Machine Learning (stat.ML); Learning (cs.LG); Optimization and Control (math.OC)

    This paper studies the landscape of empirical risk of deep neural networks by
    theoretically analyzing its convergence behavior to the population risk as well
    as its stationary points and properties. For an (l)-layer linear neural
    network, we prove its empirical risk uniformly converges to its population risk
    at the rate of (mathcal{O}(r^{2l}sqrt{dlog(l)}/sqrt{n})) with training
    sample size of (n), the total weight dimension of (d) and the magnitude bound
    (r) of weight of each layer. We then derive the stability and generalization
    bounds for the empirical risk based on this result. Besides, we establish the
    uniform convergence of gradient of the empirical risk to its population
    counterpart. We prove the one-to-one correspondence of the non-degenerate
    stationary points between the empirical and population risks with convergence
    guarantees, which describes the landscape of deep neural networks. In addition,
    we analyze these properties for deep nonlinear neural networks with sigmoid
    activation functions. We prove similar results for convergence behavior of
    their empirical risks as well as the gradients and analyze properties of their
    non-degenerate stationary points.

    To our best knowledge, this work is the first one theoretically
    characterizing landscapes of deep learning algorithms. Besides, our results
    provide the sample complexity of training a good deep neural network. We also
    provide theoretical understanding on how the neural network depth (l), the
    layer width, the network size (d) and parameter magnitude determine the neural
    network landscapes.

    Learning Effective Representations from Clinical Notes

    Sebastien Dubois, Sebastien Dubois, David C. Kale, Nigam Shah, Kenneth Jung
    Comments: Under review for the Machine Learning for Healthcare Conference 2017
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Clinical notes are a rich source of information about patient state. However,
    using them effectively presents many challenges. In this work we present two
    methods for summarizing clinical notes into patient-level representations. The
    resulting representations are evaluated on a range of prediction tasks and
    cohort sizes. The new representations offer significant predictive performance
    gains over the common baselines of Bag of Words and topic model representations
    across all tested tasks and cohort sizes.

    Near-optimality of sequential joint detection and estimation via online mirror descent

    Yang Cao, Yao Xie, Huan Xu
    Subjects: Statistics Theory (math.ST); Learning (cs.LG)

    Sequential hypothesis test and change-point detection when the distribution
    parameters are unknown is a fundamental problem in statistics and machine
    learning. We show that for such problems, detection procedures based on
    sequential likelihood ratios with simple online mirror descent estimators are
    nearly optimal. This is a blessing, since although the well-known generalized
    likelihood ratio statistics are optimal theoretically, but their exact
    computation usually requires infinite memory of historical data. We prove the
    optimality by making a connection of sequential analysis with online convex
    optimization and leveraging the logarithmic regret bound property of online
    mirror descent algorithm. Numerical examples validate our theory.

    Atari games and Intel processors

    Robert Adamski, Tomasz Grel, Maciej Klimek, Henryk Michalewski
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI); Learning (cs.LG)

    The asynchronous nature of the state-of-the-art reinforcement learning
    algorithms such as the Asynchronous Advantage Actor-Critic algorithm, makes
    them exceptionally suitable for CPU computations. However, given the fact that
    deep reinforcement learning often deals with interpreting visual information, a
    large part of the train and inference time is spent performing convolutions. In
    this work we present our results on learning strategies in Atari games using a
    Convolutional Neural Network, the Math Kernel Library and TensorFlow 0.11rc0
    machine learning framework. We also analyze effects of asynchronous
    computations on the convergence of reinforcement learning algorithms.

    CDS Rate Construction Methods by Machine Learning Techniques

    Raymond Brummelhuis, Zhongmin Luo
    Comments: 51 pages; 21 Figures; 15 Tables
    Subjects: Statistical Finance (q-fin.ST); Learning (cs.LG); Risk Management (q-fin.RM); Machine Learning (stat.ML)

    Regulators require financial institutions to estimate counterparty default
    risks from liquid CDS quotes for the valuation and risk management of OTC
    derivatives. However, the vast majority of counterparties do not have liquid
    CDS quotes and need proxy CDS rates. Existing methods cannot account for
    counterparty-specific default risks; we propose to construct proxy CDS rates by
    associating to illiquid counterparty liquid CDS Proxy based on Machine Learning
    Techniques. After testing 156 classifiers from 8 most popular classifier
    families, we found that some classifiers achieve highly satisfactory accuracy
    rates. Furthermore, we have rank-ordered the performances and investigated
    performance variations amongst and within the 8 classifier families. This paper
    is, to the best of our knowledge, the first systematic study of CDS Proxy
    construction by Machine Learning techniques, and the first systematic
    classifier comparison study based entirely on financial market data. Its
    findings both confirm and contrast existing classifier performance literature.
    Given the typically highly correlated nature of financial data, we investigated
    the impact of correlation on classifier performance. The techniques used in
    this paper should be of interest for financial institutions seeking a CDS Proxy
    method, and can serve for proxy construction for other financial variables.
    Some directions for future research are indicated.

    A Unified Framework for Stochastic Matrix Factorization via Variance Reduction

    Renbo Zhao, William B. Haskell, Jiashi Feng
    Subjects: Machine Learning (stat.ML); Learning (cs.LG); Optimization and Control (math.OC)

    We propose a unified framework to speed up the existing stochastic matrix
    factorization (SMF) algorithms via variance reduction. Our framework is general
    and it subsumes several well-known SMF formulations in the literature. We
    perform a non-asymptotic convergence analysis of our framework and derive
    computational and sample complexities for our algorithm to converge to an
    (epsilon)-stationary point in expectation. In addition, extensive experiments
    for a wide class of SMF formulations demonstrate that our framework
    consistently yields faster convergence and a more accurate output dictionary
    vis-`a-vis state-of-the-art frameworks.

    Discovering the Graph Structure in the Clustering Results

    Evgeny Bauman, Konstantin Bauman
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    In a standard cluster analysis, such as k-means, in addition to clusters
    locations and distances between them, it’s important to know if they are
    connected or well separated from each other. The main focus of this paper is
    discovering the relations between the resulting clusters. We propose a new
    method which is based on pairwise overlapping k-means clustering, that in
    addition to means of clusters provides the graph structure of their relations.
    The proposed method has a set of parameters that can be tuned in order to
    control the sensitivity of the model and the desired relative size of the
    pairwise overlapping interval between means of two adjacent clusters, i.e.,
    level of overlapping. We present the exact formula for calculating that
    parameter. The empirical study presented in the paper demonstrates that our
    approach works well not only on toy data but also compliments standard
    clustering results with a reasonable graph structure on real datasets, such as
    financial indices and restaurants.

    Monaural Audio Speaker Separation with Source Contrastive Estimation

    Cory Stephenson, Patrick Callier, Abhinav Ganesh, Karl Ni
    Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    We propose an algorithm to separate simultaneously speaking persons from each
    other, the “cocktail party problem”, using a single microphone. Our approach
    involves a deep recurrent neural networks regression to a vector space that is
    descriptive of independent speakers. Such a vector space can embed empirically
    determined speaker characteristics and is optimized by distinguishing between
    speaker masks. We call this technique source-contrastive estimation. The
    methodology is inspired by negative sampling, which has seen success in natural
    language processing, where an embedding is learned by correlating and
    de-correlating a given input vector with output weights. Although the matrix
    determined by the output weights is dependent on a set of known speakers, we
    only use the input vectors during inference. Doing so will ensure that source
    separation is explicitly speaker-independent. Our approach is similar to recent
    deep neural network clustering and permutation-invariant training research; we
    use weighted spectral features and masks to augment individual speaker
    frequencies while filtering out other speakers. We avoid, however, the severe
    computational burden of other approaches with our technique. Furthermore, by
    training a vector space rather than combinations of different speakers or
    differences thereof, we avoid the so-called permutation problem during
    training. Our algorithm offers an intuitive, computationally efficient response
    to the cocktail party problem, and most importantly boasts better empirical
    performance than other current techniques.


    Information Theory

    Secure Computation of Randomized Functions: Further Results

    Deepesh Data, Vinod M. Prabhakaran
    Comments: This is an extended version of a submission to ITW, 2017
    Subjects: Information Theory (cs.IT)

    We consider secure computation of randomized functions between two users,
    where both the users (Alice and Bob) have inputs, Alice sends a message to Bob
    over a rate-limited, noise-free link, and then Bob produces the output. We
    study two cases: (i) when privacy condition is required only against Bob, who
    tries to learn more about Alice’s input from the message than what can be
    inferred by his own input and output, and (ii) when there is no privacy
    requirement. For both the problems, we give single-letter expressions for the
    optimal rates. For the first problem, we also explicitly characterize securely
    computable randomized functions when input has full support, which leads to a
    much simpler expression for the optimal rate. Recently, Data (ISIT 2016)
    studied the remaining two cases (first, when privacy conditions are against
    both the users; and second, when privacy condition is only against Alice) and
    obtained single-letter expressions for optimal rates in both the scenarios.

    Achievable Information Rates for Coded Modulation with Hard Decision Decoding for Coherent Fiber-Optic Systems

    Alireza Sheikh, Alexandre Graell i Amat, Gianluigi Liva
    Subjects: Information Theory (cs.IT)

    We analyze the achievable information rates (AIRs) for coded modulation
    schemes with QAM constellations with both bit-wise and symbol-wise decoders,
    corresponding to the case where a binary code is used in combination with a
    higher-order modulation using the bit-interleaved coded modulation (BICM)
    paradigm and to the case where a nonbinary code over a field matched to the
    constellation size is used, respectively. In particular, we consider hard
    decision decoding, which is the preferable option for fiber-optic communication
    systems where decoding complexity is a concern. Recently, Liga emph{et al.}
    analyzed the AIRs for bit-wise and symbol-wise decoders considering what the
    authors called emph{hard decision decoder} which, however, exploits emph{soft
    information} of the transition probabilities of discrete-input discrete-ouput
    channel resulting from the hard detection. As such, the complexity of the
    decoder is essentially the same as the complexity of a soft decision decoder.
    In this paper, we analyze instead the AIRs for the standard hard decision
    decoder, commonly used in practice, where the decoding is based on the Hamming
    distance metric. We show that if standard hard decision decoding is used,
    bit-wise decoders yield significantly higher AIRs than symbol-wise decoders. As
    a result, contrary to the conclusion by Liga emph{et al.}, binary decoders
    together with the BICM paradigm are preferable for spectrally-efficient
    fiber-optic systems. We also design binary and nonbinary staircase codes and
    show that, in agreement with the AIRs, binary codes yield better performance.

    Cooperative spectrum sensing with enhanced energy detection under GAUSSIAN noise uncertainty in cognitive radios

    He Huang, Chenglin Zhao
    Comments: 5 pages
    Subjects: Information Theory (cs.IT)

    This paper presents optimization issue of energy detection (ED) thresholds
    for cooperative spectrum sensing (CSS) with regard to generalized Gaussian
    channel in cognitive radios (CRs) systems. Enhanced ED thresholds are made use
    of dealing with excessive sensitivity of Gaussian noise uncertainty, which is
    appropriate for multiple and mal-distribution noises interference matter.
    Two-steps decision pattern and convex limitation threshold solution within
    Voting Progress have been used for secondary users (SUs) over coherent complex
    random Gaussian channel. Through deducing the probability of detection and
    false alarm under independent and identical distribution (i.i.d.) for multiple
    SUs, we derive optimal parameters (such as ED thresholds and the number of
    collaborative users) to achieve mixture noises fusion processing at low
    signal-to-noise-ratio (SNR) level. Furthermore, simulation results show that
    proposed schemes significantly outperform most other noise uncertainty plans,
    which extremely improve (total error rate) in low-quality communication
    circumstance.

    Performance Analysis of Energy Detection over Composite kappa-miu Shadowed Fading Channels

    Huang He, Zhao Chenglin
    Comments: 9 pages
    Subjects: Information Theory (cs.IT)

    Energy detection is a reliable non-coherent signal processing technology of
    spectrum sensing of cognitive radio networks, which thanks to its low
    complexity, no requirement of priori received information and fast sensing
    ability etc. Since the excellent performance of energy detection would be
    actually affected by physical multipath fading, this paper is concentrating on
    characteristics analysis of energy detection over composite shadowed fading
    channels. The small-scale and line-of-sight fading distribution consists of
    particular examples such as Rayleigh, Hoyt, Nakagami-m and one sided Gaussian
    distributions. Based on this, we derive the probability density function of
    signal envelope and signal-to-noise ratio of the composite shadowed fading
    channels, which could accurately present the line-of-sight shadowed fading
    characterization. Subsequently the exact close-form expressions with infinite
    series formulation for the appropriate detection probability have been firstly
    extended to estimate detection capacity of the above-mentioned model by
    adopting Inverse Gaussian asymptotic distribution. In addition, the absolute
    truncation error is deduced for evaluating minimum detection efficiency. The
    established model can be also applied in detection estimation with non-integral
    fading parameters. Last but not least, the analytical results and
    quantification performance are approved by numerically evaluation with
    MATHEMATICA and MATLAB as the power variables of dominant components changes.

    Cooperative Spectrum Sensing over Generalized Fading Channels Based on Energy Detection

    Huang He, Zhao Chenglin
    Comments: 14 pages
    Subjects: Information Theory (cs.IT)

    This paper analyzes the unified performance of energy detection (ED) of
    spectrum sensing (SS) over generalized fading channels in cognitive radio (CR)
    networks. The detective performance of SS schemes will be obviously affected by
    fading channel between communication nodes, and ED has the advantages of fast
    implementation, no requirement of priori received information and low
    complexity, so it is meaningful to investigate ED that is performed over fading
    channels such as Nakagami-m channel and Rice channel, or generalized fading
    channels such as k{appa}-{mu} fading distribution and {eta}-{mu} fading
    distribution. The {alpha}-k{appa}-{mu} fading distribution is a generalized
    fading model that represents the nonlinear and small-scale variation of fading
    channels. The probability density function (p.d.f.) of instantaneous
    signal-to-ratio (SNR) of {alpha}-k{appa}-{mu} distribution is derived from
    the envelope p.d.f. to evaluate energy efficiency for sensing systems. Next,
    the probability of detection model with Marcum-Q function has been derived and
    the close-form detective expressions with moment generating function (MGF)
    method are deduced to achieve sensing communications over generalized fading
    channels. Furthermore, novel and exact closed-form analytic expressions for
    average area under the receiver operating characteristics curve also have been
    deduced to analyze the performance characteristics of ED over
    {alpha}-k{appa}-{mu} fading channels. Besides, cooperative spectrum sensing
    (CSS) with diversity reception has been applied to improve the detection
    accuracy and mitigate the shadowed fading features with OR-rule. At last, the
    results show that the detection capacity can be evidently affected by
    {alpha}-k{appa}-{mu} fading conditions, but appropriate channel parameters
    will improve sensing performance.

    Beam Design and User Scheduling for Non-Orthogonal Multiple Access with Multiple Antennas Based on Pareto-Optimality

    Junyeong Seo, Youngchul Sung
    Comments: 29 pages, 8 figures, submitted to IEEE transaction on singal processing
    Subjects: Information Theory (cs.IT)

    In this paper, an efficient transmit beam design and user scheduling method
    is proposed for multi-user (MU) multiple-input single-output (MISO)
    non-orthogonal multiple access (NOMA) downlink, based on Pareto-optimality. The
    proposed beam design and user scheduling method groups simultaneously-served
    users into multiple clusters with practical two users in each cluster, and then
    applies spatical zeroforcing (ZF) across clusters to control inter-cluster
    interference (ICI) and Pareto-optimal beam design with successive interference
    cancellation (SIC) to two users in each cluster to remove interference to
    strong users and leverage signal-to-interference-plus-noise ratios (SINRs) of
    interference-experiencing weak users. The proposed method has flexibility to
    control the rates of strong and weak users and numerical results show that the
    proposed method yields good performance.

    Ultra-Reliable and Low Latency Communication in mmWave-Enabled Massive MIMO Networks

    Trung Kien Vu, Chen-Feng Liu, Mehdi Bennis, Mérouane Debbah, Matti Latva-aho, Choong Seon Hong
    Comments: Accepted May 12, 2017 by IEEE Communications Letters. Topic is Ultra-Reliable and Low Latency Communication in 5G mmWave Networks
    Subjects: Information Theory (cs.IT)

    Ultra-reliability and low-latency are two key components in 5G networks. In
    this letter, we investigate the problem of ultra-reliable and low-latency
    communication (URLLC) in millimeter wave (mmWave)-enabled massive
    multiple-input multiple-output (MIMO) networks. The problem is cast as a
    network utility maximization subject to probabilistic latency and reliability
    constraints. To solve this problem, we resort to the Lyapunov technique whereby
    a utility-delay control approach is proposed, which adapts to channel
    variations and queue dynamics. Numerical results demonstrate that our proposed
    approach ensures reliable communication with a guaranteed probability of
    99.99%, and reduces latency by 28.41% and 77.11% as compared to baselines with
    and without probabilistic latency constraints, respectively.

    Energy-Efficient Resource Allocation for Elastic Optical Networks using Convex Optimization

    Mohammad Hadi, Mohammad Reza Pakravan
    Comments: 28 pages, 11 figures, 4 tables
    Subjects: Information Theory (cs.IT)

    We propose a two-stage algorithm for energy-efficient resource allocation
    constrained to QoS and physical requirements in OFDM-based EONs. The first
    stage deals with routing, grooming and traffic ordering and aims at minimizing
    amplifier power consumption and number of active transponders. We provide a
    heuristic procedure which yields an acceptable solution for the complex ILP
    formulation of the routing and grooming. In the second stage, we optimize
    transponder configuration including spectrum and transmit power parameters to
    minimize transponder power consumption. We show how QoS and transponder power
    consumption are represented by convex expressions and use the results to
    formulate a convex problem for configuring transponders in which transmit
    optical power is an optimization variable. Simulation results demonstrate that
    the power consumption is reduced by 9% when the proposed routing and grooming
    algorithm is applied to European Cost239 network with aggregate traffic 60
    Tbps. It is shown that our convex formulation for transponder parameter
    assignment is considerably faster than its MINLP counterpart and its ability to
    optimize transmit optical power improves transponder power consumption by 8%
    for aggregate traffic 60 Tbps. Furthermore, we investigate the effect of
    adaptive modulation assignment and transponder capacity on inherent tradeoff
    between network CAPEX and OPEX.

    Low-Complexity Iterative Algorithms for (Discrete) Compressed Sensing

    Robert F.H. Fischer, Susanne Sparrer
    Subjects: Information Theory (cs.IT)

    We consider iterative (`turbo’) algorithms for compressed sensing. First, a
    unified exposition of the different approaches available in the literature is
    given, thereby enlightening the general principles and main differences. In
    particular we discuss i) the estimation step (matched filter vs. optimum MMSE
    estimator), ii) the unbiasing operation (implicitly or explicitly done and
    equivalent to the calculation of extrinsic information), and iii) thresholding
    vs. the calculation of soft values. Based on these insights we propose a
    low-complexity but well-performing variant utilizing a Krylov space
    approximation of the optimum linear MMSE estimator. The derivations are valid
    for any probability density of the signal vector. However, numerical results
    are shown for the discrete case. The novel algorithms shows very good
    performance and even slightly faster convergence compared to approximative
    message passing.

    Beyond Massive-MIMO: The Potential of Positioning with Large Intelligent Surfaces

    Sha Hu, Fredrik Rusek, Ove Edfors
    Comments: Submitted to IEEE Trans. on Signal Processing on Apr. 2017; 30 pages; 13 figures
    Subjects: Information Theory (cs.IT)

    We consider the potential for positioning with a system where antenna arrays
    are deployed as a large intelligent surface (LIS), which is a newly proposed
    concept beyond massive-MIMO where future man-made structures are electronically
    active with integrated electronics and wireless communication making the entire
    environment lqlq{}intelligent
    q
    q{}. In a first step, we derive
    Fisher-information and Cram'{e}r-Rao lower bounds (CRLBs) in closed-form for
    positioning a terminal located perpendicular to the center of the LIS, whose
    location we refer to as being on the central perpendicular line (CPL) of the
    LIS. For a terminal that is not on the CPL, closed-form expressions of the
    Fisher-information and CRLB seem out of reach, and we alternatively find
    approximations of them which are shown to be accurate. Under mild conditions,
    we show that the CRLB for all three Cartesian dimensions ((x), (y) and (z))
    decreases quadratically in the surface-area of the LIS, except for a terminal
    exactly on the CPL where the CRLB for the (z)-dimension (distance from the LIS)
    decreases linearly in the same. In a second step, we analyze the CRLB for
    positioning when there is an unknown phase (varphi) presented in the analog
    circuits of the LIS. We then show that the CRLBs are dramatically increased for
    all three dimensions but decrease in the third-order of the surface-area.
    Moreover, with an infinitely large LIS the CRLB for the (z)-dimension with an
    unknown (varphi) is 6 dB higher than the case without phase uncertainty, and
    the CRLB for estimating (varphi) converges to a constant that is independent
    of the wavelength (lambda). At last, we extensively discuss the impact of
    centralized and distributed deployments of LIS, and show that a distributed
    deployment of LIS can enlarge the coverage for terminal-positioning and improve
    the overall positioning performance.

    Syndrome-Coupled Rate-Compatible Error-Correcting Codes

    Pengfei Huang, Yi Liu, Xiaojie Zhang, Paul H. Siegel, Erich F. Haratsch
    Comments: Submitted to ITW 2017
    Subjects: Information Theory (cs.IT)

    Rate-compatible error-correcting codes (ECCs), which consist of a set of
    extended codes, are of practical interest in both wireless communications and
    data storage. In this work, we first study the lower bounds for rate-compatible
    ECCs, thus proving the existence of good rate-compatible codes. Then, we
    propose a general framework for constructing rate-compatible ECCs based on
    cosets and syndromes of a set of nested linear codes. We evaluate our
    construction from two points of view. From a combinatorial perspective, we show
    that we can construct rate-compatible codes with increasing minimum distances.
    From a probabilistic point of view, we prove that we are able to construct
    capacity-achieving rate-compatible codes.

    Antenna Arrays for Line-of-Sight Massive MIMO: Half Wavelength is not Enough

    Daniele Pinchera, Marco Donald Migliore, Fulvio Schettino, Gaetano Panariello
    Comments: 9 Pages, 19 figures, to be submitted for publication
    Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

    The aim of this paper is to analyse the linear arrays synthesis for 5G
    massive MIMO systems in the Line-of-Sight working condition. The main result of
    this paper is that non uniform arrays are the natural choice in this kind of
    applications. In particular, by using non equispaced array we show that it is
    possible to achieve a better average condition number of the channel matrix.
    Furthermore, we verify that increasing the array size is beneficial also for
    circular arrays and we provide some useful rules-of-thumb for antenna array
    design for Massive MIMO applications. These results are in contrasts to the
    widely accepted idea in 5G massive MIMO literature, in which half-wavelength
    linear uniform array is universally adopted.

    Joint Uplink and Downlink Coverage Analysis of Cellular-based RF-powered IoT Network

    Mustafa A. Kishk, Harpreet S. Dhillon
    Subjects: Information Theory (cs.IT)

    Ambient radio frequency (RF) energy harvesting has emerged as a promising
    solution for powering small devices and sensors in massive Internet of Things
    (IoT) ecosystem due to its ubiquity and cost efficiency. In this paper, we
    study joint uplink and downlink coverage of cellular-based ambient RF energy
    harvesting IoT where the cellular network is assumed to be the only source of
    RF energy. We consider a time division-based approach for power and information
    transmission where each time-slot is partitioned into three sub-slots: (i)
    charging sub-slot during which the cellular base stations (BSs) act as RF
    chargers for the IoT devices, which then use the energy harvested in this
    sub-slot for information transmission and/or reception during the remaining two
    sub-slots, (ii) downlink sub-slot during which the IoT device receives
    information from the associated BS, and (iii) uplink sub-slot during which the
    IoT device transmits information to the associated BS. For this setup, we
    characterize the joint coverage probability, which is the joint probability of
    the events that the typical device harvests sufficient energy in the given time
    slot and is under both uplink and downlink signal-to-interference-plus-noise
    ratio (SINR) coverage with respect to its associated BS. This metric
    significantly generalizes the prior art on energy harvesting communications,
    which usually focused on downlink or uplink coverage separately. The key
    technical challenge is in handling the correlation between the amount of energy
    harvested in the charging sub-slot and the information signal quality (SINR) in
    the downlink and uplink sub-slots. Dominant BS-based approach is developed to
    derive tight approximation for this joint coverage probability. Several system
    design insights including comparison with regularly powered IoT network and
    throughput-optimal slot partitioning are also provided.

    Asymptotic Average Number of Different Categories of Trapping Sets, Absorbing Sets and Stopping Sets in Random Regular and Irregular LDPC Code Ensembles

    Ali Dehghan, Amir H. Banihashemi
    Subjects: Information Theory (cs.IT)

    The performance of low-density parity-check (LDPC) codes in the error floor
    region is closely related to some combinatorial structures of the code’s Tanner
    graph, collectively referred to as {it trapping sets (TSs)}. In this paper, we
    study the asymptotic average number of different types of trapping sets such as
    {em elementary TSs (ETS)}, {em leafless ETSs (LETS)}, {em absorbing sets
    (ABS)}, {em elementary ABSs (EABS)}, and {em stopping sets (SS)}, in random
    variable-regular and irregular LDPC code ensembles. We demonstrate that,
    regardless of the type of the TS, as the code’s length tends to infinity, the
    average number of a given structure tends to infinity, to a positive constant,
    or to zero, if the structure contains no cycle, only one cycle, or more than
    one cycle, respectively. For the case where the structure contains a single
    cycle, we obtain an estimate of the expected number of the structure through
    the available approximations for the average number of its constituent cycle.
    These estimates, which are independent of the block length and only depend on
    the code’s degree distributions, are shown to be accurate even for
    finite-length codes.

    Interference Alignment with Power Splitting Relays in Multi-User Multi-Relay Networks

    Man Chu, Biao He, Xuewen Liao, Zhenzhen Gao, Shihua Zhu
    Comments: 6 pages, 6 figures, to appear in Proc. VTC Fall, Toronto, Canada
    Subjects: Information Theory (cs.IT)

    In this paper, we study a multi-user multi-relay interference-channel
    network, where energy-constrained relays harvest energy from sources’ radio
    frequency (RF) signals and use the harvested energy to forward the information
    to destinations. We adopt the interference alignment (IA) technique to address
    the issue of interference, and propose a novel transmission scheme with the IA
    at sources and the power splitting (PS) at relays. A distributed and iterative
    algorithm to obtain the optimal PS ratios is further proposed, aiming at
    maximizing the sum rate of the network. The analysis is then validated by
    simulation results. Our results show that the proposed scheme with the optimal
    design significantly improves the performance of the network.

    Why Noise and Dispersion may Seriously Hamper Nonlinear Frequency-Division Multiplexing

    Stella Civelli, Enrico Forestieri, Marco Secondini
    Comments: The manuscript has been submitted to Photonics Technology Letters for publication
    Subjects: Information Theory (cs.IT); Applied Physics (physics.app-ph)

    The performance of optical fiber systems based on nonlinear
    frequency-division multiplexing (NFDM) or on more conventional transmission
    techniques is compared through numerical simulations. Some critical issues
    affecting NFDM systems-namely, the strict requirements needed to avoid burst
    interaction due to signal dispersion and the unfavorable dependence of
    performance on burst length-are investigated, highlighting their potentially
    disruptive effect in terms of spectral efficiency. Two digital processing
    techniques are finally proposed to halve the guard time between NFDM symbol
    bursts and reduce the size of the processing window at the receiver, increasing
    spectral efficiency and reducing computational complexity.

    Technical Report – MillimeterWave Communication in Vehicular Networks: Coverage and Connectivity Analysis

    Marco Giordani, Andrea Zanella, Michele Zorzi
    Comments: This document is a preliminary report which describes the mathematical model developed to carry out the results proposed in our work “Millimeter Wave Communication in Vehicular Networks: Challenges and Opportunities”, accepted to International Conference on Modern Circuits and Systems Technologies (MOCAST), 2017. A more updated version of this technical report will appear in the next weeks
    Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT)

    In this technical report (TR), we describe the mathematical model we
    developed to carry out a preliminary coverage and connectivity analysis in an
    automotive communication scenario based on mmWave links. The purpose is to
    exemplify some of the complex and interesting tradeoffs that have to be
    considered when designing solutions for mmWave automotive scenarios.




沪ICP备19023445号-2号
友情链接