IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Tue, 10 Jan 2017

    我爱机器学习(52ml.net)发表于 2017-01-10 00:00:00
    love 0

    Neural and Evolutionary Computing

    Classification Accuracy Improvement for Neuromorphic Computing Systems with One-level Precision Synapses

    Yandan Wang, Wei Wen, Linghao Song, Hai Li
    Comments: Best Paper Award of ASP-DAC 2017
    Subjects: Neural and Evolutionary Computing (cs.NE)

    Brain inspired neuromorphic computing has demonstrated remarkable advantages
    over traditional von Neumann architecture for its high energy efficiency and
    parallel data processing. However, the limited resolution of synaptic weights
    degrades system accuracy and thus impedes the use of neuromorphic systems. In
    this work, we propose three orthogonal methods to learn synapses with one-level
    precision, namely, distribution-aware quantization, quantization regularization
    and bias tuning, to make image classification accuracy comparable to the
    state-of-the-art. Experiments on both multi-layer perception and convolutional
    neural networks show that the accuracy drop can be well controlled within 0.19%
    (5.53%) for MNIST (CIFAR-10) database, compared to an ideal system without
    quantization.

    Structural Attention Neural Networks for improved sentiment analysis

    Filippos Kokkinos, Alexandros Potamianos
    Comments: Submitted to EACL2017 for review
    Subjects: Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE)

    We introduce a tree-structured attention neural network for sentences and
    small phrases and apply it to the problem of sentiment classification. Our
    model expands the current recursive models by incorporating structural
    information around a node of a syntactic tree using both bottom-up and top-down
    information propagation. Also, the model utilizes structural attention to
    identify the most salient representations during the construction of the
    syntactic tree. To our knowledge, the proposed models achieve state of the art
    performance on the Stanford Sentiment Treebank dataset.


    Computer Vision and Pattern Recognition

    Visual Multiple-Object Tracking for Unknown Clutter Rate

    Du Yong Kim
    Comments: 10 pages, 2 figures, 2 tables
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In most multi-object tracking algorithms, tuning of model parameters is of
    critical importance for reliable performance. In particular, we are interested
    in designing a robust tracking algorithm that is able to handle unknown false
    measurement rate. The proposed algorithm is based on coupling of two random
    finite set filters that share tracking parameters. Performance evaluation with
    visual surveillance and cell microscopy images demonstrates the effectiveness
    of the tracking algorithm for real-world scenarios.

    Multiple Instance Hybrid Estimator for Learning Target Signatures

    Changzhe Jiao, Alina Zare
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Signature-based detectors for hyperspectral target detection rely on knowing
    the specific target signature in advance. However, target signature are often
    difficult or impossible to obtain. Furthermore, common methods for obtaining
    target signatures, such as from laboratory measurements or manual selection
    from an image scene, usually do not capture the discriminative features of
    target class. In this paper, an approach for estimating a discriminative target
    signature from imprecise labels is presented. The proposed approach maximizes
    the response of the hybrid sub-pixel detector within a multiple instance
    learning framework and estimates a set of discriminative target signatures.
    After learning target signatures, any signature based detector can then be
    applied on test data. Both simulated and real hyperspectral target detection
    experiments are shown to illustrate the effectiveness of the method.

    A Learning-based Variable Size Part Extraction Architecture for 6D Object Pose Recovery in Depth

    Caner Sahin, Rigas Kouskouridas, Tae-Kyun Kim
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    State-of-the-art techniques for 6D object pose recovery depend on
    occlusion-free point clouds to accurately register objects in 3D space. To deal
    with this shortcoming, we introduce a novel architecture called Iterative Hough
    Forest with Histogram of Control Points that is capable of estimating the 6D
    pose of occluded and cluttered objects given a candidate 2D bounding box. Our
    Iterative Hough Forest (IHF) is learnt using parts extracted only from the
    positive samples. These parts are represented with Histogram of Control Points
    (HoCP), a “scale-variant” implicit volumetric description, which we derive from
    recently introduced Implicit B-Splines (IBS). The rich discriminative
    information provided by the scale-variant HoCP features is leveraged during
    inference. An automatic variable size part extraction framework iteratively
    refines the object’s initial pose that is roughly aligned due to the extraction
    of coarsest parts, the ones occupying the largest area in image pixels. The
    iterative refinement is accomplished based on finer (smaller) parts that are
    represented with more discriminative control point descriptors by using our
    Iterative Hough Forest. Experiments conducted on a publicly available dataset
    report that our approach show better registration performance than the
    state-of-the-art methods.

    Light Field Super-Resolution Via Graph-Based Regularization

    Mattia Rossi, Pascal Frossard
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)

    Light field cameras can capture the 3D information in a scene with a single
    shot. This special feature makes light field cameras very appealing for a
    variety of applications: from the popular post-capture refocus, to depth
    estimation and image-based rendering. However, light field cameras suffer by
    design from strong limitations in their spatial resolution, which should
    therefore be augmented by computational methods. On the one hand, off-the-shelf
    single-frame and multi-frame super-resolution algorithms are not ideal for
    light field data, as they do not consider its particular structure. On the
    other hand, the few super-resolution algorithms explicitly tailored for light
    field data exhibit significant limitations, such as the need to estimate an
    explicit disparity map at each view. In this work we propose a new light field
    super-resolution algorithm meant to address these limitations. We adopt a
    multi-frame alike super-resolution approach, where the complementary
    information in the different light field views is used to augment the spatial
    resolution of the whole light field. We show that coupling the multi-frame
    approach with a graph regularizer, that enforces the light field structure via
    non local self similarities, permits to avoid the costly and challenging
    disparity estimation step for all the views. Extensive experiments show that
    the proposed algorithm compares favorably to the other state-of-the-art methods
    for light field super-resolution, both in terms of PSNR and in terms of visual
    quality. Moreover, differently from the other light field super-resolution
    methods, the new algorithm provides reconstructed light field views with
    uniform quality, which happens to be an important feature for any light field
    application.

    Discrete approximations of affine Gaussian receptive fields

    Tony Lindeberg
    Comments: 31 pages, 7 figures, submitted to Journal of Mathematical Imaging and Vision in October 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper presents a theory for discretizing the affine Gaussian scale-space
    concept so that scale-space properties hold also for the discrete
    implementation.

    Two ways of discretizing spatial smoothing with affine Gaussian kernels are
    presented: (i) by solving semi-discretized affine diffusion equation as derived
    by necessity from the requirement of a semi-group structure over a continuum of
    scale parameters as parameterized by a family of spatial covariance matrices
    and obeying non-creation of new structures from any finer to any coarser scale
    as formalized by the requirement of non-enhancement of local extrema and (ii) a
    set of parameterized 3×3-kernels as derived from an additional discretization
    of the above theory along the scale direction and with the parameters of the
    kernels having a direct interpretation in terms of the covariance matrix of the
    composed discrete smoothing operation.

    We show how convolutions with the first family of kernels can be implemented
    in terms of a closed form expression for the Fourier transform and analyse how
    a remaining degree of freedom in the theory can be explored to ensure a
    positive discretization and optionally also achieve higher-order discrete
    approximation of the angular dependency of the shapes of the affine Gaussian
    kernels.

    We do also show how discrete directional derivative approximations can be
    efficiently implemented to approximate affine Gaussian derivatives as
    constituting a canonical model for receptive fields over a purely spatial image
    domain and with close relations to receptive fields in biological vision.

    Green-Blue Stripe Pattern for Range Sensing from a Single Image

    Changsoo Je, Kyuhyoung Choi, Sang Wook Lee
    Comments: 8 pages, 5 figures. Updated version of a conference paper
    Journal-ref: Proc. 30th Fall Semiannual Conference of Korea Information Science
    Society, vol. 2, pp. 661-663, Seoul, Korea, October, 2003
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)

    In this paper, we present a novel method for rapid high-resolution range
    sensing using green-blue stripe pattern. We use green and blue for designing
    high-frequency stripe projection pattern. For accurate and reliable range
    recovery, we identify the stripe patterns by our color-stripe segmentation and
    unwrapping algorithms. The experimental result for a naked human face shows the
    effectiveness of our method.

    Improved Texture Networks: Maximizing Quality and Diversity in Feed-forward Stylization and Texture Synthesis

    Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    The recent work of Gatys et al., who characterized the style of an image by
    the statistics of convolutional neural network filters, ignited a renewed
    interest in the texture generation and image stylization problems. While their
    image generation technique uses a slow optimization process, recently several
    authors have proposed to learn generator neural networks that can produce
    similar outputs in one quick forward pass. While generator networks are
    promising, they are still inferior in visual quality and diversity compared to
    generation-by-optimization. In this work, we advance them in two significant
    ways. First, we introduce an instance normalization module to replace batch
    normalization with significant improvements to the quality of image
    stylization. Second, we improve diversity by introducing a new learning
    formulation that encourages generators to sample unbiasedly from the Julesz
    texture ensemble, which is the equivalence class of all images characterized by
    certain filter responses. Together, these two improvements take feed forward
    texture synthesis and image stylization much closer to the quality of
    generation-via-optimization, while retaining the speed advantage.

    MS and PAN image fusion by combining Brovey and wavelet methods

    Hamid Reza Shahdoosti
    Comments: 9 pages; 2 figures, conference
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Among the existing fusion algorithms, the wavelet fusion method is the most
    frequently discussed one in recent publications because the wavelet approach
    preserves the spectral characteristics of the multispectral image better than
    other methods. The Brovey is also a popular fusion method used for its ability
    in preserving the spatial information of the PAN image. This study presents a
    new fusion approach that integrates the advantages of both the Brovey (which
    preserves a high degree of spatial information) and the wavelet (which
    preserves a high degree of spectral information) techniques to reduce the
    colour distortion of fusion results. Visual and statistical analyzes show that
    the proposed algorithm clearly improves the merging quality in terms of:
    correlation coefficient and UIQI; compared to fusion methods including, IHS,
    Brovey, PCA , HPF, discrete wavelet transform (DWT), and a-trous wavelet.

    Multi-spectral Image Panchromatic Sharpening, Outcome and Process Quality Assessment Protocol

    Andrea Baraldi, Francesca Despini, Sergio Teggi
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Multispectral (MS) image panchromatic (PAN) sharpening algorithms proposed to
    the remote sensing community are ever increasing in number and variety. Their
    aim is to sharpen a coarse spatial resolution MS image with a fine spatial
    resolution PAN image acquired simultaneously by a spaceborne or airborne Earth
    observation (EO) optical imaging sensor pair. Unfortunately, to date, no
    standard evaluation procedure for MS image PAN sharpening outcome and process
    is community agreed upon, in contrast with the Quality Assurance Framework for
    Earth Observation (QA4EO) guidelines proposed by the intergovernmental Group on
    Earth Observations (GEO). In general, process is easier to measure, outcome is
    more important. The original contribution of the present study is fourfold.
    First, existing procedures for quantitative quality assessment (Q2A) of the
    (sole) PAN sharpened MS product are critically reviewed. Their conceptual and
    implementation drawbacks are highlighted to be overcome for quality
    improvement. Second, a novel (to the best of these authors’ knowledge, the
    first) protocol for Q2A of MS image PAN sharpening product and process is
    designed, implemented and validated by independent means. Third, within this
    protocol, an innovative categorization of spectral and spatial image quality
    indicators and metrics is presented. Fourth, according to this new taxonomy, an
    original third order isotropic multi scale gray level co occurrence matrix
    (TIMS GLCM) calculator and a TIMS GLCM texture feature extractor are proposed
    to replace popular second order GLCMs.

    Multi-Objective Software Suite of Two-Dimensional Shape Descriptors for Object-Based Image Analysis

    Andrea Baraldi, João V. B. Soares
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In recent years two sets of planar (two dimensional, 2D) shape attributes,
    provided with an intuitive physical meaning, were proposed to the remote
    sensing community by, respectively, Nagao & Matsuyama and Shackelford & Davis
    in their seminal works on the increasingly popular geographic object based
    image analysis (GEOBIA) paradigm. These two known sets of simple geometric
    features were selected as initial conditions by the present research and
    technological development software project, whose multi objective goal was to
    accomplish: (i) a minimally dependent and maximally informative design
    (representation) of a general-purpose (application-independent) dictionary of
    intuitive 2D shape terms and (ii) an effective and efficient implementation of
    2D shape descriptors. To comply with the Quality Assurance Framework for Earth
    Observation (QA4EO) guidelines, the proposed suite of geometric functions is
    validated by means of a novel quantitative quality assurance (Q2A) policy,
    centered on inter-feature dependence (causality) assessment. This multivariate
    feature validation strategy is alternative to traditional classification
    accuracy estimation, which is inherently case-specific, and cross correlation
    estimation, because statistical cross-correlation does not imply causation. The
    project deliverable is a validated general purpose software suite of seven off
    the shelf (ready for use) 2D shape descriptors, provided with an intuitive
    physical meaning. Therefore, it is eligible for use in (GE)OBIA systems in
    operating mode, expected to mimic human reasoning based on a convergence of
    evidence approach.

    Automated Linear-Time Detection and Quality Assessment of Superpixels in Uncalibrated True- or False-Color RGB Images

    Andrea Baraldi, Dirk Tiede, Stefan Lang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Capable of automated near real time superpixel detection and quality
    assessment in an uncalibrated monitor typical red green blue (RGB) image,
    depicted in either true or false colors, an original low level computer vision
    (CV) lightweight computer program, called RGB Image Automatic Mapper (RGBIAM),
    is designed and implemented. Constrained by the Calibration Validation (CalVal)
    requirements of the Quality Assurance Framework for Earth Observation (QA4EO)
    guidelines, RGBIAM requires as mandatory an uncalibrated RGB image pre
    processing first stage, consisting of an automated statistical model based
    color constancy algorithm. The RGBIAM hybrid inference pipeline comprises: (I)
    a direct quantitative to nominal (QN) RGB variable transform, where RGB pixel
    values are mapped onto a prior dictionary of color names, equivalent to a
    static polyhedralization of the RGB cube. Prior color naming is the deductive
    counterpart of inductive vector quantization (VQ), whose typical VQ error
    function to minimize is a root mean square error (RMSE). In the output multi
    level color map domain, superpixels are automatically detected in linear time
    as connected sets of pixels featuring the same color label. (II) An inverse
    nominal to quantitative (NQ) RGB variable transform, where a superpixelwise
    constant RGB image approximation is generated in linear time to assess a VQ
    error image. The hybrid direct and inverse RGBIAM QNQ transform is: (i) general
    purpose, data and application independent. (ii) Automated, i.e., it requires no
    user machine interaction. (iii) Near real time, with a computational complexity
    increasing linearly with the image size. (iv) Implemented in tile streaming
    mode, to cope with massive images. Collected outcome and process quality
    indicators, including degree of automation, computational efficiency, VQ rate
    and VQ error, are consistent with theoretical expectations.

    Stage 4 validation of the Satellite Image Automatic Mapper lightweight computer program for Earth observation Level 2 product generation, Part 2 Validation

    Andrea Baraldi, Michael Laurence Humber, Dirk Tiede, Stefan Lang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    The European Space Agency (ESA) defines an Earth Observation (EO) Level 2
    product as a multispectral (MS) image corrected for geometric, atmospheric,
    adjacency and topographic effects, stacked with its scene classification map
    (SCM) whose legend includes quality layers such as cloud and cloud-shadow. No
    ESA EO Level 2 product has ever been systematically generated at the ground
    segment. To contribute toward filling an information gap from EO big sensory
    data to the ESA EO Level 2 product, a Stage 4 validation (Val) of an off the
    shelf Satellite Image Automatic Mapper (SIAM) lightweight computer program for
    prior knowledge based MS color naming was conducted by independent means. A
    time-series of annual Web Enabled Landsat Data (WELD) image composites of the
    conterminous U.S. (CONUS) was selected as input dataset. The annual SIAM WELD
    maps of the CONUS were validated in comparison with the U.S. National Land
    Cover Data (NLCD) 2006 map. These test and reference maps share the same
    spatial resolution and spatial extent, but their map legends are not the same
    and must be harmonized. For the sake of readability this paper is split into
    two. The previous Part 1 Theory provided the multidisciplinary background of a
    priori color naming. The present Part 2 Validation presents and discusses Stage
    4 Val results collected from the test SIAM WELD map time series and the
    reference NLCD map by an original protocol for wall to wall thematic map
    quality assessment without sampling, where the test and reference map legends
    can differ in agreement with the Part 1. Conclusions are that the SIAM-WELD
    maps instantiate a Level 2 SCM product whose legend is the FAO Land Cover
    Classification System (LCCS) taxonomy at the Dichotomous Phase (DP) Level 1
    vegetation/nonvegetation, Level 2 terrestrial/aquatic or superior LCCS level.

    Stage 4 validation of the Satellite Image Automatic Mapper lightweight computer program for Earth observation Level 2 product generation, Part 1 Theory

    Andrea Baraldi, Michael Laurence Humber, Dirk Tiede, Stefan Lang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    The European Space Agency (ESA) defines an Earth Observation (EO) Level 2
    product as a multispectral (MS) image corrected for geometric, atmospheric,
    adjacency and topographic effects, stacked with its scene classification map
    (SCM), whose legend includes quality layers such as cloud and cloud-shadow. No
    ESA EO Level 2 product has ever been systematically generated at the ground
    segment. To contribute toward filling an information gap from EO big data to
    the ESA EO Level 2 product, an original Stage 4 validation (Val) of the
    Satellite Image Automatic Mapper (SIAM) lightweight computer program was
    conducted by independent means on an annual Web-Enabled Landsat Data (WELD)
    image composite time-series of the conterminous U.S. The core of SIAM is a one
    pass prior knowledge based decision tree for MS reflectance space
    hyperpolyhedralization into static color names presented in literature in
    recent years. For the sake of readability this paper is split into two. The
    present Part 1 Theory provides the multidisciplinary background of a priori
    color naming in cognitive science, from linguistics to computer vision. To cope
    with dictionaries of MS color names and land cover class names that do not
    coincide and must be harmonized, an original hybrid guideline is proposed to
    identify a categorical variable pair relationship. An original quantitative
    measure of categorical variable pair association is also proposed. The
    subsequent Part 2 Validation discusses Stage 4 Val results collected by an
    original protocol for wall-to-wall thematic map quality assessment without
    sampling where the test and reference map legends can differ. Conclusions are
    that the SIAM-WELD maps instantiate a Level 2 SCM product whose legend is the 4
    class taxonomy of the FAO Land Cover Classification System at the Dichotomous
    Phase Level 1 vegetation/nonvegetation and Level 2 terrestrial/aquatic.

    On Classification of Distorted Images with Deep Convolutional Neural Networks

    Yiren Zhou, Sibo Song, Ngai-Man Cheung
    Comments: 5 pages, 8 figures, ICASSP 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Image blur and image noise are common distortions during image acquisition.
    In this paper, we systematically study the effect of image distortions on the
    deep neural network (DNN) image classifiers. First, we examine the DNN
    classifier performance under four types of distortions. Second, we propose two
    approaches to alleviate the effect of image distortion: re-training and
    fine-tuning with noisy images. Our results suggest that, under certain
    conditions, fine-tuning with noisy images can alleviate much effect due to
    distorted inputs, and is more practical than re-training.

    Random Sampling and Locality Constraint for Face Sketch

    Nannan Wang, Xinbo Gao, Jie Li
    Comments: A Submission to IEEE Transactions on Neural Networks and Learning Systems
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Face sketch synthesis plays an important role in both digital entertainment
    and law enforcement. It generally consists of two parts: neighbor selection and
    reconstruction weight representation. State-of-the-art face sketch synthesis
    methods perform neighbor selection online in a data-driven manner by (K)
    nearest neighbor ((K)-NN) searching. However, the online search increases the
    time consuming for synthesis. Moreover, since these methods need to traverse
    the whole training dataset for neighbor selection, the computational complexity
    increases with the scale of the training database and hence these methods have
    limited scalability. In addition, state-of-the-art methods consider that all
    selected neighbors contribute equally to the reconstruction weight computation
    process while the distinct similarity between the test patch and these
    neighbors are neglected. In this paper, we employ offline random sampling in
    place of online (K)-NN search to improve the synthesis efficiency. Locality
    constraint is introduced to model the distinct correlations between the test
    patch and random sampled patches. Extensive experiments on public face sketch
    databases demonstrate the superiority of the proposed method in comparison to
    state-of-the-art methods, in terms of both synthesis quality and time
    consumption. The proposed method could be extended to other heterogeneous face
    image transformation problems such as face hallucination. We will release the
    source codes of our proposed methods and the evaluation metrics for future
    study online: this http URL

    Tracking The Untrackable: Learning To Track Multiple Cues with Long-Term Dependencies

    Amir Sadeghian, Alexandre Alahi, Silvio Savarese
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We present a multi-cue metric learning framework to tackle the popular yet
    unsolved Multi-Object Tracking (MOT) problem. One of the key challenges of
    tracking methods is to effectively compute a similarity score that models
    multiple cues from the past such as object appearance, motion, or even
    interactions. This is particularly challenging when objects get occluded or
    share similar appearance properties with surrounding objects. To address this
    challenge, we cast the problem as a metric learning task that jointly reasons
    on multiple cues across time. Our framework learns to encode long-term temporal
    dependencies across multiple cues with a hierarchical Recurrent Neural Network.
    We demonstrate the strength of our approach by tracking multiple objects using
    their appearance, motion, and interactions. Our method outperforms previous
    works by a large margin on multiple publicly available datasets including the
    challenging MOT benchmark.

    Urban Scene Segmentation with Laser-Constrained CRFs

    Charika De Alvis, Lionel Ott, Fabio Ramos
    Comments: In International Conference On Intelligent Robots and Systems, 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Robots typically possess sensors of different modalities, such as colour
    cameras, inertial measurement units, and 3D laser scanners. Often, solving a
    particular problem becomes easier when more than one modality is used. However,
    while there are undeniable benefits to combine sensors of different modalities
    the process tends to be complicated. Segmenting scenes observed by the robot
    into a discrete set of classes is a central requirement for autonomy as
    understanding the scene is the first step to reason about future situations.
    Scene segmentation is commonly performed using either image data or 3D point
    cloud data. In computer vision many successful methods for scene segmentation
    are based on conditional random fields (CRF) where the maximum a posteriori
    (MAP) solution to the segmentation can be obtained by inference. In this paper
    we devise a new CRF inference method for scene segmentation that incorporates
    global constraints, enforcing the sets of nodes are assigned the same class
    label. To do this efficiently, the CRF is formulated as a relaxed quadratic
    program whose MAP solution is found using a gradient-based optimisation
    approach. The proposed method is evaluated on images and 3D point cloud data
    gathered in urban environments where image data provides the appearance
    features needed by the CRF, while the 3D point cloud data provides global
    spatial constraints over sets of nodes. Comparisons with belief propagation,
    conventional quadratic programming relaxation, and higher order potential CRF
    show the benefits of the proposed method.

    Group Visual Sentiment Analysis

    Zeshan Hussain, Tariq Patanam, Hardie Cate
    Comments: 7 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we introduce a framework for classifying images according to
    high-level sentiment. We subdivide the task into three primary problems:
    emotion classification on faces, human pose estimation, and 3D estimation and
    clustering of groups of people. We introduce novel algorithms for matching body
    parts to a common individual and clustering people in images based on physical
    location and orientation. Our results outperform several baseline approaches.

    Greedy Search for Descriptive Spatial Face Features

    Caner Gacav, Burak Benligiray, Cihan Topal
    Comments: International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Facial expression recognition methods use a combination of geometric and
    appearance-based features. Spatial features are derived from displacements of
    facial landmarks, and carry geometric information. These features are either
    selected based on prior knowledge, or dimension-reduced from a large pool. In
    this study, we produce a large number of potential spatial features using two
    combinations of facial landmarks. Among these, we search for a descriptive
    subset of features using sequential forward selection. The chosen feature
    subset is used to classify facial expressions in the extended Cohn-Kanade
    dataset (CK+), and delivered 88.7% recognition accuracy without using any
    appearance-based features.

    DeepFace: Face Generation using Deep Learning

    Hardie Cate, Fahim Dalvi, Zeshan Hussain
    Comments: 8 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We use CNNs to build a system that both classifies images of faces based on a
    variety of different facial attributes and generates new faces given a set of
    desired facial characteristics. After introducing the problem and providing
    context in the first section, we discuss recent work related to image
    generation in Section 2. In Section 3, we describe the methods used to
    fine-tune our CNN and generate new images using a novel approach inspired by a
    Gaussian mixture model. In Section 4, we discuss our working dataset and
    describe our preprocessing steps and handling of facial attributes. Finally, in
    Sections 5, 6 and 7, we explain our experiments and results and conclude in the
    following section. Our classification system has 82\% test accuracy.
    Furthermore, our generation pipeline successfully creates well-formed faces.

    Sign Language Recognition Using Temporal Classification

    Hardie Cate, Fahim Dalvi, Zeshan Hussain
    Comments: 5 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Devices like the Myo armband available in the market today enable us to
    collect data about the position of a user’s hands and fingers over time. We can
    use these technologies for sign language translation since each sign is roughly
    a combination of gestures across time. In this work, we utilize a dataset
    collected by a group at the University of South Wales, which contains
    parameters, such as hand position, hand rotation, and finger bend, for 95
    unique signs. For each input stream representing a sign, we predict which sign
    class this stream falls into. We begin by implementing baseline SVM and
    logistic regression models, which perform reasonably well on high quality data.
    Lower quality data requires a more sophisticated approach, so we explore
    different methods in temporal classification, including long short term memory
    architectures and sequential pattern mining methods.

    Oriented Response Networks

    Yanzhao Zhou, Qixiang Ye, Qiang Qiu, Jianbin Jiao
    Comments: 10 pages, 10 figures. Submitted to CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Deep Convolution Neural Networks (DCNNs) are capable of learning
    unprecedentedly effective image representations. However, their ability in
    handling significant local and global image rotations remains limited. In this
    paper, we propose Active Rotating Filters (ARFs) that actively rotate during
    convolution and produce feature maps with location and orientation explicitly
    encoded. An ARF acts as a virtual filter bank containing the filter itself and
    its multiple unmaterialised rotated versions. During back-propagation, an ARF
    is collectively updated using errors from all its rotated versions. DCNNs using
    ARFs, referred to as Oriented Response Networks (ORNs), can produce
    within-class rotation-invariant deep features while maintaining inter-class
    discrimination for classification tasks. The oriented response produced by ORNs
    can also be used for image and object orientation estimation tasks. Over
    multiple state-of-the-art DCNN architectures, such as VGG, ResNet, and STN, we
    consistently observe that replacing regular filters with the proposed ARFs
    leads to significant reduction in the number of network parameters and
    improvement in classification performance. We report the best results on
    several commonly used benchmarks.

    Unsupervised Learning of Long-Term Motion Dynamics for Videos

    Zelun Luo, Boya Peng, De-An Huang, Alexandre Alahi, Li Fei-Fei
    Comments: Computer Vision and Pattern Recognition (CVPR) 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We present an unsupervised representation learning approach that compactly
    encodes the motion dependencies in videos. Given a pair of images from a video
    clip, our framework learns to predict the long-term 3D motions. To reduce the
    complexity of the learning framework, we propose to describe the motion as a
    sequence of atomic 3D flows computed with RGB-D modality. We use a Recurrent
    Neural Network based Encoder-Decoder framework to predict these sequences of
    flows. We argue that in order for the decoder to reconstruct these sequences,
    the encoder must learn a robust video representation that captures long-term
    motion dependencies and spatial-temporal relations. We demonstrate the
    effectiveness of our learned temporal representations on activity
    classification across multiple modalities and datasets such as NTU RGB+D and
    MSR Daily Activity 3D. Our framework is generic to any input modality, i.e.,
    RGB, Depth, and RGB-D videos.

    Large-scale Isolated Gesture Recognition Using Convolutional Neural Networks

    Pichao Wang, Wanqing Li, Song Liu, Zhimin Gao, Chang Tang, Philip Ogunbona
    Comments: arXiv admin note: text overlap with arXiv:1608.06338
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper proposes three simple, compact yet effective representations of
    depth sequences, referred to respectively as Dynamic Depth Images (DDI),
    Dynamic Depth Normal Images (DDNI) and Dynamic Depth Motion Normal Images
    (DDMNI). These dynamic images are constructed from a sequence of depth maps
    using bidirectional rank pooling to effectively capture the spatial-temporal
    information. Such image-based representations enable us to fine-tune the
    existing ConvNets models trained on image data for classification of depth
    sequences, without introducing large parameters to learn. Upon the proposed
    representations, a convolutional Neural networks (ConvNets) based method is
    developed for gesture recognition and evaluated on the Large-scale Isolated
    Gesture Recognition at the ChaLearn Looking at People (LAP) challenge 2016. The
    method achieved 55.57\% classification accuracy and ranked (2^{nd}) place in
    this challenge but was very close to the best performance even though we only
    used depth data.

    Towards Accurate Multi-person Pose Estimation in the Wild

    George Papandreou, Tyler Zhu, Nori Kanazawa, Alexander Toshev, Jonathan Tompson, Chris Bregler, Kevin Murphy
    Comments: Paper describing an improved version of the G-RMI entry to the 2016 COCO keypoints challenge (this http URL)
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We propose a method for multi-person detection and 2-D keypoint localization
    (human pose estimation) that achieves state-of-the-art results on the
    challenging COCO keypoints task. It is a simple, yet powerful, top-down
    approach consisting of two stages.

    In the first stage, we predict the location and scale of boxes which are
    likely to contain people; for this we use the Faster RCNN detector with an
    Inception-ResNet architecture. In the second stage, we estimate the keypoints
    of the person potentially contained in each proposed bounding box. For each
    keypoint type we predict dense heatmaps and offsets using a fully convolutional
    ResNet. To combine these outputs we introduce a novel aggregation procedure to
    obtain highly localized keypoint predictions. We also use a novel form of
    keypoint-based Non-Maximum-Suppression (NMS), instead of the cruder box-level
    NMS, and a novel form of keypoint-based confidence score estimation, instead of
    box-level scoring.

    Our final system achieves average precision of 0.636 on the COCO test-dev set
    and the 0.628 test-standard sets, outperforming the CMU-Pose winner of the 2016
    COCO keypoints challenge. Further, by using additional labeled data we obtain
    an even higher average precision of 0.668 on the test-dev set and 0.658 on the
    test-standard set, thus achieving a roughly 10% improvement over the previous
    best performing method on the same challenge.

    Map-guided Hyperspectral Image Superpixel Segmentation Using Proportion Maps

    Hao Sun, Alina Zare
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    A map-guided superpixel segmentation method for hyperspectral imagery is
    developed and introduced. The proposed approach develops a
    hyperspectral-appropriate version of the SLIC superpixel segmentation
    algorithm, leverages map information to guide segmentation, and incorporates
    the semi-supervised Partial Membership Latent Dirichlet Allocation (sPM-LDA) to
    obtain a final superpixel segmentation. The proposed method is applied to two
    real hyperspectral data sets and quantitative cluster validity metrics indicate
    that the proposed approach outperforms existing hyperspectral superpixel
    segmentation methods.

    A Framework for Wasserstein-1-Type Metrics

    Bernhard Schmitzer, Benedikt Wirth
    Subjects: Optimization and Control (math.OC); Computer Vision and Pattern Recognition (cs.CV); Numerical Analysis (math.NA)

    We propose a unifying framework for generalising the Wasserstein-1 metric to
    a discrepancy measure between nonnegative measures of different mass. This
    generalization inherits the convexity and computational efficiency from the
    Wasserstein-1 metric, and it includes several previous approaches from the
    literature as special cases. For various specific instances of the generalized
    Wasserstein-1 metric we furthermore demonstrate their usefulness in
    applications by numerical experiments.


    Artificial Intelligence

    Just an Update on PMING Distance for Web-based Semantic Similarity in Artificial Intelligence and Data Mining

    Valentina Franzoni
    Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Information Retrieval (cs.IR); Probability (math.PR)

    One of the main problems that emerges in the classic approach to semantics is
    the difficulty in acquisition and maintenance of ontologies and semantic
    annotations. On the other hand, the Internet explosion and the massive
    diffusion of mobile smart devices lead to the creation of a worldwide system,
    which information is daily checked and fueled by the contribution of millions
    of users who interacts in a collaborative way. Search engines, continually
    exploring the Web, are a natural source of information on which to base a
    modern approach to semantic annotation. A promising idea is that it is possible
    to generalize the semantic similarity, under the assumption that semantically
    similar terms behave similarly, and define collaborative proximity measures
    based on the indexing information returned by search engines. The PMING
    Distance is a proximity measure used in data mining and information retrieval,
    which collaborative information express the degree of relationship between two
    terms, using only the number of documents returned as result for a query on a
    search engine. In this work, the PMINIG Distance is updated, providing a novel
    formal algebraic definition, which corrects previous works. The novel point of
    view underlines the features of the PMING to be a locally normalized linear
    combination of the Pointwise Mutual Information and Normalized Google Distance.
    The analyzed measure dynamically reflects the collaborative change made on the
    web resources.

    Morphognosis: the shape of knowledge in space and time

    Thomas E. Portegys
    Subjects: Neurons and Cognition (q-bio.NC); Artificial Intelligence (cs.AI)

    Artificial intelligence research to a great degree focuses on the brain and
    behaviors that the brain generates. But the brain, an extremely complex
    structure resulting from millions of years of evolution, can be viewed as a
    solution to problems posed by an environment existing in space and time. The
    environment generates signals that produce sensory events within an organism.
    Building an internal spatial and temporal model of the environment allows an
    organism to navigate and manipulate the environment. Higher intelligence might
    be the ability to process information coming from a larger extent of
    space-time. In keeping with nature’s penchant for extending rather than
    replacing, the purpose of the mammalian neocortex might then be to record
    events from distant reaches of space and time and render them, as though yet
    near and present, to the older, deeper brain whose instinctual roles have
    changed little over eons. Here this notion is embodied in a model called
    morphognosis (morpho = shape and gnosis = knowledge). Its basic structure is a
    pyramid of event recordings called a morphognostic. At the apex of the pyramid
    are the most recent and nearby events. Receding from the apex are less recent
    and possibly more distant events. A morphognostic can thus be viewed as a
    structure of progressively larger chunks of space-time knowledge. A set of
    morphognostics forms long-term memories that are learned by exposure to the
    environment. A cellular automaton is used as the platform to investigate the
    morphognosis model, using a simulated organism that learns and navigates its
    world in search of food, as well as the game of Pong.

    Coupled Compound Poisson Factorization

    Mehmet E. Basbug, Barbara E. Engelhardt
    Comments: Under review at AISTATS 2017
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    We present a general framework, the coupled compound Poisson factorization
    (CCPF), to capture the missing-data mechanism in extremely sparse data sets by
    coupling a hierarchical Poisson factorization with an arbitrary data-generating
    model. We derive a stochastic variational inference algorithm for the resulting
    model and, as examples of our framework, implement three different
    data-generating models—a mixture model, linear regression, and factor
    analysis—to robustly model non-random missing data in the context of
    clustering, prediction, and matrix factorization. In all three cases, we test
    our framework against models that ignore the missing-data mechanism on large
    scale studies with non-random missing data, and we show that explicitly
    modeling the missing-data mechanism substantially improves the quality of the
    results, as measured using data log likelihood on a held-out test set.

    Multi-level Representations for Fine-Grained Typing of Knowledge Base Entities

    Yadollah Yaghoobzadeh, Hinrich Schütze
    Comments: 13 pages, accepted in EACL 2016
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

    Entities are essential elements of natural language. In this paper, we
    present methods for learning multi-level representations of entities on three
    complementary levels: character (character patterns in entity names extracted,
    e.g., by neural networks), word (embeddings of words in entity names) and
    entity (entity embeddings). We investigate state-of-the-art learning methods on
    each level and find large differences, e.g., for deep learning models,
    traditional ngram features and the subword model of fasttext (Bojanowski et
    al., 2016) on the character level; for word2vec (Mikolov et al., 2013) on the
    word level; and for the order-aware model wang2vec (Ling et al., 2015a) on the
    entity level. We confirm experimentally that each level of representation
    contributes complementary information and a joint representation of all three
    levels improves the existing embedding based baseline for fine-grained entity
    typing by a large margin. Additionally, we show that adding information from
    entity descriptions further improves multi-level representations of entities.


    Information Retrieval

    Personalised Query Suggestion for Intranet Search with Temporal User Profiling

    Thanh Vu, Alistair Willis, Udo Kruschwitz, Dawei Song
    Comments: 4 pages, 2 figures, the 2017 ACM SIGIR Conference on Human Information Interaction & Retrieval (CHIIR)
    Subjects: Information Retrieval (cs.IR)

    Recent research has shown the usefulness of using collective user interaction
    data (e.g., query logs) to recommend query modification suggestions for
    Intranet search. However, most of the query suggestion approaches for Intranet
    search follow an “one size fits all” strategy, whereby different users who
    submit an identical query would get the same query suggestion list. This is
    problematic, as even with the same query, different users may have different
    topics of interest, which may change over time in response to the user’s
    interaction with the system. We address the problem by proposing a personalised
    query suggestion framework for Intranet search. For each search session, we
    construct two temporal user profiles: a click user profile using the user’s
    clicked documents and a query user profile using the user’s submitted queries.
    We then use the two profiles to re-rank the non-personalised query suggestion
    list returned by a state-of-the-art query suggestion method for Intranet
    search. Experimental results on a large-scale query logs collection show that
    our personalised framework significantly improves the quality of suggested
    queries.

    Toward Active Learning in Cross-domain Recommender Systems

    Roberto Pagano, Massimo Quadrana, Mehdi Elahi, Paolo Cremonesi
    Subjects: Information Retrieval (cs.IR)

    One of the main challenges in Recommender Systems (RSs) is the New User
    problem which happens when the system has to generate personalised
    recommendations for a new user whom the system has no information about. Active
    Learning tries to solve this problem by acquiring user preference data with the
    maximum quality, and with the minimum acquisition cost. Although there are
    variety of works in active learning for RSs research area, almost all of them
    have focused only on the single-domain recommendation scenario. However,
    several real-world RSs operate in the cross-domain scenario, where the system
    generates recommendations in the target domain by exploiting user preferences
    in both the target and auxiliary domains. In such a scenario, the performance
    of active learning strategies can be significantly influenced and typical
    active learning strategies may fail to perform properly. In this paper, we
    address this limitation, by evaluating active learning strategies in a novel
    evaluation framework, explicitly suited for the cross-domain recommendation
    scenario. We show that having access to the preferences of the users in the
    auxiliary domain may have a huge impact on the performance of active learning
    strategies w.r.t. the classical, single-domain scenario.

    Spotting Information biases in Chinese and Western Media

    Dominik Wurzer, Yumeng Qin
    Subjects: Information Retrieval (cs.IR)

    Newswire and Social Media are the major sources of information in our time.
    While the topical demographic of Western Media was subjects of studies in the
    past, less is known about Chinese Media. In this paper, we apply event
    detection and tracking technology to examine the information overlap and
    differences between Chinese and Western – Traditional Media and Social Media.
    Our experiments reveal a biased interest of China towards the West, which
    becomes particularly apparent when comparing the interest in celebrities.

    Just an Update on PMING Distance for Web-based Semantic Similarity in Artificial Intelligence and Data Mining

    Valentina Franzoni
    Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Information Retrieval (cs.IR); Probability (math.PR)

    One of the main problems that emerges in the classic approach to semantics is
    the difficulty in acquisition and maintenance of ontologies and semantic
    annotations. On the other hand, the Internet explosion and the massive
    diffusion of mobile smart devices lead to the creation of a worldwide system,
    which information is daily checked and fueled by the contribution of millions
    of users who interacts in a collaborative way. Search engines, continually
    exploring the Web, are a natural source of information on which to base a
    modern approach to semantic annotation. A promising idea is that it is possible
    to generalize the semantic similarity, under the assumption that semantically
    similar terms behave similarly, and define collaborative proximity measures
    based on the indexing information returned by search engines. The PMING
    Distance is a proximity measure used in data mining and information retrieval,
    which collaborative information express the degree of relationship between two
    terms, using only the number of documents returned as result for a query on a
    search engine. In this work, the PMINIG Distance is updated, providing a novel
    formal algebraic definition, which corrects previous works. The novel point of
    view underlines the features of the PMING to be a locally normalized linear
    combination of the Pointwise Mutual Information and Normalized Google Distance.
    The analyzed measure dynamically reflects the collaborative change made on the
    web resources.

    Differentially Private Neighborhood-based Recommender Systems

    Jun Wang, Qiang Tang
    Comments: 14 pages
    Subjects: Cryptography and Security (cs.CR); Information Retrieval (cs.IR)

    Privacy issues of recommender systems have become a hot topic for the society
    as such systems are appearing in every corner of our life. In contrast to the
    fact that many secure multi-party computation protocols have been proposed to
    prevent information leakage in the process of recommendation computation, very
    little has been done to restrict the information leakage from the
    recommendation results. In this paper, we apply the differential privacy
    concept to neighborhood-based recommendation methods (NBMs) under a
    probabilistic framework. We first present a solution, by directly calibrating
    Laplace noise into the training process, to differential-privately find the
    maximum a posteriori parameters similarity. Then we connect differential
    privacy to NBMs by exploiting a recent observation that sampling from the
    scaled posterior distribution of a Bayesian model results in provably
    differentially private systems. Our experiments show that both solutions allow
    promising accuracy with a modest privacy budget, and the second solution yields
    better accuracy if the sampling asymptotically converges. We also compare our
    solutions to the recent differentially private matrix factorization (MF)
    recommender systems, and show that our solutions achieve better accuracy when
    the privacy budget is reasonably small. This is an interesting result because
    MF systems often offer better accuracy when differential privacy is not
    applied.


    Computation and Language

    Crowdsourcing Ground Truth for Medical Relation Extraction

    Anca Dumitrache, Lora Aroyo, Chris Welty
    Comments: In review for ACM Transactions on Interactive Intelligent Systems (TiiS) Special Issue on Human-Centered Machine Learning
    Subjects: Computation and Language (cs.CL); Human-Computer Interaction (cs.HC)

    Cognitive computing systems require human labeled data for evaluation, and
    often for training. The standard practice used in gathering this data minimizes
    disagreement between annotators, and we have found this results in data that
    fails to account for the ambiguity inherent in language. We have proposed the
    CrowdTruth method for collecting ground truth through crowdsourcing, that
    reconsiders the role of people in machine learning based on the observation
    that disagreement between annotators provides a useful signal for phenomena
    such as ambiguity in the text. We report on using this method to build an
    annotated data set for medical relation extraction for the (cause) and (treat)
    relations, and how this data performed in a supervised training experiment. We
    demonstrate that by modeling ambiguity, labeled data gathered from crowd
    workers can (1) reach the level of quality of domain experts for this task
    while reducing the cost, and (2) provide better training data at scale than
    distant supervision. We further propose and validate new weighted measures for
    precision, recall, and F-measure, that account for ambiguity in both human and
    machine performance on this task.

    Task-Specific Attentive Pooling of Phrase Alignments Contributes to Sentence Matching

    Wenpeng Yin, Hinrich Schütze
    Comments: EACL’2017 long paper. arXiv admin note: substantial text overlap with arXiv:1604.06896
    Subjects: Computation and Language (cs.CL)

    This work studies comparatively two typical sentence matching tasks: textual
    entailment (TE) and answer selection (AS), observing that weaker phrase
    alignments are more critical in TE, while stronger phrase alignments deserve
    more attention in AS. The key to reach this observation lies in phrase
    detection, phrase representation, phrase alignment, and more importantly how to
    connect those aligned phrases of different matching degrees with the final
    classifier. Prior work (i) has limitations in phrase generation and
    representation, or (ii) conducts alignment at word and phrase levels by
    handcrafted features or (iii) utilizes a single framework of alignment without
    considering the characteristics of specific tasks, which limits the framework’s
    effectiveness across tasks. We propose an architecture based on Gated Recurrent
    Unit that supports (i) representation learning of phrases of arbitrary
    granularity and (ii) task-specific attentive pooling of phrase alignments
    between two sentences. Experimental results on TE and AS match our observation
    and show the effectiveness of our approach.

    Neural Personalized Response Generation as Domain Adaptation

    Weinan Zhang, Ting Liu, Yifa Wang, Qingfu Zhu
    Subjects: Computation and Language (cs.CL)

    In this paper, we focus on the personalized response generation for
    conversational systems. Based on the sequence to sequence learning, especially
    the encoder-decoder framework, we propose a two-phase approach, namely
    initialization then adaptation, to model the responding style of human and then
    generate personalized responses. For evaluation, we propose a novel human aided
    method to evaluate the performance of the personalized response generation
    models by online real-time conversation and offline human judgement. Moreover,
    the lexical divergence of the responses generated by the 5 personalized models
    indicates that the proposed two-phase approach achieves good results on
    modeling the responding style of human and generating personalized responses
    for the conversational systems.

    Multi-level Representations for Fine-Grained Typing of Knowledge Base Entities

    Yadollah Yaghoobzadeh, Hinrich Schütze
    Comments: 13 pages, accepted in EACL 2016
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

    Entities are essential elements of natural language. In this paper, we
    present methods for learning multi-level representations of entities on three
    complementary levels: character (character patterns in entity names extracted,
    e.g., by neural networks), word (embeddings of words in entity names) and
    entity (entity embeddings). We investigate state-of-the-art learning methods on
    each level and find large differences, e.g., for deep learning models,
    traditional ngram features and the subword model of fasttext (Bojanowski et
    al., 2016) on the character level; for word2vec (Mikolov et al., 2013) on the
    word level; and for the order-aware model wang2vec (Ling et al., 2015a) on the
    entity level. We confirm experimentally that each level of representation
    contributes complementary information and a joint representation of all three
    levels improves the existing embedding based baseline for fine-grained entity
    typing by a large margin. Additionally, we show that adding information from
    entity descriptions further improves multi-level representations of entities.

    Sentence-level dialects identification in the greater China region

    Fan Xu, Mingwen Wang, Maoxi Li
    Comments: 12
    Journal-ref: International Journal on Natural Language Computing (IJNLC) Vol.
    5, No.6, December 2016
    Subjects: Computation and Language (cs.CL)

    Identifying the different varieties of the same language is more challenging
    than unrelated languages identification. In this paper, we propose an approach
    to discriminate language varieties or dialects of Mandarin Chinese for the
    Mainland China, Hong Kong, Taiwan, Macao, Malaysia and Singapore, a.k.a., the
    Greater China Region (GCR). When applied to the dialects identification of the
    GCR, we find that the commonly used character-level or word-level uni-gram
    feature is not very efficient since there exist several specific problems such
    as the ambiguity and context-dependent characteristic of words in the dialects
    of the GCR. To overcome these challenges, we use not only the general features
    like character-level n-gram, but also many new word-level features, including
    PMI-based and word alignment-based features. A series of evaluation results on
    both the news and open-domain dataset from Wikipedia show the effectiveness of
    the proposed approach.

    Neural Machine Translation on Scarce-Resource Condition: A case-study on Persian-English

    Mohaddeseh Bastan, Shahram Khadivi, Mohammad Mehdi Homayounpour
    Comments: 6 pages, Submitted in ICEE 2017
    Subjects: Computation and Language (cs.CL)

    Neural Machine Translation (NMT) is a new approach for Machine Translation
    (MT), and due to its success, it has absorbed the attention of many researchers
    in the field. In this paper, we study NMT model on Persian-English language
    pairs, to analyze the model and investigate the appropriateness of the model
    for scarce-resourced scenarios, the situation that exists for Persian-centered
    translation systems. We adjust the model for the Persian language and find the
    best parameters and hyper parameters for two tasks: translation and
    transliteration. We also apply some preprocessing task on the Persian dataset
    which yields to increase for about one point in terms of BLEU score. Also, we
    have modified the loss function to enhance the word alignment of the model.
    This new loss function yields a total of 1.87 point improvements in terms of
    BLEU score in the translation quality.

    Structural Attention Neural Networks for improved sentiment analysis

    Filippos Kokkinos, Alexandros Potamianos
    Comments: Submitted to EACL2017 for review
    Subjects: Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE)

    We introduce a tree-structured attention neural network for sentences and
    small phrases and apply it to the problem of sentiment classification. Our
    model expands the current recursive models by incorporating structural
    information around a node of a syntactic tree using both bottom-up and top-down
    information propagation. Also, the model utilizes structural attention to
    identify the most salient representations during the construction of the
    syntactic tree. To our knowledge, the proposed models achieve state of the art
    performance on the Stanford Sentiment Treebank dataset.

    Just an Update on PMING Distance for Web-based Semantic Similarity in Artificial Intelligence and Data Mining

    Valentina Franzoni
    Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Information Retrieval (cs.IR); Probability (math.PR)

    One of the main problems that emerges in the classic approach to semantics is
    the difficulty in acquisition and maintenance of ontologies and semantic
    annotations. On the other hand, the Internet explosion and the massive
    diffusion of mobile smart devices lead to the creation of a worldwide system,
    which information is daily checked and fueled by the contribution of millions
    of users who interacts in a collaborative way. Search engines, continually
    exploring the Web, are a natural source of information on which to base a
    modern approach to semantic annotation. A promising idea is that it is possible
    to generalize the semantic similarity, under the assumption that semantically
    similar terms behave similarly, and define collaborative proximity measures
    based on the indexing information returned by search engines. The PMING
    Distance is a proximity measure used in data mining and information retrieval,
    which collaborative information express the degree of relationship between two
    terms, using only the number of documents returned as result for a query on a
    search engine. In this work, the PMINIG Distance is updated, providing a novel
    formal algebraic definition, which corrects previous works. The novel point of
    view underlines the features of the PMING to be a locally normalized linear
    combination of the Pointwise Mutual Information and Normalized Google Distance.
    The analyzed measure dynamically reflects the collaborative change made on the
    web resources.


    Distributed, Parallel, and Cluster Computing

    Resource Management in Cloud Networking Using Economic Analysis and Pricing Models: A Survey

    Nguyen Cong Luong, Ping Wang, Dusit Niyato, Wen Yonggang, Zhu Han
    Comments: IEEE Communications Surveys and Tutorials, 2017
    Subjects: Computer Science and Game Theory (cs.GT); Distributed, Parallel, and Cluster Computing (cs.DC); Networking and Internet Architecture (cs.NI)

    This paper presents a comprehensive literature review on applications of
    economic and pricing models for resource management in cloud networking. To
    achieve sustainable profit advantage, cost reduction, and flexibility in
    provisioning of cloud resources, resource management in cloud networking
    requires adaptive and robust designs to address many issues, e.g., resource
    allocation, bandwidth reservation, request allocation, and workload allocation.
    Economic and pricing models have received a lot of attention as they can lead
    to desirable performance in terms of social welfare, fairness, truthfulness,
    profit, user satisfaction, and resource utilization. This paper reviews
    applications of the economic and pricing models to develop adaptive algorithms
    and protocols for resource management in cloud networking. Besides, we survey a
    variety of incentive mechanisms using the pricing strategies in sharing
    resources in edge computing. In addition, we consider using pricing models in
    cloud-based Software Defined Wireless Networking (cloud-based SDWN). Finally,
    we highlight important challenges, open issues and future research directions
    of applying economic and pricing models to cloud networking

    Estimation of Graphlet Statistics

    Ryan A. Rossi, Rong Zhou, Nesreen K. Ahmed
    Subjects: Social and Information Networks (cs.SI); Distributed, Parallel, and Cluster Computing (cs.DC); Combinatorics (math.CO); Machine Learning (stat.ML)

    Graphlets are induced subgraphs of a large network and are important for
    understanding and modeling complex networks. Despite their practical
    importance, graphlets have been severely limited to applications and domains
    with relatively small graphs. Most previous work has focused on exact
    algorithms, however, it is often too expensive to compute graphlets exactly in
    massive networks with billions of edges, and finding an approximate count is
    usually sufficient for many applications. In this work, we propose an unbiased
    graphlet estimation framework that is (a) fast with significant speedups
    compared to the state-of-the-art, (b) parallel with nearly linear-speedups, (c)
    accurate with <1% relative error, (d) scalable and space-efficient for massive
    networks with billions of edges, and (e) flexible for a variety of real-world
    settings, as well as estimating macro and micro-level graphlet statistics
    (e.g., counts) of both connected and disconnected graphlets. In addition, an
    adaptive approach is introduced that finds the smallest sample size required to
    obtain estimates within a given user-defined error bound. On 300 networks from
    20 domains, we obtain <1% relative error for all graphlets. This is
    significantly more accurate than existing methods while using less data.
    Moreover, it takes a few seconds on billion edge graphs (as opposed to
    days/weeks). These are by far the largest graphlet computations to date.


    Learning

    QuickNet: Maximizing Efficiency and Efficacy in Deep Architectures

    Tapabrata Ghosh
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    We present QuickNet, a fast and accurate network architecture that is both
    faster and significantly more accurate than other fast deep architectures like
    SqueezeNet. Furthermore, it uses less parameters than previous networks, making
    it more memory efficient. We do this by making two major modifications to the
    reference Darknet model (Redmon et al, 2015): 1) The use of depthwise separable
    convolutions and 2) The use of parametric rectified linear units. We make the
    observation that parametric rectified linear units are computationally
    equivalent to leaky rectified linear units at test time and the observation
    that separable convolutions can be interpreted as a compressed Inception
    network (Chollet, 2016). Using these observations, we derive a network
    architecture, which we call QuickNet, that is both faster and more accurate
    than previous models. Our architecture provides at least four major advantages:
    (1) A smaller model size, which is more tenable on memory constrained systems;
    (2) A significantly faster network which is more tenable on computationally
    constrained systems; (3) A high accuracy of 95.7 percent on the CIFAR-10
    Dataset which outperforms all but one result published so far, although we note
    that our works are orthogonal approaches and can be combined (4) Orthogonality
    to previous model compression approaches allowing for further speed gains to be
    realized.

    Coupled Compound Poisson Factorization

    Mehmet E. Basbug, Barbara E. Engelhardt
    Comments: Under review at AISTATS 2017
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    We present a general framework, the coupled compound Poisson factorization
    (CCPF), to capture the missing-data mechanism in extremely sparse data sets by
    coupling a hierarchical Poisson factorization with an arbitrary data-generating
    model. We derive a stochastic variational inference algorithm for the resulting
    model and, as examples of our framework, implement three different
    data-generating models—a mixture model, linear regression, and factor
    analysis—to robustly model non-random missing data in the context of
    clustering, prediction, and matrix factorization. In all three cases, we test
    our framework against models that ignore the missing-data mechanism on large
    scale studies with non-random missing data, and we show that explicitly
    modeling the missing-data mechanism substantially improves the quality of the
    results, as measured using data log likelihood on a held-out test set.

    Large-Scale Network Motif Learning with Compression

    Peter Bloem, Steven de Rooij
    Subjects: Learning (cs.LG)

    We introduce a new learning method for network motifs: interesting or
    informative subgraph patterns in a network. Current methods for finding motifs
    rely on the frequency of the motif: specifically, subgraphs are motifs when
    their frequency in the data is high compared to the expected frequency under a
    null model. To compute this expectation, the search for motifs is normally
    repeated on as many as 1000 random graphs sampled from the null model, a
    prohibitively expensive step. We use ideas from the Minimum Description Length
    (MDL) literature to define a new measure of motif relevance. This has several
    advantages: the subgraph count on samples from the null model can be
    eliminated, and the search for motif candidates within the data itself can be
    greatly simplified. Our method allows motif analysis to scale to networks with
    billions of links, provided that a fast null model is used.

    See the Near Future: A Short-Term Predictive Methodology to Traffic Load in ITS

    Xun Zhou, Changle Li, Zhe Liu, Tom H. Luan, Zhifang Miao, Lina Zhu, Lei Xiong
    Subjects: Learning (cs.LG); Applications (stat.AP)

    The Intelligent Transportation System (ITS) targets to a coordinated traffic
    system by applying the advanced wireless communication technologies for road
    traffic scheduling. Towards an accurate road traffic control, the short-term
    traffic forecasting to predict the road traffic at the particular site in a
    short period is often useful and important. In existing works, Seasonal
    Autoregressive Integrated Moving Average (SARIMA) model is a popular approach.
    The scheme however encounters two challenges: 1) the analysis on related data
    is insufficient whereas some important features of data may be neglected; and
    2) with data presenting different features, it is unlikely to have one
    predictive model that can fit all situations. To tackle above issues, in this
    work, we develop a hybrid model to improve accuracy of SARIMA. In specific, we
    first explore the autocorrelation and distribution features existed in traffic
    flow to revise structure of the time series model. Based on the Gaussian
    distribution of traffic flow, a hybrid model with a Bayesian learning algorithm
    is developed which can effectively expand the application scenarios of SARIMA.
    We show the efficiency and accuracy of our proposal using both analysis and
    experimental studies. Using the real-world trace data, we show that the
    proposed predicting approach can achieve satisfactory performance in practice.

    Deep Learning for Time-Series Analysis

    John Cristian Borges Gamboa
    Comments: Written as part of the Seminar on Collaborative Intelligence in the TU Kaiserslautern. January 2016
    Subjects: Learning (cs.LG)

    In many real-world application, e.g., speech recognition or sleep stage
    classification, data are captured over the course of time, constituting a
    Time-Series. Time-Series often contain temporal dependencies that cause two
    otherwise identical points of time to belong to different classes or predict
    different behavior. This characteristic generally increases the difficulty of
    analysing them. Existing techniques often depended on hand-crafted features
    that were expensive to create and required expert knowledge of the field. With
    the advent of Deep Learning new models of unsupervised learning of features for
    Time-series analysis and forecast have been developed. Such new developments
    are the topic of this paper: a review of the main Deep Learning techniques is
    presented, and some applications on Time-Series analysis are summaried. The
    results make it clear that Deep Learning has a lot to contribute to the field.

    DeepDSL: A Compilation-based Domain-Specific Language for Deep Learning

    Tian Zhao, Xiaobing Huang, Yu Cao
    Subjects: Programming Languages (cs.PL); Learning (cs.LG)

    In recent years, Deep Learning (DL) has found great success in domains such
    as multimedia understanding. However, the complex nature of multimedia data
    makes it difficult to develop DL-based software. The state-of-the art tools,
    such as Caffe, TensorFlow, Torch7, and CNTK, while are successful in their
    applicable domains, are programming libraries with fixed user interface,
    internal representation, and execution environment. This makes it difficult to
    implement portable and customized DL applications.

    In this paper, we present DeepDSL, a domain specific language (DSL) embedded
    in Scala, that compiles deep networks written in DeepDSL to Java source code.
    Deep DSL provides (1) intuitive constructs to support compact encoding of deep
    networks; (2) symbolic gradient derivation of the networks; (3) static analysis
    for memory consumption and error detection; and (4) DSL-level optimization to
    improve memory and runtime efficiency.

    DeepDSL programs are compiled into compact, efficient, customizable, and
    portable Java source code, which operates the CUDA and CUDNN interfaces running
    on Nvidia GPU via a Java Native Interface (JNI) library. We evaluated DeepDSL
    with a number of popular DL networks. Our experiments show that the compiled
    programs have very competitive runtime performance and memory efficiency
    compared to the existing libraries.

    Shallow and Deep Networks Intrusion Detection System: A Taxonomy and Survey

    Elike Hodo, Xavier Bellekens, Andrew Hamilton, Christos Tachtatzis, Robert Atkinson
    Subjects: Cryptography and Security (cs.CR); Learning (cs.LG)

    Intrusion detection has attracted a considerable interest from researchers
    and industries. The community, after many years of research, still faces the
    problem of building reliable and efficient IDS that are capable of handling
    large quantities of data, with changing patterns in real time situations. The
    work presented in this manuscript classifies intrusion detection systems (IDS).
    Moreover, a taxonomy and survey of shallow and deep networks intrusion
    detection systems is presented based on previous and current works. This
    taxonomy and survey reviews machine learning techniques and their performance
    in detecting anomalies. Feature selection which influences the effectiveness of
    machine learning (ML) IDS is discussed to explain the role of feature selection
    in the classification and training phase of ML IDS. Finally, a discussion of
    the false and true positive alarm rates is presented to help researchers model
    reliable and efficient machine learning based intrusion detection systems.

    Deep driven fMRI decoding of visual categories

    Michele Svanera, Sergio Benini, Gal Raz, Talma Hendler, Rainer Goebel, Giancarlo Valente
    Comments: Presented at MLINI-2016 workshop, 2016 (arXiv:1701.01437)
    Subjects: Machine Learning (stat.ML); Learning (cs.LG); Neurons and Cognition (q-bio.NC)

    Deep neural networks have been developed drawing inspiration from the brain
    visual pathway, implementing an end-to-end approach: from image data to video
    object classes. However building an fMRI decoder with the typical structure of
    Convolutional Neural Network (CNN), i.e. learning multiple level of
    representations, seems impractical due to lack of brain data. As a possible
    solution, this work presents the first hybrid fMRI and deep features decoding
    approach: collected fMRI and deep learnt representations of video object
    classes are linked together by means of Kernel Canonical Correlation Analysis.
    In decoding, this allows exploiting the discriminatory power of CNN by relating
    the fMRI representation to the last layer of CNN (fc7). We show the
    effectiveness of embedding fMRI data onto a subspace related to deep features
    in distinguishing semantic visual categories based solely on brain imaging
    data.

    Tunable GMM Kernels

    Ping Li
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    The recently proposed “generalized min-max” (GMM) kernel can be efficiently
    linearized via hashing or the Nystrom method, with direct applications in
    large-scale machine learning and fast near neighbor search. As a tuning-free
    kernel, the linearized GMM kernel has been extensively compared with the
    normalized random Fourier features (NRFF). On classification tasks, the
    tuning-free GMM kernel performs (surprisingly) well compared to the best-tuned
    RBF kernel. Nevertheless, one would naturally expect that the GMM kernel ought
    to be further improved if we introduce tuning parameters.

    In this paper, we study two simple constructions of tunable GMM kernels: (i)
    the exponentiated-GMM (or eGMM) kernel, and (ii) the powered-GMM (or pGMM)
    kernel. One can of course introduce additional (and more complex) kernels by
    (e.g.,) combining eGMM and pGMM kernels. The pGMM kernel can still be
    efficiently linearized by modifying the original hashing procedure for the GMM
    kernel. On 47 publicly available classification datasets from the UCI
    repository, we verify that the eGMM and pGMM kernels (typically) improve over
    the original GMM kernel. On a fraction of datasets, the improvements can be
    astonishingly significant.

    We hope our introduction of tunable kernels could offer practitioners the
    flexibility of choosing appropriate kernels and methods for large-scale search
    & learning for their specific applications.


    Information Theory

    On short cycle enumeration in biregular bipartite graphs

    Ian Blake, Shu Lin
    Comments: 15 pages, 1 figure, 2 tables
    Subjects: Information Theory (cs.IT); Combinatorics (math.CO)

    A number of recent works have used a variety of combinatorial constructions
    to derive Tanner graphs for LDPC codes and some of these have been shown to
    perform well in terms of their probability of error curves and error floors.
    Such graphs are bipartite and many of these constructions yield biregular
    graphs where the degree of left vertices is a constant (c+1) and that of the
    right vertices is a constant (d+1). Such graphs are termed ((c+1,d+1))
    biregular bipartite graphs here. One property of interest in such work is the
    girth of the graph and the number of short cycles in the graph, cycles of
    length either the girth or slightly larger. Such numbers have been shown to be
    related to the error floor of the probability of error curve of the related
    LDPC code. Using known results of graph theory, it is shown how the girth and
    the number of cycles of length equal to the girth may be computed for these
    ((c+1,d+1)) biregular bipartite graphs knowing only the parameters (c) and (d)
    and the numbers of left and right vertices. While numerous algorithms to
    determine the number of short cycles in arbitrary graphs exist, the reduction
    of the problem from an algorithm to a computation for these biregular bipartite
    graphs is of interest.

    Modeling a Spatially Correlated Cellular Network with Strong Repulsion

    Chang-Sik Choi, Jae Oh Woo, Jeffrey G. Andrews
    Comments: Submitted to IEEE Transactions on Wireless Communications
    Subjects: Information Theory (cs.IT)

    We propose a new cellular network model that captures both strong repulsion
    and no correlation between base stations. The base station locations are
    modeled as the superposition of two independent stationary point processes: a
    random shifted grid with intensity ( lambda_g ) for the repulsive base
    stations and an independent Poisson point process with intensity ( lambda_p )
    for the uncorrelated base stations. Assuming that users are associated with
    base stations that provide the strongest average receive signal power, we
    obtain the probability that a typical user is associated with either a
    repulsive base station or an uncorrelated base station. Assuming Rayleigh
    fading channels, we derive the expression for the coverage probability of the
    typical user. The following three observations can be made. First, the
    association and coverage probability of the typical user are fully
    characterized by two system parameters: the intensity ratio (
    ho_lambda )
    and the transmit power ratio ( eta). Second, in user association, a bias
    toward the repulsive base stations exists. Finally, the coverage expression is
    a monotonically increasing function with respect to the number of repulsive
    base stations, or the inverse of the intensity ratio (
    ho_lambda^{-1} ).

    Greedy-Merge Degrading has Optimal Power-Law

    Assaf Kartowsky, Ido Tal
    Comments: 5 pages, submitted to ISIT 2017
    Subjects: Information Theory (cs.IT)

    Consider a channel with a given input distribution. Our aim is to degrade it
    to a channel with at most L output letters. One such degradation method is the
    so called “greedy-merge” algorithm. We derive an upper bound on the reduction
    in mutual information between input and output. For fixed input alphabet size
    and variable L, the upper bound is within a constant factor of an
    algorithm-independent lower bound. Thus, we establish that greedy-merge is
    optimal in the power-law sense.

    On Achievable Rates of AWGN Energy-Harvesting Channels with Block Energy Arrival and Non-Vanishing Error Probabilities

    Silas L. Fong, Vincent Y. F. Tan, Ayfer Özgür
    Comments: 48 pages
    Subjects: Information Theory (cs.IT)

    This paper investigates the achievable rates of an additive white Gaussian
    noise (AWGN) energy-harvesting (EH) channel with an infinite battery under the
    assumption that the error probabilities do not vanish as the blocklength
    increases. The EH process is characterized by a sequence of blocks of harvested
    energy. The harvested energy remains constant within a block while the
    harvested energy across different blocks is characterized by a sequence of
    independent and identically distributed (i.i.d.) random variables. The blocks
    have length (L), which can be interpreted as the coherence time of the energy
    arrival process. If (L) is a constant or grows sublinearly in the blocklength
    (n), we fully characterize the first-order coding rate. In addition, we obtain
    lower and upper bounds on the second-order coding rate, which are proportional
    to (-sqrt{frac{L}{n}}) for any fixed error probability (<1/2). If (L) grows
    linearly in (n), we obtain lower and upper bounds on the first-order coding
    rate, which coincide whenever the EH random variable is continuous. Our results
    suggest that correlation in the energy-arrival process decreases the effective
    blocklength by a factor of~(L).

    A Decentralized Optimization Framework for Energy Harvesting Devices

    Alessandro Biason, Subhrakanti Dey, Michele Zorzi
    Comments: 12 pages, 9 figures, submitted to IEEE Transactions on Mobile Computing, Dec. 2016. arXiv admin note: text overlap with arXiv:1610.07284
    Subjects: Information Theory (cs.IT)

    Designing decentralized policies for wireless communication networks is a
    crucial problem, which has only been partially solved in the literature so far.
    In this paper, we propose the Decentralized Markov Decision Process (Dec-MDP)
    framework to analyze a wireless sensor network with multiple users which access
    a common wireless channel. We consider devices with energy harvesting
    capabilities, so that they aim at balancing the energy arrivals with the data
    departures and with the probability of colliding with other nodes. Randomly
    over time, an access point triggers a SYNC slot, wherein it recomputes the
    optimal transmission parameters of the whole network, and distributes this
    information. Every node receives its own policy, which specifies how it should
    access the channel in the future, and, thereafter, proceeds in a fully
    decentralized fashion, without interacting with other entities in the network.
    We propose a multi-layer Markov model, where an external MDP manages the jumps
    between SYNC slots, and an internal Dec-MDP computes the optimal policy in the
    near future. We numerically show that, because of the harvesting, a fully
    orthogonal scheme (e.g., TDMA-like) is suboptimal in energy harvesting
    scenarios, and the optimal trade-off lies between an orthogonal and a random
    access system.

    A Construction of Linear Codes and Their Complete Weight Enumerators

    Shudi Yang, Xiangli Kong, Chunming Tang
    Comments: 19 pages
    Subjects: Information Theory (cs.IT)

    Recently, linear codes constructed from defining sets have been studied
    extensively. They may have nice parameters if the defining set is chosen
    properly. Let ( m >2) be a positive integer. For an odd prime ( p ), let (
    r=p^m ) and ( ext{Tr}) be the absolute trace function from (mathbb{F}_r) onto
    (mathbb{F}_p). In this paper, we give a construction of linear codes by
    defining the code ( C_{D}={(mathrm{Tr}(ax))_{xin D}: a in mathbb{F}_{r}
    }, ) where ( D =left{ xin mathbb{F}_{r} : mathrm{Tr}(x)=1,
    mathrm{Tr}(x^2)=0
    ight}. ) Its complete weight enumerator and weight
    enumerator are determined explicitly by employing cyclotomic numbers and Gauss
    sums. In addition, we obtain several optimal linear codes with a few weights.
    They have higher rate compared with other codes, which enables them to have
    essential applications in areas such as association schemes and secret sharing
    schemes.

    Macro diversity in Cellular Networks with Random Blockages

    Abhishek K. Gupta, Jeffrey G. Andrews, Robert W. Heath Jr
    Subjects: Information Theory (cs.IT)

    Blocking objects (blockages) between a transmitter and receiver cause
    wireless communication links to transition from line-of-sight (LOS) to
    non-line-of-sight (NLOS) propagation, which can greatly reduce the received
    power, particularly at higher frequencies such as millimeter wave (mmWave). We
    consider a cellular network in which a mobile user attempts to connect to two
    or more base stations (BSs) simultaneously, to increase the probability of at
    least one LOS link, which is a form of macrodiversity. We develop a framework
    for determining the LOS probability as a function of the number of BSs, when
    taking into account the correlation between blockages: for example, a single
    blockage close to the device — including the user’s own body — could block
    multiple BSs. We consider the impact of the size of blocking objects on the
    system reliability probability and show that macrodiversity gains are higher
    when the blocking objects are small. We also show that the BS density must
    scale as the square of the blockage density to maintain a given level of
    reliability.

    Cover's Open Problem: "The Capacity of the Relay Channel"

    Xiugang Wu, Leighton Pate Barnes, Ayfer Ozgur
    Comments: Submitted to IEEE Trans. on Information Theory
    Subjects: Information Theory (cs.IT)

    Consider a memoryless relay channel, where the channel from the relay to the
    destination is an isolated bit pipe of capacity (C_0). Let (C(C_0)) denote the
    capacity of this channel as a function of (C_0). What is the critical value of
    (C_0) such that (C(C_0)) first equals (C(infty))? This is a long-standing open
    problem posed by Cover and named “The Capacity of the Relay Channel,” in (Open
    Problems in Communication and Computation), Springer-Verlag, 1987. In
    this paper, we answer this question in the Gaussian case and show that (C(C_0))
    can not equal to (C(infty)) unless (C_0=infty), regardless of the SNR of the
    Gaussian channels, while the cut-set bound would suggest that (C(infty)) can
    be achieved at finite (C_0). Our approach is geometric and relies on a
    strengthening of the isoperimetric inequality on the sphere by using the Riesz
    rearrangement inequality.

    Numerically Stable Evaluation of Moments of Random Gram Matrices with Applications

    Khalil Elkhalil, Abla Kammoun, Tareq Y. Al-Naffouri, Mohamed-Slim Alouini
    Comments: Submitted to IEEE signal processing letters
    Subjects: Information Theory (cs.IT)

    This paper is focuses on the computation of the positive moments of one-side
    correlated random Gram matrices. Closed-form expressions for the moments can be
    obtained easily, but numerical evaluation thereof is prone to numerical
    stability, especially in high-dimensional settings. This letter provides a
    numerically stable method that efficiently computes the positive moments in
    closed-form. The developed expressions are more accurate and can lead to higher
    accuracy levels when fed to moment based-approaches. As an application, we show
    how the obtained moments can be used to approximate the marginal distribution
    of the eigenvalues of random Gram matrices.

    A Joint Resource Allocation Scheme for Multi-User Full-Duplex OFDMA Systems

    Yang You, Chong Qin, Yi Gong
    Subjects: Information Theory (cs.IT)

    Full-duplex is a promising technology and great system performance
    enhancement can be achieved especially when we apply it to base stations. In
    this paper, we consider a Multi-User OFDMA network consisting of one
    Full-duplex base station, several uplink users, and a number of downlink users.
    A joint subcarrier, downlink user and uplink user mapping and power allocation
    optimization problem is investigated. In detail, we first decompose the
    3-dimensional mapping problem into three 2-dimensional sub-problems and solve
    them by iteratively using classical Hungarian Method. Then, based on the dual
    method, we sequentially solve the power allocation and 3-dimensional mapping
    problem at each dual point. And the optimal solution to the dual problem is
    derived by using sub-gradient method. Unlike existing methods that only solve
    one sub-problem but with a high computation complexity, we tackle both of them
    and successfully reducing computation complexity from exponential to polynomial
    order. Numerical simulations are conducted to verify the proposed system.

    IRA codes derived from Gruenbaum graph

    Alexander Zhdanov
    Subjects: Information Theory (cs.IT)

    In this paper, we consider coding of short data frames (192 bits) by IRA
    codes. A new interleaver for the IRA codes based on a Gruenbaum graph is
    proposed. The difference of the proposed algorithm from known methods consists
    in the following: permutation is performed by using a match smaller interleaver
    which is derived from the Gruenbaum graph by finding in this graph a
    Hamiltonian path, enumerating the passed vertices in ascending order and
    passing them again in the permuted order through the edges which are not
    included in the Hamiltonian path. For the IRA code the obtained interleaver
    provides 0.7-0.8 db gain over a convolutional code decoded by Viterbi
    algorithm.

    Guessing Attacks on Distributed-Storage Systems

    Annina Bracher, Eran Hof, Amos Lapidoth
    Comments: 76 pages, submitted to IEEE Transactions on Information Theory
    Subjects: Information Theory (cs.IT)

    The secrecy of a distributed-storage system for passwords is studied. The
    encoder, Alice, observes a length-n password and describes it using two hints,
    which she stores in different locations. The legitimate receiver, Bob, observes
    both hints. In one scenario the requirement is that the expected number of
    guesses it takes Bob to guess the password approach one as n tends to infinity,
    and in the other that the expected size of the shortest list that Bob must form
    to guarantee that it contain the password approach one. The eavesdropper, Eve,
    sees only one of the hints. Assuming that Alice cannot control which hints Eve
    observes, the largest normalized (by n) exponent that can be guaranteed for the
    expected number of guesses it takes Eve to guess the password is characterized
    for each scenario. Key to the proof are new results on Arikan’s guessing and
    Bunte and Lapidoth’s task-encoding problem; in particular, the paper
    establishes a close relation between the two problems. A rate-distortion
    version of the model is also discussed, as is a generalization that allows for
    Alice to produce {delta} (not necessarily two) hints, for Bob to observe {
    u}
    (not necessarily two) of the hints, and for Eve to observe {eta} (not
    necessarily one) of the hints. The generalized model is robust against {delta}
    – {
    u} disk failures.

    Arimoto-Rényi Conditional Entropy and Bayesian (M)-ary Hypothesis Testing

    Igal Sason, Sergio Verdú
    Comments: Submitted to the IEEE Trans. on Information Theory, September 2016. A shortened version is submitted to ISIT 2017
    Subjects: Information Theory (cs.IT); Probability (math.PR); Statistics Theory (math.ST)

    This paper gives upper and lower bounds on the minimum error probability of
    Bayesian (M)-ary hypothesis testing in terms of the Arimoto-R’enyi conditional
    entropy of an arbitrary order (alpha). The improved tightness of these bounds
    over their specialized versions with the Shannon conditional entropy
    ((alpha=1)) is demonstrated. In particular, in the case where (M) is finite,
    we show how to generalize Fano’s inequality under both the conventional and
    list-decision settings. As a counterpart to the generalized Fano’s inequality,
    allowing (M) to be infinite, a lower bound on the Arimoto-R’enyi conditional
    entropy is derived as a function of the minimum error probability. Explicit
    upper and lower bounds on the minimum error probability are obtained as a
    function of the Arimoto-R’enyi conditional entropy for both positive and
    negative (alpha). Furthermore, we give upper bounds on the minimum error
    probability as a function of the R’enyi divergence and the Chernoff
    information. In the setup of discrete memoryless channels, we analyze the
    exponentially vanishing decay of the Arimoto-R’enyi conditional entropy of the
    transmitted codeword given the channel output when averaged over a code
    ensemble.

    Joint Power and Admission Control based on Channel Distribution Information: A Novel Two-Timescale Approach

    Qitian Chen, Dong Kang, Yichu He, Tsung-Hui Chang, Ya-Feng Liu
    Comments: 13 pages, 2 figures, Accepted by IEEE Signal Processing Letters
    Subjects: Information Theory (cs.IT)

    In this letter, we consider the joint power and admission control (JPAC)
    problem by assuming that only the channel distribution information (CDI) is
    available. Under this assumption, we formulate a new chance (probabilistic)
    constrained JPAC problem, where the signal to interference plus noise ratio
    (SINR) outage probability of the supported links is enforced to be not greater
    than a prespecified tolerance. To efficiently deal with the chance SINR
    constraint, we employ the sample approximation method to convert them into
    finitely many linear constraints. Then, we propose a convex approximation based
    deflation algorithm for solving the sample approximation JPAC problem. Compared
    to the existing works, this letter proposes a novel two-timescale JPAC
    approach, where admission control is performed by the proposed deflation
    algorithm based on the CDI in a large timescale and transmission power is
    adapted instantly with fast fadings in a small timescale. The effectiveness of
    the proposed algorithm is illustrated by simulations.

    Joint Optimization of Power Splitting and Allocation for SWIPT in Interference Alignment Networks

    Nan Zhao
    Subjects: Information Theory (cs.IT)

    Interference alignment (IA) is a promising solution for interference
    management in wireless networks. On the other hand, simultaneous wireless
    information and power transfer (SWIPT) has become an emerging technique.
    Although some works have been done on IA and SWIPT, these two important areas
    have traditionally been addressed separately in the literature. In this paper,
    we propose to use a common framework to jointly study IA and SWIPT. We analyze
    the performance of SWIPT in IA networks. Specifically, we derive the upper
    bound of the power that can be harvested in IA networks. In addition, we show
    that, to improve the performance of wireless power transfer and information
    transmission, users should be dynamically selected as energy harvesting (EH) or
    information decoding (ID) terminals. Furthermore, we design two
    easy-implemented SWIPT-user selection (SWIPT-US) algorithms in IA networks. To
    optimize the ID and EH performance of SWIPT in IA networks, a power-splitting
    optimization (PSO) algorithm is proposed when power splitters are available,
    and its closed-form optimal solutions are derived. Power allocation in the PSO
    algorithm is also studied to further optimize the performance. Simulation
    results are presented to show the effectiveness of the proposed algorithms.

    Polar Coding for the Binary Erasure Channel with Deletions

    Eldho K. Thomas, Vincent Y. F. Tan, Alexander Vardy, Mehul Motani
    Comments: 4 pages; 1 figure; To appear in the IEEE Communication Letters
    Subjects: Information Theory (cs.IT)

    We study the application of polar codes in deletion channels by analyzing the
    cascade of a binary erasure channel (BEC) and a deletion channel. We show how
    polar codes can be used effectively on a BEC with a single deletion, and
    propose a list decoding algorithm with a cyclic redundancy check for this case.
    The decoding complexity is (O(N^2log N)), where (N) is the blocklength of the
    code. An important contribution is an optimization of the amount of redundancy
    added to minimize the overall error probability. Our theoretical results are
    corroborated by numerical simulations which show that the list size can be
    reduced to one and the original message can be recovered with high probability
    as the length of the code grows.

    Analysis of equivalence relation in joint sparse recovery

    Changlong Wang, Jigen Peng
    Comments: 22 pages, 7 figures
    Subjects: Information Theory (cs.IT)

    The joint sparse recovery problem is a generalization of the single
    measurement vector problem which is widely studied in Compressed Sensing and it
    aims to recovery a set of jointly sparse vectors. i.e. have nonzero entries
    concentrated at common location. Meanwhile l_p-minimization subject to matrices
    is widely used in a large number of algorithms designed for this problem.
    Therefore the main contribution in this paper is two theoretical results about
    this technique. The first one is to prove that in every multiple systems of
    linear equation, there exists a constant p* such that the original unique
    sparse solution also can be recovered from a minimization in l_p quasi-norm
    subject to matrices whenever 0< p<p*. The other one is to show an analysis
    expression of such p*. Finally, we display the results of one example to
    confirm the validity of our conclusions.

    Power-Constrained Secrecy Rate Maximization for Joint Relay and Jammer Selection Assisted Wireless Networks

    Haiyan Guo, Zhen Yang, Linghua Zhang, Jia Zhu, Yulong Zou
    Comments: 14 pages, IEEE Transactions on Communications, 2017
    Subjects: Information Theory (cs.IT)

    In this paper, we examine the physical layer security for cooperative
    wireless networks with multiple intermediate nodes, where the
    decode-and-forward (DF) protocol is considered. We propose a new joint relay
    and jammer selection (JRJS) scheme for protecting wireless communications
    against eavesdropping, where an intermediate node is selected as the relay for
    the sake of forwarding the source signal to the destination and meanwhile, the
    remaining intermediate nodes are employed to act as friendly jammers which
    broadcast the artificial noise for disturbing the eavesdropper. We further
    investigate the power allocation among the source, relay and friendly jammers
    for maximizing the secrecy rate of proposed JRJS scheme and derive a
    closed-form sub-optimal solution. Specificially, all the intermediate nodes
    which successfully decode the source signal are considered as relay candidates.
    For each candidate, we derive the sub-optimal closed-form power allocation
    solution and obtain the secrecy rate result of the corresponding JRJS scheme.
    Then, the candidate which is capable of achieving the highest secrecy rate is
    selected as the relay. Two assumptions about the channel state information
    (CSI), namely the full CSI (FCSI) and partial CSI (PCSI), are considered.
    Simulation results show that the proposed JRJS scheme outperforms the
    conventional pure relay selection, pure jamming and GSVD based beamforming
    schemes in terms of secrecy rate. Additionally, the proposed FCSI based power
    allocation (FCSI-PA) and PCSI based power allocation (PCSI-PA) schemes both
    achieve higher secrecy rates than the equal power allocation (EPA) scheme.

    Variable-Length Lossy Compression Allowing Positive Overflow and Excess Distortion Probabilities

    Shota Saito, Hideki Yagi, Toshiyasu Matsushima
    Subjects: Information Theory (cs.IT)

    This paper investigates the problem of variable-length lossy source coding.
    We deal with the case where both the excess distortion probability and the
    overflow probability of codeword length are less than or equal to positive
    constants. The infimum of the thresholds on the overflow probability is
    characterized by a smooth max entropy-based quantity. Both non-asymptotic and
    asymptotic cases are analyzed. To show the achievability results, we do not
    utilize the random coding argument but give an explicit code construction.

    Cyclotomic Construction of Strong External Difference Families in Finite Fields

    Jiejing Wen, Minghui Yang, Fangwei Fu, Keqin Feng
    Comments: 10 pages
    Subjects: Information Theory (cs.IT); Combinatorics (math.CO)

    Strong external difference family (SEDF) and its generalizations GSEDF,
    BGSEDF in a finite abelian group (G) are combinatorial designs raised by
    Paterson and Stinson [7] in 2016 and have applications in communication theory
    to construct optimal strong algebraic manipulation detection codes. In this
    paper we firstly present some general constructions of these combinatorial
    designs by using difference sets and partial difference sets in (G). Then, as
    applications of the general constructions, we construct series of SEDF, GSEDF
    and BGSEDF in finite fields by using cyclotomic classes.

    The Flow of Information in Interactive Quantum Protocols: the Cost of Forgetting

    Mathieu Lauriere, Dave Touchette
    Comments: v1, 39 pages, 2 figures
    Subjects: Quantum Physics (quant-ph); Computational Complexity (cs.CC); Information Theory (cs.IT)

    In the context of two-party interactive quantum communication protocols, we
    study a recently defined notion of quantum information cost (QIC), which
    possesses most of the important properties of its classical analogue. Although
    this definition has the advantage to be valid for fully quantum inputs and
    tasks, its interpretation for classical tasks remained rather obscure. Also,
    the link between this new notion and other notions of information cost for
    quantum protocols that had previously appeared in the literature was not clear,
    if existent at all.

    We settle both these issues: for quantum communication with classical inputs,
    we provide an alternate characterization of QIC in terms of information about
    the input registers, avoiding any reference to the notion of a purification of
    the classical input state. We provide an exact operational interpretation of
    this alternative characterization as the sum of the cost of transmitting
    information about the classical inputs and the cost of forgetting information
    about these inputs. To obtain this characterization, we prove a general lemma,
    the Information Flow Lemma, assessing exactly the transfer of information in
    general interactive quantum processes. Furthermore, we clarify the link between
    QIC and IC of classical protocols by simulating quantumly classical protocols.

    Finally, we apply these concepts to argue that any quantum protocol that does
    not forget information solves Disjointness on n-bits in Omega (n)
    communication, completely losing the quadratic quantum speedup. This provides a
    specific sense in which forgetting information is a necessary feature of
    interactive quantum protocols. We also apply these concepts to prove that QIC
    at zero-error is exactly n for the Inner Product function, and n (1 – o(1)) for
    a random Boolean function on n+n bits.

    Unitary Reconstruction of Secret for Stabilizer Based Quantum Secret Sharing

    Ryutaroh Matsumoto
    Comments: IEEEtran.cls, 4 pages, no figure. Submitted to the 2017 IEEE ISIT symposium
    Subjects: Quantum Physics (quant-ph); Information Theory (cs.IT)

    We propose a unitary procedure to reconstruct quantum secret for a quantum
    secret sharing scheme constructed from stabilizer quantum error-correcting
    codes. Erasure correcting procedures for stabilizer codes need to add missing
    shares for reconstruction of quantum secret while unitary reconstruction
    procedures for certain class of quantum secret sharing are known to work
    without adding missing shares. The proposed procedure also works without adding
    missing shares.

    Relative Error Bounds for Nonnegative Matrix Factorization under a Geometric Assumption

    Zhaoqiang Liu, Vincent Y. F. Tan
    Comments: 13 pages, 4 figures, journal
    Subjects: Machine Learning (stat.ML); Information Theory (cs.IT); Methodology (stat.ME)

    We propose a geometric assumption on nonnegative data matrices such that
    under this assumption, we are able to provide upper bounds (both deterministic
    and probabilistic) on the relative error of nonnegative matrix factorization
    (NMF). The algorithm we propose first uses the geometric assumption to obtain
    an exact clustering of the columns of the data matrix; subsequently, it employs
    several rank-one NMFs to obtain the final decomposition. When applied to data
    matrices generated from our statistical model, we observe that our proposed
    algorithm produces factor matrices with comparable relative errors vis-a`-vis
    classical NMF algorithms but with much faster speeds. On face image and
    hyperspectral imaging datasets, we demonstrate that our algorithm provides an
    excellent initialization for applying other NMF algorithms at a low
    computational cost. Finally, we show on face and text datasets that the
    combinations of our algorithm and several classical NMF algorithms outperform
    other algorithms in terms of clustering performance.




沪ICP备19023445号-2号
友情链接