IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Mon, 6 Mar 2017

    我爱机器学习(52ml.net)发表于 2017-03-06 00:00:00
    love 0

    Neural and Evolutionary Computing

    On the Behavior of Convolutional Nets for Feature Extraction

    Dario Garcia-Gasulla, Ferran Parés, Armand Vilalta, Jonatan Moreno, Eduard Ayguadé, Jesús Labarta, Ulises Cortés, Toyotaro Suzumura
    Comments: Submitted to Journal of Artificial Intelligence Research Special Track on Deep Learning, Knowledge Representation, and Reasoning
    Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    Convolutional neural networks (CNN) are representation learning techniques
    that achieve state-of-the-art performance on almost every image-related,
    machine learning task. Applying the representation languages build by these
    models to tasks beyond the one they were originally trained for is a field of
    interest known as transfer learning for feature extraction. Through this
    approach, one can apply the image descriptors learnt by a CNN after processing
    millions of images to any dataset, without an expensive training phase.
    Contributions to this field have so far focused on extracting CNN features from
    layers close to the output (e.g., fully connected layers), particularly because
    they work better when used out-of-the-box to feed a classifier. Nevertheless,
    the rest of CNN features is known to encode a wide variety of visual
    information, which could be potentially exploited on knowledge representation
    and reasoning tasks. In this paper we analyze the behavior of each feature
    individually, exploring their intra/inter class activations for all classes of
    three different datasets. From this study we learn that low and middle level
    features behave very differently to high level features, the former being more
    descriptive and the latter being more discriminant. We show how low and middle
    level features can be used for knowledge representation purposes both by their
    presence or by their absence. We also study how much noise these features may
    encode, and propose a thresholding approach to discard most of it. Finally, we
    discuss the potential implications of these results in the context of knowledge
    representation using features extracted from a CNN.

    Large-Scale Evolution of Image Classifiers

    Esteban Real, Sherry Moore, Andrew Selle, Saurabh Saxena, Yutaka Leon Suematsu, Quoc Le, Alex Kurakin
    Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC)

    Neural networks have proven effective at solving difficult problems but
    designing their architectures can be challenging, even for image classification
    problems alone. Evolutionary algorithms provide a technique to discover such
    networks automatically. Despite significant computational requirements, we show
    that evolving models that rival large, hand-designed architectures is possible
    today. We employ simple evolutionary techniques at unprecedented scales to
    discover models for the CIFAR-10 and CIFAR-100 datasets, starting from trivial
    initial conditions. To do this, we use novel and intuitive mutation operators
    that navigate large search spaces. We stress that no human participation is
    required once evolution starts and that the output is a fully-trained model.
    Throughout this work, we place special emphasis on the repeatability of
    results, the variability in the outcomes and the computational requirements.

    Adversarial Examples for Semantic Image Segmentation

    Volker Fischer, Mummadi Chaithanya Kumar, Jan Hendrik Metzen, Thomas Brox
    Comments: ICLR 2017 workshop submission
    Subjects: Machine Learning (stat.ML); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Machine learning methods in general and Deep Neural Networks in particular
    have shown to be vulnerable to adversarial perturbations. So far this
    phenomenon has mainly been studied in the context of whole-image
    classification. In this contribution, we analyse how adversarial perturbations
    can affect the task of semantic segmentation. We show how existing adversarial
    attackers can be transferred to this task and that it is possible to create
    imperceptible adversarial perturbations that lead a deep network to misclassify
    almost all pixels of a chosen class while leaving network prediction nearly
    unchanged outside this class.


    Computer Vision and Pattern Recognition

    Bridging Saliency Detection to Weakly Supervised Object Detection Based on Self-paced Curriculum Learning

    Dingwen Zhang, Deyu Meng, Long Zhao, Junwei Han
    Comments: Has published in IJCAI 16
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Weakly-supervised object detection (WOD) is a challenging problems in
    computer vision. The key problem is to simultaneously infer the exact object
    locations in the training images and train the object detectors, given only the
    training images with weak image-level labels. Intuitively, by simulating the
    selective attention mechanism of human visual system, saliency detection
    technique can select attractive objects in scenes and thus is a potential way
    to provide useful priors for WOD. However, the way to adopt saliency detection
    in WOD is not trivial since the detected saliency region might be possibly
    highly ambiguous in complex cases. To this end, this paper first
    comprehensively analyzes the challenges in applying saliency detection to WOD.
    Then, we make one of the earliest efforts to bridge saliency detection to WOD
    via the self-paced curriculum learning, which can guide the learning procedure
    to gradually achieve faithful knowledge of multi-class objects from easy to
    hard. The experimental results demonstrate that the proposed approach can
    successfully bridge saliency detection and WOD tasks and achieve the
    state-of-the-art object detection results under the weak supervision.

    Instance Flow Based Online Multiple Object Tracking

    Sebastian Bullinger, Christoph Bodensteiner, Michael Arens
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We present a method to perform online Multiple Object Tracking (MOT) of known
    object categories in monocular video data. Current Tracking-by-Detection MOT
    approaches build on top of 2D bounding box detections. In contrast, we exploit
    state-of-the-art instance aware semantic segmentation techniques to compute 2D
    shape representations of target objects in each frame. We predict position and
    shape of segmented instances in subsequent frames by exploiting optical flow
    cues. We define an affinity matrix between instances of subsequent frames which
    reflects locality and visual similarity. The instance association is solved by
    applying the Hungarian method. We evaluate different configurations of our
    algorithm using the MOT 2D 2015 train dataset. The evaluation shows that our
    tracking approach is able to track objects with high relative motions. In
    addition, we provide results of our approach on the MOT 2D 2015 test set for
    comparison with previous works. We achieve a MOTA score of 32.1.

    Incident Light Frequency-based Image Defogging Algorithm

    Wenbo Zhang, Xiaorong Hou
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Considering the problem of color distortion caused by the defogging algorithm
    based on dark channel prior, an improved algorithm was proposed to calculate
    the transmittance of all channels respectively. First, incident light
    frequency’s effect on the transmittance of various color channels was analyzed
    according to the Beer-Lambert’s Law, from which a proportion among various
    channel transmittances was derived; afterwards, images were preprocessed by
    down-sampling to refine transmittance, and then the original size was restored
    to enhance the operational efficiency of the algorithm; finally, the
    transmittance of all color channels was acquired in accordance with the
    proportion, and then the corresponding transmittance was used for image
    restoration in each channel. The experimental results show that compared with
    the existing algorithm, this improved image defogging algorithm could make
    image colors more natural, solve the problem of slightly higher color
    saturation caused by the existing algorithm, and shorten the operation time by
    four to nine times.

    Augmented Reality for Depth Cues in Monocular Minimally Invasive Surgery

    Long Chen, Wen Tang, Nigel W. John, Tao Ruan Wan, Jian Jun Zhang
    Comments: 15 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    One of the major challenges in Minimally Invasive Surgery (MIS) such as
    laparoscopy is the lack of depth perception. In recent years, laparoscopic
    scene tracking and surface reconstruction has been a focus of investigation to
    provide rich additional information to aid the surgical process and compensate
    for the depth perception issue. However, robust 3D surface reconstruction and
    augmented reality with depth perception on the reconstructed scene are yet to
    be reported. This paper presents our work in this area. First, we adopt a
    state-of-the-art visual simultaneous localization and mapping (SLAM) framework
    – ORB-SLAM – and extend the algorithm for use in MIS scenes for reliable
    endoscopic camera tracking and salient point mapping. We then develop a robust
    global 3D surface reconstruction frame- work based on the sparse point clouds
    extracted from the SLAM framework. Our approach is to combine an outlier
    removal filter within a Moving Least Squares smoothing algorithm and then
    employ Poisson surface reconstruction to obtain smooth surfaces from the
    unstructured sparse point cloud. Our proposed method has been quantitatively
    evaluated compared with ground-truth camera trajectories and the organ model
    surface we used to render the synthetic simulation videos. In vivo laparoscopic
    videos used in the tests have demonstrated the robustness and accuracy of our
    proposed framework on both camera tracking and surface reconstruction,
    illustrating the potential of our algorithm for depth augmentation and
    depth-corrected augmented reality in MIS with monocular endoscopes.

    Deep Collaborative Learning for Visual Recognition

    Yan Wang, Lingxi Xie, Ya Zhang, Wenjun Zhang, Alan Yuille
    Comments: Submitted to CVPR 2017 (10 pages, 5 figures)
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Deep neural networks are playing an important role in state-of-the-art visual
    recognition. To represent high-level visual concepts, modern networks are
    equipped with large convolutional layers, which use a large number of filters
    and contribute significantly to model complexity. For example, more than half
    of the weights of AlexNet are stored in the first fully-connected layer (4,096
    filters).

    We formulate the function of a convolutional layer as learning a large visual
    vocabulary, and propose an alternative way, namely Deep Collaborative Learning
    (DCL), to reduce the computational complexity. We replace a convolutional layer
    with a two-stage DCL module, in which we first construct a couple of smaller
    convolutional layers individually, and then fuse them at each spatial position
    to consider feature co-occurrence. In mathematics, DCL can be explained as an
    efficient way of learning compositional visual concepts, in which the
    vocabulary size increases exponentially while the model complexity only
    increases linearly. We evaluate DCL on a wide range of visual recognition
    tasks, including a series of multi-digit number classification datasets, and
    some generic image classification datasets such as SVHN, CIFAR and ILSVRC2012.
    We apply DCL to several state-of-the-art network structures, improving the
    recognition accuracy meanwhile reducing the number of parameters (16.82% fewer
    in AlexNet).

    Context Aware Query Image Representation for Particular Object Retrieval

    Zakaria Laskar, Juho Kannala
    Comments: 14 pages, Extended version of a manuscript submitted to SCIA 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    The current models of image representation based on Convolutional Neural
    Networks (CNN) have shown tremendous performance in image retrieval. Such
    models are inspired by the information flow along the visual pathway in the
    human visual cortex. We propose that in the field of particular object
    retrieval, the process of extracting CNN representations from query images with
    a given region of interest (ROI) can also be modelled by taking inspiration
    from human vision. Particularly, we show that by making the CNN pay attention
    on the ROI while extracting query image representation leads to significant
    improvement over the baseline methods on challenging Oxford5k and Paris6k
    datasets. Furthermore, we propose an extension to a recently introduced
    encoding method for CNN representations, regional maximum activations of
    convolutions (R-MAC). The proposed extension weights the regional
    representations using a novel saliency measure prior to aggregation. This leads
    to further improvement in retrieval accuracy.

    Denoising Adversarial Autoencoders

    Antonia Creswell, Anil Anthony Bharath
    Comments: submitted to journal
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)

    Unsupervised learning is of growing interest because it unlocks the potential
    held in vast amounts of unlabelled data to learn useful representations for
    inference. Autoencoders, a form of generative model, may be trained by learning
    to reconstruct unlabelled input data from a latent representation space. More
    robust representations may be produced by an autoencoder if it learns to
    recover clean input samples from corrupted ones. Representations may be further
    improved by introducing regularisation during training to shape the
    distribution of the encoded data in latent space. We suggest denoising
    adversarial autoencoders, which combine denoising and regularisation, shaping
    the distribution of latent space using adversarial training. We introduce a
    novel analysis that shows how denoising may be incorporated into the training
    and sampling of adversarial autoencoders. Experiments are performed to assess
    the contributions that denoising makes to the learning of representations for
    classification and sample synthesis. Our results suggest that autoencoders
    trained using a denoising criterion achieve higher classification performance,
    and can synthesise samples that are more consistent with the input data than
    those trained without a corruption process.

    EmotioNet Challenge: Recognition of facial expressions of emotion in the wild

    C. Fabian Benitez-Quiroz, Ramprakash Srinivasan, Qianli Feng, Yan Wang, Aleix M. Martinez
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper details the methodology and results of the EmotioNet challenge.
    This challenge is the first to test the ability of computer vision algorithms
    in the automatic analysis of a large number of images of facial expressions of
    emotion in the wild. The challenge was divided into two tracks. The first track
    tested the ability of current computer vision algorithms in the automatic
    detection of action units (AUs). Specifically, we tested the detection of 11
    AUs. The second track tested the algorithms’ ability to recognize emotion
    categories in images of facial expressions. Specifically, we tested the
    recognition of 16 basic and compound emotion categories. The results of the
    challenge suggest that current computer vision and machine learning algorithms
    are unable to reliably solve these two tasks. The limitations of current
    algorithms are more apparent when trying to recognize emotion. We also show
    that current algorithms are not affected by mild resolution changes, small
    occluders, gender or age, but that 3D pose is a major limiting factor on
    performance. We provide an in-depth discussion of the points that need special
    attention moving forward.

    A Survey on Content-Aware Video Analysis for Sports

    Huang-Chia Shih
    Comments: Accepted for publication in IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)

    Sports data analysis is becoming increasingly large-scale, diversified, and
    shared, but difficulty persists in rapidly accessing the most crucial
    information. Previous surveys have focused on the methodologies of sports video
    analysis from the spatiotemporal viewpoint instead of a content-based
    viewpoint, and few of these studies have considered semantics. This study
    develops a deeper interpretation of content-aware sports video analysis by
    examining the insight offered by research into the structure of content under
    different scenarios. On the basis of this insight, we provide an overview of
    the themes particularly relevant to the research on content-aware systems for
    broadcast sports. Specifically, we focus on the video content analysis
    techniques applied in sportscasts over the past decade from the perspectives of
    fundamentals and general review, a content hierarchical model, and trends and
    challenges. Content-aware analysis methods are discussed with respect to
    object-, event-, and context-oriented groups. In each group, the gap between
    sensation and content excitement must be bridged using proper strategies. In
    this regard, a content-aware approach is required to determine user demands.
    Finally, the paper summarizes the future trends and challenges for sports video
    analysis. We believe that our findings can advance the field of research on
    content-aware video analysis for broadcast sports.

    Deep Learning with Domain Adaptation for Accelerated Projection Reconstruction MR

    Yo Seob Han, Jaejun Yoo, Jong Chul Ye
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Purpose: A radial k-space trajectory is one of well-established sampling
    trajectory in magnetic resonance imaging. However, the radial k-space
    trajectory requires a large number of radial lines for high-resolution
    reconstruction. Increasing the number of lines causes longer sampling times,
    making it more difficult for routine clinical use. If we reduce the radial
    lines to reduce the sampling time, streaking artifact patterns are unavoidable.
    To solve this problem, we propose a novel deep learning approach to reconstruct
    high-resolution MR images from the under-sampled k-space data.

    Methods: The proposed deep network estimates the streaking artifacts. Once
    the streaking artifacts are estimated, an artifact-free image is then obtained
    by subtracting the estimated streaking artifacts from the distorted image. In
    the case of the limited number of available radial acquisition data, we apply a
    domain adaptation scheme, which first pre-trains the network with a large
    number of x-ray computed tomography (CT) data sets and then fine-tunes it with
    only a few MR data sets.

    Results: The proposed deep learning method shows better performance than the
    existing compressed sensing algorithms, such as total variation and PR-FOCUSS.
    In addition, the calculation time is several order of magnitude faster than
    total variation and PR-FOCUSS methods.

    Conclusion: The proposed deep learning method surpasses the image quality as
    well as the computation times against the existing compressed sensing
    algorithms. In addition, we demonstrate the possibilities of domain-adaptation
    approach when a limited number of MR data is available.

    Deep artifact learning for compressed sensing and parallel MRI

    Dongwook Lee, Jaejun Yoo, Jong Chul Ye
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Purpose: Compressed sensing MRI (CS-MRI) from single and parallel coils is
    one of the powerful ways to reduce the scan time of MR imaging with performance
    guarantee. However, the computational costs are usually expensive. This paper
    aims to propose a computationally fast and accurate deep learning algorithm for
    the reconstruction of MR images from highly down-sampled k-space data.

    Theory: Based on the topological analysis, we show that the data manifold of
    the aliasing artifact is easier to learn from a uniform subsampling pattern
    with additional low-frequency k-space data. Thus, we develop deep aliasing
    artifact learning networks for the magnitude and phase images to estimate and
    remove the aliasing artifacts from highly accelerated MR acquisition.

    Methods: The aliasing artifacts are directly estimated from the distorted
    magnitude and phase images reconstructed from subsampled k-space data so that
    we can get an aliasing-free images by subtracting the estimated aliasing
    artifact from corrupted inputs. Moreover, to deal with the globally distributed
    aliasing artifact, we develop a multi-scale deep neural network with a large
    receptive field.

    Results: The experimental results confirm that the proposed deep artifact
    learning network effectively estimates and removes the aliasing artifacts.
    Compared to existing CS methods from single and multi-coli data, the proposed
    network shows minimal errors by removing the coherent aliasing artifacts.
    Furthermore, the computational time is by order of magnitude faster.

    Conclusion: As the proposed deep artifact learning network immediately
    generates accurate reconstruction, it has great potential for clinical
    applications.

    Arbitrary-Oriented Scene Text Detection via Rotation Proposals

    Jianqi Ma, Weiyuan Shao, Hao Ye, Li Wang, Hong Wang, Yingbin Zheng, Xiangyang Xue
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper introduces a novel rotation-based framework for arbitrary-oriented
    text detection in natural scene images. We present the Rotation Region Proposal
    Networks (RRPN), which is designed to generate inclined proposals with text
    orientation angle information. The angle information is then adapted for
    bounding box regression to make the proposals more accurately fit into the text
    region in orientation. The Rotation Region-of-Interest (RRoI) pooling layer is
    proposed to project arbitrary-oriented proposals to the feature map for a text
    region classifier. The whole framework is built upon the Faster-RCNN
    architecture, which ensures the computational efficiency of the
    arbitrary-oriented text detection comparing with previous text detection
    systems. We conduct experiments using the rotation-based framework on three
    real-world scene text detection datasets, and demonstrate its superiority in
    terms of effectiveness and efficiency over previous approaches.

    Skin Lesion Classification using Class Activation Map

    Xi Jia, Linlin Shen
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We proposed a two stage framework with only one network to analyze skin
    lesion images, we firstly trained a convolutional network to classify these
    images, and cropped the import regions which the network has the maximum
    activation value. In the second stage, we retrained this CNN with the image
    regions extracted from stage one and output the final probabilities. The two
    stage framework achieved a mean AUC of 0.857 in ISIC-2017 skin lesion
    validation set and is 0.04 higher than that of the original inputs, 0.821.

    Outlier Cluster Formation in Spectral Clustering

    Takuro Ina, Atsushi Hashimoto, Masaaki Iiyama, Hidekazu Kasahara, Mikihiko Mori, Michihiko Minoh
    Comments: 10 pages, 2 figures, 2 tables
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Outlier detection and cluster number estimation is an important issue for
    clustering real data. This paper focuses on spectral clustering, a time-tested
    clustering method, and reveals its important properties related to outliers.
    The highlights of this paper are the following two mathematical observations:
    first, spectral clustering’s intrinsic property of an outlier cluster
    formation, and second, the singularity of an outlier cluster with a valid
    cluster number. Based on these observations, we designed a function that
    evaluates clustering and outlier detection results. In experiments, we prepared
    two scenarios, face clustering in photo album and person re-identification in a
    camera network. We confirmed that the proposed method detects outliers and
    estimates the number of clusters properly in both problems. Our method
    outperforms state-of-the-art methods in both the 128-dimensional sparse space
    for face clustering and the 4,096-dimensional non-sparse space for person
    re-identification.

    A Novel Multi-task Deep Learning Model for Skin Lesion Segmentation and Classification

    Xulei Yang, Zeng Zeng, Si Yong Yeo, Colin Tan, Hong Liang Tey, Yi Su
    Comments: Submission to support ISIC 2017 challenge results
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this study, a multi-task deep neural network is proposed for skin lesion
    analysis. The proposed multi-task learning model solves different tasks (e.g.,
    lesion segmentation and two independent binary lesion classifications) at the
    same time by exploiting commonalities and differences across tasks. This
    results in improved learning efficiency and potential prediction accuracy for
    the task-specific models, when compared to training the individual models
    separately. The proposed multi-task deep learning model is trained and
    evaluated on the dermoscopic image sets from the International Skin Imaging
    Collaboration (ISIC) 2017 Challenge – Skin Lesion Analysis towards Melanoma
    Detection, which consists of 2000 training samples and 150 evaluation samples.
    The experimental results show that the proposed multi-task deep learning model
    achieves promising performances on skin lesion segmentation and classification.
    The average value of Jaccard index for lesion segmentation is 0.724, while the
    average values of area under the receiver operating characteristic curve (AUC)
    on two individual lesion classifications are 0.880 and 0.972, respectively.

    Adversarial Examples for Semantic Image Segmentation

    Volker Fischer, Mummadi Chaithanya Kumar, Jan Hendrik Metzen, Thomas Brox
    Comments: ICLR 2017 workshop submission
    Subjects: Machine Learning (stat.ML); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Machine learning methods in general and Deep Neural Networks in particular
    have shown to be vulnerable to adversarial perturbations. So far this
    phenomenon has mainly been studied in the context of whole-image
    classification. In this contribution, we analyse how adversarial perturbations
    can affect the task of semantic segmentation. We show how existing adversarial
    attackers can be transferred to this task and that it is possible to create
    imperceptible adversarial perturbations that lead a deep network to misclassify
    almost all pixels of a chosen class while leaving network prediction nearly
    unchanged outside this class.

    Large-Scale Evolution of Image Classifiers

    Esteban Real, Sherry Moore, Andrew Selle, Saurabh Saxena, Yutaka Leon Suematsu, Quoc Le, Alex Kurakin
    Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC)

    Neural networks have proven effective at solving difficult problems but
    designing their architectures can be challenging, even for image classification
    problems alone. Evolutionary algorithms provide a technique to discover such
    networks automatically. Despite significant computational requirements, we show
    that evolving models that rival large, hand-designed architectures is possible
    today. We employ simple evolutionary techniques at unprecedented scales to
    discover models for the CIFAR-10 and CIFAR-100 datasets, starting from trivial
    initial conditions. To do this, we use novel and intuitive mutation operators
    that navigate large search spaces. We stress that no human participation is
    required once evolution starts and that the output is a fully-trained model.
    Throughout this work, we place special emphasis on the repeatability of
    results, the variability in the outcomes and the computational requirements.

    Learning Robot Activities from First-Person Human Videos Using Convolutional Future Regression

    Jangwon Lee, Michael S. Ryoo
    Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    We design a new approach that allows robot learning of new activities from
    unlabeled human example videos. Given videos of humans executing the same
    activity from a human’s viewpoint (i.e., first-person videos), our objective is
    to make the robot learn the temporal structure of the activity as its future
    regression network, and learn to transfer such model for its own motor
    execution. We present a new deep learning model: We extend the state-of-the-art
    convolutional object detection network for the detection of human hands in
    training videos based on image information, and newly introduce the concept of
    using a fully convolutional network to regress (i.e., predict) the intermediate
    scene representation corresponding to the future frame (e.g., 1-2 seconds
    later). Combining these allows direct prediction of future locations of human
    hands and objects, which enables the robot to infer the motor control plan
    using our manipulation network. We experimentally confirm that our approach
    makes learning of robot activities from unlabeled human interaction videos
    possible, and demonstrate that our robot is able to execute the learned
    collaborative activities in real-time directly based on its camera input.

    Estimating the resolution of real images

    Ryuta Mizutani, Rino Saiga, Susumu Takekoshi, Chie Inomoto, Naoya Nakamura, Makoto Arai, Kenichi Oshima, Masanari Itokawa, Akihisa Takeuchi, Kentaro Uesugi, Yasuko Terada, Yoshio Suzuki
    Comments: 4 pages, 2 figures
    Subjects: Data Analysis, Statistics and Probability (physics.data-an); Computer Vision and Pattern Recognition (cs.CV); Medical Physics (physics.med-ph)

    Image resolvability is the primary concern in imaging. This paper reports an
    estimation of the full width at half maximum of the point spread function from
    a Fourier domain plot of real sample images by neither using test objects, nor
    defining a threshold criterion. We suggest that this method can be applied to
    any type of image, independently of the imaging modality.

    Belief Propagation in Conditional RBMs for Structured Prediction

    Wei Ping, Alexander Ihler
    Comments: Artificial Intelligence and Statistics (AISTATS) 2017
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

    Restricted Boltzmann machines~(RBMs) and conditional RBMs~(CRBMs) are popular
    models for a wide range of applications. In previous work, learning on such
    models has been dominated by contrastive divergence~(CD) and its variants.
    Belief propagation~(BP) algorithms are believed to be slow for structured
    prediction on conditional RBMs~(e.g., Mnih et al. [2011]), and not as good as
    CD when applied in learning~(e.g., Larochelle et al. [2012]). In this work, we
    present a matrix-based implementation of belief propagation algorithms on
    CRBMs, which is easily scalable to tens of thousands of visible and hidden
    units. We demonstrate that, in both maximum likelihood and max-margin learning,
    training conditional RBMs with BP as the inference routine can provide
    significantly better results than current state-of-the-art CD methods on
    structured prediction problems. We also include practical guidelines on
    training CRBMs with BP, and some insights on the interaction of learning and
    inference algorithms for CRBMs.

    A Restaurant Process Mixture Model for Connectivity Based Parcellation of the Cortex

    Daniel Moyer, Boris A Gutman, Neda Jahanshad, Paul M. Thompson
    Comments: In the Proceedings of Information Processing in Medical Imaging 2017
    Subjects: Neurons and Cognition (q-bio.NC); Computational Engineering, Finance, and Science (cs.CE); Computer Vision and Pattern Recognition (cs.CV); Quantitative Methods (q-bio.QM); Applications (stat.AP)

    One of the primary objectives of human brain mapping is the division of the
    cortical surface into functionally distinct regions, i.e. parcellation. While
    it is generally agreed that at macro-scale different regions of the cortex have
    different functions, the exact number and configuration of these regions is not
    known. Methods for the discovery of these regions are thus important,
    particularly as the volume of available information grows. Towards this end, we
    present a parcellation method based on a Bayesian non-parametric mixture model
    of cortical connectivity.

    Depth Estimation using Modified Cost Function for Occlusion Handling

    Krzysztof Wegner, Olgierd Stankiewicz
    Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV)

    The paper presents a novel approach to occlusion handling problem in depth
    estimation using three views. A solution based on modification of similarity
    cost function is proposed. During the depth estimation via optimization
    algorithms like Graph Cut similarity metric is constantly updated so that only
    non-occluded fragments in side views are considered. At each iteration of the
    algorithm non-occluded fragments are detected based on side view virtual depth
    maps synthesized from the best currently estimated depth map of the center
    view. Then similarity metric is updated for correspondence search only in
    non-occluded regions of the side views. The experimental results, conducted on
    well-known 3D video test sequences, have proved that the depth maps estimated
    with the proposed approach provide about 1.25 dB virtual view quality
    improvement in comparison to the virtual view synthesized based on depth maps
    generated by the state-of-the-art MPEG Depth Estimation Reference Software.


    Artificial Intelligence

    Actor-Critic Reinforcement Learning with Simultaneous Human Control and Feedback

    Kory W. Mathewson, Patrick M. Pilarski
    Comments: 10 pages, 2 pages of references, 8 figures. Under review for the 34th International Conference on Machine Learning,Sydney, Australia, 2017. Copyright 2017 by the authors
    Subjects: Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Robotics (cs.RO)

    This paper contributes a first study into how different human users deliver
    simultaneous control and feedback signals during human-robot interaction. As
    part of this work, we formalize and present a general interactive learning
    framework for online cooperation between humans and reinforcement learning
    agents. In many human-machine interaction settings, there is a growing gap
    between the degrees-of-freedom of complex semi-autonomous systems and the
    number of human control channels. Simple human control and feedback mechanisms
    are required to close this gap and allow for better collaboration between
    humans and machines on complex tasks. To better inform the design of concurrent
    control and feedback interfaces, we present experimental results from a
    human-robot collaborative domain wherein the human must simultaneously deliver
    both control and feedback signals to interactively train an actor-critic
    reinforcement learning robot. We compare three experimental conditions: 1)
    human delivered control signals, 2) reward-shaping feedback signals, and 3)
    simultaneous control and feedback. Our results suggest that subjects provide
    less feedback when simultaneously delivering feedback and control signals and
    that control signal quality is not significantly diminished. Our data suggest
    that subjects may also modify when and how they provide feedback. Through
    algorithmic development and tuning informed by this study, we expect
    semi-autonomous actions of robotic agents can be better shaped by human
    feedback, allowing for seamless collaboration and improved performance in
    difficult interactive domains.

    FeUdal Networks for Hierarchical Reinforcement Learning

    Alexander Sasha Vezhnevets, Simon Osindero, Tom Schaul, Nicolas Heess, Max Jaderberg, David Silver, Koray Kavukcuoglu
    Subjects: Artificial Intelligence (cs.AI)

    We introduce FeUdal Networks (FuNs): a novel architecture for hierarchical
    reinforcement learning. Our approach is inspired by the feudal reinforcement
    learning proposal of Dayan and Hinton, and gains power and efficacy by
    decoupling end-to-end learning across multiple levels — allowing it to utilise
    different resolutions of time. Our framework employs a Manager module and a
    Worker module. The Manager operates at a lower temporal resolution and sets
    abstract goals which are conveyed to and enacted by the Worker. The Worker
    generates primitive actions at every tick of the environment. The decoupled
    structure of FuN conveys several benefits — in addition to facilitating very
    long timescale credit assignment it also encourages the emergence of
    sub-policies associated with different goals set by the Manager. These
    properties allow FuN to dramatically outperform a strong baseline agent on
    tasks that involve long-term credit assignment or memorisation. We demonstrate
    the performance of our proposed system on a range of tasks from the ATARI suite
    and also from a 3D DeepMind Lab environment.

    Sequential Plan Recognition

    Reuth Mirsky, Roni Stern, Ya'akov (Kobi)Gal, Meir Kalech
    Subjects: Artificial Intelligence (cs.AI)

    Plan recognition algorithms infer agents’ plans from their observed actions.
    Due to imperfect knowledge about the agent’s behavior and the environment, it
    is often the case that there are multiple hypotheses about an agent’s plans
    that are consistent with the observations, though only one of these hypotheses
    is correct. This paper addresses the problem of how to disambiguate between
    hypotheses, by querying the acting agent about whether a candidate plan in one
    of the hypotheses matches its intentions. This process is performed
    sequentially and used to update the set of possible hypotheses during the
    recognition process. The paper defines the sequential plan recognition process
    (SPRP), which seeks to reduce the number of hypotheses using a minimal number
    of queries. We propose a number of policies for the SPRP which use maximum
    likelihood and information gain to choose which plan to query. We show this
    approach works well in practice on two domains from the literature,
    significantly reducing the number of hypotheses using fewer queries than a
    baseline approach. Our results can inform the design of future plan recognition
    systems that interleave the recognition process with intelligent interventions
    of their users.

    Unsupervised Basis Function Adaptation for Reinforcement Learning

    Edward W. Barker, Charl J. Ras
    Comments: Extended abstract submitted (3 March 2017) for 3rd Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM) 2017
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    When using reinforcement learning (RL) algorithms to evaluate a policy it is
    common, given a large state space, to introduce some form of approximation
    architecture for the value function (VF). The exact form of this architecture
    can have a significant effect on the accuracy of the VF estimate, however, and
    determining a suitable approximation architecture can often be a highly complex
    task. Consequently there is a large amount of interest in the potential for
    allowing RL algorithms to adaptively generate approximation architectures.

    We investigate a method of adapting approximation architectures which uses
    feedback regarding the frequency with which an agent has visited certain states
    to guide which areas of the state space to approximate with greater detail.
    This method is “unsupervised” in the sense that it makes no direct reference to
    reward or the VF estimate. We introduce an algorithm based upon this idea which
    adapts a state aggregation approximation architecture on-line.

    A common method of scoring a VF estimate is to weight the squared Bellman
    error of each state-action by the probability of that state-action occurring.
    Adopting this scoring method, and assuming (S) states, we demonstrate
    theoretically that – provided (1) the number of cells (X) in the state
    aggregation architecture is of order (sqrt{S}log_2{S}ln{S}) or greater, (2)
    the policy and transition function are close to deterministic, and (3) the
    prior for the transition function is uniformly distributed – our algorithm,
    used in conjunction with a suitable RL algorithm, can guarantee a score which
    is arbitrarily close to zero as (S) becomes large. It is able to do this
    despite having only (O(X log_2S)) space complexity and negligible time
    complexity. The results take advantage of certain properties of the stationary
    distributions of Markov chains.

    Stochastic Separation Theorems

    A.N. Gorban, I.Y. Tyukin
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

    A set (S) is linearly separable if each (xin S) can be separated from the
    rest of (S) by a linear functional. We study random (N)-element sets in
    (mathbb{R}^n) for large (n) and demonstrate that for (N<aexp(b{n})) they are
    linearly separable with probability (p), (p>1-vartheta), for a given (small)
    (vartheta>0). Constants (a,b>0) depend on the probability distribution and the
    constant (vartheta). The results are important for machine learning in high
    dimension, especially for correction of unavoidable mistakes of legacy
    Artificial Intelligence systems.

    On the Behavior of Convolutional Nets for Feature Extraction

    Dario Garcia-Gasulla, Ferran Parés, Armand Vilalta, Jonatan Moreno, Eduard Ayguadé, Jesús Labarta, Ulises Cortés, Toyotaro Suzumura
    Comments: Submitted to Journal of Artificial Intelligence Research Special Track on Deep Learning, Knowledge Representation, and Reasoning
    Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    Convolutional neural networks (CNN) are representation learning techniques
    that achieve state-of-the-art performance on almost every image-related,
    machine learning task. Applying the representation languages build by these
    models to tasks beyond the one they were originally trained for is a field of
    interest known as transfer learning for feature extraction. Through this
    approach, one can apply the image descriptors learnt by a CNN after processing
    millions of images to any dataset, without an expensive training phase.
    Contributions to this field have so far focused on extracting CNN features from
    layers close to the output (e.g., fully connected layers), particularly because
    they work better when used out-of-the-box to feed a classifier. Nevertheless,
    the rest of CNN features is known to encode a wide variety of visual
    information, which could be potentially exploited on knowledge representation
    and reasoning tasks. In this paper we analyze the behavior of each feature
    individually, exploring their intra/inter class activations for all classes of
    three different datasets. From this study we learn that low and middle level
    features behave very differently to high level features, the former being more
    descriptive and the latter being more discriminant. We show how low and middle
    level features can be used for knowledge representation purposes both by their
    presence or by their absence. We also study how much noise these features may
    encode, and propose a thresholding approach to discard most of it. Finally, we
    discuss the potential implications of these results in the context of knowledge
    representation using features extracted from a CNN.

    Large-Scale Evolution of Image Classifiers

    Esteban Real, Sherry Moore, Andrew Selle, Saurabh Saxena, Yutaka Leon Suematsu, Quoc Le, Alex Kurakin
    Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC)

    Neural networks have proven effective at solving difficult problems but
    designing their architectures can be challenging, even for image classification
    problems alone. Evolutionary algorithms provide a technique to discover such
    networks automatically. Despite significant computational requirements, we show
    that evolving models that rival large, hand-designed architectures is possible
    today. We employ simple evolutionary techniques at unprecedented scales to
    discover models for the CIFAR-10 and CIFAR-100 datasets, starting from trivial
    initial conditions. To do this, we use novel and intuitive mutation operators
    that navigate large search spaces. We stress that no human participation is
    required once evolution starts and that the output is a fully-trained model.
    Throughout this work, we place special emphasis on the repeatability of
    results, the variability in the outcomes and the computational requirements.

    Learning Robot Activities from First-Person Human Videos Using Convolutional Future Regression

    Jangwon Lee, Michael S. Ryoo
    Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    We design a new approach that allows robot learning of new activities from
    unlabeled human example videos. Given videos of humans executing the same
    activity from a human’s viewpoint (i.e., first-person videos), our objective is
    to make the robot learn the temporal structure of the activity as its future
    regression network, and learn to transfer such model for its own motor
    execution. We present a new deep learning model: We extend the state-of-the-art
    convolutional object detection network for the detection of human hands in
    training videos based on image information, and newly introduce the concept of
    using a fully convolutional network to regress (i.e., predict) the intermediate
    scene representation corresponding to the future frame (e.g., 1-2 seconds
    later). Combining these allows direct prediction of future locations of human
    hands and objects, which enables the robot to infer the motor control plan
    using our manipulation network. We experimentally confirm that our approach
    makes learning of robot activities from unlabeled human interaction videos
    possible, and demonstrate that our robot is able to execute the learned
    collaborative activities in real-time directly based on its camera input.

    End-to-End Task-Completion Neural Dialogue Systems

    Xuijun Li, Yun-Nung Chen, Lihong Li, Jianfeng Gao
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

    This paper presents an end-to-end learning framework for task-completion
    neural dialogue systems, which leverages supervised and reinforcement learning
    with various deep-learning models. The system is able to interface with a
    structured database, and interact with users for assisting them to access
    information and complete tasks such as booking movie tickets. Our experiments
    in a movie-ticket booking domain show the proposed system outperforms a
    modular-based dialogue system and is more robust to noise produced by other
    components in the system.

    A Laplacian Framework for Option Discovery in Reinforcement Learning

    Marlos C. Machado, Marc G. Bellemare, Michael Bowling
    Comments: Version submitted to the 34th International Conference on Machine Learning (ICML)
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

    Representation learning and option discovery are two of the biggest
    challenges in reinforcement learning (RL). Proto-RL is a well known approach
    for representation learning in MDPs. The representations learned with this
    framework are called proto-value functions (PVFs). In this paper we address the
    option discovery problem by showing how PVFs implicitly define options. We do
    it by introducing eigenpurposes, intrinsic reward functions derived from the
    learned representations. The options discovered from eigenpurposes traverse the
    principal directions of the state space. They are useful for multiple tasks
    because they are independent of the agents’ intentions. Moreover, by capturing
    the diffusion process of a random walk, different options act at different time
    scales, making them helpful for exploration strategies. We demonstrate features
    of eigenpurposes in traditional tabular domains as well as in Atari 2600 games.

    Controllable Text Generation

    Zhiting Hu, Zichao Yang, Xiaodan Liang, Ruslan Salakhutdinov, Eric P. Xing
    Comments: updated discussions and references
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (stat.ML)

    Generic generation and manipulation of text is challenging and has limited
    success compared to recent deep generative modeling in visual domain. This
    paper aims at generating plausible natural language sentences, whose attributes
    are dynamically controlled by learning disentangled latent representations with
    designated semantics. We propose a new neural generative model which combines
    variational auto-encoders and holistic attribute discriminators for effective
    imposition of semantic structures. With differentiable approximation to
    discrete text samples, explicit constraints on independent attribute controls,
    and efficient collaborative learning of generator and discriminators, our model
    learns highly interpretable representations from even only word annotations,
    and produces realistic sentences with desired attributes. Quantitative
    evaluation validates the accuracy of sentence and attribute generation.

    DAWT: Densely Annotated Wikipedia Texts across multiple languages

    Nemanja Spasojevic, Preeti Bhargava, Guoning Hu
    Comments: 8 pages, 3 figures, 7 tables, WWW2017, WWW 2017 Companion proceedings
    Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Social and Information Networks (cs.SI)

    In this work, we open up the DAWT dataset – Densely Annotated Wikipedia Texts
    across multiple languages. The annotations include labeled text mentions
    mapping to entities (represented by their Freebase machine ids) as well as the
    type of the entity. The data set contains total of 13.6M articles, 5.0B tokens,
    13.8M mention entity co-occurrences. DAWT contains 4.8 times more anchor text
    to entity links than originally present in the Wikipedia markup. Moreover, it
    spans several languages including English, Spanish, Italian, German, French and
    Arabic. We also present the methodology used to generate the dataset which
    enriches Wikipedia markup in order to increase number of links. In addition to
    the main dataset, we open up several derived datasets including mention entity
    co-occurrence counts and entity embeddings, as well as mappings between
    Freebase ids and Wikidata item ids. We also discuss two applications of these
    datasets and hope that opening them up would prove useful for the Natural
    Language Processing and Information Retrieval communities, as well as
    facilitate multi-lingual research.


    Information Retrieval

    Employing Spectral Domain Features for Efficient Collaborative Filtering

    Doaa M. Shawky
    Subjects: Information Retrieval (cs.IR)

    Collaborative filtering (CF) is a powerful recommender system that generates
    a list of recommended items for an active user based on the ratings of similar
    users. This paper presents a novel approach to CF by first finding the set of
    users similar to the active user by adopting self-organizing maps (SOM),
    followed by k-means clustering. Then, the ratings for each item in the cluster
    closest to the active user are mapped to the frequency domain using the
    Discrete Fourier Transform (DFT). The power spectra of the mapped ratings are
    generated, and a new similarity measure based on the coherence of these power
    spectra is calculated. The proposed similarity measure is more time efficient
    than current state-of-the-art measures. Moreover, it can capture the global
    similarity between the profiles of users. Experimental results show that the
    proposed approach overcomes the major problems in existing CF algorithms as
    follows: First, it mitigates the scalability problem by creating clusters of
    similar users and applying the time-efficient similarity measure. Second, its
    frequency-based similarity measure is less sensitive to sparsity problems
    because the DFT performs efficiently even with sparse data. Third, it
    outperforms standard similarity measures in terms of accuracy.

    DAWT: Densely Annotated Wikipedia Texts across multiple languages

    Nemanja Spasojevic, Preeti Bhargava, Guoning Hu
    Comments: 8 pages, 3 figures, 7 tables, WWW2017, WWW 2017 Companion proceedings
    Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Social and Information Networks (cs.SI)

    In this work, we open up the DAWT dataset – Densely Annotated Wikipedia Texts
    across multiple languages. The annotations include labeled text mentions
    mapping to entities (represented by their Freebase machine ids) as well as the
    type of the entity. The data set contains total of 13.6M articles, 5.0B tokens,
    13.8M mention entity co-occurrences. DAWT contains 4.8 times more anchor text
    to entity links than originally present in the Wikipedia markup. Moreover, it
    spans several languages including English, Spanish, Italian, German, French and
    Arabic. We also present the methodology used to generate the dataset which
    enriches Wikipedia markup in order to increase number of links. In addition to
    the main dataset, we open up several derived datasets including mention entity
    co-occurrence counts and entity embeddings, as well as mappings between
    Freebase ids and Wikidata item ids. We also discuss two applications of these
    datasets and hope that opening them up would prove useful for the Natural
    Language Processing and Information Retrieval communities, as well as
    facilitate multi-lingual research.

    Deconvolving Feedback Loops in Recommender Systems

    Ayan Sinha, David F. Gleich, Karthik Ramani
    Comments: Neural Information Processing Systems, 2016
    Subjects: Social and Information Networks (cs.SI); Information Retrieval (cs.IR)

    Collaborative filtering is a popular technique to infer users’ preferences on
    new content based on the collective information of all users preferences.
    Recommender systems then use this information to make personalized suggestions
    to users. When users accept these recommendations it creates a feedback loop in
    the recommender system, and these loops iteratively influence the collaborative
    filtering algorithm’s predictions over time. We investigate whether it is
    possible to identify items affected by these feedback loops. We state
    sufficient assumptions to deconvolve the feedback loops while keeping the
    inverse solution tractable. We furthermore develop a metric to unravel the
    recommender system’s influence on the entire user-item rating matrix. We use
    this metric on synthetic and real-world datasets to (1) identify the extent to
    which the recommender system affects the final rating matrix, (2) rank
    frequently recommended items, and (3) distinguish whether a user’s rated item
    was recommended or an intrinsic preference. Our results indicate that it is
    possible to recover the ratings matrix of intrinsic user preferences using a
    single snapshot of the ratings matrix without any temporal information.


    Computation and Language

    Exponential Moving Average Model in Parallel Speech Recognition Training

    Xu Tian, Jun Zhang, Zejun Ma, Yi He, Juan Wei
    Comments: 5 pages
    Subjects: Computation and Language (cs.CL)

    As training data rapid growth, large-scale parallel training with multi-GPUs
    cluster is widely applied in the neural network model learning currently.We
    present a new approach that applies exponential moving average method in
    large-scale parallel training of neural network model. It is a non-interference
    strategy that the exponential moving average model is not broadcasted to
    distributed workers to update their local models after model synchronization in
    the training process, and it is implemented as the final model of the training
    system. Fully-connected feed-forward neural networks (DNNs) and deep
    unidirectional Long short-term memory (LSTM) recurrent neural networks (RNNs)
    are successfully trained with proposed method for large vocabulary continuous
    speech recognition on Shenma voice search data in Mandarin. The character error
    rate (CER) of Mandarin speech recognition further degrades than
    state-of-the-art approaches of parallel training.

    End-to-End Task-Completion Neural Dialogue Systems

    Xuijun Li, Yun-Nung Chen, Lihong Li, Jianfeng Gao
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

    This paper presents an end-to-end learning framework for task-completion
    neural dialogue systems, which leverages supervised and reinforcement learning
    with various deep-learning models. The system is able to interface with a
    structured database, and interact with users for assisting them to access
    information and complete tasks such as booking movie tickets. Our experiments
    in a movie-ticket booking domain show the proposed system outperforms a
    modular-based dialogue system and is more robust to noise produced by other
    components in the system.

    A Comparative Study of Word Embeddings for Reading Comprehension

    Bhuwan Dhingra, Hanxiao Liu, Ruslan Salakhutdinov, William W. Cohen
    Subjects: Computation and Language (cs.CL)

    The focus of past machine learning research for Reading Comprehension tasks
    has been primarily on the design of novel deep learning architectures. Here we
    show that seemingly minor choices made on (1) the use of pre-trained word
    embeddings, and (2) the representation of out-of-vocabulary tokens at test
    time, can turn out to have a larger impact than architectural choices on the
    final performance. We systematically explore several options for these choices,
    and provide recommendations to researchers working in this area.

    Controllable Text Generation

    Zhiting Hu, Zichao Yang, Xiaodan Liang, Ruslan Salakhutdinov, Eric P. Xing
    Comments: updated discussions and references
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (stat.ML)

    Generic generation and manipulation of text is challenging and has limited
    success compared to recent deep generative modeling in visual domain. This
    paper aims at generating plausible natural language sentences, whose attributes
    are dynamically controlled by learning disentangled latent representations with
    designated semantics. We propose a new neural generative model which combines
    variational auto-encoders and holistic attribute discriminators for effective
    imposition of semantic structures. With differentiable approximation to
    discrete text samples, explicit constraints on independent attribute controls,
    and efficient collaborative learning of generator and discriminators, our model
    learns highly interpretable representations from even only word annotations,
    and produces realistic sentences with desired attributes. Quantitative
    evaluation validates the accuracy of sentence and attribute generation.

    DAWT: Densely Annotated Wikipedia Texts across multiple languages

    Nemanja Spasojevic, Preeti Bhargava, Guoning Hu
    Comments: 8 pages, 3 figures, 7 tables, WWW2017, WWW 2017 Companion proceedings
    Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Social and Information Networks (cs.SI)

    In this work, we open up the DAWT dataset – Densely Annotated Wikipedia Texts
    across multiple languages. The annotations include labeled text mentions
    mapping to entities (represented by their Freebase machine ids) as well as the
    type of the entity. The data set contains total of 13.6M articles, 5.0B tokens,
    13.8M mention entity co-occurrences. DAWT contains 4.8 times more anchor text
    to entity links than originally present in the Wikipedia markup. Moreover, it
    spans several languages including English, Spanish, Italian, German, French and
    Arabic. We also present the methodology used to generate the dataset which
    enriches Wikipedia markup in order to increase number of links. In addition to
    the main dataset, we open up several derived datasets including mention entity
    co-occurrence counts and entity embeddings, as well as mappings between
    Freebase ids and Wikidata item ids. We also discuss two applications of these
    datasets and hope that opening them up would prove useful for the Natural
    Language Processing and Information Retrieval communities, as well as
    facilitate multi-lingual research.


    Distributed, Parallel, and Cluster Computing

    A Layered Architecture for Erasure-Coded Consistent Distributed Storage

    Kishori M. Konwar, N. Prakash, Nancy Lynch, Muriel Medard
    Comments: 22 pages
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Information Theory (cs.IT)

    Motivated by emerging applications to the edge computing paradigm, we
    introduce a two-layer erasure-coded fault-tolerant distributed storage system
    offering atomic access for read and write operations. In edge computing,
    clients interact with an edge-layer of servers that is geographically near; the
    edge-layer in turn interacts with a back-end layer of servers. The edge-layer
    provides low latency access and temporary storage for client operations, and
    uses the back-end layer for persistent storage. Our algorithm, termed Layered
    Data Storage (LDS) algorithm, offers several features suitable for
    edge-computing systems, works under asynchronous message-passing environments,
    supports multiple readers and writers, and can tolerate (f_1 < n_1/2) and (f_2
    < n_2/3) crash failures in the two layers having (n_1) and (n_2) servers,
    respectively. We use a class of erasure codes known as regenerating codes for
    storage of data in the back-end layer. The choice of regenerating codes,
    instead of popular choices like Reed-Solomon codes, not only optimizes the cost
    of back-end storage, but also helps in optimizing communication cost of read
    operations, when the value needs to be recreated all the way from the back-end.
    The two-layer architecture permits a modular implementation of atomicity and
    erasure-code protocols; the implementation of erasure-codes is mostly limited
    to interaction between the two layers. We prove liveness and atomicity of LDS,
    and also compute performance costs associated with read and write operations.
    Further, in a multi-object system running (N) independent instances of LDS,
    where only a small fraction of the objects undergo concurrent accesses at any
    point during the execution, the overall storage cost is dominated by that of
    persistent storage in the back-end layer, and is given by (Theta(N)).

    Workload Analysis of Blue Waters

    Matthew D. Jones, Joseph P. White, Martins Innus, Robert L. DeLeon, Nikolay Simakov, Jeffrey T. Palmer, Steven M. Gallo, Thomas R. Furlani, Michael Showerman, Robert Brunner, Andry Kot, Gregory Bauer, Brett Bode, Jeremy Enos, William Kramer
    Comments: 107 pages, &gt;100 figures (figure sizes reduced to save space, contact authors for version with full resolution)
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Blue Waters is a Petascale-level supercomputer whose mission is to enable the
    national scientific and research community to solve “grand challenge” problems
    that are orders of magnitude more complex than can be carried out on other high
    performance computing systems. Given the important and unique role that Blue
    Waters plays in the U.S. research portfolio, it is important to have a detailed
    understanding of its workload in order to guide performance optimization both
    at the software and system configuration level as well as inform architectural
    balance tradeoffs. Furthermore, understanding the computing requirements of the
    Blue Water’s workload (memory access, IO, communication, etc.), which is
    comprised of some of the most computationally demanding scientific problems,
    will help drive changes in future computing architectures, especially at the
    leading edge. With this objective in mind, the project team carried out a
    detailed workload analysis of Blue Waters.

    Everware toolkit. Supporting reproducible science and challenge-driven education

    Andrey Ustyuzhanin, Timothy Daniel Head, Igor Babuschkin, Alexander Tiunov
    Subjects: Computers and Society (cs.CY); Distributed, Parallel, and Cluster Computing (cs.DC)

    Modern science clearly demands for a higher level of reproducibility and
    collaboration. To make research fully reproducible one has to take care of
    several aspects: research protocol description, data access, environment
    preservation, workflow pipeline, and analysis script preservation. Version
    control systems like git help with the workflow and analysis scripts part.
    Virtualization techniques like Docker or Vagrant can help deal with
    environments. Jupyter notebooks are a powerful platform for conducting research
    in a collaborative manner. We present project Everware that seamlessly
    integrates git repository management systems such as Github or Gitlab, Docker
    and Jupyter helping with a) sharing results of real research and b) boosts
    education activities. With the help of Everware one can not only share the
    final artifacts of research but all the depth of the research process. This
    been shown to be extremely helpful during organization of several data analysis
    hackathons and machine learning schools. Using Everware participants could
    start from an existing solution instead of starting from scratch. They could
    start contributing immediately. Everware allows its users to make use of their
    own computational resources to run the workflows they are interested in, which
    leads to higher scalability of the toolkit.

    Runtime Optimization of Join Location in Parallel Data Management Systems

    Bikash Chandra, S. Sudarshan
    Comments: 16 pages
    Subjects: Databases (cs.DB); Distributed, Parallel, and Cluster Computing (cs.DC)

    Applications running on parallel systems often need to join a streaming
    relation or a stored relation with data indexed in a parallel data storage
    system. Some applications also compute UDFs on the joined tuples. The join can
    be done at the data storage nodes, corresponding to reduce side joins, or by
    fetching data from the storage system, corresponding to map side join. Both may
    be suboptimal: reduce side joins may cause skew, while map side joins may lead
    to a lot of data being transferred and replicated.

    In this paper, we present techniques to make runtime decisions between the
    two options on a per key basis, in order to improve the throughput of the join,
    accounting for UDF computation if any. Our techniques are based on an extended
    ski-rental algorithm and provide worst-case performance guarantees with respect
    to the optimal point in the space considered by us. Our techniques use load
    balancing taking into account the CPU, network and I/O costs as well as the
    load at clients and servers. We have implemented our techniques on Hadoop,
    Spark and the Muppet stream processing engine. Our experiments show that our
    optimization techniques provide significant improvement in throughput over
    existing techniques.

    When Hashes Met Wedges: A Distributed Algorithm for Finding High Similarity Vectors

    Aneesh Sharma, C. Seshadhri, Ashish Goel
    Subjects: Social and Information Networks (cs.SI); Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS)

    Finding similar user pairs is a fundamental task in social networks, with
    numerous applications in ranking and personalization tasks such as link
    prediction and tie strength detection. A common manifestation of user
    similarity is based upon network structure: each user is represented by a
    vector that represents the user’s network connections, where pairwise cosine
    similarity among these vectors defines user similarity. The predominant task
    for user similarity applications is to discover all similar pairs that have a
    pairwise cosine similarity value larger than a given threshold ( au). In
    contrast to previous work where ( au) is assumed to be quite close to 1, we
    focus on recommendation applications where ( au) is small, but still
    meaningful. The all pairs cosine similarity problem is computationally
    challenging on networks with billions of edges, and especially so for settings
    with small ( au). To the best of our knowledge, there is no practical solution
    for computing all user pairs with, say ( au = 0.2) on large social networks,
    even using the power of distributed algorithms.

    Our work directly addresses this challenge by introducing a new algorithm —
    WHIMP — that solves this problem efficiently in the MapReduce model. The key
    insight in WHIMP is to combine the “wedge-sampling” approach of Cohen-Lewis for
    approximate matrix multiplication with the SimHash random projection techniques
    of Charikar. We provide a theoretical analysis of WHIMP, proving that it has
    near optimal communication costs while maintaining computation cost comparable
    with the state of the art. We also empirically demonstrate WHIMP’s scalability
    by computing all highly similar pairs on four massive data sets, and show that
    it accurately finds high similarity pairs. In particular, we note that WHIMP
    successfully processes the entire Twitter network, which has tens of billions
    of edges.

    Large-Scale Evolution of Image Classifiers

    Esteban Real, Sherry Moore, Andrew Selle, Saurabh Saxena, Yutaka Leon Suematsu, Quoc Le, Alex Kurakin
    Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC)

    Neural networks have proven effective at solving difficult problems but
    designing their architectures can be challenging, even for image classification
    problems alone. Evolutionary algorithms provide a technique to discover such
    networks automatically. Despite significant computational requirements, we show
    that evolving models that rival large, hand-designed architectures is possible
    today. We employ simple evolutionary techniques at unprecedented scales to
    discover models for the CIFAR-10 and CIFAR-100 datasets, starting from trivial
    initial conditions. To do this, we use novel and intuitive mutation operators
    that navigate large search spaces. We stress that no human participation is
    required once evolution starts and that the output is a fully-trained model.
    Throughout this work, we place special emphasis on the repeatability of
    results, the variability in the outcomes and the computational requirements.


    Learning

    EX2: Exploration with Exemplar Models for Deep Reinforcement Learning

    Justin Fu, John D. Co-Reyes, Sergey Levine
    Subjects: Learning (cs.LG)

    Efficient exploration in high-dimensional environments remains a key
    challenge in reinforcement learning (RL). Deep reinforcement learning methods
    have demonstrated the ability to learn with highly general policy classes for
    complex tasks with high-dimensional inputs, such as raw images. However, many
    of the most effective exploration techniques rely on tabular representations,
    or on the ability to construct a generative model over states and actions. Both
    are exceptionally difficult when these inputs are complex and high dimensional.
    On the other hand, it is comparatively easy to build discriminative models on
    top of complex states such as images using standard deep neural networks. This
    paper introduces a novel approach, EX2, which approximates state visitation
    densities by training an ensemble of discriminators, and assigns reward bonuses
    to rarely visited states. We demonstrate that EX2 achieves comparable
    performance to the state-of-the-art methods on low-dimensional tasks, and its
    effectiveness scales into high-dimensional state spaces such as visual domains
    without hand-designing features or density models.

    Learning Graphical Games from Behavioral Data: Sufficient and Necessary Conditions

    Asish Ghoshal, Jean Honorio
    Comments: Accepted to AISTATS 2017, Florida. arXiv admin note: substantial text overlap with arXiv:1607.02959
    Subjects: Learning (cs.LG)

    In this paper we obtain sufficient and necessary conditions on the number of
    samples required for exact recovery of the pure-strategy Nash equilibria (PSNE)
    set of a graphical game from noisy observations of joint actions. We consider
    sparse linear influence games — a parametric class of graphical games with
    linear payoffs, and represented by directed graphs of n nodes (players) and
    in-degree of at most k. We show that one can efficiently recover the PSNE set
    of a linear influence game with (O(k^2 log n)) samples, under very general
    observation models. On the other hand, we show that (Omega(k log n)) samples
    are necessary for any procedure to recover the PSNE set from observations of
    joint actions.

    Stochastic Separation Theorems

    A.N. Gorban, I.Y. Tyukin
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

    A set (S) is linearly separable if each (xin S) can be separated from the
    rest of (S) by a linear functional. We study random (N)-element sets in
    (mathbb{R}^n) for large (n) and demonstrate that for (N<aexp(b{n})) they are
    linearly separable with probability (p), (p>1-vartheta), for a given (small)
    (vartheta>0). Constants (a,b>0) depend on the probability distribution and the
    constant (vartheta). The results are important for machine learning in high
    dimension, especially for correction of unavoidable mistakes of legacy
    Artificial Intelligence systems.

    Learning Identifiable Gaussian Bayesian Networks in Polynomial Time and Sample Complexity

    Asish Ghoshal, Jean Honorio
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    Learning the directed acyclic graph (DAG) structure of a Bayesian network
    from observational data is a notoriously difficult problem for which many
    hardness results are known. In this paper we propose a provably polynomial-time
    algorithm for learning sparse Gaussian Bayesian networks with equal noise
    variance — a class of Bayesian networks for which the DAG structure can be
    uniquely identified from observational data — under high-dimensional
    settings. We show that (O(k^4 log p)) number of samples suffices for our
    method to recover the true DAG structure with high probability, where (p) is
    the number of variables and (k) is the maximum Markov blanket size. We obtain
    our theoretical guarantees under a condition called Restricted Strong Adjacency
    Faithfulness, which is strictly weaker than strong faithfulness — a condition
    that other methods based on conditional independence testing need for their
    success. The sample complexity of our method matches the information-theoretic
    limits in terms of the dependence on (p). We show that our method out-performs
    existing state-of-the-art methods for learning Gaussian Bayesian networks in
    terms of recovering the true DAG structure while being comparable in speed to
    heuristic methods.

    Dynamic State Warping

    Zhichen Gong, Huanhuan Chen
    Subjects: Learning (cs.LG)

    The ubiquity of sequences in many domains enhances significant recent
    interest in sequence learning, for which a basic problem is how to measure the
    distance between sequences. Dynamic time warping (DTW) aligns two sequences by
    nonlinear local warping and returns a distance value. DTW shows superior
    ability in many applications, e.g. video, image, etc. However, in DTW, two
    points are paired essentially based on point-to-point Euclidean distance (ED)
    without considering the autocorrelation of sequences. Thus, points with
    different semantic meanings, e.g. peaks and valleys, may be matched providing
    their coordinate values are similar. As a result, DTW is sensitive to noise and
    poorly interpretable. This paper proposes an efficient and flexible sequence
    alignment algorithm, dynamic state warping (DSW). DSW converts each time point
    into a latent state, which endows point-wise autocorrelation information.
    Alignment is performed by using the state sequences. Thus DSW is able to yield
    alignment that is semantically more interpretable than that of DTW. Using one
    nearest neighbor classifier, DSW shows significant improvement on
    classification accuracy in comparison to ED (70/85 wins) and DTW (74/85 wins).
    We also empirically demonstrate that DSW is more robust and scales better to
    long sequences than ED and DTW.

    Deeply AggreVaTeD: Differentiable Imitation Learning for Sequential Prediction

    Wen Sun, Arun Venkatraman, Geoffrey J. Gordon, Byron Boots, J. Andrew Bagnell
    Comments: 17 pages
    Subjects: Learning (cs.LG)

    Researchers have demonstrated state-of-the-art performance in sequential
    decision making problems (e.g., robotics control, sequential prediction) with
    deep neural network models. One often has access to near-optimal oracles that
    achieve good performance on the task during training. We demonstrate that
    AggreVaTeD — a policy gradient extension of the Imitation Learning (IL)
    approach of (Ross & Bagnell, 2014) — can leverage such an oracle to achieve
    faster and better solutions with less training data than a less-informed
    Reinforcement Learning (RL) technique. Using both feedforward and recurrent
    neural network predictors, we present stochastic gradient procedures on a
    sequential prediction task, dependency-parsing from raw image data, as well as
    on various high dimensional robotics control problems. We also provide a
    comprehensive theoretical study of IL that demonstrates we can expect up to
    exponentially lower sample complexity for learning with AggreVaTeD than with RL
    algorithms, which backs our empirical findings. Our results and theory indicate
    that the proposed approach can achieve superior performance with respect to the
    oracle when the demonstrator is sub-optimal.

    Active Learning for Cost-Sensitive Classification

    Akshay Krishnamurthy, Alekh Agarwal, Tzu-Kuo Huang, Hal Daume III, John Langford
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    We design an active learning algorithm for cost-sensitive multiclass
    classification: problems where different errors have different costs. Our
    algorithm, COAL, makes predictions by regressing on each label’s cost and
    predicting the smallest. On a new example, it uses a set of regressors that
    perform well on past data to estimate possible costs for each label. It queries
    only the labels that could be the best, ignoring the sure losers. We prove COAL
    can be efficiently implemented for any regression family that admits squared
    loss optimization; it also enjoys strong guarantees with respect to predictive
    performance and labeling effort. We empirically compare COAL to passive
    learning, showing significant improvements in labeling effort and test cost.

    Scalable Deep Traffic Flow Neural Networks for Urban Traffic Congestion Prediction

    Mohammadhani Fouladgar, Mostafa Parchami, Ramez Elmasri, Amir Ghaderi
    Subjects: Learning (cs.LG)

    Tracking congestion throughout the network road is a critical component of
    Intelligent transportation network management systems. Understanding how the
    traffic flows and short-term prediction of congestion occurrence due to
    rush-hour or incidents can be beneficial to such systems to effectively manage
    and direct the traffic to the most appropriate detours. Many of the current
    traffic flow prediction systems are designed by utilizing a central processing
    component where the prediction is carried out through aggregation of the
    information gathered from all measuring stations. However, centralized systems
    are not scalable and fail provide real-time feedback to the system whereas in a
    decentralized scheme, each node is responsible to predict its own short-term
    congestion based on the local current measurements in neighboring nodes.

    We propose a decentralized deep learning-based method where each node
    accurately predicts its own congestion state in real-time based on the
    congestion state of the neighboring stations. Moreover, historical data from
    the deployment site is not required, which makes the proposed method more
    suitable for newly installed stations. In order to achieve higher performance,
    we introduce a regularized Euclidean loss function that favors high congestion
    samples over low congestion samples to avoid the impact of the unbalanced
    training dataset. A novel dataset for this purpose is designed based on the
    traffic data obtained from traffic control stations in northern California.
    Extensive experiments conducted on the designed benchmark reflect a successful
    congestion prediction.

    Optimization of distributions differences for classification

    Mohammad Reza Bonyadi, Quang M. Tieng, David C. Reutens
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    In this paper we introduce a new classification algorithm called Optimization
    of Distributions Differences (ODD). The algorithm aims to find a transformation
    from the feature space to a new space where the instances in the same class are
    as close as possible to one another while the gravity centers of these classes
    are as far as possible from one another. This aim is formulated as a
    multiobjective optimization problem that is solved by a hybrid of an
    evolutionary strategy and the Quasi-Newton method. The choice of the
    transformation function is flexible and could be any continuous space function.
    We experiment with a linear and a non-linear transformation in this paper. We
    show that the algorithm can outperform 6 other state-of-the-art classification
    methods, namely naive Bayes, support vector machines, linear discriminant
    analysis, multi-layer perceptrons, decision trees, and k-nearest neighbors, in
    12 standard classification datasets. Our results show that the method is less
    sensitive to the imbalanced number of instances comparing to these methods. We
    also show that ODD maintains its performance better than other classification
    methods in these datasets, hence, offers a better generalization ability.

    Belief Propagation in Conditional RBMs for Structured Prediction

    Wei Ping, Alexander Ihler
    Comments: Artificial Intelligence and Statistics (AISTATS) 2017
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

    Restricted Boltzmann machines~(RBMs) and conditional RBMs~(CRBMs) are popular
    models for a wide range of applications. In previous work, learning on such
    models has been dominated by contrastive divergence~(CD) and its variants.
    Belief propagation~(BP) algorithms are believed to be slow for structured
    prediction on conditional RBMs~(e.g., Mnih et al. [2011]), and not as good as
    CD when applied in learning~(e.g., Larochelle et al. [2012]). In this work, we
    present a matrix-based implementation of belief propagation algorithms on
    CRBMs, which is easily scalable to tens of thousands of visible and hidden
    units. We demonstrate that, in both maximum likelihood and max-margin learning,
    training conditional RBMs with BP as the inference routine can provide
    significantly better results than current state-of-the-art CD methods on
    structured prediction problems. We also include practical guidelines on
    training CRBMs with BP, and some insights on the interaction of learning and
    inference algorithms for CRBMs.

    A Laplacian Framework for Option Discovery in Reinforcement Learning

    Marlos C. Machado, Marc G. Bellemare, Michael Bowling
    Comments: Version submitted to the 34th International Conference on Machine Learning (ICML)
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

    Representation learning and option discovery are two of the biggest
    challenges in reinforcement learning (RL). Proto-RL is a well known approach
    for representation learning in MDPs. The representations learned with this
    framework are called proto-value functions (PVFs). In this paper we address the
    option discovery problem by showing how PVFs implicitly define options. We do
    it by introducing eigenpurposes, intrinsic reward functions derived from the
    learned representations. The options discovered from eigenpurposes traverse the
    principal directions of the state space. They are useful for multiple tasks
    because they are independent of the agents’ intentions. Moreover, by capturing
    the diffusion process of a random walk, different options act at different time
    scales, making them helpful for exploration strategies. We demonstrate features
    of eigenpurposes in traditional tabular domains as well as in Atari 2600 games.

    Controllable Text Generation

    Zhiting Hu, Zichao Yang, Xiaodan Liang, Ruslan Salakhutdinov, Eric P. Xing
    Comments: updated discussions and references
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (stat.ML)

    Generic generation and manipulation of text is challenging and has limited
    success compared to recent deep generative modeling in visual domain. This
    paper aims at generating plausible natural language sentences, whose attributes
    are dynamically controlled by learning disentangled latent representations with
    designated semantics. We propose a new neural generative model which combines
    variational auto-encoders and holistic attribute discriminators for effective
    imposition of semantic structures. With differentiable approximation to
    discrete text samples, explicit constraints on independent attribute controls,
    and efficient collaborative learning of generator and discriminators, our model
    learns highly interpretable representations from even only word annotations,
    and produces realistic sentences with desired attributes. Quantitative
    evaluation validates the accuracy of sentence and attribute generation.

    Machine Learning on Sequential Data Using a Recurrent Weighted Average

    Jared Ostmeyer, Lindsay Cowell
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Recurrent Neural Networks (RNN) are a type of statistical model designed to
    handle sequential data. The model reads a sequence one symbol at a time. Each
    symbol is processed based on information collected from the previous symbols.
    With existing RNN architectures, each symbol is processed using only
    information from the previous processing step. To overcome this limitation, we
    propose a new kind of RNN model that computes a recurrent weighted average
    (RWA) over every past processing step. Because the RWA can be computed as a
    running average, the computational overhead scales like that of any other RNN.
    The approach essentially reformulates the attention mechanism into a
    stand-alone model. When assessing a RWA model, it is found to train faster and
    generalize better than a standard LSTM model when performing the variable copy
    problem, the adding problem, classification of artificial grammar,
    classification of sequences by length, and classification of MNIST handwritten
    digits (where the pixels are read sequentially one at a time).

    Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with Bayesian Optimization

    Alonso Marco, Felix Berkenkamp, Philipp Hennig, Angela P. Schoellig, Andreas Krause, Stefan Schaal, Sebastian Trimpe
    Comments: 7 pages, 6 figures, to appear in IEEE 2017 International Conference on Robotics and Automation (ICRA)
    Subjects: Robotics (cs.RO); Learning (cs.LG); Systems and Control (cs.SY)

    In practice, the parameters of control policies are often tuned manually.
    This is time-consuming and frustrating. Reinforcement learning is a promising
    alternative that aims to automate this process, yet often requires too many
    experiments to be practical. In this paper, we propose a solution to this
    problem by exploiting prior knowledge from simulations, which are readily
    available for most robotic platforms. Specifically, we extend Entropy Search, a
    Bayesian optimization algorithm that maximizes information gain from each
    experiment, to the case of multiple information sources. The result is a
    principled way to automatically combine cheap, but inaccurate information from
    simulations with expensive and accurate physical experiments in a
    cost-effective manner. We apply the resulting method to a cart-pole system,
    which confirms that the algorithm can find good control policies with fewer
    experiments than standard Bayesian optimization on the physical system only.

    Denoising Adversarial Autoencoders

    Antonia Creswell, Anil Anthony Bharath
    Comments: submitted to journal
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)

    Unsupervised learning is of growing interest because it unlocks the potential
    held in vast amounts of unlabelled data to learn useful representations for
    inference. Autoencoders, a form of generative model, may be trained by learning
    to reconstruct unlabelled input data from a latent representation space. More
    robust representations may be produced by an autoencoder if it learns to
    recover clean input samples from corrupted ones. Representations may be further
    improved by introducing regularisation during training to shape the
    distribution of the encoded data in latent space. We suggest denoising
    adversarial autoencoders, which combine denoising and regularisation, shaping
    the distribution of latent space using adversarial training. We introduce a
    novel analysis that shows how denoising may be incorporated into the training
    and sampling of adversarial autoencoders. Experiments are performed to assess
    the contributions that denoising makes to the learning of representations for
    classification and sample synthesis. Our results suggest that autoencoders
    trained using a denoising criterion achieve higher classification performance,
    and can synthesise samples that are more consistent with the input data than
    those trained without a corruption process.

    On the Behavior of Convolutional Nets for Feature Extraction

    Dario Garcia-Gasulla, Ferran Parés, Armand Vilalta, Jonatan Moreno, Eduard Ayguadé, Jesús Labarta, Ulises Cortés, Toyotaro Suzumura
    Comments: Submitted to Journal of Artificial Intelligence Research Special Track on Deep Learning, Knowledge Representation, and Reasoning
    Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    Convolutional neural networks (CNN) are representation learning techniques
    that achieve state-of-the-art performance on almost every image-related,
    machine learning task. Applying the representation languages build by these
    models to tasks beyond the one they were originally trained for is a field of
    interest known as transfer learning for feature extraction. Through this
    approach, one can apply the image descriptors learnt by a CNN after processing
    millions of images to any dataset, without an expensive training phase.
    Contributions to this field have so far focused on extracting CNN features from
    layers close to the output (e.g., fully connected layers), particularly because
    they work better when used out-of-the-box to feed a classifier. Nevertheless,
    the rest of CNN features is known to encode a wide variety of visual
    information, which could be potentially exploited on knowledge representation
    and reasoning tasks. In this paper we analyze the behavior of each feature
    individually, exploring their intra/inter class activations for all classes of
    three different datasets. From this study we learn that low and middle level
    features behave very differently to high level features, the former being more
    descriptive and the latter being more discriminant. We show how low and middle
    level features can be used for knowledge representation purposes both by their
    presence or by their absence. We also study how much noise these features may
    encode, and propose a thresholding approach to discard most of it. Finally, we
    discuss the potential implications of these results in the context of knowledge
    representation using features extracted from a CNN.

    Differentially Private Bayesian Learning on Distributed Data

    Mikko Heikkilä, Yusuke Okimoto, Samuel Kaski, Kana Shimizu, Antti Honkela
    Comments: 12 pages, 11 figures
    Subjects: Machine Learning (stat.ML); Cryptography and Security (cs.CR); Learning (cs.LG); Computation (stat.CO)

    Many applications of machine learning, for example in health care, would
    benefit from methods that can guarantee privacy of data subjects. Differential
    privacy (DP) has become established as a standard for protecting learning
    results, but the proposed algorithms require a single trusted party to have
    access to the entire data, which is a clear weakness. We consider DP Bayesian
    learning in a distributed setting, where each party only holds a single sample
    or a few samples of the data. We propose a novel method for DP learning in this
    distributed setting, based on a secure multi-party sum function for aggregating
    summaries from the data holders. Each data holder adds their share of Gaussian
    noise to make the total computation differentially private using the Gaussian
    mechanism. We prove that the system can be made secure against a desired number
    of colluding data owners and robust against faulting data owners. The method
    builds on an asymptotically optimal and practically efficient DP Bayesian
    inference with rapidly diminishing extra cost.

    Adversarial Examples for Semantic Image Segmentation

    Volker Fischer, Mummadi Chaithanya Kumar, Jan Hendrik Metzen, Thomas Brox
    Comments: ICLR 2017 workshop submission
    Subjects: Machine Learning (stat.ML); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Machine learning methods in general and Deep Neural Networks in particular
    have shown to be vulnerable to adversarial perturbations. So far this
    phenomenon has mainly been studied in the context of whole-image
    classification. In this contribution, we analyse how adversarial perturbations
    can affect the task of semantic segmentation. We show how existing adversarial
    attackers can be transferred to this task and that it is possible to create
    imperceptible adversarial perturbations that lead a deep network to misclassify
    almost all pixels of a chosen class while leaving network prediction nearly
    unchanged outside this class.

    Learning Robot Activities from First-Person Human Videos Using Convolutional Future Regression

    Jangwon Lee, Michael S. Ryoo
    Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    We design a new approach that allows robot learning of new activities from
    unlabeled human example videos. Given videos of humans executing the same
    activity from a human’s viewpoint (i.e., first-person videos), our objective is
    to make the robot learn the temporal structure of the activity as its future
    regression network, and learn to transfer such model for its own motor
    execution. We present a new deep learning model: We extend the state-of-the-art
    convolutional object detection network for the detection of human hands in
    training videos based on image information, and newly introduce the concept of
    using a fully convolutional network to regress (i.e., predict) the intermediate
    scene representation corresponding to the future frame (e.g., 1-2 seconds
    later). Combining these allows direct prediction of future locations of human
    hands and objects, which enables the robot to infer the motor control plan
    using our manipulation network. We experimentally confirm that our approach
    makes learning of robot activities from unlabeled human interaction videos
    possible, and demonstrate that our robot is able to execute the learned
    collaborative activities in real-time directly based on its camera input.

    Unsupervised Basis Function Adaptation for Reinforcement Learning

    Edward W. Barker, Charl J. Ras
    Comments: Extended abstract submitted (3 March 2017) for 3rd Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM) 2017
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    When using reinforcement learning (RL) algorithms to evaluate a policy it is
    common, given a large state space, to introduce some form of approximation
    architecture for the value function (VF). The exact form of this architecture
    can have a significant effect on the accuracy of the VF estimate, however, and
    determining a suitable approximation architecture can often be a highly complex
    task. Consequently there is a large amount of interest in the potential for
    allowing RL algorithms to adaptively generate approximation architectures.

    We investigate a method of adapting approximation architectures which uses
    feedback regarding the frequency with which an agent has visited certain states
    to guide which areas of the state space to approximate with greater detail.
    This method is “unsupervised” in the sense that it makes no direct reference to
    reward or the VF estimate. We introduce an algorithm based upon this idea which
    adapts a state aggregation approximation architecture on-line.

    A common method of scoring a VF estimate is to weight the squared Bellman
    error of each state-action by the probability of that state-action occurring.
    Adopting this scoring method, and assuming (S) states, we demonstrate
    theoretically that – provided (1) the number of cells (X) in the state
    aggregation architecture is of order (sqrt{S}log_2{S}ln{S}) or greater, (2)
    the policy and transition function are close to deterministic, and (3) the
    prior for the transition function is uniformly distributed – our algorithm,
    used in conjunction with a suitable RL algorithm, can guarantee a score which
    is arbitrarily close to zero as (S) becomes large. It is able to do this
    despite having only (O(X log_2S)) space complexity and negligible time
    complexity. The results take advantage of certain properties of the stationary
    distributions of Markov chains.

    Co-Clustering for Multitask Learning

    Keerthiram Murugesan, Jaime Carbonell, Yiming Yang
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    This paper presents a new multitask learning framework that learns a shared
    representation among the tasks, incorporating both task and feature clusters.
    The jointly-induced clusters yield a shared latent subspace where task
    relationships are learned more effectively and more generally than in
    state-of-the-art multitask learning methods. The proposed general framework
    enables the derivation of more specific or restricted state-of-the-art
    multitask methods. The paper also proposes a highly-scalable multitask learning
    algorithm, based on the new framework, using conjugate gradient descent and
    generalized extit{Sylvester equations}. Experimental results on synthetic and
    benchmark datasets show that the proposed method systematically outperforms
    several state-of-the-art multitask learning methods.

    Compositional Falsification of Cyber-Physical Systems with Machine Learning Components

    Tommaso Dreossi, Alexandre Donzé, Sanjit A. Seshia
    Subjects: Systems and Control (cs.SY); Learning (cs.LG)

    Cyber-physical systems (CPS), such as automotive systems, are starting to
    include sophisticated machine learning (ML) components. Their correctness,
    therefore, depends on properties of the inner ML modules. While learning
    algorithms aim to generalize from examples, they are only as good as the
    examples provided, and recent efforts have shown that they can produce
    inconsistent output under small adversarial perturbations. This raises the
    question: can the output from learning components can lead to a failure of the
    entire CPS? In this work, we address this question by formulating it as a
    problem of falsifying signal temporal logic (STL) specifications for CPS with
    ML components. We propose a compositional falsification framework where a
    temporal logic falsifier and a machine learning analyzer cooperate with the aim
    of finding falsifying executions of the considered model. The efficacy of the
    proposed technique is shown on an automatic emergency braking system model with
    a perception component based on deep neural networks.

    Self-Paced Multitask Learning with Shared Knowledge

    Keerthiram Murugesan, Jaime Carbonell
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    This paper introduces self-paced task selection to multitask learning, where
    instances from more closely related tasks are selected in a progression of
    easier-to-harder tasks, to emulate an effective human education strategy but
    applied to multitask machine learning. We develop the mathematical foundation
    for the approach based on an iterative selection of the most appropriate task,
    learning the task parameters, and updating the shared knowledge, optimizing a
    new bi-convex loss function. This proposed method applies quite generally,
    including to multitask feature learning, multitask learning with alternating
    structure optimization and multitask manifold regularization learning. Results
    show that in each of the above formulations self-paced (easier-to-harder) task
    selection outperforms the baseline version of these methods in all the
    experiments.


    Information Theory

    On the MISO Channel with Feedback: Can Infinitely Massive Antennas Achieve Infinite Capacity?

    Jinyuan Chen
    Subjects: Information Theory (cs.IT)

    We consider communication over a multiple-input single-output (MISO) block
    fading channel in the presence of an independent noiseless feedback link. We
    assume that the transmitter and receiver have no prior knowledge of the channel
    state realizations, but the transmitter and receiver can acquire the channel
    state information (CSIT/CSIR) via downlink training and feedback. For this
    channel, we show that increasing the number of transmit antennas to infinity
    will not achieve an infinite capacity, for a finite channel coherence and a
    finite input constraint on the second or fourth moment. This insight follows
    from our new capacity bounds that hold for any linear and nonlinear coding
    strategies, and any channel training schemes. In addition to the channel
    capacity bounds, we also provide a characterization on the beamforming gain
    that is also known as array gain or power gain, at the regime with large number
    of antennas.

    Downlink Cellular Network Analysis with LOS/NLOS Propagation and Elevated Base Stations

    Italo Atzeni, Jesús Arnau, Marios Kountouris
    Comments: Submitted to the IEEE for possible publication
    Subjects: Information Theory (cs.IT)

    In this paper, we investigate the downlink performance of dense cellular
    networks with elevated base stations (BSs) using a channel model that
    incorporates line-of-sight (LOS)/non-line-of-sight (NLOS) propagation in both
    small-scale and large-scale fading. Modeling LOS fading with Nakagami-(m)
    fading, we provide a unified framework based on stochastic geometry that
    encompasses both closest and strongest BS association. Our study is
    particularized to two distance-dependent LOS/NLOS models of practical interest.
    Considering the effect of LOS propagation alone, we derive closed-form
    expressions for the coverage probability with Nakagami-(m) fading, showing that
    the performance for strongest BS association is the same as in the case of
    Rayleigh fading, whereas for closest BS association it monotonically increases
    with the shape parameter (m). Then, focusing on the effect of elevated BSs, we
    show that network densification eventually leads to near-universal outage even
    for moderately low BS densities: in particular, the maximum area spectral
    efficiency is proportional to the inverse of the squared BS height.

    On squares of cyclic codes

    Ignacio Cascudo
    Subjects: Information Theory (cs.IT)

    The square (C^{*2}) of a linear error correcting code (C) is the linear code
    spanned by the coordinate-wise products of every pair of (non-necessarily
    distinct) words in (C). Squares of codes have gained attention for several
    applications mainly in the area of cryptography, where typically one is
    concerned about some of the parameters (dimension, minimum distance) of both
    (C^{*2}) and (C). In this paper, squares of cyclic codes are considered.
    General results on the minimum distance of the squares of cyclic codes are
    obtained and constructions of cyclic codes (C) with relatively large dimension
    of (C) and minimum distance of the square (C^{*2}) are discussed. In some
    cases, the constructions lead to codes (C) such that both (C) and (C^{*2})
    simultaneously have the largest possible minimum distances for their length and
    dimensions.

    Percentile Policies for Tracking of Markovian Random Processes with Asymmetric Cost and Observation

    Parisa Mansourifard, Tara Javidi, Bhaskar Krishnamachari
    Subjects: Information Theory (cs.IT)

    Motivated by wide-ranging applications such as video delivery over networks
    using Multiple Description Codes, congestion control, and inventory management,
    we study the state-tracking of a Markovian random process with a known
    transition matrix and a finite ordered state set. The decision-maker must
    select a state as an action at each time step to minimize the total expected
    cost. The decision-maker is faced with asymmetries both in cost and
    observation: in case the selected state is less than the actual state of the
    Markovian process, an under-utilization cost occurs and only partial
    observation about the actual state is revealed; otherwise, the decision incurs
    an over-utilization cost and reveals full information about the actual state.
    We can formulate this problem as a Partially Observable Markov Decision Process
    which can be expressed as a dynamic program based on the last full observed
    state and the time of full observation. This formulation determines the
    sequence of actions to be taken between any two consecutive full observations
    of the actual state. However, this DP grows exponentially in the number of
    states, with little hope for a computationally feasible solution. We present an
    interesting class of computationally tractable policies with a percentile
    structure. A generalization of binary search, this class of policies attempt at
    any given time to reduce the uncertainty by a given percentage. Among all
    percentile policies, we search for the one with the minimum expected cost. The
    result of this search is a heuristic policy which we evaluate through numerical
    simulations. We show that it outperforms the myopic policies and under some
    conditions performs close to the optimal policies. Furthermore, we derive a
    lower bound on the cost of the optimal policy which can be computed with low
    complexity and give a measure for how close our heuristic policy is to the
    optimal policy.

    The Global Optimization Geometry of Nonsymmetric Matrix Factorization and Sensing

    Zhihui Zhu, Qiuwei Li, Gongguo Tang, Michael B. Wakin
    Subjects: Information Theory (cs.IT); Optimization and Control (math.OC)

    In this paper we characterize the optimization geometry of a matrix
    factorization problem where we aim to find (n imes r) and (m imes r) matrices
    (U) and (V) such that (UV^T) approximates a given matrix (X^star). We show
    that the objective function of the matrix factorization problem has no spurious
    local minima and obeys the strict saddle property not only for the
    exact-parameterization case where (rank(X^star) = r), but also for the
    over-parameterization case where (rank(X^star) < r) and under-parameterization
    case where (rank(X^star) > r). These geometric properties imply that a number
    of iterative optimization algorithms (such as gradient descent) converge to a
    global solution with random initialization. For the exact-parameterization
    case, we further show that the objective function satisfies the robust strict
    saddle property, ensuring global convergence of many local search algorithms in
    polynomial time. We extend the geometric analysis to the matrix sensing problem
    with the factorization approach and prove that this global optimization
    geometry is preserved as long as the measurement operator satisfies the
    standard restricted isometry property.

    Preserving Confidentiality in The Gaussian Broadcast Channel Using Compute-and-Forward

    Parisa Babaheidarian, Somayeh Salimi, Panos Papadimitratos
    Comments: 6 pages, accepted to CISS 2017
    Subjects: Information Theory (cs.IT)

    We study the transmission of confidential messages across a wireless
    broadcast channel with K>2 receivers and K helpers. The goal is to transmit all
    messages reliably to their intended receivers while keeping them confidential
    from the unintended receivers. We design a codebook based on nested lattice
    structure, cooperative jamming, lattice alignment, and i.i.d. coding. Moreover,
    we exploit the asymmetric compute-and-forward decoding strategy to handle
    finite SNR regimes. Unlike previous alignment schemes, our achievable rates are
    attainable at any finite SNR value. Also, we show that our scheme achieves the
    optimal sum secure degrees of freedom of 1 for the K-receiver Gaussian
    broadcast channel with K confidential messages and K helpers.

    Multicast Transmissions in Directional mmWave Communications

    Alessandro Biason, Michele Zorzi
    Comments: 8 pages, 5 figures, submitted to European Wireless (EW), Feb. 2017
    Subjects: Information Theory (cs.IT)

    Multicast transmissions have been widely analyzed in traditional networks as
    a way to improve spectrum efficiency when multiple users are interested in the
    same data. However, their application to mmWave communications has been studied
    only marginally so far. The goal of this paper is to partially fill this gap by
    investigating optimal and suboptimal multicast schemes for mmWave
    communications with directional beams. In particular, we propose a Markov setup
    to model the retransmission status of the unsuccessfully transmitted packets
    and, because of the computational complexity of the optimal solution, we
    introduce a suboptimal hierarchical optimization procedure, which is much
    easier to derive. Finally, we numerically show that restricting the link to
    unicast beams is strongly suboptimal, especially when many packets have to be
    transmitted.

    Sum-set Inequalities from Aligned Image Sets: Instruments for Robust GDoF Bounds

    Arash Gholami Davoodi, Syed A. Jafar
    Comments: 22 pages, 1 figure
    Subjects: Information Theory (cs.IT)

    We present sum-set inequalities specialized to the generalized degrees of
    freedom (GDoF) framework. These are information theoretic lower bounds on the
    entropy of bounded density linear combinations of discrete, power-limited
    dependent random variables in terms of the joint entropies of arbitrary linear
    combinations of new random variables that are obtained by power level
    partitioning of the original random variables. The bounds are useful
    instruments to obtain GDoF characterizations for wireless interference
    networks, especially with multiple antenna nodes, subject to arbitrary channel
    strength and channel uncertainty levels.

    Symmetric Laplacians, Quantum Density Matrices and their Von-Neumann Entropy

    David E. Simmons, Justin P. Coon, Animesh Datta
    Subjects: Information Theory (cs.IT)

    We show that the (normalized) symmetric Laplacian of a simple graph can be
    obtained from the partial trace over a pure bipartite quantum state that
    resides in a bipartite Hilbert space (one part corresponding to the vertices,
    the other corresponding to the edges). This suggests an interpretation of the
    symmetric Laplacian’s Von Neumann entropy as a measure of bipartite
    entanglement present between the two parts of the state. We then study extreme
    values for a connected graph’s generalized R’enyi-(p) entropy. Specifically,
    we show that egin{enumerate}item the complete graph achieves maximum
    entropy, item the (2)-regular graph:egin{enumerate} item achieves minimum
    R’enyi-(2) entropy among all (k)-regular graphs, item is within (log 4/3) of
    the minimum R’enyi-(2) entropy and (log4sqrt{2}/3) of the minimum Von
    Neumann entropy among all connected graphs, item achieves a Von Neumann
    entropy less than the star graph. end{enumerate} end{enumerate} Point ((2))
    contrasts sharply with similar work applied to (normalized) combinatorial
    Laplacians, where it has been shown that the star graph almost always achieves
    minimum Von Neumann entropy. In this work we find that the star graph achieves
    maximum entropy in the limit as the number of vertices grows without bound.

    smallskip
    oindent extbf{Keywords:} Symmetric; Laplacian; Quantum;
    Entropy; Bounds; R’enyi.

    Zero-Delay Source-Channel Coding with a One-Bit ADC Front End and Correlated Side Information at the Receiver

    Morteza Varasteh, Borzoo Rassouli, Ozvaldo Simeone, Deniz Gunduz
    Subjects: Information Theory (cs.IT)

    Zero-delay transmission of a Gaussian source over an additive white Gaussian
    noise (AWGN) channel is considered with a one-bit analog-to-digital converter
    (ADC) front end and a correlated side information at the receiver. The design
    of the optimal encoder and decoder is studied for two performance criteria,
    namely, the mean squared error (MSE) distortion and the distortion outage
    probability (DOP), under an average power constraint on the channel input. For
    both criteria, necessary optimality conditions for the encoder and the decoder
    are derived. Using these conditions, it is observed that the numerically
    optimized encoder (NOE) under the MSE distortion criterion is periodic, and its
    period increases with the correlation between the source and the receiver side
    information. For the DOP, it is instead seen that the NOE mappings periodically
    acquire positive and negative values, which decay to zero with increasing
    source magnitude, and the interval over which the mapping takes non-zero
    values, becomes wider with the correlation between the source and the side
    information.

    Good cyclic codes and the uncertainty principle

    Shai Evra, Emmanuel Kowalski, Alexander Lubotzky
    Comments: 19 pages
    Subjects: Information Theory (cs.IT); Number Theory (math.NT); Representation Theory (math.RT)

    A long standing problem in the area of error correcting codes asks whether
    there exist good cyclic codes. Most of the known results point in the direction
    of a negative answer.

    The uncertainty principle is a classical result of harmonic analysis
    asserting that given a non-zero function (f) on some abelian group, either (f)
    or its Fourier transform (hat{f}) has large support.

    In this note, we observe a connection between these two subjects. We point
    out that even a weak version of the uncertainty principle for fields of
    positive characteristic would imply that good cyclic codes do exist. We also
    provide some heuristic arguments supporting that this is indeed the case.

    Full-Duplex Operations in Wireless Powered Communication Networks

    Hyungsik Ju, Yuro Lee, Tae-Joong Kim
    Comments: 5 pages, 6 figures
    Subjects: Information Theory (cs.IT)

    In this paper, we study a wireless powered communication network (WPCN) in
    which a hybrid access point (H-AP) and user equipments (UEs) all operate in
    full-duplex (FD). We first propose a transceiver structure that enables FD
    operation of UEs to receive energy in the downlink (DL) and transmit
    information in the uplink (UL) simultaneously. We then provide an energy usage
    model in the proposed UE transceiver accounting for energy leakage from
    transmit chain to receive chain. It is shown that throughput of FD WPCN with
    the proposed FD UEs can be maximized by optimally allocating UL transmission
    time to UEs by solving a simple convex optimization problem. Simulation results
    reveals that the use of the proposed FD UEs efficiently improves the throughput
    of WPCN with practical self-interference cancellation capability at the H-AP.

    Performance Analysis of a Hybrid Downlink-Uplink Cooperative NOMA Scheme

    Zhiqiang Wei, Linglong Dai, Derrick Wing Kwan Ng, Jinhong Yuan
    Comments: 7 pages, accepted for presentation at the IEEE VTC 2017 Spring, Sydney, Australia
    Subjects: Information Theory (cs.IT)

    This paper proposes a novel hybrid downlinkuplink cooperative NOMA
    (HDU-CNOMA) scheme to achieve a better tradeoff between spectral efficiency and
    signal reception reliability than the conventional cooperative NOMA schemes. In
    particular, the proposed scheme enables the strong user to perform a
    cooperative transmission and an interference-free uplink transmission
    simultaneously during the cooperative phase, at the expense of a slightly
    decrease in signal reception reliability at the weak user. We analyze the
    outage probability, diversity order, and outage throughput of the proposed
    scheme. Simulation results not only confirm the accuracy of the developed
    analytical results, but also unveil the spectral efficiency gains achieved by
    the proposed scheme over a baseline cooperative NOMA scheme and a
    non-cooperative NOMA scheme.

    Atomic Norm Minimization for Modal Analysis from Random and Compressed Samples

    Shuang Li, Dehui Yang, Gongguo Tang, Michael B. Wakin
    Subjects: Information Theory (cs.IT)

    Modal analysis is the process of estimating a system’s modal parameters such
    as its natural frequencies and mode shapes. One application of modal analysis
    is in structural health monitoring (SHM), where a network of sensors may be
    used to collect vibration data from a physical structure such as a building or
    bridge. There is a growing interest in developing automated techniques for SHM
    based on data collected in a wireless sensor network. In order to conserve
    power and extend battery life, however, it is desirable to minimize the amount
    of data that must be collected and transmitted in such a sensor network. In
    this paper, we highlight the fact that modal analysis can be formulated as an
    atomic norm minimization (ANM) problem, which can be solved efficiently and in
    some cases recover perfectly a structure’s mode shapes and frequencies. We
    survey a broad class of sampling and compression strategies that one might
    consider in a physical sensor network, and we provide bounds on the sample
    complexity of these compressive schemes in order to recover a structure’s mode
    shapes and frequencies via ANM. A main contribution of our paper is to
    establish a bound on the sample complexity of modal analysis with random
    temporal compression, and in this scenario we prove that the samples per sensor
    can actually decrease as the number of sensors increases. We also extend an
    atomic norm denoising problem to the multiple measurement vector (MMV) setting
    in the case of uniform sampling.

    A Layered Architecture for Erasure-Coded Consistent Distributed Storage

    Kishori M. Konwar, N. Prakash, Nancy Lynch, Muriel Medard
    Comments: 22 pages
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Information Theory (cs.IT)

    Motivated by emerging applications to the edge computing paradigm, we
    introduce a two-layer erasure-coded fault-tolerant distributed storage system
    offering atomic access for read and write operations. In edge computing,
    clients interact with an edge-layer of servers that is geographically near; the
    edge-layer in turn interacts with a back-end layer of servers. The edge-layer
    provides low latency access and temporary storage for client operations, and
    uses the back-end layer for persistent storage. Our algorithm, termed Layered
    Data Storage (LDS) algorithm, offers several features suitable for
    edge-computing systems, works under asynchronous message-passing environments,
    supports multiple readers and writers, and can tolerate (f_1 < n_1/2) and (f_2
    < n_2/3) crash failures in the two layers having (n_1) and (n_2) servers,
    respectively. We use a class of erasure codes known as regenerating codes for
    storage of data in the back-end layer. The choice of regenerating codes,
    instead of popular choices like Reed-Solomon codes, not only optimizes the cost
    of back-end storage, but also helps in optimizing communication cost of read
    operations, when the value needs to be recreated all the way from the back-end.
    The two-layer architecture permits a modular implementation of atomicity and
    erasure-code protocols; the implementation of erasure-codes is mostly limited
    to interaction between the two layers. We prove liveness and atomicity of LDS,
    and also compute performance costs associated with read and write operations.
    Further, in a multi-object system running (N) independent instances of LDS,
    where only a small fraction of the objects undergo concurrent accesses at any
    point during the execution, the overall storage cost is dominated by that of
    persistent storage in the back-end layer, and is given by (Theta(N)).




沪ICP备19023445号-2号
友情链接