IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Fri, 28 Oct 2016

    我爱机器学习(52ml.net)发表于 2016-10-28 00:00:00
    love 0

    Computer Vision and Pattern Recognition

    Local Similarity-Aware Deep Feature Embedding

    Chen Huang, Chen Change Loy, Xiaoou Tang
    Comments: 9 pages, 4 figures, 2 tables. Accepted to NIPS 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Existing deep embedding methods in vision tasks are capable of learning a
    compact Euclidean space from images, where Euclidean distances correspond to a
    similarity metric. To make learning more effective and efficient, hard sample
    mining is usually employed, with samples identified through computing the
    Euclidean feature distance. However, the global Euclidean distance cannot
    faithfully characterize the true feature similarity in a complex visual feature
    space, where the intraclass distance in a high-density region may be larger
    than the interclass distance in low-density regions. In this paper, we
    introduce a Position-Dependent Deep Metric (PDDM) unit, which is capable of
    learning a similarity metric adaptive to local feature structure. The metric
    can be used to select genuinely hard samples in a local neighborhood to guide
    the deep embedding learning in an online and robust manner. The new layer is
    appealing in that it is pluggable to any convolutional networks and is trained
    end-to-end. Our local similarity-aware feature embedding not only demonstrates
    faster convergence and boosted performance on two complex image retrieval
    datasets, its large margin nature also leads to superior generalization results
    under the large and open set scenarios of transfer learning and zero-shot
    learning on ImageNet 2010 and ImageNet-10K datasets.

    Detecting People in Artwork with CNNs

    Nicholas Westlake, Hongping Cai, Peter Hall
    Comments: 14 pages, plus 3 pages of references; 7 figures in ECCV 2016 Workshops
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    CNNs have massively improved performance in object detection in photographs.
    However research into object detection in artwork remains limited. We show
    state-of-the-art performance on a challenging dataset, People-Art, which
    contains people from photos, cartoons and 41 different artwork movements. We
    achieve this high performance by fine-tuning a CNN for this task, thus also
    demonstrating that training CNNs on photos results in overfitting for photos:
    only the first three or four layers transfer from photos to artwork. Although
    the CNN’s performance is the highest yet, it remains less than 60\% AP,
    suggesting further work is needed for the cross-depiction problem. The final
    publication is available at Springer via
    this http URL

    Tool and Phase recognition using contextual CNN features

    Manish Sahu, Anirban Mukhopadhyay, Angelika Szengel, Stefan Zachow
    Comments: MICCAI M2CAI 2016 Surgical tool & phase detection challenge report
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    A transfer learning method for generating features suitable for surgical
    tools and phase recognition from the ImageNet classification features [1] is
    proposed here. In addition, methods are developed for generating contextual
    features and combining them with time series analysis for final classification
    using multi-class random forest. The proposed pipeline is tested over the
    training and testing datasets of M2CAI16 challenges: tool and phase detection.
    Encouraging results are obtained by leave-one-out cross validation evaluation
    on the training dataset.

    Single- and Multi-Task Architectures for Tool Presence Detection Challenge at M2CAI 2016

    Andru P. Twinanda, Didier Mutter, Jacques Marescaux, Michel de Mathelin, Nicolas Padoy
    Comments: The dataset is available at this http URL
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    The tool presence detection challenge at M2CAI 2016 consists of identifying
    the presence/absence of seven surgical tools in the images of cholecystectomy
    videos. Here, we propose to use deep architectures that are based on our
    previous work where we presented several architectures to perform multiple
    recognition tasks on laparoscopic videos. In this technical report, we present
    the tool presence detection results using two architectures: (1) a single-task
    architecture designed to perform solely the tool presence detection task and
    (2) a multi-task architecture designed to perform jointly phase recognition and
    tool presence detection. The results show that the multi-task network only
    slightly improves the tool presence detection results. In constrast, a
    significant improvement is obtained when there are more data available to train
    the networks. This significant improvement can be regarded as a call for action
    for other institutions to start working toward publishing more datasets into
    the community, so that better models could be generated to perform the task.

    Single- and Multi-Task Architecture for Surgical Workflow at M2CAI 2016

    Andru P. Twinanda, Didier Mutter, Jacques Marescaux, Michel de Mathelin, Nicolas Padoy
    Comments: The dataset is available at this http URL
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    The surgical workflow challenge at M2CAI 2016 consists of identifying 8
    surgical phases in cholecystectomy procedures. Here, we propose to use deep
    architectures that are based on our previous work where we presented several
    architectures to perform multiple recognition tasks on laparoscopic videos. In
    this technical report, we present the phase recognition results using two
    architectures: (1) a single-task architecture designed to perform solely the
    surgical phase recognition task and (2) a multi-task architecture designed to
    perform jointly phase recognition and tool presence detection. On top of these
    architectures we propose to use two different approaches to enforce the
    temporal constraints of the surgical workflow: (1) HMM-based and (2) LSTM-based
    pipelines. The results show that the LSTM-based approach is able to outperform
    the HMM-based approach and also to properly enforce the temporal constraints
    into the recognition process.

    Automated Management of Pothole related Disasters Using Image Processing and Geotagging

    Madhura Katageri, Manisha Mandal, Mansi Gandhi, Navin Koregaonkar, Prof. Sharmila Sengupta
    Journal-ref: International Journal of Computer Science and Information
    Technology (IJCSIT) December 2015, volume 7, number 6
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Potholes though seem inconsequential, may cause accidents resulting in loss
    of human life. In this paper, we present an automated system to efficiently
    manage the potholes in a ward by deploying geotagging and image processing
    techniques that overcomes the drawbacks associated with the existing
    survey-oriented systems. Image processing is used for identification of target
    pothole regions in the 2D images using edge detection and morphological image
    processing operations. A method is developed to accurately estimate the
    dimensions of the potholes from their images, analyze their area and depth,
    estimate the quantity of filling material required and therefore enabling
    pothole attendance on a priority basis. This will further enable the government
    official to have a fully automated system for effectively managing pothole
    related disasters.

    Estimation of Bandlimited Grayscale Images From the Single Bit Observations of Pixels Affected by Additive Gaussian Noise

    Abhinav Kumar, Animesh Kumar
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Statistics Theory (math.ST)

    The estimation of grayscale images using their single-bit zero mean Gaussian
    noise-affected pixels is presented in this paper. The images are assumed to be
    bandlimited in the Fourier Cosine transform (FCT) domain. The images are
    oversampled over their Nyquist rate in the FCT domain. We propose a
    non-recursive approach based on first order approximation of Cumulative
    Distribution Function (CDF) to estimate the image from single bit pixels which
    itself is based on Banach’s contraction theorem. The decay rate for mean
    squared error of estimating such images is found to be independent of the
    precision of the quantizer and it varies as (O(1/N)) where (N) is the
    “effective” oversampling ratio with respect to the Nyquist rate in the FCT
    domain.

    PCM and APCM Revisited: An Uncertainty Perspective

    Peixin Hou, Hao Deng, Jiguang Yue, Shuguang Liu
    Comments: 8 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

    In this paper, we take a new look at the possibilistic c-means (PCM) and
    adaptive PCM (APCM) clustering algorithms from the perspective of uncertainty.
    This new perspective offers us insights into the clustering process, and also
    provides us greater degree of flexibility. We analyze the clustering behavior
    of PCM-based algorithms and introduce parameters (sigma_v) and (alpha) to
    characterize uncertainty of estimated bandwidth and noise level of the dataset
    respectively. Then uncertainty (fuzziness) of membership values caused by
    uncertainty of the estimated bandwidth parameter is modeled by a conditional
    fuzzy set, which is a new formulation of the type-2 fuzzy set. Experiments show
    that parameters (sigma_v) and (alpha) make the clustering process more easy
    to control, and main features of PCM and APCM are unified in this new
    clustering framework (UPCM). More specifically, UPCM reduces to PCM when we set
    a small (alpha) or a large (sigma_v), and UPCM reduces to APCM when clusters
    are confined in their physical clusters and possible cluster elimination are
    ensured. Finally we present further researches of this paper.

    Exploiting Structure Sparsity for Covariance-based Visual Representation

    Jianjia Zhang, Lei Wang, Luping Zhou
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    The past few years have witnessed increasing research interest on
    covariance-based feature representation. A variety of methods have been
    proposed to boost its efficacy, with some recent ones resorting to nonlinear
    kernel technique. Noting that the essence of this feature representation is to
    characterise the underlying structure of visual features, this paper argues
    that an equally, if not more, important approach to boosting its efficacy shall
    be to improve the quality of this characterisation. Following this idea, we
    propose to exploit the structure sparsity of visual features in skeletal human
    action recognition, and compute sparse inverse covariance estimate (SICE) as
    feature representation. We discuss the advantage of this new representation on
    dealing with small sample, high dimensionality, and modelling capability.
    Furthermore, utilising the monotonicity property of SICE, we efficiently
    generate a hierarchy of SICE matrices to characterise the structure of visual
    features at different sparsity levels, and two discriminative learning
    algorithms are then developed to adaptively integrate them to perform
    recognition. As demonstrated by extensive experiments, the proposed
    representation leads to significantly improved recognition performance over the
    state-of-the-art comparable methods. In particular, as a method fully based on
    linear technique, it is comparable or even better than those employing
    nonlinear kernel technique. This result well demonstrates the value of
    exploiting structure sparsity for covariance-based feature representation.

    Joint Detection and Tracking for Multipath Targets: A Variational Bayesian Approach

    Hua Lan, Shuai Sun, Zengfu Wang, Quan Pan, Zhishan Zhang
    Comments: 21 pages, 15 figures, has been submitted to IEEE Transactions on Signal Processing
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Different from traditional point target tracking systems assuming that a
    target generates at most one single measurement per scan, there exists a class
    of multipath target tracking systems where each measurement may originate from
    the interested target via one of multiple propagation paths or from clutter,
    while the correspondence among targets, measurements, and propagation paths is
    unknown. The performance of multipath target tracking systems can be improved
    if multiple measurements from the same target are effectively utilized, but
    suffers from two major challenges. The first is multipath detection that
    detects appearing and disappearing targets automatically, while one target may
    produce (s) tracks for (s) propagation paths. The second is multipath tracking
    that calculates the target-to-measurement-to-path assignment matrices to
    estimate target states, which is computationally intractable due to the
    combinatorial explosion. Based on variational Bayesian framework, this paper
    introduces a novel probabilistic joint detection and tracking algorithm
    (JDT-VB) that incorporates data association, path association, state estimation
    and automatic track management. The posterior probabilities of these latent
    variables are derived in a closed-form iterative manner, which is effective for
    dealing with the coupling issue of multipath data association identification
    risk and state estimation error. Loopy belief propagation (LBP) is exploited to
    approximate the multipath data association, which significantly reduces the
    computational cost. The proposed JDT-VB algorithm can simultaneously deal with
    the track initiation, maintenance, and termination for multiple multipath
    target tracking with time-varying number of targets, and its performance is
    verified by a numerical simulation of over-the-horizon radar.

    Fast Low-rank Shared Dictionary Learning for Image Classification

    Tiep Vu, Vishal Monga
    Comments: First submission to TIP
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Despite the fact that different objects possess distinct class-specific
    features, they also usually share common patterns. This observation has been
    exploited partially in a recently proposed dictionary learning framework by
    separating the particularity and the commonality (COPAR). Inspired by this, we
    propose a novel method to explicitly and simultaneously learn a set of common
    patterns as well as class-specific features for classification with more
    intuitive constraints. Our dictionary learning framework is hence characterized
    by both a shared dictionary and particular (class-specific) dictionaries. For
    the shared dictionary, we enforce a low-rank constraint, i.e. claim that its
    spanning subspace should have low dimension and the coefficients corresponding
    to this dictionary should be similar. For the particular dictionaries, we
    impose on them the well-known constraints stated in the Fisher discrimination
    dictionary learning (FDDL). Further, we develop new fast and accurate
    algorithms to solve the subproblems in the learning step, accelerating its
    convergence. The said algorithms could also be applied to FDDL and its
    extensions. The efficiencies of these algorithms are theoretically and
    experimentally verified by comparing their complexities and running time with
    those of other well-known dictionary learning methods. Experimental results on
    widely used image datasets establish the advantages of our method over
    state-of-the-art dictionary learning methods.

    Iterative Inversion of Deformation Vector Fields with Feedback Control

    Abhishek Kumar Dubey, Alexandros-Stavros Iliopoulos, Xiaobai Sun, Fang-Fang Yin, Lei Ren
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Purpose: The inverse of a deformation vector field (DVF) is often needed in
    deformable registration, 4D image reconstruction, and adaptive radiation
    therapy. This study aims at improving both the accuracy with respect to inverse
    consistency and efficiency of the numerical DVF inversion by developing a
    fixed-point iteration method with feedback control.

    Method: We introduce an iterative method with active feedback control for DVF
    inversion. The method is built upon a previous fixed-point iteration method,
    which is represented as a particular passive instance in the new method. At
    each iteration step, we measure the inconsistency, namely the residual, between
    the iterative inverse estimate and the input DVF. The residual is modulated by
    a feedback control mechanism before being incorporated into the next iterate.
    The feedback control design is based on analysis of error propagation in the
    iteration process. The design goal is to make the convergence region as large
    as possible, and to make estimate errors vanish faster. The feedback control
    design is assessed with two data sets: an analytic DVF pair, and a DVF between
    two phases of a 4D XCAT phantom.

    Results: The single-parameter feedback control improved both the convergence
    region and convergence rate of the iterative algorithm, for both datasets. With
    the analytic data, the iteration becomes convergent over the entire image
    domain, whereas the the precursor method diverges as the deformation becomes
    larger. With the XCAT DVF data, feedback control reduced the 95th percentile of
    residuals from 1 mm to 1E-6 mm. Additionally, convergence rate was accelerated
    by at least a factor of 2 for both datasets.

    Conclusion: The introduced method shows the previously unexplored possibility
    in exercising active feedback control in DVF inversion, and the unexploited
    potential in improving both numerical accuracy and computational efficiency.

    Volumetric Light-field Encryption at the Microscopic Scale

    Haoyu Li, Changliang Guo, Inbarasan Muniraj, Bryce C. Schroeder, John T. Sheridan, Shu Jia
    Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Biological Physics (physics.bio-ph); Optics (physics.optics)

    We report a light-field based method that allows the optical encryption of
    three-dimensional (3D) volumetric information at the microscopic scale in a
    single 2D light-field image. The system consists of a microlens array and an
    array of random phase/amplitude masks. The method utilizes a wave optics model
    to account for the dominant diffraction effect at this new scale, and the
    system point-spread function (PSF) serves as the key for encryption and
    decryption. We successfully developed and demonstrated a deconvolution
    algorithm to retrieve spatially multiplexed discrete and continuous volumetric
    data from 2D light-field images. Showing that the method is practical for data
    transmission and storage, we obtained a faithful reconstruction of the 3D
    volumetric information from a digital copy of the encrypted light-field image.
    The method represents a new level of optical encryption, paving the way for
    broad industrial and biomedical applications in processing and securing 3D data
    at the microscopic scale.


    Artificial Intelligence

    Personalized Risk Scoring for Critical Care Prognosis using Mixtures of Gaussian Processes

    Ahmed M. Alaa, Jinsung Yoon, Scott Hu, Mihaela van der Schaar
    Subjects: Artificial Intelligence (cs.AI)

    Objective: In this paper, we develop a personalized real-time risk scoring
    algorithm that provides timely and granular assessments for the clinical acuity
    of ward patients based on their (temporal) lab tests and vital signs; the
    proposed risk scoring system ensures timely intensive care unit (ICU)
    admissions for clinically deteriorating patients. Methods: The risk scoring
    system learns a set of latent patient subtypes from the offline electronic
    health record data, and trains a mixture of Gaussian Process (GP) experts,
    where each expert models the physiological data streams associated with a
    specific patient subtype. Transfer learning techniques are used to learn the
    relationship between a patient’s latent subtype and her static admission
    information (e.g. age, gender, transfer status, ICD-9 codes, etc). Results:
    Experiments conducted on data from a heterogeneous cohort of 6,321 patients
    admitted to Ronald Reagan UCLA medical center show that our risk score
    significantly and consistently outperforms the currently deployed risk scores,
    such as the Rothman index, MEWS, APACHE and SOFA scores, in terms of
    timeliness, true positive rate (TPR), and positive predictive value (PPV).
    Conclusion: Our results reflect the importance of adopting the concepts of
    personalized medicine in critical care settings; significant accuracy and
    timeliness gains can be achieved by accounting for the patients’ heterogeneity.
    Significance: The proposed risk scoring methodology can confer huge clinical
    and social benefits on more than 200,000 critically ill inpatient who exhibit
    cardiac arrests in the US every year.

    Anomaly Detection with the Voronoi Diagram Evolutionary Algorithm

    Marti Luis (TAO, LRI), Fansi-Tchango Arsene (TRT), Navarro Laurent (TRT), Marc Schoenauer (TAO, LRI)
    Journal-ref: Parallel Problem Solving from Nature — PPSN XIV, Sep 2016,
    Edinburgh, France. Springer Verlag, 9921, pp.697-706, 2016, LNCS
    Subjects: Artificial Intelligence (cs.AI)

    This paper presents the Voronoi diagram-based evolutionary algorithm
    (VorEAl). VorEAl partitions input space in abnormal/normal subsets using
    Voronoi diagrams. Diagrams are evolved using a multi-objective bio-inspired
    approach in order to conjointly optimize classification metrics while also
    being able to represent areas of the data space that are not present in the
    training dataset. As part of the paper VorEAl is experimentally validated and
    contrasted with similar approaches.

    A Review of 40 Years of Cognitive Architecture Research: Focus on Perception, Attention, Learning and Applications

    Iuliia Kotseruba, Oscar J. Avella Gonzalez, John K. Tsotsos
    Comments: 37 pages, 11 figures
    Subjects: Artificial Intelligence (cs.AI)

    In this paper we present a broad overview of the last 40 years of research on
    cognitive architectures. Although the number of existing architectures is
    nearing several hundred, most of the existing surveys do not reflect this
    growth and focus on a handful of well-established architectures. While their
    contributions are undeniable, they represent only a part of the research in the
    field. Thus, in this survey we wanted to shift the focus towards a more
    inclusive and high-level overview of the research in cognitive architectures.
    Our final set of 86 architectures includes 55 that are still actively
    developed, and borrow from a diverse set of disciplines, spanning areas from
    psychoanalysis to neuroscience. To keep the length of this paper within
    reasonable limits we discuss only the core cognitive abilities, such as
    perception, attention mechanisms, learning and memory structure. To assess the
    breadth of practical applications of cognitive architectures we gathered
    information on over 700 practical projects implemented using the cognitive
    architectures in our list. We use various visualization techniques to highlight
    overall trends in the development of the field. Our analysis of practical
    applications shows that most architectures are very narrowly focused on a
    particular application domain. Furthermore, there is an apparent gap between
    general research in robotics and computer vision and research in these areas
    within the cognitive architectures field. It is very clear that biologically
    inspired models do not have the same range and efficiency compared to the
    systems based on engineering principles and heuristics. Another observation is
    related to a general lack of collaboration. Several factors hinder
    communication, such as the closed nature of the individual projects (only
    one-third of the reviewed here architectures are open-source) and
    terminological differences.

    Learning Scalable Deep Kernels with Recurrent Structure

    Maruan Al-Shedivat, Andrew Gordon Wilson, Yunus Saatchi, Zhiting Hu, Eric P. Xing
    Comments: 29 pages, 7 figures
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    Many applications in speech, robotics, finance, and biology deal with
    sequential data, where ordering matters and recurrent structures are common.
    However, this structure cannot be easily captured by standard kernel functions.
    To model such structure, we propose expressive closed-form kernel functions for
    Gaussian processes. The resulting model, GP-LSTM, fully encapsulates the
    inductive biases of long short-term memory (LSTM) recurrent networks, while
    retaining the non-parametric probabilistic advantages of Gaussian processes. We
    learn the properties of the proposed kernels by optimizing the Gaussian process
    marginal likelihood using a new provably convergent semi-stochastic procedure
    and exploit the structure of these kernels for fast and scalable training and
    prediction. We demonstrate state-of-the-art performance on several benchmarks,
    and thoroughly investigate a consequential autonomous driving application,
    where the predictive uncertainties provided by GP-LSTM are uniquely valuable.

    Hit-and-Run for Sampling and Planning in Non-Convex Spaces

    Yasin Abbasi-Yadkori, Peter L. Bartlett, Victor Gabillon, Alan Malek
    Subjects: Computation (stat.CO); Artificial Intelligence (cs.AI); Combinatorics (math.CO); Probability (math.PR)

    We propose the Hit-and-Run algorithm for planning and sampling problems in
    non-convex spaces. For sampling, we show the first analysis of the Hit-and-Run
    algorithm in non-convex spaces and show that it mixes fast as long as certain
    smoothness conditions are satisfied. In particular, our analysis reveals an
    intriguing connection between fast mixing and the existence of smooth
    measure-preserving mappings from a convex space to the non-convex space. For
    planning, we show advantages of Hit-and-Run compared to state-of-the-art
    planning methods such as Rapidly-Exploring Random Trees.

    Synthesis of Shared Control Protocols with Provable Safety and Performance Guarantees

    Nils Jansen, Murat Cubuktepe, Ufuk Topcu
    Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Learning (cs.LG)

    We formalize synthesis of shared control protocols with correctness
    guarantees for temporal logic specifications. More specifically, we introduce a
    modeling formalism in which both a human and an autonomy protocol can issue
    commands to a robot towards performing a certain task. These commands are
    blended into a joint input to the robot. The autonomy protocol is synthesized
    using an abstraction of possible human commands accounting for randomness in
    decisions caused by factors such as fatigue or incomprehensibility of the
    problem at hand. The synthesis is designed to ensure that the resulting robot
    behavior satisfies given safety and performance specifications, e.g., in
    temporal logic. Our solution is based on nonlinear programming and we address
    the inherent scalability issue by presenting alternative methods. We assess the
    feasibility and the scalability of the approach by an experimental evaluation.

    Fast Bayesian Non-Negative Matrix Factorisation and Tri-Factorisation

    Thomas Brouwer, Jes Frellsen, Pietro Lio'
    Comments: NIPS 2016 Workshop on Advances in Approximate Bayesian Inference
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Numerical Analysis (cs.NA); Machine Learning (stat.ML)

    We present a fast variational Bayesian algorithm for performing non-negative
    matrix factorisation and tri-factorisation. We show that our approach achieves
    faster convergence per iteration and timestep (wall-clock) than Gibbs sampling
    and non-probabilistic approaches, and do not require additional samples to
    estimate the posterior. We show that in particular for matrix tri-factorisation
    convergence is difficult, but our variational Bayesian approach offers a fast
    solution, allowing the tri-factorisation approach to be used more effectively.


    Information Retrieval

    Word Embeddings to Enhance Twitter Gang Member Profile Identification

    Sanjaya Wijeratne, Lakshika Balasuriya, Derek Doran, Amit Sheth
    Comments: 7 pages, 1 figure, 2 tables, Published at IJCAI Workshop on Semantic Machine Learning (SML 2016)
    Journal-ref: IJCAI Workshop on Semantic Machine Learning (SML 2016). pp. 18-24.
    CEUR-WS, New York City, NY (07 2016)
    Subjects: Social and Information Networks (cs.SI); Computation and Language (cs.CL); Computers and Society (cs.CY); Information Retrieval (cs.IR)

    Gang affiliates have joined the masses who use social media to share thoughts
    and actions publicly. Interestingly, they use this public medium to express
    recent illegal actions, to intimidate others, and to share outrageous images
    and statements. Agencies able to unearth these profiles may thus be able to
    anticipate, stop, or hasten the investigation of gang-related crimes. This
    paper investigates the use of word embeddings to help identify gang members on
    Twitter. Building on our previous work, we generate word embeddings that
    translate what Twitter users post in their profile descriptions, tweets,
    profile images, and linked YouTube content to a real vector format amenable for
    machine learning classification. Our experimental results show that pre-trained
    word embeddings can boost the accuracy of supervised learning algorithms
    trained over gang members social media posts.


    Computation and Language

    Ex Machina: Personal Attacks Seen at Scale

    Ellery Wulczyn, Nithum Thain, Lucas Dixon
    Subjects: Computation and Language (cs.CL)

    The damage personal attacks make to online discourse motivates many platforms
    to try to curb the phenomenon. However, understanding the prevalence and impact
    of personal attacks in online platforms at scale remains surprisingly
    difficult. The contribution of this paper is to develop and illustrate a method
    that combines crowdsourcing and machine learning to analyze personal attacks at
    scale. We show an evaluation method for a classifier in terms of the aggregated
    number of crowd-workers it can approximate. We apply our methodology to English
    Wikipedia, generating a corpus of over 100k high quality human-labeled comments
    and 63M machine-labeled ones from a classifier that is as good as the aggregate
    of 3 crowd-workers. Using the corpus of machine-labeled scores, our methodology
    allows us to explore some of the open questions about the nature of online
    personal attacks. This reveals that the majority of personal attacks on
    Wikipedia are not the result of a few malicious users, nor primarily the
    consequence of allowing anonymous contributions.

    A Deeper Look into Sarcastic Tweets Using Deep Convolutional Neural Networks

    Soujanya Poria, Erik Cambria, Devamanyu Hazarika, Prateek Vij
    Comments: 12 pages, 4 figures in COLING 2016
    Subjects: Computation and Language (cs.CL)

    Sarcasm detection is a key task for many natural language processing tasks.
    In sentiment analysis, for example, sarcasm can flip the polarity of an
    “apparently positive” sentence and, hence, negatively affect polarity detection
    performance. To date, most approaches to sarcasm detection have treated the
    task primarily as a text categorization problem. Sarcasm, however, can be
    expressed in very subtle ways and requires a deeper understanding of natural
    language that standard text categorization techniques cannot grasp. In this
    work, we develop models based on a pre-trained convolutional neural network for
    extracting sentiment, emotion and personality features for sarcasm detection.
    Such features, along with the network’s baseline features, allow the proposed
    models to outperform the state of the art on benchmark datasets. We also
    address the often ignored generalizability issue of classifying data that have
    not been seen by the models at learning phase.

    CoType: Joint Extraction of Typed Entities and Relations with Knowledge Bases

    Xiang Ren, Zeqiu Wu, Wenqi He, Meng Qu, Clare R. Voss, Heng Ji, Tarek F. Abdelzaher, Jiawei Han
    Comments: submitted to WWW 2017
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    Extracting entities and relations for types of interest from text is
    important for understanding massive text corpora. Traditionally, systems of
    entity relation extraction have relied on human-annotated corpora for training
    and adopted an incremental pipeline. Such systems require additional human
    expertise to be ported to a new domain, and are vulnerable to errors cascading
    down the pipeline. In this paper, we investigate joint extraction of typed
    entities and relations with labeled data heuristically obtained from knowledge
    bases (i.e., distant supervision). As our algorithm for type labeling via
    distant supervision is context-agnostic, noisy training data poses unique
    challenges for the task. We propose a novel domain-independent framework,
    called CoType, that runs a data-driven text segmentation algorithm to extract
    entity mentions, and jointly embeds entity mentions, relation mentions, text
    features and type labels into two low-dimensional spaces (for entity and
    relation mentions respectively), where, in each space, objects whose types are
    close will also have similar representations. CoType, then using these learned
    embeddings, estimates the types of test (unlinkable) mentions. We formulate a
    joint optimization problem to learn embeddings from text corpora and knowledge
    bases, adopting a novel partial-label loss function for noisy labeled data and
    introducing an object “translation” function to capture the cross-constraints
    of entities and relations on each other. Experiments on three public datasets
    demonstrate the effectiveness of CoType across different domains (e.g., news,
    biomedical), with an average of 25% improvement in F1 score compared to the
    next best method.

    CogALex-V Shared Task: LexNET – Integrated Path-based and Distributional Method for the Identification of Semantic Relations

    Vered Shwartz, Ido Dagan
    Comments: 5 pages, accepted to the 5th Workshop on Cognitive Aspects of the Lexicon (CogALex-V), in COLING 2016
    Subjects: Computation and Language (cs.CL)

    We present a submission to the CogALex 2016 shared task on the corpus-based
    identification of semantic relations, using LexNET (Shwartz and Dagan, 2016),
    an integrated path-based and distributional method for semantic relation
    classification. The reported results in the shared task bring this submission
    to the third place on subtask 1 (word relatedness), and the first place on
    subtask 2 (semantic relation classification), demonstrating the utility of
    integrating the complementary path-based and distributional information sources
    in recognizing semantic relatedness. Combined with a common similarity measure,
    LexNET performs fairly good on the word relatedness task (subtask 1). The
    relatively lower performance of LexNET and the various other systems on subtask
    2, however, confirms the difficulty of the semantic relation classification
    task, and stresses the need to develop additional methods for this task.

    Knowledge-Based Biomedical Word Sense Disambiguation with Neural Concept Embeddings and Distant Supervision

    A.K.M. Sabbir, Antonio Jimeno Yepes, Ramakanth Kavuluru
    Subjects: Computation and Language (cs.CL)

    Biomedical word sense disambiguation (WSD) is an important intermediate task
    in many natural language processing applications such as named entity
    recognition, syntactic parsing, and relation extraction. In this paper, we
    employ knowledge-based approaches that also exploit recent advances in neural
    word/concept embeddings to improve over the state-of-the-art in biomedical WSD
    using the MSH WSD dataset as the test set. Our methods involve distant
    supervision – we do not use any hand-labeled examples for WSD to build our
    prediction models; however, we employ an existing well known named entity
    recognition and concept mapping program, MetaMap, to obtain our concept
    vectors. Over the MSH WSD dataset, our linear time (in terms of numbers of
    senses and words in the test instance) method achieves an accuracy of 92.24%
    which is an absolute 3% improvement over the best known results obtained via
    unsupervised or knowledge-based means. A more expensive approach that we
    developed relies on a nearest neighbor framework and achieves an accuracy of
    94.34%. Employing dense vector representations learned from unlabeled free text
    has been shown to benefit many language processing tasks recently and our
    efforts show that biomedical WSD is no exception to this trend. For a complex
    and rapidly evolving domain such as biomedicine, building labeled datasets for
    larger sets of ambiguous terms may be impractical. Here we demonstrate that
    distant supervision that leverages recent advances in representation learning
    can rival supervised approaches in biomedical WSD.

    Can Active Memory Replace Attention?

    Łukasz Kaiser, Samy Bengio
    Subjects: Learning (cs.LG); Computation and Language (cs.CL)

    Several mechanisms to focus attention of a neural network on selected parts
    of its input or memory have been used successfully in deep learning models in
    recent years. Attention has improved image classification, image captioning,
    speech recognition, generative models, and learning algorithmic tasks, but it
    had probably the largest impact on neural machine translation.

    Recently, similar improvements have been obtained using alternative
    mechanisms that do not focus on a single part of a memory but operate on all of
    it in parallel, in a uniform way. Such mechanism, which we call active memory,
    improved over attention in algorithmic tasks, image processing, and in
    generative modelling.

    So far, however, active memory has not improved over attention for most
    natural language processing tasks, in particular for machine translation. We
    analyze this shortcoming in this paper and propose an extended model of active
    memory that matches existing attention models on neural machine translation and
    generalizes better to longer sentences. We investigate this model and explain
    why previous active memory models did not succeed. Finally, we discuss when
    active memory brings most benefits and where attention can be a better choice.

    Word Embeddings to Enhance Twitter Gang Member Profile Identification

    Sanjaya Wijeratne, Lakshika Balasuriya, Derek Doran, Amit Sheth
    Comments: 7 pages, 1 figure, 2 tables, Published at IJCAI Workshop on Semantic Machine Learning (SML 2016)
    Journal-ref: IJCAI Workshop on Semantic Machine Learning (SML 2016). pp. 18-24.
    CEUR-WS, New York City, NY (07 2016)
    Subjects: Social and Information Networks (cs.SI); Computation and Language (cs.CL); Computers and Society (cs.CY); Information Retrieval (cs.IR)

    Gang affiliates have joined the masses who use social media to share thoughts
    and actions publicly. Interestingly, they use this public medium to express
    recent illegal actions, to intimidate others, and to share outrageous images
    and statements. Agencies able to unearth these profiles may thus be able to
    anticipate, stop, or hasten the investigation of gang-related crimes. This
    paper investigates the use of word embeddings to help identify gang members on
    Twitter. Building on our previous work, we generate word embeddings that
    translate what Twitter users post in their profile descriptions, tweets,
    profile images, and linked YouTube content to a real vector format amenable for
    machine learning classification. Our experimental results show that pre-trained
    word embeddings can boost the accuracy of supervised learning algorithms
    trained over gang members social media posts.


    Distributed, Parallel, and Cluster Computing

    A Survey of High Level Frameworks in Block-Structured Adaptive Mesh Refinement Packages

    Anshu Dubey, Ann Almgren, John Bell, Martin Berzins, Steve Brandt, Greg Bryan, Phillip Colella, Daniel Graves, Michael Lijewski, Frank Löffler, Brian O'Shea, Erik Schnetter, Brian Van Straalen, Klaus Weide
    Journal-ref: Journal of Parallel and Distributed Computing, Volume 74, Issue
    12, December 2014, Pages 3217-3227
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); High Energy Astrophysical Phenomena (astro-ph.HE); General Relativity and Quantum Cosmology (gr-qc)

    Over the last decade block-structured adaptive mesh refinement (SAMR) has
    found increasing use in large, publicly available codes and frameworks. SAMR
    frameworks have evolved along different paths. Some have stayed focused on
    specific domain areas, others have pursued a more general functionality,
    providing the building blocks for a larger variety of applications. In this
    survey paper we examine a representative set of SAMR packages and SAMR-based
    codes that have been in existence for half a decade or more, have a reasonably
    sized and active user base outside of their home institutions, and are publicly
    available. The set consists of a mix of SAMR packages and application codes
    that cover a broad range of scientific domains. We look at their high-level
    frameworks, and their approach to dealing with the advent of radical changes in
    hardware architecture. The codes included in this survey are BoxLib, Cactus,
    Chombo, Enzo, FLASH, and Uintah.

    A Framework for Certified Self-Stabilization

    Karine Altisen, Pierre Corbineau, Stephane Devismes
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Logic in Computer Science (cs.LO)

    We propose a general framework to build certified proofs of distributed
    self-stabilizing algorithms with the proof assistant Coq. We first define in
    Coq the locally shared memory model with composite atomicity, the most commonly
    used model in the self-stabilizing area. We then validate our framework by
    certifying a non trivial part of an existing silent self-stabilizing algorithm
    which builds a k-clustering of the network. We also certified a quantitative
    property related to the output of this algorithm. Precisely, we show that the
    computed k-clustering contains at most floor( (n-1)/(k+1) )+ 1 clusterheads,
    where n is the number of nodes in the network. To obtain these results, we also
    developed a library which contains general tools related to potential functions
    and cardinality of sets.


    Learning

    Learning Scalable Deep Kernels with Recurrent Structure

    Maruan Al-Shedivat, Andrew Gordon Wilson, Yunus Saatchi, Zhiting Hu, Eric P. Xing
    Comments: 29 pages, 7 figures
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    Many applications in speech, robotics, finance, and biology deal with
    sequential data, where ordering matters and recurrent structures are common.
    However, this structure cannot be easily captured by standard kernel functions.
    To model such structure, we propose expressive closed-form kernel functions for
    Gaussian processes. The resulting model, GP-LSTM, fully encapsulates the
    inductive biases of long short-term memory (LSTM) recurrent networks, while
    retaining the non-parametric probabilistic advantages of Gaussian processes. We
    learn the properties of the proposed kernels by optimizing the Gaussian process
    marginal likelihood using a new provably convergent semi-stochastic procedure
    and exploit the structure of these kernels for fast and scalable training and
    prediction. We demonstrate state-of-the-art performance on several benchmarks,
    and thoroughly investigate a consequential autonomous driving application,
    where the predictive uncertainties provided by GP-LSTM are uniquely valuable.

    Compressive K-means

    Nicolas Keriven (UR1, PANAMA), Nicolas Tremblay (PANAMA), Yann Traonmilin (PANAMA), Rémi Gribonval (PANAMA)
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    The Lloyd-Max algorithm is a classical approach to perform K-means
    clustering. Unfortunately, its cost becomes prohibitive as the training dataset
    grows large. We propose a compressive version of K-means (CKM), that estimates
    cluster centers from a sketch, i.e. from a drastically compressed
    representation of the training dataset. We demonstrate empirically that CKM
    performs similarly to Lloyd-Max, for a sketch size proportional to the number
    of cen-troids times the ambient dimension, and independent of the size of the
    original dataset. Given the sketch, the computational complexity of CKM is also
    independent of the size of the dataset. Unlike Lloyd-Max which requires several
    replicates, we further demonstrate that CKM is almost insensitive to
    initialization. For a large dataset of 10^7 data points, we show that CKM can
    run two orders of magnitude faster than five replicates of Lloyd-Max, with
    similar clustering performance on artificial data. Finally, CKM achieves lower
    classification errors on handwritten digits classification.

    Can Active Memory Replace Attention?

    Łukasz Kaiser, Samy Bengio
    Subjects: Learning (cs.LG); Computation and Language (cs.CL)

    Several mechanisms to focus attention of a neural network on selected parts
    of its input or memory have been used successfully in deep learning models in
    recent years. Attention has improved image classification, image captioning,
    speech recognition, generative models, and learning algorithmic tasks, but it
    had probably the largest impact on neural machine translation.

    Recently, similar improvements have been obtained using alternative
    mechanisms that do not focus on a single part of a memory but operate on all of
    it in parallel, in a uniform way. Such mechanism, which we call active memory,
    improved over attention in algorithmic tasks, image processing, and in
    generative modelling.

    So far, however, active memory has not improved over attention for most
    natural language processing tasks, in particular for machine translation. We
    analyze this shortcoming in this paper and propose an extended model of active
    memory that matches existing attention models on neural machine translation and
    generalizes better to longer sentences. We investigate this model and explain
    why previous active memory models did not succeed. Finally, we discuss when
    active memory brings most benefits and where attention can be a better choice.

    Local Similarity-Aware Deep Feature Embedding

    Chen Huang, Chen Change Loy, Xiaoou Tang
    Comments: 9 pages, 4 figures, 2 tables. Accepted to NIPS 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Existing deep embedding methods in vision tasks are capable of learning a
    compact Euclidean space from images, where Euclidean distances correspond to a
    similarity metric. To make learning more effective and efficient, hard sample
    mining is usually employed, with samples identified through computing the
    Euclidean feature distance. However, the global Euclidean distance cannot
    faithfully characterize the true feature similarity in a complex visual feature
    space, where the intraclass distance in a high-density region may be larger
    than the interclass distance in low-density regions. In this paper, we
    introduce a Position-Dependent Deep Metric (PDDM) unit, which is capable of
    learning a similarity metric adaptive to local feature structure. The metric
    can be used to select genuinely hard samples in a local neighborhood to guide
    the deep embedding learning in an online and robust manner. The new layer is
    appealing in that it is pluggable to any convolutional networks and is trained
    end-to-end. Our local similarity-aware feature embedding not only demonstrates
    faster convergence and boosted performance on two complex image retrieval
    datasets, its large margin nature also leads to superior generalization results
    under the large and open set scenarios of transfer learning and zero-shot
    learning on ImageNet 2010 and ImageNet-10K datasets.

    On Bochner's and Polya's Characterizations of Positive-Definite Kernels and the Respective Random Feature Maps

    Jie Chen, Dehua Cheng, Yan Liu
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Positive-definite kernel functions are fundamental elements of kernel methods
    and Gaussian processes. A well-known construction of such functions comes from
    Bochner’s characterization, which connects a positive-definite function with a
    probability distribution. Another construction, which appears to have attracted
    less attention, is Polya’s criterion that characterizes a subset of these
    functions. In this paper, we study the latter characterization and derive a
    number of novel kernels little known previously.

    In the context of large-scale kernel machines, Rahimi and Recht (2007)
    proposed a random feature map (random Fourier) that approximates a kernel
    function, through independent sampling of the probability distribution in
    Bochner’s characterization. The authors also suggested another feature map
    (random binning), which, although not explicitly stated, comes from Polya’s
    characterization. We show that with the same number of random samples, the
    random binning map results in an Euclidean inner product closer to the kernel
    than does the random Fourier map. The superiority of the random binning map is
    confirmed empirically through regressions and classifications in the
    reproducing kernel Hilbert space.

    A Category Space Approach to Supervised Dimensionality Reduction

    Anthony O. Smith, Anand Rangarajan
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Supervised dimensionality reduction has emerged as an important theme in the
    last decade. Despite the plethora of models and formulations, there is a lack
    of a simple model which aims to project the set of patterns into a space
    defined by the classes (or categories). To this end, we set up a model in which
    each class is represented as a 1D subspace of the vector space formed by the
    features. Assuming the set of classes does not exceed the cardinality of the
    features, the model results in multi-class supervised learning in which the
    features of each class are projected into the class subspace. Class
    discrimination is automatically guaranteed via the imposition of orthogonality
    of the 1D class sub-spaces. The resulting optimization problem – formulated as
    the minimization of a sum of quadratic functions on a Stiefel manifold – while
    being non-convex (due to the constraints), nevertheless has a structure for
    which we can identify when we have reached a global minimum. After formulating
    a version with standard inner products, we extend the formulation to
    reproducing kernel Hilbert spaces in a straightforward manner. The optimization
    approach also extends in a similar fashion to the kernel version. Results and
    comparisons with the multi-class Fisher linear (and kernel) discriminants and
    principal component analysis (linear and kernel) showcase the relative merits
    of this approach to dimensionality reduction.

    CoType: Joint Extraction of Typed Entities and Relations with Knowledge Bases

    Xiang Ren, Zeqiu Wu, Wenqi He, Meng Qu, Clare R. Voss, Heng Ji, Tarek F. Abdelzaher, Jiawei Han
    Comments: submitted to WWW 2017
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    Extracting entities and relations for types of interest from text is
    important for understanding massive text corpora. Traditionally, systems of
    entity relation extraction have relied on human-annotated corpora for training
    and adopted an incremental pipeline. Such systems require additional human
    expertise to be ported to a new domain, and are vulnerable to errors cascading
    down the pipeline. In this paper, we investigate joint extraction of typed
    entities and relations with labeled data heuristically obtained from knowledge
    bases (i.e., distant supervision). As our algorithm for type labeling via
    distant supervision is context-agnostic, noisy training data poses unique
    challenges for the task. We propose a novel domain-independent framework,
    called CoType, that runs a data-driven text segmentation algorithm to extract
    entity mentions, and jointly embeds entity mentions, relation mentions, text
    features and type labels into two low-dimensional spaces (for entity and
    relation mentions respectively), where, in each space, objects whose types are
    close will also have similar representations. CoType, then using these learned
    embeddings, estimates the types of test (unlinkable) mentions. We formulate a
    joint optimization problem to learn embeddings from text corpora and knowledge
    bases, adopting a novel partial-label loss function for noisy labeled data and
    introducing an object “translation” function to capture the cross-constraints
    of entities and relations on each other. Experiments on three public datasets
    demonstrate the effectiveness of CoType across different domains (e.g., news,
    biomedical), with an average of 25% improvement in F1 score compared to the
    next best method.

    Differentially Private Variational Inference for Non-conjugate Models

    Joonas Jälkö, Onur Dikmen, Antti Honkela
    Comments: 8 pages, 7 figures
    Subjects: Machine Learning (stat.ML); Cryptography and Security (cs.CR); Learning (cs.LG); Methodology (stat.ME)

    As collecting huge amounts of personal data from individuals has been
    established as a standard nowadays, it is really important to use these data in
    a conscientious way. For example, when performing inference using these data,
    one has to make sure individuals’ identities or the privacy of the data are not
    compromised. Differential privacy is a powerful framework that introduces
    stochasticity into the computation to guarantee that it is difficult to breach
    the privacy using the output of the computation. Differentially private
    versions of many important machine learning methods have been proposed, but
    still there is a long way to pave towards an efficient unified approach
    applicable to handle many models. In this paper, we propose a differentially
    private variational inference method with a very wide applicability. The
    variational inference is based on stochastic gradient ascent and can handle
    non-conjugate models as well as conjugate ones. Differential privacy is
    achieved by perturbing the gradients. We explore ways to make the algorithm
    more efficient through privacy amplification from subsampling and through
    clipping the gradients to limit the amount of information they leak. We explore
    the effect of different parameter combinations in logistic regression problems
    where the method can reach an accuracy close to non-private level under
    reasonably strong privacy guarantees.

    Learning Bound for Parameter Transfer Learning

    Wataru Kumagai
    Comments: This paper was accepted at NIPS 2016 as a poster presentation
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    We consider a transfer-learning problem by using the parameter transfer
    approach, where a suitable parameter of feature mapping is learned through one
    task and applied to another objective task. Then, we introduce the notion of
    the local stability of parametric feature mapping and parameter transfer
    learnability, and thereby derive a learning bound for parameter transfer
    algorithms. As an application of parameter transfer learning, we discuss the
    performance of sparse coding in self-taught learning. Although self-taught
    learning algorithms with plentiful unlabeled data often show excellent
    empirical performance, their theoretical analysis has not been studied. In this
    paper, we also provide the first theoretical learning bound for self-taught
    learning.

    A random version of principal component analysis in data clustering

    Luigi Leonardo Palese
    Comments: 18 pages, 6 figures, 2 tables
    Subjects: Quantitative Methods (q-bio.QM); Learning (cs.LG)

    Principal component analysis (PCA) is a widespread technique for data
    analysis that relies on the covariance-correlation matrix of the analyzed data.
    However to properly work with high-dimensional data, PCA poses severe
    mathematical constraints on the minimum number of different replicates or
    samples that must be included in the analysis. Here we show that a modified
    algorithm works not only on well dimensioned datasets, but also on degenerated
    ones.

    Regret Bounds for Lifelong Learning

    Pierre Alquier, The Tien Mai, Massimiliano Pontil
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    We consider the problem of transfer learning in an online setting. Different
    tasks are presented sequentially and processed by a within-task algorithm. We
    propose a lifelong learning strategy which refines the underlying data
    representation used by the within-task algorithm, thereby transferring
    information from one task to the next. We show that when the within-task
    algorithm comes with some regret bound, our strategy inherits this good
    property. Our bounds are in expectation for a general loss function, and
    uniform for a convex loss. We discuss applications to dictionary learning and
    finite set of predictors. In the latter case, we improve previous
    (O(1/sqrt{m})) bounds to (O(1/m)) where (m) is the per task sample size.

    Causal Network Learning from Multiple Interventions of Unknown Manipulated Targets

    Yango He, Zhi Geng
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    In this paper, we discuss structure learning of causal networks from multiple
    data sets obtained by external intervention experiments where we do not know
    what variables are manipulated. For example, the conditions in these
    experiments are changed by changing temperature or using drugs, but we do not
    know what target variables are manipulated by the external interventions. From
    such data sets, the structure learning becomes more difficult. For this case,
    we first discuss the identifiability of causal structures. Next we present a
    graph-merging method for learning causal networks for the case that the sample
    sizes are large for these interventions. Then for the case that the sample
    sizes of these interventions are relatively small, we propose a data-pooling
    method for learning causal networks in which we pool all data sets of these
    interventions together for the learning. Further we propose a re-sampling
    approach to evaluate the edges of the causal network learned by the
    data-pooling method. Finally we illustrate the proposed learning methods by
    simulations.

    Synthesis of Shared Control Protocols with Provable Safety and Performance Guarantees

    Nils Jansen, Murat Cubuktepe, Ufuk Topcu
    Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Learning (cs.LG)

    We formalize synthesis of shared control protocols with correctness
    guarantees for temporal logic specifications. More specifically, we introduce a
    modeling formalism in which both a human and an autonomy protocol can issue
    commands to a robot towards performing a certain task. These commands are
    blended into a joint input to the robot. The autonomy protocol is synthesized
    using an abstraction of possible human commands accounting for randomness in
    decisions caused by factors such as fatigue or incomprehensibility of the
    problem at hand. The synthesis is designed to ensure that the resulting robot
    behavior satisfies given safety and performance specifications, e.g., in
    temporal logic. Our solution is based on nonlinear programming and we address
    the inherent scalability issue by presenting alternative methods. We assess the
    feasibility and the scalability of the approach by an experimental evaluation.


    Information Theory

    Buffer-Aided Relaying For The Two-Hop Full-Duplex Relay Channel With Self-Interference

    Mohsen Mohammadkhani Razlighi, Nikola Zlatanov
    Comments: 22 pages and 2 figures, prepared to be an extended version of conference (IEEE ICC 2017, 21-25 May 2017,Paris,France) paper
    Subjects: Information Theory (cs.IT)

    In this paper, we investigate the two-hop full-duplex (FD) relay channel with
    self-interference and fading, which is comprised of a source, an FD relay, and
    a destination, where a direct source-destination link does not exist and the FD
    relay is impaired by self-interference. For this channel, we propose three
    buffer-aided relaying schemes with adaptive reception-transmission at the FD
    relay for the cases when the source and the relay both perform adaptive-power
    allocation, fixed-power allocation, and fixed-rate transmission, respectively.
    The proposed buffer-aided relaying schemes significantly improve the achievable
    rate and the throughput of the considered relay channel by enabling the FD
    relay to adaptively select to either receive, transmit, or simultaneously
    receive and transmit in a given time slot based on the qualities of the
    receiving, transmitting, and self-interference channels. Our numerical results
    show that significant performance gains are achieved using the proposed
    buffer-aided relaying schemes compared to conventional FD relaying, where the
    FD relay is forced to always simultaneously receive and transmit, and to
    buffer-aided half-duplex relaying, where the half-duplex relay cannot
    simultaneously receive and transmit.

    Performance Analysis of M-QAM Multihop Relaying over mmWave Weibull Fading Channels

    Abdelaziz Soulimani, Mustapha Benjillali, Hatim Chergui, Daniel B. da Costa
    Comments: Draft, V1
    Subjects: Information Theory (cs.IT)

    The paper presents a comprehensive closed-form performance analysis framework
    of multihop cooperative communications as promising schemes in the next
    generation mmwave systems. As fading channel model, we adopt the advocated
    Weibull channel for its flexible ability to cover different channel conditions,
    and supported by many recent measurement campaigns in various emerging 5G
    scenarios. The analyzed scheme consists of multiple “Detect-and-Forward” relays
    with generalized high-order M-QAM transmissions. The end-to-end performance of
    the multihop communication is evaluated in terms of outage probability, ergodic
    capacity, bit error rate, and symbol error rate. For all these metrics, we
    present exact closed-form expressions and their asymptotic behavior, some in
    terms of generalized hypergeometric functions. The exactness of our study is
    set by numerical analysis, and Monte-Carlo simulations assess the accuracy of
    our obtained results for different system and channel parameters. Moreover,
    while Fox H and bivariate Fox H functions are not yet implemented in Matlab, we
    provide new codes for computing these functions in general contexts.

    Continuity bounds for information characteristics of quantum channels depending on input dimension and on input energy

    M.E. Shirokov
    Comments: 34 pages, radically extended version of arXiv:1602.00568 (the finite energy case is added), any comments are welcome
    Subjects: Quantum Physics (quant-ph); Information Theory (cs.IT); Mathematical Physics (math-ph)

    We obtain continuity bounds for basic information characteristics of quantum
    channels depending on their input dimension (when it is finite) and on the
    maximal level of input energy (when the input dimension is infinite).

    First we prove continuity bounds for the output conditional mutual
    information for a single channel and for (n) copies of a channel.

    Then we obtain estimates for variation of the output Holevo quantity with
    respect to simultaneous variations of a channel and of an input ensemble.

    As a result tight and close-to-tight continuity bounds for basic capacities
    of quantum channels depending on the input dimension are obtained. They
    complement the Leung-Smith continuity bounds depending on the output dimension.

    Finally, continuity bounds for basic capacities of infinite-dimensional
    channels with the input energy constraints are proved. They show uniform
    continuity of these capacities on the set of all channels with respect to the
    norm of complete boundedness.




沪ICP备19023445号-2号
友情链接