IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Fri, 7 Oct 2016

    我爱机器学习(52ml.net)发表于 2016-10-07 00:00:00
    love 0

    Neural and Evolutionary Computing

    Sequence-based Sleep Stage Classification using Conditional Neural Fields

    Intan Nurma Yulita, Mohamad Ivan Fanany, Aniati Murni Arymurthy
    Comments: 14 pages. Submitted to Computational and Mathematical Methods in Medicine (Hindawi Publishin). Article ID 7163687
    Subjects: Neural and Evolutionary Computing (cs.NE); Learning (cs.LG)

    Sleep signals from a polysomnographic database are sequences in nature.
    Commonly employed analysis and classification methods, however, ignored this
    fact and treated the sleep signals as non-sequence data. Treating the sleep
    signals as sequences, this paper compared two powerful unsupervised feature
    extractors and three sequence-based classifiers regarding accuracy and
    computational (training and testing) time after 10-folds cross-validation. The
    compared feature extractors are Deep Belief Networks (DBN) and Fuzzy C-Means
    (FCM) clustering. Whereas the compared sequence-based classifiers are Hidden
    Markov Models (HMM), Conditional Random Fields (CRF) and its variants, i.e.,
    Hidden-state CRF (HCRF) and Latent-Dynamic CRF (LDCRF); and Conditional Neural
    Fields (CNF) and its variant (LDCNF). In this study, we use two datasets. The
    first dataset is an open (public) polysomnographic dataset downloadable from
    the Internet, while the second dataset is our polysomnographic dataset (also
    available for download). For the first dataset, the combination of FCM and CNF
    gives the highest accuracy (96.75\%) with relatively short training time (0.33
    hours). For the second dataset, the combination of DBN and CRF gives the
    accuracy of 99.96\% but with 1.02 hours training time, whereas the combination
    of DBN and CNF gives slightly less accuracy (99.69\%) but also less computation
    time (0.89 hours).

    Regularized Dynamic Boltzmann Machine with Delay Pruning for Unsupervised Learning of Temporal Sequences

    Sakyasingha Dasgupta, Takayuki Yoshizumi, Takayuki Osogami
    Comments: 6 pages, 5 figures, accepted full paper (oral presentation) at ICPR 2016
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    We introduce Delay Pruning, a simple yet powerful technique to regularize
    dynamic Boltzmann machines (DyBM). The recently introduced DyBM provides a
    particularly structured Boltzmann machine, as a generative model of a
    multi-dimensional time-series. This Boltzmann machine can have infinitely many
    layers of units but allows exact inference and learning based on its
    biologically motivated structure. DyBM uses the idea of conduction delays in
    the form of fixed length first-in first-out (FIFO) queues, with a neuron
    connected to another via this FIFO queue, and spikes from a pre-synaptic neuron
    travel along the queue to the post-synaptic neuron with a constant period of
    delay. Here, we present Delay Pruning as a mechanism to prune the lengths of
    the FIFO queues (making them zero) by setting some delay lengths to one with a
    fixed probability, and finally selecting the best performing model with fixed
    delays. The uniqueness of structure and a non-sampling based learning rule in
    DyBM, make the application of previously proposed regularization techniques
    like Dropout or DropConnect difficult, leading to poor generalization. First,
    we evaluate the performance of Delay Pruning to let DyBM learn a
    multidimensional temporal sequence generated by a Markov chain. Finally, we
    show the effectiveness of delay pruning in learning high dimensional sequences
    using the moving MNIST dataset, and compare it with Dropout and DropConnect
    methods.

    Metaheuristic Algorithms for Convolution Neural Network

    L. M. Rasdi Rere, Mohamad Ivan Fanany, Aniati Murni Arymurthy
    Comments: Article ID 1537325, 13 pages. Received 29 January 2016; Revised 15 April 2016; Accepted 10 May 2016. Academic Editor: Martin Hagan. in Hindawi Publishing. Computational Intelligence and Neuroscience Volume 2016 (2016)
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    A typical modern optimization technique is usually either heuristic or
    metaheuristic. This technique has managed to solve some optimization problems
    in the research area of science, engineering, and industry. However,
    implementation strategy of metaheuristic for accuracy improvement on
    convolution neural networks (CNN), a famous deep learning method, is still
    rarely investigated. Deep learning relates to a type of machine learning
    technique, where its aim is to move closer to the goal of artificial
    intelligence of creating a machine that could successfully perform any
    intellectual tasks that can be carried out by a human. In this paper, we
    propose the implementation strategy of three popular metaheuristic approaches,
    that is, simulated annealing, differential evolution, and harmony search, to
    optimize CNN. The performances of these metaheuristic methods in optimizing CNN
    on classifying MNIST and CIFAR dataset were evaluated and compared.
    Furthermore, the proposed methods are also compared with the original CNN.
    Although the proposed methods show an increase in the computation time, their
    accuracy has also been improved (up to 7.14 percent).

    Adaptive Online Sequential ELM for Concept Drift Tackling

    Arif Budiman, Mohamad Ivan Fanany, Chan Basaruddin
    Comments: Hindawi Publishing. Computational Intelligence and Neuroscience Volume 2016 (2016), Article ID 8091267, 17 pages Received 29 January 2016, Accepted 17 May 2016. Special Issue on “Advances in Neural Networks and Hybrid-Metaheuristics: Theory, Algorithms, and Novel Engineering Applications”. Academic Editor: Stefan Haufe
    Journal-ref: Computational Intelligence and Neuroscience Volume 2016 (2016),
    Article ID 8091267, 17 pages
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    A machine learning method needs to adapt to over time changes in the
    environment. Such changes are known as concept drift. In this paper, we propose
    concept drift tackling method as an enhancement of Online Sequential Extreme
    Learning Machine (OS-ELM) and Constructive Enhancement OS-ELM (CEOS-ELM) by
    adding adaptive capability for classification and regression problem. The
    scheme is named as adaptive OS-ELM (AOS-ELM). It is a single classifier scheme
    that works well to handle real drift, virtual drift, and hybrid drift. The
    AOS-ELM also works well for sudden drift and recurrent context change type. The
    scheme is a simple unified method implemented in simple lines of code. We
    evaluated AOS-ELM on regression and classification problem by using concept
    drift public data set (SEA and STAGGER) and other public data sets such as
    MNIST, USPS, and IDS. Experiments show that our method gives higher kappa value
    compared to the multiclassifier ELM ensemble. Even though AOS-ELM in practice
    does not need hidden nodes increase, we address some issues related to the
    increasing of the hidden nodes such as error condition and rank values. We
    propose taking the rank of the pseudoinverse matrix as an indicator parameter
    to detect underfitting condition.

    A New Data Representation Based on Training Data Characteristics to Extract Drug Named-Entity in Medical Text

    Sadikin Mujiono, Mohamad Ivan Fanany, Chan Basaruddin
    Comments: Hindawi Publishing. Computational Intelligence and Neuroscience Volume 2016 (2016), Article ID 3483528, 24 pages Received 27 May 2016; Revised 8 August 2016; Accepted 18 September 2016. Special Issue on “Smart Data: Where the Big Data Meets the Semantics”. Academic Editor: Trong H. Duong
    Journal-ref: Computational Intelligence and Neuroscience Volume 2016 (2016),
    Article ID 3483528, 24 pages
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    One essential task in information extraction from the medical corpus is drug
    name recognition. Compared with text sources come from other domains, the
    medical text is special and has unique characteristics. In addition, the
    medical text mining poses more challenges, e.g., more unstructured text, the
    fast growing of new terms addition, a wide range of name variation for the same
    drug. The mining is even more challenging due to the lack of labeled dataset
    sources and external knowledge, as well as multiple token representations for a
    single drug name that is more common in the real application setting. Although
    many approaches have been proposed to overwhelm the task, some problems
    remained with poor F-score performance (less than 0.75). This paper presents a
    new treatment in data representation techniques to overcome some of those
    challenges. We propose three data representation techniques based on the
    characteristics of word distribution and word similarities as a result of word
    embedding training. The first technique is evaluated with the standard NN
    model, i.e., MLP (Multi-Layer Perceptrons). The second technique involves two
    deep network classifiers, i.e., DBN (Deep Belief Networks), and SAE (Stacked
    Denoising Encoders). The third technique represents the sentence as a sequence
    that is evaluated with a recurrent NN model, i.e., LSTM (Long Short Term
    Memory). In extracting the drug name entities, the third technique gives the
    best F-score performance compared to the state of the art, with its average
    F-score being 0.8645.

    Multiple Regularizations Deep Learning for Paddy Growth Stages Classification from LANDSAT-8

    Ines Heidieni Ikasari, Vina Ayumi, Mohamad Ivan Fanany, Sidik Mulyono
    Comments: 11 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    This study uses remote sensing technology that can provide information about
    the condition of the earth’s surface area, fast, and spatially. The study area
    was in Karawang District, lying in the Northern part of West Java-Indonesia. We
    address a paddy growth stages classification using LANDSAT 8 image data
    obtained from multi-sensor remote sensing image taken in October 2015 to August
    2016. This study pursues a fast and accurate classification of paddy growth
    stages by employing multiple regularizations learning on some deep learning
    methods such as DNN (Deep Neural Networks) and 1-D CNN (1-D Convolutional
    Neural Networks). The used regularizations are Fast Dropout, Dropout, and Batch
    Normalization. To evaluate the effectiveness, we also compared our method with
    other machine learning methods such as (Logistic Regression, SVM, Random
    Forest, and XGBoost). The data used are seven bands of LANDSAT-8 spectral data
    samples that correspond to paddy growth stages data obtained from i-Sky (eye in
    the sky) Innovation system. The growth stages are determined based on paddy
    crop phenology profile from time series of LANDSAT-8 images. The classification
    results show that MLP using multiple regularization Dropout and Batch
    Normalization achieves the highest accuracy for this dataset.

    Ischemic Stroke Identification Based on EEG and EOG using 1D Convolutional Neural Network and Batch Normalization

    Endang Purnama Giri, Mohamad Ivan Fanany, Aniati Murni Arymurthy
    Comments: 13 pages. To be published in ICACSIS 2016
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    In 2015, stroke was the number one cause of death in Indonesia. The majority
    type of stroke is ischemic. The standard tool for diagnosing stroke is CT-Scan.
    For developing countries like Indonesia, the availability of CT-Scan is very
    limited and still relatively expensive. Because of the availability, another
    device that potential to diagnose stroke in Indonesia is EEG. Ischemic stroke
    occurs because of obstruction that can make the cerebral blood flow (CBF) on a
    person with stroke has become lower than CBF on a normal person (control) so
    that the EEG signal have a deceleration. On this study, we perform the ability
    of 1D Convolutional Neural Network (1DCNN) to construct classification model
    that can distinguish the EEG and EOG stroke data from EEG and EOG control data.
    To accelerate training process our model we use Batch Normalization. Involving
    62 person data object and from leave one out the scenario with five times
    repetition of measurement we obtain the average of accuracy 0.86 (F-Score
    0.861) only at 200 epoch. This result is better than all over shallow and
    popular classifiers as the comparator (the best result of accuracy 0.69 and
    F-Score 0.72 ). The feature used in our study were only 24 handcrafted feature
    with simple feature extraction process.

    Combining Generative and Discriminative Neural Networks for Sleep Stages Classification

    Endang Purnama Giri, Mohamad Ivan Fanany, Aniati Murni Arymurthy
    Comments: Submitted to Computational Intelligence and Neuroscience (Hindawi Publishing). 13 pages
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Sleep stages pattern provides important clues in diagnosing the presence of
    sleep disorder. By analyzing sleep stages pattern and extracting its features
    from EEG, EOG, and EMG signals, we can classify sleep stages. This study
    presents a novel classification model for predicting sleep stages with a high
    accuracy. The main idea is to combine the generative capability of Deep Belief
    Network (DBN) with a discriminative ability and sequence pattern recognizing
    capability of Long Short-term Memory (LSTM). We use DBN that is treated as an
    automatic higher level features generator. The input to DBN is 28 “handcrafted”
    features as used in previous sleep stages studies. We compared our method with
    other techniques which combined DBN with Hidden Markov Model (HMM).In this
    study, we exploit the sequence or time series characteristics of sleep dataset.
    To the best of our knowledge, most of the present sleep analysis from
    polysomnogram relies only on single instanced label (nonsequence) for
    classification. In this study, we used two datasets: an open data set that is
    treated as a benchmark; the other dataset is our sleep stages dataset
    (available for download) to verify the results further. Our experiments showed
    that the combination of DBN with LSTM gives better overall accuracy 98.75\%
    (Fscore=0.9875) for benchmark dataset and 98.94\% (Fscore=0.9894) for MKG
    dataset. This result is better than the state of the art of sleep stages
    classification that was 91.31\%.


    Computer Vision and Pattern Recognition

    Driving in the Matrix: Can Virtual Worlds Replace Human-Generated Annotations for Real World Tasks?

    Matthew Johnson-Roberson, Charles Barto, Rounak Mehta, Sharath Nittur Sridhar, Ram Vasudevan
    Comments: 8 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)

    Deep learning has rapidly transformed the state of the art algorithms used to
    address a variety of problems in computer vision and robotics. These
    breakthroughs have however relied upon massive amounts of human annotated
    training data. This time-consuming process has begun impeding the progress of
    these deep learning efforts. This paper describes a method to incorporate
    photo-realistic computer images from a simulation engine to rapidly generate
    annotated data that can be used for training of machine learning algorithms. We
    demonstrate that a state of the art architecture, which is trained only using
    these synthetic annotations, performs better than the identical architecture
    trained on human annotated real-world data, when tested on the KITTI data set
    for vehicle detection. By training machine learning algorithms on a rich
    virtual world, this paper illustrates that real objects in real scenes can be
    learned and classified using synthetic data. This approach offers the
    possibility of accelerating deep learning’s application to sensor based
    classification problems like those that appear in self-driving cars.

    PetroSurf3D – A high-resolution 3D Dataset of Rock Art for Surface Segmentation

    Georg Poier, Markus Seidl, Matthias Zeppelzauer, Christian Reinbacher, Martin Schaich, Giovanna Bellando, Alberto Marretta, Horst Bischof
    Comments: Dataset and more information can be found at this http URL
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Ancient rock engravings (so called petroglyphs) represent one of the earliest
    surviving artifacts describing life of our ancestors. Recently, modern 3D
    scanning techniques found their application in the domain of rock art
    documentation by providing high-resolution reconstructions of rock surfaces.
    Reconstruction results demonstrate the strengths of novel 3D techniques and
    have the potential to replace the traditional (manual) documentation techniques
    of archaeologists.

    An important analysis task in rock art documentation is the segmentation of
    petroglyphs. To foster automation of this tedious step, we present a
    high-resolution 3D surface dataset of natural rock surfaces which exhibit
    different petroglyphs together with accurate expert ground-truth annotations.
    To our knowledge, this dataset is the first public 3D surface dataset which
    allows for surface segmentation at sub-millimeter scale. We conduct experiments
    with state-of-the-art methods to generate a baseline for the dataset and verify
    that the size and variability of the data is sufficient to successfully adopt
    even recent data-hungry Convolutional Neural Networks (CNNs). Furthermore, we
    experimentally demonstrate that the provided geometric information is key to
    successful automatic segmentation and strongly outperforms color-based
    segmentation. The introduced dataset represents a novel benchmark for 3D
    surface segmentation methods in general and is intended to foster comparability
    among different approaches in future.

    Metaheuristic Algorithms for Convolution Neural Network

    L. M. Rasdi Rere, Mohamad Ivan Fanany, Aniati Murni Arymurthy
    Comments: Article ID 1537325, 13 pages. Received 29 January 2016; Revised 15 April 2016; Accepted 10 May 2016. Academic Editor: Martin Hagan. in Hindawi Publishing. Computational Intelligence and Neuroscience Volume 2016 (2016)
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    A typical modern optimization technique is usually either heuristic or
    metaheuristic. This technique has managed to solve some optimization problems
    in the research area of science, engineering, and industry. However,
    implementation strategy of metaheuristic for accuracy improvement on
    convolution neural networks (CNN), a famous deep learning method, is still
    rarely investigated. Deep learning relates to a type of machine learning
    technique, where its aim is to move closer to the goal of artificial
    intelligence of creating a machine that could successfully perform any
    intellectual tasks that can be carried out by a human. In this paper, we
    propose the implementation strategy of three popular metaheuristic approaches,
    that is, simulated annealing, differential evolution, and harmony search, to
    optimize CNN. The performances of these metaheuristic methods in optimizing CNN
    on classifying MNIST and CIFAR dataset were evaluated and compared.
    Furthermore, the proposed methods are also compared with the original CNN.
    Although the proposed methods show an increase in the computation time, their
    accuracy has also been improved (up to 7.14 percent).

    A Vision-based Indoor Positioning System on Shopping Mall Context

    Ziwei Xu, Haitian Zheng, Minjian Pang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    With the help of a map and GPS, outdoor navigation from one spot to another
    can be done quickly and well. Unfortunately, inside a shopping mall, where GPS
    signal is hardly available, navigation becomes troublesome. In this paper, we
    propose an indoor navigation system to address the problem. Unlike most
    existing indoor navigation systems, which relies heavily on infrastructures and
    pre-labelled maps, our system uses only photos taken by cellphone cameras as
    input. We utilize multiple image processing techniques to parse photos of a
    mall’s shopping instruction and a construct topological map of the mall. During
    navigation, we make use of deep neural networks to extract information from
    environment and find out the real-time position of the user. We propose a new
    feature fusion method to help automatically identifying shops in a photo.

    Do They All Look the Same? Deciphering Chinese, Japanese and Koreans by Fine-Grained Deep Learning

    Yu Wang, Haofu Liao, Yang Feng, Xiangyang Xu, Jiebo Luo
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We study to what extend Chinese, Japanese and Korean faces can be classified
    and which facial attributes offer the most important cues. First, we propose a
    novel way of obtaining large numbers of facial images with nationality labels.
    Then we train state-of-the-art neural networks with these labeled images. We
    are able to achieve an accuracy of 75.03% in the classification task, with
    chances being 33.33% and human accuracy 38.89% . Further, we train multiple
    facial attribute classifiers to identify the most distinctive features for each
    group. We find that Chinese, Japanese and Koreans do exhibit substantial
    differences in certain attributes, such as bangs, smiling, and bushy eyebrows.
    Along the way, we uncover several gender-related cross-country patterns as
    well. Our work, which complements existing APIs such as Microsoft Cognitive
    Services and Face++, could find potential applications in tourism, e-commerce,
    social media marketing, criminal justice and even counter-terrorism.

    Compressive Imaging with Iterative Forward Models

    Hsiou-Yuan Liu, Ulugbek S. Kamilov, Dehong Liu, Hassan Mansour, Petros T. Boufounos
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Optics (physics.optics)

    We propose a new compressive imaging method for reconstructing 2D or 3D
    objects from their scattered wave-field measurements. Our method relies on a
    novel, nonlinear measurement model that can account for the multiple scattering
    phenomenon, which makes the method preferable in applications where linear
    measurement models are inaccurate. We construct the measurement model by
    expanding the scattered wave-field with an accelerated-gradient method, which
    is guaranteed to converge and is suitable for large-scale problems. We provide
    explicit formulas for computing the gradient of our measurement model with
    respect to the unknown image, which enables image formation with a sparsity-
    driven numerical optimization algorithm. We validate the method both
    analytically and with numerical simulations.

    Searching Scenes by Abstracting Things

    Svetlana Kordumova, Jan C. van Gemert, Cees G. M. Snoek, Arnold W. M. Smeulders
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper we propose to represent a scene as an abstraction of ‘things’.
    We start from ‘things’ as generated by modern object proposals, and we
    investigate their immediately observable properties: position, size, aspect
    ratio and color, and those only. Where the recent successes and excitement of
    the field lie in object identification, we represent the scene composition
    independent of object identities. We make three contributions in this work.
    First, we study simple observable properties of ‘things’, and call it things
    syntax. Second, we propose translating the things syntax in linguistic abstract
    statements and study their descriptive effect to retrieve scenes. Thirdly, we
    propose querying of scenes with abstract block illustrations and study their
    effectiveness to discriminate among different types of scenes. The benefit of
    abstract statements and block illustrations is that we generate them directly
    from the images, without any learning beforehand as in the standard attribute
    learning. Surprisingly, we show that even though we use the simplest of
    features from ‘things’ layout and no learning at all, we can still retrieve
    scenes reasonably well.

    Multiple Regularizations Deep Learning for Paddy Growth Stages Classification from LANDSAT-8

    Ines Heidieni Ikasari, Vina Ayumi, Mohamad Ivan Fanany, Sidik Mulyono
    Comments: 11 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    This study uses remote sensing technology that can provide information about
    the condition of the earth’s surface area, fast, and spatially. The study area
    was in Karawang District, lying in the Northern part of West Java-Indonesia. We
    address a paddy growth stages classification using LANDSAT 8 image data
    obtained from multi-sensor remote sensing image taken in October 2015 to August
    2016. This study pursues a fast and accurate classification of paddy growth
    stages by employing multiple regularizations learning on some deep learning
    methods such as DNN (Deep Neural Networks) and 1-D CNN (1-D Convolutional
    Neural Networks). The used regularizations are Fast Dropout, Dropout, and Batch
    Normalization. To evaluate the effectiveness, we also compared our method with
    other machine learning methods such as (Logistic Regression, SVM, Random
    Forest, and XGBoost). The data used are seven bands of LANDSAT-8 spectral data
    samples that correspond to paddy growth stages data obtained from i-Sky (eye in
    the sky) Innovation system. The growth stages are determined based on paddy
    crop phenology profile from time series of LANDSAT-8 images. The classification
    results show that MLP using multiple regularization Dropout and Batch
    Normalization achieves the highest accuracy for this dataset.

    PCA-aided Fully Convolutional Networks for Semantic Segmentation of Multi-channel fMRI

    Lei Tai, Qiong Ye, Ming Liu
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)

    Semantic segmentation of functional magnetic resonance imaging (fMRI) makes
    great sense for pathology diagnosis and decision system of medical robots. The
    multi-channel fMRI data provide more information of the pathological features.
    But the increased amount of data causes complexity in feature detection. This
    paper proposes a principal component analysis (PCA)-aided fully convolutional
    network to particularly deal with multi-channel fMRI. We transfer the learned
    weights of contemporary classification networks to the segmentation task by
    fine-tuning. The experiments results are compared with various methods e.g.
    k-NN. A new labelling strategy is proposed to solve the semantic segmentation
    problem with unclear boundaries. Even with a small-sized training dataset, the
    test results demonstrate that our model outperforms other pathological feature
    detection methods. Besides, its forward inference only takes 90 milliseconds
    for a single set of fMRI data. To our knowledge, this is the first time to
    realize pixel-wise labeling of multi-channel magnetic resonance image using
    FCN.

    A Deep Spatial Contextual Long-term Recurrent Convolutional Network for Saliency Detection

    Nian Liu, Junwei Han
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Traditional saliency models usually adopt hand-crafted image features and
    human-designed mechanisms to calculate local or global contrast. In this paper,
    we propose a novel computational saliency model, i.e., deep spatial contextual
    long-term recurrent convolutional network (DSCLRCN) to predict where people
    looks in natural scenes. DSCLRCN first automatically learns saliency related
    local features on each image location in parallel. Then, in contrast with most
    other deep network based saliency models which infer saliency in local
    contexts, DSCLRCN can mimic the cortical lateral inhibition mechanisms in human
    visual system to incorporate global contexts to assess the saliency of each
    image location by leveraging the deep spatial long short-term memory (DSLSTM)
    model. Moreover, we also integrate scene context modulation in DSLSTM for
    saliency inference, leading to a novel deep spatial contextual LSTM (DSCLSTM)
    model. The whole network can be trained end-to-end and works efficiently when
    testing. Experimental results on two benchmark datasets show that DSCLRCN can
    achieve state-of-the-art performance on saliency detection. Furthermore, the
    proposed DSCLSTM model can significantly boost the saliency detection
    performance by incorporating both global spatial interconnections and scene
    context modulation, which may uncover novel inspirations for studies on them in
    computational saliency models.

    Exploiting Depth from Single Monocular Images for Object Detection and Semantic Segmentation

    Yuanzhouhan Cao, Chunhua Shen, Heng Tao Shen
    Comments: 14 pages. Accepted to IEEE T. Image Processing
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Augmenting RGB data with measured depth has been shown to improve the
    performance of a range of tasks in computer vision including object detection
    and semantic segmentation. Although depth sensors such as the Microsoft Kinect
    have facilitated easy acquisition of such depth information, the vast majority
    of images used in vision tasks do not contain depth information. In this paper,
    we show that augmenting RGB images with estimated depth can also improve the
    accuracy of both object detection and semantic segmentation. Specifically, we
    first exploit the recent success of depth estimation from monocular images and
    learn a deep depth estimation model. Then we learn deep depth features from the
    estimated depth and combine with RGB features for object detection and semantic
    segmentation. Additionally, we propose an RGB-D semantic segmentation method
    which applies a multi-task training scheme: semantic label prediction and depth
    value regression. We test our methods on several datasets and demonstrate that
    incorporating information from estimated depth improves the performance of
    object detection and semantic segmentation remarkably.

    Distortion Varieties

    Joe Kileel, Zuzana Kukelova, Tomas Pajdla, Bernd Sturmfels
    Comments: 25 pages
    Subjects: Algebraic Geometry (math.AG); Computer Vision and Pattern Recognition (cs.CV)

    The distortion varieties of a given projective variety are parametrized by
    duplicating coordinates and multiplying them with monomials. We study their
    degrees and defining equations. Exact formulas are obtained for the case of
    one-parameter distortions. These are based on Chow polytopes and Gr”obner
    bases. Multi-parameter distortions are studied using tropical geometry. The
    motivation for distortion varieties comes from multi-view geometry in computer
    vision. Our theory furnishes a new framework for formulating and solving
    minimal problems for camera models with image distortion.

    Supervision via Competition: Robot Adversaries for Learning Tasks

    Lerrel Pinto, James Davidson, Abhinav Gupta
    Comments: Submission to ICRA 2017
    Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    There has been a recent paradigm shift in robotics to data-driven learning
    for planning and control. Due to large number of experiences required for
    training, most of these approaches use a self-supervised paradigm: using
    sensors to measure success/failure. However, in most cases, these sensors
    provide weak supervision at best. In this work, we propose an adversarial
    learning framework that pits an adversary against the robot learning the task.
    In an effort to defeat the adversary, the original robot learns to perform the
    task with more robustness leading to overall improved performance. We show that
    this adversarial framework forces the the robot to learn a better grasping
    model in order to overcome the adversary. By grasping 82% of presented novel
    objects compared to 68% without an adversary, we demonstrate the utility of
    creating adversaries. We also demonstrate via experiments that having robots in
    adversarial setting might be a better learning strategy as compared to having
    collaborative multiple robots.

    Much Ado About Time: Exhaustive Annotation of Temporal Data

    Gunnar A. Sigurdsson, Olga Russakovsky, Ali Farhadi, Ivan Laptev, Abhinav Gupta
    Comments: HCOMP 2016 Camera Ready
    Subjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV)

    Large-scale annotated datasets allow AI systems to learn from and build upon
    the knowledge of the crowd. Many crowdsourcing techniques have been developed
    for collecting image annotations. These techniques often implicitly rely on the
    fact that a new input image takes a negligible amount of time to perceive. In
    contrast, we investigate and determine the most cost-effective way of obtaining
    high-quality multi-label annotations for temporal data such as videos. Watching
    even a short 30-second video clip requires a significant time investment from a
    crowd worker; thus, requesting multiple annotations following a single viewing
    is an important cost-saving strategy. But how many questions should we ask per
    video? We conclude that the optimal strategy is to ask as many questions as
    possible in a HIT (up to 52 binary questions after watching a 30-second video
    clip in our experiments). We demonstrate that while workers may not correctly
    answer all questions, the cost-benefit analysis nevertheless favors consensus
    from multiple such cheap-yet-imperfect iterations over more complex
    alternatives. When compared with a one-question-per-video baseline, our method
    is able to achieve a 10% improvement in recall 76.7% ours versus 66.7%
    baseline) at comparable precision (83.8% ours versus 83.0% baseline) in about
    half the annotation time (3.8 minutes ours compared to 7.1 minutes baseline).
    We demonstrate the effectiveness of our method by collecting multi-label
    annotations of 157 human activities on 1,815 videos.


    Artificial Intelligence

    Adaptive Online Sequential ELM for Concept Drift Tackling

    Arif Budiman, Mohamad Ivan Fanany, Chan Basaruddin
    Comments: Hindawi Publishing. Computational Intelligence and Neuroscience Volume 2016 (2016), Article ID 8091267, 17 pages Received 29 January 2016, Accepted 17 May 2016. Special Issue on “Advances in Neural Networks and Hybrid-Metaheuristics: Theory, Algorithms, and Novel Engineering Applications”. Academic Editor: Stefan Haufe
    Journal-ref: Computational Intelligence and Neuroscience Volume 2016 (2016),
    Article ID 8091267, 17 pages
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    A machine learning method needs to adapt to over time changes in the
    environment. Such changes are known as concept drift. In this paper, we propose
    concept drift tackling method as an enhancement of Online Sequential Extreme
    Learning Machine (OS-ELM) and Constructive Enhancement OS-ELM (CEOS-ELM) by
    adding adaptive capability for classification and regression problem. The
    scheme is named as adaptive OS-ELM (AOS-ELM). It is a single classifier scheme
    that works well to handle real drift, virtual drift, and hybrid drift. The
    AOS-ELM also works well for sudden drift and recurrent context change type. The
    scheme is a simple unified method implemented in simple lines of code. We
    evaluated AOS-ELM on regression and classification problem by using concept
    drift public data set (SEA and STAGGER) and other public data sets such as
    MNIST, USPS, and IDS. Experiments show that our method gives higher kappa value
    compared to the multiclassifier ELM ensemble. Even though AOS-ELM in practice
    does not need hidden nodes increase, we address some issues related to the
    increasing of the hidden nodes such as error condition and rank values. We
    propose taking the rank of the pseudoinverse matrix as an indicator parameter
    to detect underfitting condition.

    Metaheuristic Algorithms for Convolution Neural Network

    L. M. Rasdi Rere, Mohamad Ivan Fanany, Aniati Murni Arymurthy
    Comments: Article ID 1537325, 13 pages. Received 29 January 2016; Revised 15 April 2016; Accepted 10 May 2016. Academic Editor: Martin Hagan. in Hindawi Publishing. Computational Intelligence and Neuroscience Volume 2016 (2016)
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    A typical modern optimization technique is usually either heuristic or
    metaheuristic. This technique has managed to solve some optimization problems
    in the research area of science, engineering, and industry. However,
    implementation strategy of metaheuristic for accuracy improvement on
    convolution neural networks (CNN), a famous deep learning method, is still
    rarely investigated. Deep learning relates to a type of machine learning
    technique, where its aim is to move closer to the goal of artificial
    intelligence of creating a machine that could successfully perform any
    intellectual tasks that can be carried out by a human. In this paper, we
    propose the implementation strategy of three popular metaheuristic approaches,
    that is, simulated annealing, differential evolution, and harmony search, to
    optimize CNN. The performances of these metaheuristic methods in optimizing CNN
    on classifying MNIST and CIFAR dataset were evaluated and compared.
    Furthermore, the proposed methods are also compared with the original CNN.
    Although the proposed methods show an increase in the computation time, their
    accuracy has also been improved (up to 7.14 percent).

    A New Data Representation Based on Training Data Characteristics to Extract Drug Named-Entity in Medical Text

    Sadikin Mujiono, Mohamad Ivan Fanany, Chan Basaruddin
    Comments: Hindawi Publishing. Computational Intelligence and Neuroscience Volume 2016 (2016), Article ID 3483528, 24 pages Received 27 May 2016; Revised 8 August 2016; Accepted 18 September 2016. Special Issue on “Smart Data: Where the Big Data Meets the Semantics”. Academic Editor: Trong H. Duong
    Journal-ref: Computational Intelligence and Neuroscience Volume 2016 (2016),
    Article ID 3483528, 24 pages
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    One essential task in information extraction from the medical corpus is drug
    name recognition. Compared with text sources come from other domains, the
    medical text is special and has unique characteristics. In addition, the
    medical text mining poses more challenges, e.g., more unstructured text, the
    fast growing of new terms addition, a wide range of name variation for the same
    drug. The mining is even more challenging due to the lack of labeled dataset
    sources and external knowledge, as well as multiple token representations for a
    single drug name that is more common in the real application setting. Although
    many approaches have been proposed to overwhelm the task, some problems
    remained with poor F-score performance (less than 0.75). This paper presents a
    new treatment in data representation techniques to overcome some of those
    challenges. We propose three data representation techniques based on the
    characteristics of word distribution and word similarities as a result of word
    embedding training. The first technique is evaluated with the standard NN
    model, i.e., MLP (Multi-Layer Perceptrons). The second technique involves two
    deep network classifiers, i.e., DBN (Deep Belief Networks), and SAE (Stacked
    Denoising Encoders). The third technique represents the sentence as a sequence
    that is evaluated with a recurrent NN model, i.e., LSTM (Long Short Term
    Memory). In extracting the drug name entities, the third technique gives the
    best F-score performance compared to the state of the art, with its average
    F-score being 0.8645.

    Parallel Large-Scale Attribute Reduction on Cloud Systems

    Junbo Zhang, Tianrui Li, Yi Pan
    Comments: 14 pages, 10 figures
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI)

    The rapid growth of emerging information technologies and application
    patterns in modern society, e.g., Internet, Internet of Things, Cloud Computing
    and Tri-network Convergence, has caused the advent of the era of big data. Big
    data contains huge values, however, mining knowledge from big data is a
    tremendously challenging task because of data uncertainty and inconsistency.
    Attribute reduction (also known as feature selection) can not only be used as
    an effective preprocessing step, but also exploits the data redundancy to
    reduce the uncertainty. However, existing solutions are designed 1) either for
    a single machine that means the entire data must fit in the main memory and the
    parallelism is limited; 2) or for the Hadoop platform which means that the data
    have to be loaded into the distributed memory frequently and therefore become
    inefficient. In this paper, we overcome these shortcomings for maximum
    efficiency possible, and propose a unified framework for Parallel Large-scale
    Attribute Reduction, termed PLAR, for big data analysis. PLAR consists of three
    components: 1) Granular Computing (GrC)-based initialization: it converts a
    decision table (i.e., original data representation) into a granularity
    representation which reduces the amount of space and hence can be easily cached
    in the distributed memory: 2) model-parallelism: it simultaneously evaluates
    all feature candidates and makes attribute reduction highly parallelizable; 3)
    data-parallelism: it computes the significance of an attribute in parallel
    using a MapReduce-style manner. We implement PLAR with four representative
    heuristic feature selection algorithms on Spark, and evaluate them on various
    huge datasets, including UCI and astronomical datasets, finding our method’s
    advantages beyond existing solutions.

    Human Decision-Making under Limited Time

    Pedro A. Ortega, Alan A. Stocker
    Comments: 9 pages, 4 figures, NIPS Advances in Neural Information Processing Systems 29, 2016
    Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI)

    Subjective expected utility theory assumes that decision-makers possess
    unlimited computational resources to reason about their choices; however,
    virtually all decisions in everyday life are made under resource constraints –
    i.e. decision-makers are bounded in their rationality. Here we experimentally
    tested the predictions made by a formalization of bounded rationality based on
    ideas from statistical mechanics and information-theory. We systematically
    tested human subjects in their ability to solve combinatorial puzzles under
    different time limitations. We found that our bounded-rational model accounts
    well for the data. The decomposition of the fitted model parameter into the
    subjects’ expected utility function and resource parameter provide interesting
    insight into the subjects’ information capacity limits. Our results confirm
    that humans gradually fall back on their learned prior choice patterns when
    confronted with increasing resource limitations.


    Information Retrieval

    Discriminative Information Retrieval for Knowledge Discovery

    Tongfei Chen, Benjamin Van Durme
    Subjects: Information Retrieval (cs.IR)

    We propose a framework for discriminative Information Retrieval (IR) atop
    linguistic features, trained to improve the recall of tasks such as answer
    candidate passage retrieval, the initial step in text-based Question Answering
    (QA). We formalize this as an instance of linear feature-based IR (Metzler and
    Croft, 2007), illustrating how a variety of knowledge discovery tasks are
    captured under this approach, leading to a 44% improvement in recall for
    candidate triage for QA.

    A Robust Framework for Classifying Evolving Document Streams in an Expert-Machine-Crowd Setting

    Muhammad Imran, Sanjay Chawla, Carlos Castillo
    Comments: Accepted at ICDM 2016
    Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)

    An emerging challenge in the online classification of social media data
    streams is to keep the categories used for classification up-to-date. In this
    paper, we propose an innovative framework based on an Expert-Machine-Crowd
    (EMC) triad to help categorize items by continuously identifying novel concepts
    in heterogeneous data streams often riddled with outliers. We unify constrained
    clustering and outlier detection by formulating a novel optimization problem:
    COD-Means. We design an algorithm to solve the COD-Means problem and show that
    COD-Means will not only help detect novel categories but also seamlessly
    discover human annotation errors and improve the overall quality of the
    categorization process. Experiments on diverse real data sets demonstrate that
    our approach is both effective and efficient.


    Computation and Language

    Scalable Machine Translation in Memory Constrained Environments

    Paul Baltescu
    Comments: Master Thesis
    Subjects: Computation and Language (cs.CL)

    Machine translation is the discipline concerned with developing automated
    tools for translating from one human language to another. Statistical machine
    translation (SMT) is the dominant paradigm in this field. In SMT, translations
    are generated by means of statistical models whose parameters are learned from
    bilingual data. Scalability is a key concern in SMT, as one would like to make
    use of as much data as possible to train better translation systems.

    In recent years, mobile devices with adequate computing power have become
    widely available. Despite being very successful, mobile applications relying on
    NLP systems continue to follow a client-server architecture, which is of
    limited use because access to internet is often limited and expensive. The goal
    of this dissertation is to show how to construct a scalable machine translation
    system that can operate with the limited resources available on a mobile
    device.

    The main challenge for porting translation systems on mobile devices is
    memory usage. The amount of memory available on a mobile device is far less
    than what is typically available on the server side of a client-server
    application. In this thesis, we investigate alternatives for the two components
    which prevent standard translation systems from working on mobile devices due
    to high memory usage. We show that once these standard components are replaced
    with our proposed alternatives, we obtain a scalable translation system that
    can work on a device with limited memory.

    Toward Automatic Understanding of the Function of Affective Language in Support Groups

    Amit Navindgi, Caroline Brun, Cécile Boulard Masson, Scott Nowson
    Comments: 9 pages, 1 figure, conference workshop
    Subjects: Computation and Language (cs.CL)

    Understanding expressions of emotions in support forums has considerable
    value and NLP methods are key to automating this. Many approaches
    understandably use subjective categories which are more fine-grained than a
    straightforward polarity-based spectrum. However, the definition of such
    categories is non-trivial and, in fact, we argue for a need to incorporate
    communicative elements even beyond subjectivity. To support our position, we
    report experiments on a sentiment-labelled corpus of posts taken from a medical
    support forum. We argue that not only is a more fine-grained approach to text
    analysis important, but simultaneously recognising the social function behind
    affective expressions enable a more accurate and valuable level of
    understanding.

    A New Data Representation Based on Training Data Characteristics to Extract Drug Named-Entity in Medical Text

    Sadikin Mujiono, Mohamad Ivan Fanany, Chan Basaruddin
    Comments: Hindawi Publishing. Computational Intelligence and Neuroscience Volume 2016 (2016), Article ID 3483528, 24 pages Received 27 May 2016; Revised 8 August 2016; Accepted 18 September 2016. Special Issue on “Smart Data: Where the Big Data Meets the Semantics”. Academic Editor: Trong H. Duong
    Journal-ref: Computational Intelligence and Neuroscience Volume 2016 (2016),
    Article ID 3483528, 24 pages
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    One essential task in information extraction from the medical corpus is drug
    name recognition. Compared with text sources come from other domains, the
    medical text is special and has unique characteristics. In addition, the
    medical text mining poses more challenges, e.g., more unstructured text, the
    fast growing of new terms addition, a wide range of name variation for the same
    drug. The mining is even more challenging due to the lack of labeled dataset
    sources and external knowledge, as well as multiple token representations for a
    single drug name that is more common in the real application setting. Although
    many approaches have been proposed to overwhelm the task, some problems
    remained with poor F-score performance (less than 0.75). This paper presents a
    new treatment in data representation techniques to overcome some of those
    challenges. We propose three data representation techniques based on the
    characteristics of word distribution and word similarities as a result of word
    embedding training. The first technique is evaluated with the standard NN
    model, i.e., MLP (Multi-Layer Perceptrons). The second technique involves two
    deep network classifiers, i.e., DBN (Deep Belief Networks), and SAE (Stacked
    Denoising Encoders). The third technique represents the sentence as a sequence
    that is evaluated with a recurrent NN model, i.e., LSTM (Long Short Term
    Memory). In extracting the drug name entities, the third technique gives the
    best F-score performance compared to the state of the art, with its average
    F-score being 0.8645.

    Neural-based Noise Filtering from Word Embeddings

    Kim Anh Nguyen, Sabine Schulte im Walde, Ngoc Thang Vu
    Comments: 9 pages, 4 figures, COLING 2016
    Subjects: Computation and Language (cs.CL)

    Word embeddings have been demonstrated to benefit NLP tasks impressively.
    Yet, there is room for improvement in the vector representations, because
    current word embeddings typically contain unnecessary information, i.e., noise.
    We propose two novel models to improve word embeddings by unsupervised
    learning, in order to yield word denoising embeddings. The word denoising
    embeddings are obtained by strengthening salient information and weakening
    noise in the original word embeddings, based on a deep feed-forward neural
    network filter. Results from benchmark tasks show that the filtered word
    denoising embeddings outperform the original word embeddings.

    A Robust Framework for Classifying Evolving Document Streams in an Expert-Machine-Crowd Setting

    Muhammad Imran, Sanjay Chawla, Carlos Castillo
    Comments: Accepted at ICDM 2016
    Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)

    An emerging challenge in the online classification of social media data
    streams is to keep the categories used for classification up-to-date. In this
    paper, we propose an innovative framework based on an Expert-Machine-Crowd
    (EMC) triad to help categorize items by continuously identifying novel concepts
    in heterogeneous data streams often riddled with outliers. We unify constrained
    clustering and outlier detection by formulating a novel optimization problem:
    COD-Means. We design an algorithm to solve the COD-Means problem and show that
    COD-Means will not only help detect novel categories but also seamlessly
    discover human annotation errors and improve the overall quality of the
    categorization process. Experiments on diverse real data sets demonstrate that
    our approach is both effective and efficient.

    Automatic Detection of Small Groups of Persons, Influential Members, Relations and Hierarchy in Written Conversations Using Fuzzy Logic

    French Pope III, Rouzbeh A. Shirvani, Mugizi Robert Rwebangira, Mohamed Chouikha, Ayo Taylor, Andres Alarcon Ramirez, Amirsina Torfi
    Subjects: Computation and Language (cs.CL); Social and Information Networks (cs.SI)

    Nowadays a lot of data is collected in online forums. One of the key tasks is
    to determine the social structure of these online groups, for example the
    identification of subgroups within a larger group. We will approach the
    grouping of individual as a classification problem. The classifier will be
    based on fuzzy logic. The input to the classifier will be linguistic features
    and degree of relationships (among individuals). The output of the classifiers
    are the groupings of individuals. We also incorporate a method that ranks the
    members of the detected subgroup to identify the hierarchies in each subgroup.
    Data from the HBO television show The Wire is used to analyze the efficacy and
    usefulness of fuzzy logic based methods as alternative methods to classical
    statistical methods usually used for these problems. The proposed methodology
    could detect automatically the most influential members of each organization
    The Wire with 90% accuracy.

    Generating Simulations of Motion Events from Verbal Descriptions

    James Pustejovsky, Nikhil Krishnaswamy
    Comments: 11 pages, 5 figures, *SEM workshop, COLING 2014
    Subjects: Computation and Language (cs.CL)

    In this paper, we describe a computational model for motion events in natural
    language that maps from linguistic expressions, through a dynamic event
    interpretation, into three-dimensional temporal simulations in a model.
    Starting with the model from (Pustejovsky and Moszkowicz, 2011), we analyze
    motion events using temporally-traced Labelled Transition Systems. We model the
    distinction between path- and manner-motion in an operational semantics, and
    further distinguish different types of manner-of-motion verbs in terms of the
    mereo-topological relations that hold throughout the process of movement. From
    these representations, we generate minimal models, which are realized as
    three-dimensional simulations in software developed with the game engine,
    Unity. The generated simulations act as a conceptual “debugger” for the
    semantics of different motion verbs: that is, by testing for consistency and
    informativeness in the model, simulations expose the presuppositions associated
    with linguistic expressions and their compositions. Because the model
    generation component is still incomplete, this paper focuses on an
    implementation which maps directly from linguistic interpretations into the
    Unity code snippets that create the simulations.


    Distributed, Parallel, and Cluster Computing

    Parallel Large-Scale Attribute Reduction on Cloud Systems

    Junbo Zhang, Tianrui Li, Yi Pan
    Comments: 14 pages, 10 figures
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI)

    The rapid growth of emerging information technologies and application
    patterns in modern society, e.g., Internet, Internet of Things, Cloud Computing
    and Tri-network Convergence, has caused the advent of the era of big data. Big
    data contains huge values, however, mining knowledge from big data is a
    tremendously challenging task because of data uncertainty and inconsistency.
    Attribute reduction (also known as feature selection) can not only be used as
    an effective preprocessing step, but also exploits the data redundancy to
    reduce the uncertainty. However, existing solutions are designed 1) either for
    a single machine that means the entire data must fit in the main memory and the
    parallelism is limited; 2) or for the Hadoop platform which means that the data
    have to be loaded into the distributed memory frequently and therefore become
    inefficient. In this paper, we overcome these shortcomings for maximum
    efficiency possible, and propose a unified framework for Parallel Large-scale
    Attribute Reduction, termed PLAR, for big data analysis. PLAR consists of three
    components: 1) Granular Computing (GrC)-based initialization: it converts a
    decision table (i.e., original data representation) into a granularity
    representation which reduces the amount of space and hence can be easily cached
    in the distributed memory: 2) model-parallelism: it simultaneously evaluates
    all feature candidates and makes attribute reduction highly parallelizable; 3)
    data-parallelism: it computes the significance of an attribute in parallel
    using a MapReduce-style manner. We implement PLAR with four representative
    heuristic feature selection algorithms on Spark, and evaluate them on various
    huge datasets, including UCI and astronomical datasets, finding our method’s
    advantages beyond existing solutions.

    A Survey and Measurement Study of GPU DVFS on Energy Conservation

    Xinxin Mei, Qiang Wang, Xiaowen Chu
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Energy efficiency has become one of the top design criteria for current
    computing systems. The dynamic voltage and frequency scaling (DVFS) has been
    widely adopted by laptop computers, servers, and mobile devices to conserve
    energy, while the GPU DVFS is still at a certain early age. This paper aims at
    exploring the impact of GPU DVFS on the application performance and power
    consumption, and furthermore, on energy conservation. We survey the
    state-of-the-art GPU DVFS characterizations, and then summarize recent research
    works on GPU power and performance models. We also conduct real GPU DVFS
    experiments on NVIDIA Fermi and Maxwell GPUs. According to our experimental
    results, GPU DVFS has significant potential for energy saving. The effect of
    scaling core voltage/frequency and memory voltage/frequency depends on not only
    the GPU architectures, but also the characteristic of GPU applications.

    RedThreads: An Interface for Application-level Fault Detection/Correction through Adaptive Redundant Multithreading

    Saurabh Hukerikar, Keita Teranishi, Pedro C. Diniz, Robert F. Lucas
    Comments: Submitted to the International Journal of Parallel Programming
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    In the presence of accelerated fault rates, which are projected to be the
    norm on future exascale systems, it will become increasingly difficult for
    high-performance computing (HPC) applications to accomplish useful computation.
    Due to the fault-oblivious nature of current HPC programming paradigms and
    execution environments, HPC applications are insufficiently equipped to deal
    with errors. We believe that HPC applications should be enabled with
    capabilities to actively search for and correct errors in their computations.
    The redundant multithreading (RMT) approach offers lightweight replicated
    execution streams of program instructions within the context of a single
    application process. However, the use of complete redundancy incurs significant
    overhead to the application performance.

    In this paper we present RedThreads, an interface that provides
    application-level fault detection and correction based on RMT, but applies the
    thread-level redundancy adaptively. We describe the RedThreads syntax and
    semantics, and the supporting compiler infrastructure and runtime system. Our
    approach enables application programmers to scope the extent of redundant
    computation. Additionally, the runtime system permits the use of RMT to be
    dynamically enabled, or disabled, based on the resiliency needs of the
    application and the state of the system. Our experimental results demonstrate
    how adaptive RMT exploits programmer insight and runtime inference to
    dynamically navigate the trade-off space between an application’s resilience
    coverage and the associated performance overhead of redundant computation.


    Learning

    Regularized Dynamic Boltzmann Machine with Delay Pruning for Unsupervised Learning of Temporal Sequences

    Sakyasingha Dasgupta, Takayuki Yoshizumi, Takayuki Osogami
    Comments: 6 pages, 5 figures, accepted full paper (oral presentation) at ICPR 2016
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    We introduce Delay Pruning, a simple yet powerful technique to regularize
    dynamic Boltzmann machines (DyBM). The recently introduced DyBM provides a
    particularly structured Boltzmann machine, as a generative model of a
    multi-dimensional time-series. This Boltzmann machine can have infinitely many
    layers of units but allows exact inference and learning based on its
    biologically motivated structure. DyBM uses the idea of conduction delays in
    the form of fixed length first-in first-out (FIFO) queues, with a neuron
    connected to another via this FIFO queue, and spikes from a pre-synaptic neuron
    travel along the queue to the post-synaptic neuron with a constant period of
    delay. Here, we present Delay Pruning as a mechanism to prune the lengths of
    the FIFO queues (making them zero) by setting some delay lengths to one with a
    fixed probability, and finally selecting the best performing model with fixed
    delays. The uniqueness of structure and a non-sampling based learning rule in
    DyBM, make the application of previously proposed regularization techniques
    like Dropout or DropConnect difficult, leading to poor generalization. First,
    we evaluate the performance of Delay Pruning to let DyBM learn a
    multidimensional temporal sequence generated by a Markov chain. Finally, we
    show the effectiveness of delay pruning in learning high dimensional sequences
    using the moving MNIST dataset, and compare it with Dropout and DropConnect
    methods.

    Active exploration in parameterized reinforcement learning

    Mehdi Khamassi, Costas Tzafestas
    Comments: Submitted to EWRL2016
    Subjects: Learning (cs.LG)

    Online model-free reinforcement learning (RL) methods with continuous actions
    are playing a prominent role when dealing with real-world applications such as
    Robotics. However, when confronted to non-stationary environments, these
    methods crucially rely on an exploration-exploitation trade-off which is rarely
    dynamically and automatically adjusted to changes in the environment. Here we
    propose an active exploration algorithm for RL in structured (parameterized)
    continuous action space. This framework deals with a set of discrete actions,
    each of which is parameterized with continuous variables. Discrete exploration
    is controlled through a Boltzmann softmax function with an inverse temperature
    $eta$ parameter. In parallel, a Gaussian exploration is applied to the
    continuous action parameters. We apply a meta-learning algorithm based on the
    comparison between variations of short-term and long-term reward running
    averages to simultaneously tune $eta$ and the width of the Gaussian
    distribution from which continuous action parameters are drawn. When applied to
    a simple virtual human-robot interaction task, we show that this algorithm
    outperforms continuous parameterized RL both without active exploration and
    with active exploration based on uncertainty variations measured by a
    Kalman-Q-learning algorithm.

    Connecting Generative Adversarial Networks and Actor-Critic Methods

    David Pfau, Oriol Vinyals
    Comments: Submission to NIPS 2016 Workshop on Adversarial Training
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    Both generative adversarial networks (GAN) in unsupervised learning and
    actor-critic methods in reinforcement learning (RL) have gained a reputation
    for being difficult to optimize. Practitioners in both fields have amassed a
    large number of strategies to mitigate these instabilities and improve
    training. Here we show that GANs can be viewed as actor-critic methods in an
    environment where the actor cannot affect the reward. We review the strategies
    for stabilizing training for each class of models, both those that generalize
    between the two and those that are particular to that model. We also review a
    number of extensions to GANs and RL algorithms with even more complicated
    information flow. We hope that by highlighting this formal connection we will
    encourage both GAN and RL communities to develop general, scalable, and stable
    algorithms for multilevel optimization with deep networks, and to draw
    inspiration across communities.

    Using Non-invertible Data Transformations to Build Adversary-Resistant Deep Neural Networks

    Qinglong Wang, Wenbo Guo, Alexander G. Ororbia II, Xinyu Xing, Lin Lin, C. Lee Giles, Xue Liu, Peng Liu, Gang Xiong
    Subjects: Learning (cs.LG)

    Deep neural networks have proven to be quite effective in a wide variety of
    machine learning tasks, ranging from improved speech recognition systems to
    advancing the development of autonomous vehicles. However, despite their
    superior performance in many applications, these models have been recently
    shown to be susceptible to a particular type of attack possible through the
    generation of particular synthetic examples referred to as adversarial samples.
    These samples are constructed by manipulating real examples from the training
    data distribution in order to “fool” the original neural model, resulting in
    misclassification (with high confidence) of previously correctly classified
    samples. Addressing this weakness is of utmost importance if deep neural
    architectures are to be applied to critical applications, such as those in the
    domain of cybersecurity. In this paper, we present an analysis of this
    fundamental flaw lurking in all neural architectures to uncover limitations of
    previously proposed defense mechanisms. More importantly, we present a unifying
    framework for protecting deep neural models using a non-invertible data
    transformation–developing two adversary-resilient architectures utilizing both
    linear and nonlinear dimensionality reduction. Empirical results indicate that
    our framework provides better robustness compared to state-of-art solutions
    while having negligible degradation in accuracy.

    Ischemic Stroke Identification Based on EEG and EOG using 1D Convolutional Neural Network and Batch Normalization

    Endang Purnama Giri, Mohamad Ivan Fanany, Aniati Murni Arymurthy
    Comments: 13 pages. To be published in ICACSIS 2016
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    In 2015, stroke was the number one cause of death in Indonesia. The majority
    type of stroke is ischemic. The standard tool for diagnosing stroke is CT-Scan.
    For developing countries like Indonesia, the availability of CT-Scan is very
    limited and still relatively expensive. Because of the availability, another
    device that potential to diagnose stroke in Indonesia is EEG. Ischemic stroke
    occurs because of obstruction that can make the cerebral blood flow (CBF) on a
    person with stroke has become lower than CBF on a normal person (control) so
    that the EEG signal have a deceleration. On this study, we perform the ability
    of 1D Convolutional Neural Network (1DCNN) to construct classification model
    that can distinguish the EEG and EOG stroke data from EEG and EOG control data.
    To accelerate training process our model we use Batch Normalization. Involving
    62 person data object and from leave one out the scenario with five times
    repetition of measurement we obtain the average of accuracy 0.86 (F-Score
    0.861) only at 200 epoch. This result is better than all over shallow and
    popular classifiers as the comparator (the best result of accuracy 0.69 and
    F-Score 0.72 ). The feature used in our study were only 24 handcrafted feature
    with simple feature extraction process.

    Combining Generative and Discriminative Neural Networks for Sleep Stages Classification

    Endang Purnama Giri, Mohamad Ivan Fanany, Aniati Murni Arymurthy
    Comments: Submitted to Computational Intelligence and Neuroscience (Hindawi Publishing). 13 pages
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Sleep stages pattern provides important clues in diagnosing the presence of
    sleep disorder. By analyzing sleep stages pattern and extracting its features
    from EEG, EOG, and EMG signals, we can classify sleep stages. This study
    presents a novel classification model for predicting sleep stages with a high
    accuracy. The main idea is to combine the generative capability of Deep Belief
    Network (DBN) with a discriminative ability and sequence pattern recognizing
    capability of Long Short-term Memory (LSTM). We use DBN that is treated as an
    automatic higher level features generator. The input to DBN is 28 “handcrafted”
    features as used in previous sleep stages studies. We compared our method with
    other techniques which combined DBN with Hidden Markov Model (HMM).In this
    study, we exploit the sequence or time series characteristics of sleep dataset.
    To the best of our knowledge, most of the present sleep analysis from
    polysomnogram relies only on single instanced label (nonsequence) for
    classification. In this study, we used two datasets: an open data set that is
    treated as a benchmark; the other dataset is our sleep stages dataset
    (available for download) to verify the results further. Our experiments showed
    that the combination of DBN with LSTM gives better overall accuracy 98.75\%
    (Fscore=0.9875) for benchmark dataset and 98.94\% (Fscore=0.9894) for MKG
    dataset. This result is better than the state of the art of sleep stages
    classification that was 91.31\%.

    A Methodology for Customizing Clinical Tests for Esophageal Cancer based on Patient Preferences

    Asis Roy, Sourangshu Bhattacharya, Kalyan Guin
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    Tests for Esophageal cancer can be expensive, uncomfortable and can have side
    effects. For many patients, we can predict non-existence of disease with 100%
    certainty, just using demographics, lifestyle, and medical history information.
    Our objective is to devise a general methodology for customizing tests using
    user preferences so that expensive or uncomfortable tests can be avoided. We
    propose to use classifiers trained from electronic health records (EHR) for
    selection of tests. The key idea is to design classifiers with 100% false
    normal rates, possibly at the cost higher false abnormals. We compare Naive
    Bayes classification (NB), Random Forests (RF), Support Vector Machines (SVM)
    and Logistic Regression (LR), and find kernel Logistic regression to be most
    suitable for the task. We propose an algorithm for finding the best probability
    threshold for kernel LR, based on test set accuracy. Using the proposed
    algorithm, we describe schemes for selecting tests, which appear as features in
    the automatic classification algorithm, using preferences on costs and
    discomfort of the users. We test our methodology with EHRs collected for more
    than 3000 patients, as a part of project carried out by a reputed hospital in
    Mumbai, India. Kernel SVM and kernel LR with a polynomial kernel of degree 3,
    yields an accuracy of 99.8% and sensitivity 100%, without the MP features, i.e.
    using only clinical tests. We demonstrate our test selection algorithm using
    two case studies, one using cost of clinical tests, and other using
    “discomfort” values for clinical tests. We compute the test sets corresponding
    to the lowest false abnormals for each criterion described above, using
    exhaustive enumeration of 15 clinical tests. The sets turn out to different,
    substantiating our claim that one can customize test sets based on user
    preferences.

    Low-tubal-rank Tensor Completion using Alternating Minimization

    Xiao-Yang Liu, Shuchin Aeron, Vaneet Aggarwal, Xiaodong Wang
    Subjects: Learning (cs.LG); Information Theory (cs.IT)

    The low-tubal-rank tensor model has been recently proposed for real-world
    multidimensional data. In this paper, we study the low-tubal-rank tensor
    completion problem, i.e., to recover a third-order tensor by observing a subset
    of its elements selected uniformly at random. We propose a fast iterative
    algorithm, called {em Tubal-Alt-Min}, that is inspired by a similar approach
    for low-rank matrix completion. The unknown low-tubal-rank tensor is
    represented as the product of two much smaller tensors with the low-tubal-rank
    property being automatically incorporated, and Tubal-Alt-Min alternates between
    estimating those two tensors using tensor least squares minimization. First, we
    note that tensor least squares minimization is different from its matrix
    counterpart and nontrivial as the circular convolution operator of the
    low-tubal-rank tensor model is intertwined with the sub-sampling operator.
    Second, the theoretical performance guarantee is challenging since
    Tubal-Alt-Min is iterative and nonconvex in nature. We prove that 1)
    Tubal-Alt-Min guarantees exponential convergence to the global optima, and 2)
    for an $n imes n imes k$ tensor with tubal-rank $r ll n$, the required
    sampling complexity is $O(nr^2k log^3 n)$ and the computational complexity is
    $O(n^2rk^2 log^2 n)$. Third, on both synthetic data and real-world video data,
    evaluation results show that compared with tensor-nuclear norm minimization
    (TNN-ADMM), Tubal-Alt-Min improves the recovery error dramatically (by orders
    of magnitude). It is estimated that Tubal-Alt-Min converges at an exponential
    rate $10^{-0.4423 ext{Iter}}$ where $ ext{Iter}$ denotes the number of
    iterations, which is much faster than TNN-ADMM’s $10^{-0.0332 ext{Iter}}$,
    and the running time can be accelerated by more than $5$ times for a $200
    imes 200 imes 20$ tensor.

    Generalized Inverse Classification

    Michael T. Lash, Qihang Lin, W. Nick Street, Jennifer G. Robinson, Jeffrey Ohlmann
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    Inverse classification is the process of perturbing an instance in a
    meaningful way such that it is more likely to conform to a specific class.
    Historical methods that address such a problem are often framed to leverage
    only a single classifier, or specific set of classifiers. These works are often
    accompanied by naive assumptions. In this work we propose generalized inverse
    classification (GIC), which avoids restricting the classification model that
    can be used. We incorporate this formulation into a refined framework in which
    GIC takes place. Under this framework, GIC operates on features that are
    immediately actionable. Each change incurs an individual cost, either linear or
    non-linear. Such changes are subjected to occur within a specified level of
    cumulative change (budget). Furthermore, our framework incorporates the
    estimation of features that change as a consequence of direct actions taken
    (indirectly changeable features). To solve such a problem, we propose three
    real-valued heuristic-based methods and two sensitivity analysis-based
    comparison methods, each of which is evaluated on two freely available
    real-world datasets. Our results demonstrate the validity and benefits of our
    formulation, framework, and methods.

    Polynomial-time Tensor Decompositions with Sum-of-Squares

    Tengyu Ma, Jonathan Shi, David Steurer
    Comments: to appear in FOCS 2016
    Subjects: Data Structures and Algorithms (cs.DS); Learning (cs.LG)

    We give new algorithms based on the sum-of-squares method for tensor
    decomposition. Our results improve the best known running times from
    quasi-polynomial to polynomial for several problems, including decomposing
    random overcomplete 3-tensors and learning overcomplete dictionaries with
    constant relative sparsity. We also give the first robust analysis for
    decomposing overcomplete 4-tensors in the smoothed analysis model. A key
    ingredient of our analysis is to establish small spectral gaps in moment
    matrices derived from solutions to sum-of-squares relaxations. To enable this
    analysis we augment sum-of-squares relaxations with spectral analogs of maximum
    entropy constraints.

    Efficient L1-Norm Principal-Component Analysis via Bit Flipping

    Panos P. Markopoulos, Sandipan Kundu, Shubham Chamadia, Dimitris A. Pados
    Subjects: Data Structures and Algorithms (cs.DS); Learning (cs.LG); Machine Learning (stat.ML)

    It was shown recently that the $K$ L1-norm principal components (L1-PCs) of a
    real-valued data matrix $mathbf X in mathbb R^{D imes N}$ ($N$ data
    samples of $D$ dimensions) can be exactly calculated with cost
    $mathcal{O}(2^{NK})$ or, when advantageous, $mathcal{O}(N^{dK – K + 1})$
    where $d=mathrm{rank}(mathbf X)$, $K<d$ [1],[2]. In applications where
    $mathbf X$ is large (e.g., “big” data of large $N$ and/or “heavy” data of
    large $d$), these costs are prohibitive. In this work, we present a novel
    suboptimal algorithm for the calculation of the $K < d$ L1-PCs of $mathbf X$
    of cost $mathcal O(ND mathrm{min} { N,D} + N^2(K^4 + dK^2) + dNK^3)$, which
    is comparable to that of standard (L2-norm) PC analysis. Our theoretical and
    experimental studies show that the proposed algorithm calculates the exact
    optimal L1-PCs with high frequency and achieves higher value in the L1-PC
    optimization metric than any known alternative algorithm of comparable
    computational cost. The superiority of the calculated L1-PCs over standard
    L2-PCs (singular vectors) in characterizing potentially faulty
    data/measurements is demonstrated with experiments on data dimensionality
    reduction and disease diagnosis from genomic data.

    Sequence-based Sleep Stage Classification using Conditional Neural Fields

    Intan Nurma Yulita, Mohamad Ivan Fanany, Aniati Murni Arymurthy
    Comments: 14 pages. Submitted to Computational and Mathematical Methods in Medicine (Hindawi Publishin). Article ID 7163687
    Subjects: Neural and Evolutionary Computing (cs.NE); Learning (cs.LG)

    Sleep signals from a polysomnographic database are sequences in nature.
    Commonly employed analysis and classification methods, however, ignored this
    fact and treated the sleep signals as non-sequence data. Treating the sleep
    signals as sequences, this paper compared two powerful unsupervised feature
    extractors and three sequence-based classifiers regarding accuracy and
    computational (training and testing) time after 10-folds cross-validation. The
    compared feature extractors are Deep Belief Networks (DBN) and Fuzzy C-Means
    (FCM) clustering. Whereas the compared sequence-based classifiers are Hidden
    Markov Models (HMM), Conditional Random Fields (CRF) and its variants, i.e.,
    Hidden-state CRF (HCRF) and Latent-Dynamic CRF (LDCRF); and Conditional Neural
    Fields (CNF) and its variant (LDCNF). In this study, we use two datasets. The
    first dataset is an open (public) polysomnographic dataset downloadable from
    the Internet, while the second dataset is our polysomnographic dataset (also
    available for download). For the first dataset, the combination of FCM and CNF
    gives the highest accuracy (96.75\%) with relatively short training time (0.33
    hours). For the second dataset, the combination of DBN and CRF gives the
    accuracy of 99.96\% but with 1.02 hours training time, whereas the combination
    of DBN and CNF gives slightly less accuracy (99.69\%) but also less computation
    time (0.89 hours).

    Adaptive Online Sequential ELM for Concept Drift Tackling

    Arif Budiman, Mohamad Ivan Fanany, Chan Basaruddin
    Comments: Hindawi Publishing. Computational Intelligence and Neuroscience Volume 2016 (2016), Article ID 8091267, 17 pages Received 29 January 2016, Accepted 17 May 2016. Special Issue on “Advances in Neural Networks and Hybrid-Metaheuristics: Theory, Algorithms, and Novel Engineering Applications”. Academic Editor: Stefan Haufe
    Journal-ref: Computational Intelligence and Neuroscience Volume 2016 (2016),
    Article ID 8091267, 17 pages
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    A machine learning method needs to adapt to over time changes in the
    environment. Such changes are known as concept drift. In this paper, we propose
    concept drift tackling method as an enhancement of Online Sequential Extreme
    Learning Machine (OS-ELM) and Constructive Enhancement OS-ELM (CEOS-ELM) by
    adding adaptive capability for classification and regression problem. The
    scheme is named as adaptive OS-ELM (AOS-ELM). It is a single classifier scheme
    that works well to handle real drift, virtual drift, and hybrid drift. The
    AOS-ELM also works well for sudden drift and recurrent context change type. The
    scheme is a simple unified method implemented in simple lines of code. We
    evaluated AOS-ELM on regression and classification problem by using concept
    drift public data set (SEA and STAGGER) and other public data sets such as
    MNIST, USPS, and IDS. Experiments show that our method gives higher kappa value
    compared to the multiclassifier ELM ensemble. Even though AOS-ELM in practice
    does not need hidden nodes increase, we address some issues related to the
    increasing of the hidden nodes such as error condition and rank values. We
    propose taking the rank of the pseudoinverse matrix as an indicator parameter
    to detect underfitting condition.

    A New Data Representation Based on Training Data Characteristics to Extract Drug Named-Entity in Medical Text

    Sadikin Mujiono, Mohamad Ivan Fanany, Chan Basaruddin
    Comments: Hindawi Publishing. Computational Intelligence and Neuroscience Volume 2016 (2016), Article ID 3483528, 24 pages Received 27 May 2016; Revised 8 August 2016; Accepted 18 September 2016. Special Issue on “Smart Data: Where the Big Data Meets the Semantics”. Academic Editor: Trong H. Duong
    Journal-ref: Computational Intelligence and Neuroscience Volume 2016 (2016),
    Article ID 3483528, 24 pages
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    One essential task in information extraction from the medical corpus is drug
    name recognition. Compared with text sources come from other domains, the
    medical text is special and has unique characteristics. In addition, the
    medical text mining poses more challenges, e.g., more unstructured text, the
    fast growing of new terms addition, a wide range of name variation for the same
    drug. The mining is even more challenging due to the lack of labeled dataset
    sources and external knowledge, as well as multiple token representations for a
    single drug name that is more common in the real application setting. Although
    many approaches have been proposed to overwhelm the task, some problems
    remained with poor F-score performance (less than 0.75). This paper presents a
    new treatment in data representation techniques to overcome some of those
    challenges. We propose three data representation techniques based on the
    characteristics of word distribution and word similarities as a result of word
    embedding training. The first technique is evaluated with the standard NN
    model, i.e., MLP (Multi-Layer Perceptrons). The second technique involves two
    deep network classifiers, i.e., DBN (Deep Belief Networks), and SAE (Stacked
    Denoising Encoders). The third technique represents the sentence as a sequence
    that is evaluated with a recurrent NN model, i.e., LSTM (Long Short Term
    Memory). In extracting the drug name entities, the third technique gives the
    best F-score performance compared to the state of the art, with its average
    F-score being 0.8645.

    Sampled Fictitious Play is Hannan Consistent

    Zifan Li, Ambuj Tewari
    Subjects: Computer Science and Game Theory (cs.GT); Learning (cs.LG); Machine Learning (stat.ML)

    Fictitious play is a simple and widely studied adaptive heuristic for playing
    repeated games. It is well known that fictitious play fails to be Hannan
    consistent. Several variants of fictitious play including regret matching,
    generalized regret matching and smooth fictitious play, are known to be Hannan
    consistent. In this note, we consider sampled fictitious play: at each round,
    the player samples past times and plays the best response to previous moves of
    other players at the sampled time points. We show that sampled fictitious play,
    using Bernoulli sampling, is Hannan consistent. Unlike several existing Hannan
    consistency proofs that rely on concentration of measure results, ours instead
    uses anti-concentration results from Littlewood-Offord theory.

    Supervision via Competition: Robot Adversaries for Learning Tasks

    Lerrel Pinto, James Davidson, Abhinav Gupta
    Comments: Submission to ICRA 2017
    Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    There has been a recent paradigm shift in robotics to data-driven learning
    for planning and control. Due to large number of experiences required for
    training, most of these approaches use a self-supervised paradigm: using
    sensors to measure success/failure. However, in most cases, these sensors
    provide weak supervision at best. In this work, we propose an adversarial
    learning framework that pits an adversary against the robot learning the task.
    In an effort to defeat the adversary, the original robot learns to perform the
    task with more robustness leading to overall improved performance. We show that
    this adversarial framework forces the the robot to learn a better grasping
    model in order to overcome the adversary. By grasping 82% of presented novel
    objects compared to 68% without an adversary, we demonstrate the utility of
    creating adversaries. We also demonstrate via experiments that having robots in
    adversarial setting might be a better learning strategy as compared to having
    collaborative multiple robots.

    Automatic Sleep Stage Scoring with Single-Channel EEG Using Convolutional Neural Networks

    Orestis Tsinalis, Paul M. Matthews, Yike Guo, Stefanos Zafeiriou
    Comments: 12 pages
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    We used convolutional neural networks (CNNs) for automatic sleep stage
    scoring based on single-channel electroencephalography (EEG) to learn
    task-specific filters for classification without using prior domain knowledge.
    We used an openly available dataset from 20 healthy young adults for evaluation
    and applied 20-fold cross-validation. We used class-balanced random sampling
    within the stochastic gradient descent (SGD) optimization of the CNN to avoid
    skewed performance in favor of the most represented sleep stages. We achieved
    high mean F1-score (81%, range 79-83%), mean accuracy across individual sleep
    stages (82%, range 80-84%) and overall accuracy (74%, range 71-76%) over all
    subjects. By analyzing and visualizing the filters that our CNN learns, we
    found that rules learned by the filters correspond to sleep scoring criteria in
    the American Academy of Sleep Medicine (AASM) manual that human experts follow.
    Our method’s performance is balanced across classes and our results are
    comparable to state-of-the-art methods with hand-engineered features. We show
    that, without using prior domain knowledge, a CNN can automatically learn to
    distinguish among different normal sleep stages.


    Information Theory

    Geometric decoding of subspace codes with explicit Schubert calculus applied to spread codes

    Klara Stokes
    Subjects: Information Theory (cs.IT)

    This article is about a decoding algorithm for error-correcting subspace
    codes. A version of this algorithm was previously described by Rosenthal,
    Silberstein and Trautmann. The decoding algorithm requires the code to be
    defined as the intersection of the Pl”ucker embedding of the Grassmannian and
    an algebraic variety. We call such codes emph{geometric subspace codes}.
    Complexity is substantially improved compared to the algorithm by Rosenthal,
    Silberstein and Trautmann and connections to finite geometry are given. The
    decoding algorithm is applied to Desarguesian spread codes, which are known to
    be defined as the intersection of the Pl”ucker embedding of the Grassmannian
    with a linear space.

    Fundamental properties of solutions to utility maximization problems in wireless networks

    Renato Luis Garrido Cavalcante, Slawomir Stanczak
    Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

    We introduce a unified framework for the study of utility maximization
    problems in interference-coupled wireless networks. The framework can be
    applied to a large class of utilities, but in this study special attention is
    devoted to the rate. In more detail, we resort to results from concave
    Perron-Frobenius theory to show that, within the class of problems we consider
    here, each problem has a unique solution. As a consequence, given any network
    utility maximization problem belonging to this class, we can define two
    functions that relate the power budget $ar{p}$ of a network to the network
    utility and to the energy efficiency achieved by the solution to the given
    problem. Among many interesting properties, we prove that these functions are
    continuous and monotonic. In addition, we derive bounds revealing that the
    solution is characterized by a low and a high power regime. In the low power
    regime, the energy efficiency can decrease slowly as the power budget
    increases, and the network utility grows linearly at best. In contrast, in the
    high power regime, the energy efficiency typically scales as
    $mathcal{O}(1/ar{p})$ as $ar{p} oinfty$, and the network utility scales
    as $mathcal{O}(1)$. We apply the theoretical findings to a novel weighted rate
    maximization problem involving the joint optimization of the uplink power and
    the base station assignment. For this novel problem formulation, we also
    propose a simple and practical iterative solver.

    Quantum Game Theory for Beam Alignment in Millimeter Wave Device-to-Device Communications

    Qianqian Zhang, Walid Saad, Mehdi Bennis, Merouane Debbah
    Comments: 6 pages, 6 figures
    Subjects: Information Theory (cs.IT); Computer Science and Game Theory (cs.GT)

    In this paper, the problem of optimized beam alignment for wearable
    device-to-device (D2D) communications over millimeter wave (mmW) frequencies is
    studied. In particular, a noncooperative game is formulated between wearable
    communication pairs that engage in D2D communications. In this game, wearable
    devices acting as transmitters autonomously select the directions of their
    beams so as to maximize the data rate to their receivers. To solve the game, an
    algorithm based on best response dynamics is proposed that allows the
    transmitters to reach a Nash equilibrium in a distributed manner. To further
    improve the performance of mmW D2D communications, a novel quantum game model
    is formulated to enable the wearable devices to exploit new quantum directions
    during their beam alignment so as to further enhance their data rate.
    Simulation results show that the proposed game-theoretic approach improves the
    performance, in terms of data rate, of about 75% compared to a uniform beam
    alignment. The results also show that the quantum game model can further yield
    up to 20% improvement in data rates, relative to the classical game approach.

    Downlink Coordinated Joint Transmission for Mutual Information Accumulation

    Amogh Rajanna, Martin Haenggi
    Comments: 10 pages, 2 figures, IEEE Wireless Communications Letters (submitted 5 Oct 2016)
    Subjects: Information Theory (cs.IT)

    In this letter, we propose a new coordinated multipoint (CoMP) technique
    based on mutual information (MI) accumulation using rateless codes. Using a
    stochastic geometry model for the cellular downlink, we quantify the
    performance enhancements in coverage probability and rate due to MI
    accumulation. By simulation and analysis, we show that MI accumulation using
    rateless codes leads to remarkable improvements in coverage and rate for
    general users and specific cell edge users.

    Variable-Length Coding with Cost Allowing Non-Vanishing Error Probability

    Hideki Yagi, Ryo Nomura
    Comments: 7 pages; extended version of a paper accepted by ISITA2016
    Subjects: Information Theory (cs.IT)

    We derive a general formula of the minimum achievable rate for
    fixed-to-variable length coding with a regular cost function by allowing the
    error probability up to a constant $varepsilon$. For a fixed-to-variable
    length code, we call the set of source sequences that can be decoded without
    error the dominant set of source sequences. For any two regular cost functions,
    it is revealed that the dominant set of source sequences for a code attaining
    the minimum achievable rate with a cost function is also the dominant set for a
    code attaining the minimum achievable rate with the other cost function. We
    also give a general formula of the second-order minimum achievable rate.

    Learning with Finite Memory for Machine Type Communication

    Taehyeun Park, Walid Saad
    Comments: 6 pages, 3 figures
    Subjects: Information Theory (cs.IT)

    Machine-type devices (MTDs) will lie at the heart of the Internet of Things
    (IoT) system. A key challenge in such a system is sharing network resources
    between small MTDs, which have limited memory and computational capabilities.
    In this paper, a novel learning emph{with finite memory} framework is proposed
    to enable MTDs to effectively learn about each others message state, so as to
    properly adapt their transmission parameters. In particular, an IoT system in
    which MTDs can transmit both delay tolerant, periodic messages and critical
    alarm messages is studied. For this model, the characterization of the
    exponentially growing delay for critical alarm messages and the convergence of
    the proposed learning framework in an IoT are analyzed. Simulation results show
    that the delay of critical alarm messages is significantly reduced up to $94\%$
    with very minimal memory requirements. The results also show that the proposed
    learning with finite memory framework is very effective in mitigating the
    limiting factors of learning that prevent proper learning procedures.

    Low-tubal-rank Tensor Completion using Alternating Minimization

    Xiao-Yang Liu, Shuchin Aeron, Vaneet Aggarwal, Xiaodong Wang
    Subjects: Learning (cs.LG); Information Theory (cs.IT)

    The low-tubal-rank tensor model has been recently proposed for real-world
    multidimensional data. In this paper, we study the low-tubal-rank tensor
    completion problem, i.e., to recover a third-order tensor by observing a subset
    of its elements selected uniformly at random. We propose a fast iterative
    algorithm, called {em Tubal-Alt-Min}, that is inspired by a similar approach
    for low-rank matrix completion. The unknown low-tubal-rank tensor is
    represented as the product of two much smaller tensors with the low-tubal-rank
    property being automatically incorporated, and Tubal-Alt-Min alternates between
    estimating those two tensors using tensor least squares minimization. First, we
    note that tensor least squares minimization is different from its matrix
    counterpart and nontrivial as the circular convolution operator of the
    low-tubal-rank tensor model is intertwined with the sub-sampling operator.
    Second, the theoretical performance guarantee is challenging since
    Tubal-Alt-Min is iterative and nonconvex in nature. We prove that 1)
    Tubal-Alt-Min guarantees exponential convergence to the global optima, and 2)
    for an $n imes n imes k$ tensor with tubal-rank $r ll n$, the required
    sampling complexity is $O(nr^2k log^3 n)$ and the computational complexity is
    $O(n^2rk^2 log^2 n)$. Third, on both synthetic data and real-world video data,
    evaluation results show that compared with tensor-nuclear norm minimization
    (TNN-ADMM), Tubal-Alt-Min improves the recovery error dramatically (by orders
    of magnitude). It is estimated that Tubal-Alt-Min converges at an exponential
    rate $10^{-0.4423 ext{Iter}}$ where $ ext{Iter}$ denotes the number of
    iterations, which is much faster than TNN-ADMM’s $10^{-0.0332 ext{Iter}}$,
    and the running time can be accelerated by more than $5$ times for a $200
    imes 200 imes 20$ tensor.




沪ICP备19023445号-2号
友情链接