IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Thu, 10 Nov 2016

    我爱机器学习(52ml.net)发表于 2016-11-10 00:00:00
    love 0

    Neural and Evolutionary Computing

    Bio-Inspired Spiking Convolutional Neural Network using Layer-wise Sparse Coding and STDP Learning

    Amirhossein Tavanaei, Anthony S. Maida
    Subjects: Neural and Evolutionary Computing (cs.NE)

    Hierarchical feature discovery using non-spiking convolutional neural
    networks (CNNs) has attracted much recent interest in machine learning and
    computer vision. However, it is still not well understood how to create spiking
    deep networks with multi-layer, unsupervised learning. One advantage of spiking
    CNNs is their bio-realism. Another advantage is that they represent information
    using sparse spike-trains which enable power-efficient implementation. This
    paper explores a novel bio-inspired spiking CNN that is trained in a greedy,
    layer-wise fashion. The proposed network consists of a spiking
    convolutional-pooling layer followed by a feature discovery layer. Kernels for
    the convolutional layer are trained using local learning. The learning is
    implemented using a sparse, spiking auto-encoder representing primary visual
    features. The feature discovery layer is equipped with a probabilistic
    spike-timing-dependent plasticity (STDP) learning rule. This layer represents
    complex visual features using probabilistic leaky, integrate-and-fire (LIF)
    neurons. Our results show that the convolutional layer is stack-admissible,
    enabling it to support a multi-layer learning. The visual features obtained
    from the proposed probabilistic LIF neurons in the feature discovery layer are
    utilized for training a classifier. Classification results contribute to the
    independent and informative visual features extracted in a hierarchy of
    convolutional and feature discovery layers. The proposed model is evaluated on
    the MNIST digit dataset using clean and noisy images. The recognition
    performance for clean images is above 98%. The performance loss for recognizing
    the noisy images is in the range 0.1% to 8.5% depending on noise types and
    densities. This level of performance loss indicates that the network is robust
    to additive noise.

    Lie-Access Neural Turing Machines

    Greg Yang, Alexander M. Rush
    Comments: Submitted to ICLR. Rewrite and improvement of this https URL
    Subjects: Neural and Evolutionary Computing (cs.NE); Learning (cs.LG)

    Recent work has demonstrated the effectiveness of employing explicit external
    memory structures in conjunction with deep neural models for algorithmic
    learning (Graves et al. 2014; Weston et al. 2014). These models utilize
    differentiable versions of traditional discrete memory-access structures
    (random access, stacks, tapes) to provide the variable-length storage necessary
    for computational tasks. In this work, we propose an alternative model,
    Lie-access memory, that is explicitly designed for the neural setting. In this
    paradigm, memory is accessed using a continuous head in a key-space manifold.
    The head is moved via Lie group actions, such as shifts or rotations, generated
    by a controller, and soft memory access is performed by considering the
    distance to keys associated with each memory. We argue that Lie groups provide
    a natural generalization of discrete memory structures, such as Turing
    machines, as they provide inverse and identity operators while maintain
    differentiability. To experiment with this approach, we implement several
    simplified Lie-access neural Turing machine (LANTM) with different Lie groups.
    We find that this approach is able to perform well on a range of algorithmic
    tasks.

    Incremental Sequence Learning

    Edwin D. de Jong
    Comments: 18 pages
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Deep learning research over the past years has shown that by increasing the
    scope or difficulty of the learning problem over time, increasingly complex
    learning problems can be addressed. We study incremental learning in the
    context of sequence learning, using generative RNNs in the form of multi-layer
    recurrent Mixture Density Networks. We introduce Incremental Sequence Learning,
    a simple incremental approach to sequence learning. Incremental Sequence
    Learning starts out by using only the first few steps of each sequence as
    training data. Each time a performance criterion has been reached, the length
    of the parts of the sequences used for training is increased. To evaluate
    Incremental Sequence Learning and comparison methods, we introduce and make
    available a novel sequence learning task and data set: predicting and
    classifying MNIST pen stroke sequences, where the familiar handwritten digit
    images have been transformed to pen stroke sequences representing the skeletons
    of the digits. We find that Incremental Sequence Learning greatly speeds up
    sequence learning and reaches the best test performance level of regular
    sequence learning 20 times faster, reduces the test error by 74%, and in
    general performs more robustly; it displays lower variance and achieves
    sustained progress after all three comparison method have stopped improving. A
    trained sequence prediction model is also used in transfer learning to the task
    of sequence classification, where it is found that transfer learning realizes
    improved classification performance compared to methods that learn to classify
    from scratch.

    RL(^2): Fast Reinforcement Learning via Slow Reinforcement Learning

    Yan Duan, John Schulman, Xi Chen, Peter Bartlett, Ilya Sutskever, Pieter Abbeel
    Comments: 14 pages. Under review as a conference paper at ICLR 2017
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    Deep reinforcement learning (deep RL) has been successful in learning
    sophisticated behaviors automatically; however, the learning process requires a
    huge number of trials. In contrast, animals can learn new tasks in just a few
    trials, benefiting from their prior knowledge about the world. This paper seeks
    to bridge this gap. Rather than designing a “fast” reinforcement learning
    algorithm, we propose to represent it as a recurrent neural network (RNN) and
    learn it from data. In our proposed method, RL(^2), the algorithm is encoded in
    the weights of the RNN, which are learned slowly through a general-purpose
    (“slow”) RL algorithm. The RNN receives all information a typical RL algorithm
    would receive, including observations, actions, rewards, and termination flags;
    and it retains its state across episodes in a given Markov Decision Process
    (MDP). The activations of the RNN store the state of the “fast” RL algorithm on
    the current (previously unseen) MDP. We evaluate RL(^2) experimentally on both
    small-scale and large-scale problems. On the small-scale side, we train it to
    solve randomly generated multi-arm bandit problems and finite MDPs. After
    RL(^2) is trained, its performance on new MDPs is close to human-designed
    algorithms with optimality guarantees. On the large-scale side, we test RL(^2)
    on a vision-based navigation task and show that it scales up to
    high-dimensional problems.


    Computer Vision and Pattern Recognition

    Optimal Multiple Surface Segmentation with Convex Priors in Irregularly Sampled Space

    Abhay Shah, Junjie Bai, Michael D. Abramoff, Xiaodong Wu
    Comments: 20 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Optimal surface segmentation is widely used in numerous medical image
    segmentation applications. However, nodes in the graph based optimal surface
    segmentation method typically encode uniformly distributed orthogonal voxels of
    the volume. Thus the segmentation cannot attain an accuracy greater than a
    single unit voxel, i.e. the distance between two adjoining nodes in graph
    space. Segmentation accuracy higher than a unit voxel is achievable by
    exploiting partial volume information in the voxels which shall result in
    non-equidistant spacing between adjoining graph nodes. This paper reports a
    generalized graph based optimal multiple surface segmentation method with
    convex priors which segments the target surfaces in irregularly sampled space.
    The proposed method allows non-equidistant spacing between the adjoining graph
    nodes to achieve subvoxel accurate segmentation by utilizing the partial volume
    information in the voxels. The partial volume information in the voxels is
    exploited by computing a displacement field from the original volume data to
    identify the subvoxel accurate centers within each voxel resulting in
    non-equidistant spacing between the adjoining graph nodes. The smoothness of
    each surface modelled as a convex constraint governs the connectivity and
    regularity of the surface. We employ an edge-based graph representation to
    incorporate the necessary constraints and the globally optimal solution is
    obtained by computing a minimum s-t cut. The proposed method was validated on
    25 optical coherence tomography image volumes of the retina and 10
    intravascular multi-frame ultrasound image datasets for subvoxel and super
    resolution segmentation accuracy. In all cases, the approach yielded highly
    accurate results. Our approach can be readily extended to higher-dimensional
    segmentations.

    Node-Adapt, Path-Adapt and Tree-Adapt:Model-Transfer Domain Adaptation for Random Forest

    Azadeh S. Mozafari, David Vazquez, Mansour Jamzad, Antonio M. Lopez
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Random Forest (RF) is a successful paradigm for learning classifiers due to
    its ability to learn from large feature spaces and seamlessly integrate
    multi-class classification, as well as the achieved accuracy and processing
    efficiency. However, as many other classifiers, RF requires domain adaptation
    (DA) provided that there is a mismatch between the training (source) and
    testing (target) domains which provokes classification degradation.
    Consequently, different RF-DA methods have been proposed, which not only
    require target-domain samples but revisiting the source-domain ones, too. As
    novelty, we propose three inherently different methods (Node-Adapt, Path-Adapt
    and Tree-Adapt) that only require the learned source-domain RF and a relatively
    few target-domain samples for DA, i.e. source-domain samples do not need to be
    available. To assess the performance of our proposals we focus on image-based
    object detection, using the pedestrian detection problem as challenging
    proof-of-concept. Moreover, we use the RF with expert nodes because it is a
    competitive patch-based pedestrian model. We test our Node-, Path- and
    Tree-Adapt methods in standard benchmarks, showing that DA is largely achieved.

    Audio Visual Speech Recognition using Deep Recurrent Neural Networks

    Abhinav Thanda, Shankar M Venkatesan
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Learning (cs.LG)

    In this work, we propose a training algorithm for an audio-visual automatic
    speech recognition (AV-ASR) system using deep recurrent neural network
    (RNN).First, we train a deep RNN acoustic model with a Connectionist Temporal
    Classification (CTC) objective function. The frame labels obtained from the
    acoustic model are then used to perform a non-linear dimensionality reduction
    of the visual features using a deep bottleneck network. Audio and visual
    features are fused and used to train a fusion RNN. The use of bottleneck
    features for visual modality helps the model to converge properly during
    training. Our system is evaluated on GRID corpus. Our results show that
    presence of visual modality gives significant improvement in character error
    rate (CER) at various levels of noise even when the model is trained without
    noisy data. We also provide a comparison of two fusion methods: feature fusion
    and decision fusion.

    The Little Engine that Could: Regularization by Denoising (RED)

    Yaniv Romano, Michael Elad, Peyman Milanfar
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Numerical Analysis (cs.NA)

    Removal of noise from an image is an extensively studied problem in image
    processing. Indeed, the recent advent of sophisticated and highly effective
    denoising algorithms lead some to believe that existing methods are touching
    the ceiling in terms of noise removal performance. Can we leverage this
    impressive achievement to treat other tasks in image processing? Recent work
    has answered this question positively, in the form of the Plug-and-Play Prior
    ((P^3)) method, showing that any inverse problem can be handled by sequentially
    applying image denoising steps. This relies heavily on the ADMM optimization
    technique in order to obtain this chained denoising interpretation.

    Is this the only way in which tasks in image processing can exploit the image
    denoising engine? In this paper we provide an alternative, more powerful and
    more flexible framework for achieving the same goal. As opposed to the (P^3)
    method, we offer Regularization by Denoising (RED): using the denoising engine
    in defining the regularization of the inverse problem. We propose an explicit
    image-adaptive Laplacian-based regularization functional, making the overall
    objective functional clearer and better defined. With a complete flexibility to
    choose the iterative optimization procedure for minimizing the above
    functional, RED is capable of incorporating any image denoising algorithm,
    treat general inverse problems very effectively, and is guaranteed to converge
    to the globally optimal result. We test this approach and demonstrate
    state-of-the-art results in the image deblurring and super-resolution problems.

    Semi-Supervised Recognition of the Diploglossus Millepunctatus Lizard Species using Artificial Vision Algorithms

    Jhony-Heriberto Giraldo-Zuluaga, Augusto Salazar, Juan M. Daza
    Comments: arXiv admin note: text overlap with arXiv:1603.00841
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Animal biometrics is an important requirement for monitoring and conservation
    tasks. The classical animal biometrics risk the animals’ integrity, are
    expensive for numerous animals, and depend on expert criterion. The
    non-invasive biometrics techniques offer alternatives to manage the
    aforementioned problems. In this paper we propose an automatic segmentation and
    identification algorithm based on artificial vision algorithms to recognize
    Diploglossus millepunctatus. Diploglossus millepunctatus is an endangered
    lizard species. The algorithm is based on two stages: automatic segmentation to
    remove the subjective evaluation, and one identification stage to reduce the
    analysis time. A 82.87% of correct segmentation in average is reached.
    Meanwhile the identification algorithm is achieved with euclidean distance
    point algorithms such as Iterative Closest Point and Procrustes Analysis. A
    performance of 92.99% on the top 1, and a 96.82% on the top 5 is reached. The
    developed software, and the database used in this paper are publicly available
    for download from the web page of the project.

    Generative Shape Models: Joint Text Recognition and Segmentation with Very Little Training Data

    Xinghua Lou, Ken Kansky, Wolfgang Lehrach, CC Laan, Bhaskara Marthi, D. Scott Phoenix, Dileep George
    Journal-ref: Advances in Neural Information Processing Systems 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We demonstrate that a generative model for object shapes can achieve state of
    the art results on challenging scene text recognition tasks, and with orders of
    magnitude fewer training images than required for competing discriminative
    methods. In addition to transcribing text from challenging images, our method
    performs fine-grained instance segmentation of characters. We show that our
    model is more robust to both affine transformations and non-affine deformations
    compared to previous approaches.

    Deep Convolutional Neural Network for 6-DOF Image Localization

    Daoyuan Jia, Yongchi Su, Chunping Li
    Comments: will update soon
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We present an accurate and robust method for six degree of freedom image
    localization. There are two key-points of our method, 1. automatic immense
    photo synthesis and labeling from point cloud model and, 2. pose estimation
    with deep convolutional neural networks regression. Our model can directly
    regresses 6-DOF camera poses from images, accurately describing where and how
    it was captured. We achieved an accuracy within 1 meters and 1 degree on our
    out-door dataset, which covers about 2 acres on our school campus.

    A backward pass through a CNN using a generative model of its activations

    Huayan Wang, Anna Chen, Yi Liu, Dileep George, D. Scott Phoenix
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Neural networks have shown to be a practical way of building a very complex
    mapping between a pre-specified input space and output space. For example, a
    convolutional neural network (CNN) mapping an image into one of a thousand
    object labels is approaching human performance in this particular task. However
    the mapping (neural network) does not automatically lend itself to other forms
    of queries, for example, to detect/reconstruct object instances, to enforce
    top-down signal on ambiguous inputs, or to recover object instances from
    occlusion. One way to address these queries is a backward pass through the
    network that fuses top-down and bottom-up information. In this paper, we show a
    way of building such a backward pass by defining a generative model of the
    neural network’s activations. Approximate inference of the model would
    naturally take the form of a backward pass through the CNN layers, and it
    addresses the aforementioned queries in a unified framework.

    Robust Cardiac Motion Estimation using Ultrafast Ultrasound Data: A Low-Rank-Topology-Preserving Approach

    Angelica I. Aviles, Thomas Widlak, Alicia Casals, Maartje M. Nillesen, Habib Ammari
    Comments: 15 pages, 10 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Cardiac motion estimation is an important diagnostic tool to detect heart
    diseases and it has been explored with modalities such as MRI and conventional
    ultrasound (US) sequences. US cardiac motion estimation still presents
    challenges because of the complex motion patterns and the presence of noise. In
    this work, we propose a novel approach to estimate the cardiac motion using
    ultrafast ultrasound data. — Our solution is based on a variational
    formulation characterized by the L2-regularized class. The displacement is
    represented by a lattice of b-splines and we ensure robustness by applying a
    maximum likelihood type estimator. While this is an important part of our
    solution, the main highlight of this paper is to combine a low-rank data
    representation with topology preservation. Low-rank data representation
    (achieved by finding the k-dominant singular values of a Casorati Matrix
    arranged from the data sequence) speeds up the global solution and achieves
    noise reduction. On the other hand, topology preservation (achieved by
    monitoring the Jacobian determinant) allows to radically rule out distortions
    while carefully controlling the size of allowed expansions and contractions.
    Our variational approach is carried out on a realistic dataset as well as on a
    simulated one. We demonstrate how our proposed variational solution deals with
    complex deformations through careful numerical experiments. While maintaining
    the accuracy of the solution, the low-rank preprocessing is shown to speed up
    the convergence of the variational problem. Beyond cardiac motion estimation,
    our approach is promising for the analysis of other organs that experience
    motion.

    Gaussian process regression can turn non-uniform and undersampled diffusion MRI data into diffusion spectrum imaging

    Jens Sjölund, Anders Eklund, Evren Özarslan, Hans Knutsson
    Comments: 5 pages
    Subjects: Applications (stat.AP); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

    We propose to use Gaussian process regression to accurately estimate the
    diffusion MRI signal at arbitrary locations in q-space. By estimating the
    signal on a grid, we can do synthetic diffusion spectrum imaging:
    reconstructing the ensemble averaged propagator (EAP) by an inverse Fourier
    transform. We also propose an alternative reconstruction method guaranteeing a
    nonnegative EAP that integrates to unity. The reconstruction is validated on
    data simulated from two Gaussians at various crossing angles. Moreover, we
    demonstrate on non-uniformly sampled in vivo data that the method is far
    superior to linear interpolation, and allows a drastic undersampling of the
    data with only a minor loss of accuracy. We envision the method as a potential
    replacement for standard diffusion spectrum imaging, in particular when
    acquistion time is limited.

    Inferring low-dimensional microstructure representations using convolutional neural networks

    Nicholas Lubbers, Turab Lookman, Kipton Barros
    Comments: 25 Pages, 12 Figures
    Subjects: Computational Physics (physics.comp-ph); Materials Science (cond-mat.mtrl-sci); Computer Vision and Pattern Recognition (cs.CV)

    We apply recent advances in machine learning and computer vision to a central
    problem in materials informatics: The statistical representation of
    microstructural images. We use activations in a pre-trained convolutional
    neural network to provide a high-dimensional characterization of a set of
    synthetic microstructural images. Next, we use manifold learning to obtain a
    low-dimensional embedding of this statistical characterization. We show that
    the low-dimensional embedding extracts the parameters used to generate the
    images. According to a variety of metrics, the convolutional neural network
    method yields dramatically better embeddings than the analogous method derived
    from two-point correlations alone.


    Artificial Intelligence

    Encoding monotonic multi-set preferences using CI-nets: preliminary report

    Martin Diller, Anthony Hunter
    Subjects: Artificial Intelligence (cs.AI)

    CP-nets and their variants constitute one of the main AI approaches for
    specifying and reasoning about preferences. CI-nets, in particular, are a
    CP-inspired formalism for representing ordinal preferences over sets of goods,
    which are typically required to be monotonic.

    Considering also that goods often come in multi-sets rather than sets, a
    natural question is whether CI-nets can be used more or less directly to encode
    preferences over multi-sets. We here provide some initial ideas on how to
    achieve this, in the sense that at least a restricted form of reasoning on our
    framework, which we call “confined reasoning”, can be efficiently reduced to
    reasoning on CI-nets. Our framework nevertheless allows for encoding
    preferences over multi-sets with unbounded multiplicities. We also show the
    extent to which it can be used to represent preferences where multiplicites of
    the goods are not stated explicitly (“purely qualitative preferences”) as well
    as a potential use of our generalization of CI-nets as a component of a recent
    system for evidence aggregation.

    RL(^2): Fast Reinforcement Learning via Slow Reinforcement Learning

    Yan Duan, John Schulman, Xi Chen, Peter Bartlett, Ilya Sutskever, Pieter Abbeel
    Comments: 14 pages. Under review as a conference paper at ICLR 2017
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    Deep reinforcement learning (deep RL) has been successful in learning
    sophisticated behaviors automatically; however, the learning process requires a
    huge number of trials. In contrast, animals can learn new tasks in just a few
    trials, benefiting from their prior knowledge about the world. This paper seeks
    to bridge this gap. Rather than designing a “fast” reinforcement learning
    algorithm, we propose to represent it as a recurrent neural network (RNN) and
    learn it from data. In our proposed method, RL(^2), the algorithm is encoded in
    the weights of the RNN, which are learned slowly through a general-purpose
    (“slow”) RL algorithm. The RNN receives all information a typical RL algorithm
    would receive, including observations, actions, rewards, and termination flags;
    and it retains its state across episodes in a given Markov Decision Process
    (MDP). The activations of the RNN store the state of the “fast” RL algorithm on
    the current (previously unseen) MDP. We evaluate RL(^2) experimentally on both
    small-scale and large-scale problems. On the small-scale side, we train it to
    solve randomly generated multi-arm bandit problems and finite MDPs. After
    RL(^2) is trained, its performance on new MDPs is close to human-designed
    algorithms with optimality guarantees. On the large-scale side, we test RL(^2)
    on a vision-based navigation task and show that it scales up to
    high-dimensional problems.

    Recursive Decomposition for Nonconvex Optimization

    Abram L. Friesen, Pedro Domingos
    Comments: 11 pages, 7 figures, pdflatex
    Journal-ref: Proceedings of the 24th International Joint Conference on
    Artificial Intelligence (2015), pp. 253-259
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    Continuous optimization is an important problem in many areas of AI,
    including vision, robotics, probabilistic inference, and machine learning.
    Unfortunately, most real-world optimization problems are nonconvex, causing
    standard convex techniques to find only local optima, even with extensions like
    random restarts and simulated annealing. We observe that, in many cases, the
    local modes of the objective function have combinatorial structure, and thus
    ideas from combinatorial optimization can be brought to bear. Based on this, we
    propose a problem-decomposition approach to nonconvex optimization. Similarly
    to DPLL-style SAT solvers and recursive conditioning in probabilistic
    inference, our algorithm, RDIS, recursively sets variables so as to simplify
    and decompose the objective function into approximately independent
    sub-functions, until the remaining functions are simple enough to be optimized
    by standard techniques like gradient descent. The variables to set are chosen
    by graph partitioning, ensuring decomposition whenever possible. We show
    analytically that RDIS can solve a broad class of nonconvex optimization
    problems exponentially faster than gradient descent with random restarts.
    Experimentally, RDIS outperforms standard techniques on problems like structure
    from motion and protein folding.

    Tuning Recurrent Neural Networks with Reinforcement Learning

    Natasha Jaques, Shixiang Gu, Richard E. Turner, Douglas Eck
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

    Sequence models can be trained using supervised learning and a next-step
    prediction objective. This approach, however, suffers from known failure modes.
    For example, it is notoriously difficult to ensure multi-step generated
    sequences have coherent global structure. Motivated by the fact that
    reinforcement learning (RL) can be used to impose arbitrary properties on
    generated data by choosing appropriate reward functions, in this paper we
    propose a novel approach for sequence training which combines Maximum
    Likelihood (ML) and RL training. We refine a sequence predictor by optimizing
    for some imposed reward functions, while maintaining good predictive properties
    learned from data. We propose efficient ways to solve this by augmenting deep
    Q-learning with a cross-entropy reward and deriving novel off-policy methods
    for RNNs from stochastic optimal control (SOC). We explore the usefulness of
    our approach in the context of music generation. An LSTM is trained on a large
    corpus of songs to predict the next note in a musical sequence. This Note-RNN
    is then refined using RL, where the reward function is a combination of rewards
    based on rules of music theory, as well as the output of another trained
    Note-RNN. We show that by combining ML and RL, this RL Tuner method can not
    only produce more pleasing melodies, but that it can significantly reduce
    unwanted behaviors and failure modes of the RNN.


    Information Retrieval

    An Automated System for Essay Scoring of Online Exams in Arabic based on Stemming Techniques and Levenshtein Edit Operations

    Emad Fawzi Al-Shalabi
    Comments: 5 pages, 2 figures
    Journal-ref: IJCSI International Journal of Computer Science Issues, Volume 13,
    Issue 5, September 2016
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    In this article, an automated system is proposed for essay scoring in Arabic
    language for online exams based on stemming techniques and Levenshtein edit
    operations. An online exam has been developed on the proposed mechanisms,
    exploiting the capabilities of light and heavy stemming. The implemented online
    grading system has shown to be an efficient tool for automated scoring of essay
    questions.


    Computation and Language

    When silver glitters more than gold: Bootstrapping an Italian part-of-speech tagger for Twitter

    Barbara Plank, Malvina Nissim
    Comments: Proceedings of the 5th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian (EVALITA 2016)
    Subjects: Computation and Language (cs.CL)

    We bootstrap a state-of-the-art part-of-speech tagger to tag Italian Twitter
    data, in the context of the Evalita 2016 PoSTWITA shared task. We show that
    training the tagger on native Twitter data enriched with little amounts of
    specifically selected gold data and additional silver-labelled data scraped
    from Facebook, yields better results than using large amounts of manually
    annotated data from a mix of genres.

    Distant supervision for emotion detection using Facebook reactions

    Chris Pool, Malvina Nissim
    Comments: Proceedings of the Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media (PEOPLES 2016), held in conjunction with COLING 2016, Osaka, Japan
    Subjects: Computation and Language (cs.CL)

    We exploit the Facebook reaction feature in a distant supervised fashion to
    train a support vector machine classifier for emotion detection, using several
    feature combinations and combining different Facebook pages. We test our models
    on existing benchmarks for emotion detection and show that employing only
    information that is derived completely automatically, thus without relying on
    any handcrafted lexicon as it’s usually done, we can achieve competitive
    results. The results also show that there is large room for improvement,
    especially by gearing the collection of Facebook pages, with a view to the
    target domain.

    A Comparison of Word Embeddings for English and Cross-Lingual Chinese Word Sense Disambiguation

    Hong Jin Kang, Tao Chen, Muthu Kumar Chandrasekaran, Min-Yen Kan
    Comments: 10 pages
    Subjects: Computation and Language (cs.CL)

    Word embeddings are now ubiquitous forms of word representation in natural
    language processing. There have been applications of word embeddings for
    monolingual word sense disambiguation (WSD) in English, but few comparisons
    have been done. This paper attempts to bridge that gap by examining popular
    embeddings for the task of monolingual English WSD. Our simplified method leads
    to comparable state-of-the-art performance without expensive retraining.
    Cross-Lingual WSD – where the word senses of a word in a source language e come
    from a separate target translation language f – can also assist in language
    learning; for example, when providing translations of target vocabulary for
    learners. Thus we have also applied word embeddings to the novel task of
    cross-lingual WSD for Chinese and provide a public dataset for further
    benchmarking. We have also experimented with using word embeddings for LSTM
    networks and found surprisingly that a basic LSTM network does not work well.
    We discuss the ramifications of this outcome.

    Increasing the throughput of machine translation systems using clouds

    Jernej Vičič, Andrej Brodnik
    Comments: 20 pages, 7 figures
    Subjects: Computation and Language (cs.CL); Distributed, Parallel, and Cluster Computing (cs.DC)

    The manuscript presents an experiment at implementation of a Machine
    Translation system in a MapReduce model. The empirical evaluation was done
    using fully implemented translation systems embedded into the MapReduce
    programming model. Two machine translation paradigms were studied: shallow
    transfer Rule Based Machine Translation and Statistical Machine Translation.

    The results show that the MapReduce model can be successfully used to
    increase the throughput of a machine translation system. Furthermore this
    method enhances the throughput of a machine translation system without
    decreasing the quality of the translation output.

    Thus, the present manuscript also represents a contribution to the seminal
    work in natural language processing, specifically Machine Translation. It first
    points toward the importance of the definition of the metric of throughput of
    translation system and, second, the applicability of the machine translation
    task to the MapReduce paradigm.

    Old Content and Modern Tools – Searching Named Entities in a Finnish OCRed Historical Newspaper Collection 1771-1910

    Kimmo Kettunen, Eetu Mäkelä, Teemu Ruokolainen, Juha Kuokkala, Laura Löfberg
    Comments: 24 pages, 13 tables
    Subjects: Computation and Language (cs.CL)

    Named Entity Recognition (NER), search, classification and tagging of names
    and name like frequent informational elements in texts, has become a standard
    information extraction procedure for textual data. NER has been applied to many
    types of texts and different types of entities: newspapers, fiction, historical
    records, persons, locations, chemical compounds, protein families, animals etc.
    In general a NER system’s performance is genre and domain dependent and also
    used entity categories vary (Nadeau and Sekine, 2007). The most general set of
    named entities is usually some version of three partite categorization of
    locations, persons and organizations. In this paper we report first large scale
    trials and evaluation of NER with data out of a digitized Finnish historical
    newspaper collection Digi. Experiments, results and discussion of this research
    serve development of the Web collection of historical Finnish newspapers.

    Digi collection contains 1,960,921 pages of newspaper material from years
    1771-1910 both in Finnish and Swedish. We use only material of Finnish
    documents in our evaluation. The OCRed newspaper collection has lots of OCR
    errors; its estimated word level correctness is about 70-75 % (Kettunen and
    P”a”akk”onen, 2016). Our principal NER tagger is a rule-based tagger of
    Finnish, FiNER, provided by the FIN-CLARIN consortium. We show also results of
    limited category semantic tagging with tools of the Semantic Computing Research
    Group (SeCo) of the Aalto University. Three other tools are also evaluated
    briefly.

    This research reports first published large scale results of NER in a
    historical Finnish OCRed newspaper collection. Results of the research
    supplement NER results of other languages with similar noisy data.

    Automatic recognition of child speech for robotic applications in noisy environments

    Samuel Fernando, Roger K. Moore, David Cameron, Emily C. Collins, Abigail Millings, Amanda J. Sharkey, Tony J. Prescott
    Comments: Submission to Computer Speech and Language, special issue on Interaction Technologies for Children
    Subjects: Computation and Language (cs.CL); Sound (cs.SD)

    Automatic speech recognition (ASR) allows a natural and intuitive interface
    for robotic educational applications for children. However there are a number
    of challenges to overcome to allow such an interface to operate robustly in
    realistic settings, including the intrinsic difficulties of recognising child
    speech and high levels of background noise often present in classrooms. As part
    of the EU EASEL project we have provided several contributions to address these
    challenges, implementing our own ASR module for use in robotics applications.
    We used the latest deep neural network algorithms which provide a leap in
    performance over the traditional GMM approach, and apply data augmentation
    methods to improve robustness to noise and speaker variation. We provide a
    close integration between the ASR module and the rest of the dialogue system,
    allowing the ASR to receive in real-time the language models relevant to the
    current section of the dialogue, greatly improving the accuracy. We integrated
    our ASR module into an interactive, multimodal system using a small humanoid
    robot to help children learn about exercise and energy. The system was
    installed at a public museum event as part of a research study where 320
    children (aged 3 to 14) interacted with the robot, with our ASR achieving 90%
    accuracy for fluent and near-fluent speech.

    Audio Visual Speech Recognition using Deep Recurrent Neural Networks

    Abhinav Thanda, Shankar M Venkatesan
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Learning (cs.LG)

    In this work, we propose a training algorithm for an audio-visual automatic
    speech recognition (AV-ASR) system using deep recurrent neural network
    (RNN).First, we train a deep RNN acoustic model with a Connectionist Temporal
    Classification (CTC) objective function. The frame labels obtained from the
    acoustic model are then used to perform a non-linear dimensionality reduction
    of the visual features using a deep bottleneck network. Audio and visual
    features are fused and used to train a fusion RNN. The use of bottleneck
    features for visual modality helps the model to converge properly during
    training. Our system is evaluated on GRID corpus. Our results show that
    presence of visual modality gives significant improvement in character error
    rate (CER) at various levels of noise even when the model is trained without
    noisy data. We also provide a comparison of two fusion methods: feature fusion
    and decision fusion.

    An Automated System for Essay Scoring of Online Exams in Arabic based on Stemming Techniques and Levenshtein Edit Operations

    Emad Fawzi Al-Shalabi
    Comments: 5 pages, 2 figures
    Journal-ref: IJCSI International Journal of Computer Science Issues, Volume 13,
    Issue 5, September 2016
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

    In this article, an automated system is proposed for essay scoring in Arabic
    language for online exams based on stemming techniques and Levenshtein edit
    operations. An online exam has been developed on the proposed mechanisms,
    exploiting the capabilities of light and heavy stemming. The implemented online
    grading system has shown to be an efficient tool for automated scoring of essay
    questions.


    Distributed, Parallel, and Cluster Computing

    Coarse mesh partitioning for tree based AMR

    Carsten Burstedde, Johannes Holke
    Comments: 28 pages, 11 figures, 6 tables
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    In tree based adaptive mesh refinement, elements are partitioned between
    processes using a space filling curve. The curve establishes an ordering
    between all elements that derive from the same root element, the tree. When
    representing more complex geometries by patching together several trees, the
    roots of these trees form an unstructured coarse mesh. We present an algorithm
    to partition the elements of the coarse mesh such that (a) the fine mesh can be
    load-balanced to equal element counts per process regardless of the
    element-to-tree map and (b) each process that holds fine mesh elements has
    access to the meta data of all relevant trees. As an additional feature, the
    algorithm partitions the meta data of relevant ghost (halo) trees as well. We
    develop in detail how each process computes the communication pattern for the
    partition routine without handshaking and with minimal data movement. We
    demonstrate the scalability of this approach on up to 917e3 MPI ranks and
    .37e12 coarse mesh elements, measuring run times of one second or less.

    SLA-aware Interactive Workflow Assistant for HPC Parameter Sweeping Experiments

    Bruno Silva, Marco A. S. Netto, Renato L. F. Cunha
    Comments: 11 pages, 9 figures
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    A common workflow in science and engineering is to (i) setup and deploy large
    experiments with tasks comprising an application and multiple parameter values;
    (ii) generate intermediate results; (iii) analyze them; and (iv) reprioritize
    the tasks. These steps are repeated until the desired goal is achieved, which
    can be the evaluation/simulation of complex systems or model calibration. Due
    to time and cost constraints, sweeping all possible parameter values of the
    user application is not always feasible. Experimental Design techniques can
    help users reorganize submission-execution-analysis workflows to bring a
    solution in a more timely manner. This paper introduces a novel tool that
    leverages users’ feedback on analyzing intermediate results of parameter
    sweeping experiments to advise them about their strategies on parameter
    selections tied to their SLA constraints. We evaluated our tool with three
    applications of distinct domains and search space shapes. Our main finding is
    that users with submission-execution-analysis workflows can benefit from their
    interaction with intermediate results and adapt themselves according to their
    domain expertise and SLA constraints.

    Helping HPC Users Specify Job Memory Requirements via Machine Learning

    Eduardo R. Rodrigues, Renato L. F. Cunha, Marco A. S. Netto, Michael Spriggs
    Comments: 8 pages, 3 figures, presented at the Third Annual Workshop on HPC User Support Tools
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Resource allocation in High Performance Computing (HPC) settings is still not
    easy for end-users due to the wide variety of application and environment
    configuration options. Users have difficulties to estimate the number of
    processors and amount of memory required by their jobs, select the queue and
    partition, and estimate when job output will be available to plan for next
    experiments. Apart from wasting infrastructure resources by making wrong
    allocation decisions, overall user response time can also be negatively
    impacted. Techniques that exploit batch scheduler systems to predict waiting
    time and runtime of user jobs have already been proposed. However, we observed
    that such techniques are not suitable for predicting job memory usage. In this
    paper we introduce a tool to help users predict their memory requirements using
    machine learning. We describe the integration of the tool with a batch
    scheduler system, discuss how batch scheduler log data can be exploited to
    generate memory usage predictions through machine learning, and present results
    of two production systems containing thousands of jobs.

    Language Support for Reliable Memory Regions

    Saurabh Hukerikar, Christian Engelmann
    Comments: The 29th International Workshop on Languages and Compilers for Parallel Computing
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Programming Languages (cs.PL); Software Engineering (cs.SE)

    The path to exascale computational capabilities in high-performance computing
    (HPC) systems is challenged by the inadequacy of present software technologies
    to adapt to the rapid evolution of architectures of supercomputing systems. The
    constraints of power have driven system designs to include increasingly
    heterogeneous architectures and diverse memory technologies and interfaces.
    Future systems are also expected to experience an increased rate of errors,
    such that the applications will no longer be able to assume correct behavior of
    the underlying machine. To enable the scientific community to succeed in
    scaling their applications, and to harness the capabilities of exascale
    systems, we need software strategies that provide mechanisms for explicit
    management of resilience to errors in the system, in addition to locality of
    reference in the complex memory hierarchies of future HPC systems.

    In prior work, we introduced the concept of explicitly reliable memory
    regions, called havens. Memory management using havens supports reliability
    management through a region-based approach to memory allocations. Havens enable
    the creation of robust memory regions, whose resilient behavior is guaranteed
    by software-based protection schemes. In this paper, we propose language
    support for havens through type annotations that make the structure of a
    program’s havens more explicit and convenient for HPC programmers to use. We
    describe how the extended haven-based memory management model is implemented,
    and demonstrate the use of the language-based annotations to affect the
    resiliency of a conjugate gradient solver application.

    ZeroTouch Provisioning (ZTP) Model and Infrastructure Components for Multi-provider Cloud Services Provisioning

    Yuri Demchenko, Paola Grosso, Cees de Laat, Sonja Filiposka, Migiel de Vos
    Comments: 6 pages, 2 fugures
    Journal-ref: Fifth IEEE International Workshop on Cloud Computing Interclouds,
    Multiclouds, Federations, and Interoperability (Intercloud 2016), In Proc.
    IEEE International Conference on Cloud Engineering (IC2E), April 4 – 8, 2016,
    Berlin, Germany
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    This paper presents results of the ongoing development of the Cloud Services
    Delivery Infrastructure (CSDI) that provides a basis for infrastructure centric
    cloud services provisioning, operation and management in multi-cloud
    multi-provider environment defined as a Zero Touch Provisioning, Operation and
    Management (ZTP/ZTPOM) model. The presented work refers to use cases from data
    intensive research that require high performance computation resources and
    large storage volumes that are typically distributed between datacenters often
    involving multiple cloud providers. Automation for large scale scientific (and
    industrial) applications should include provisioning of both inter-cloud
    network infrastructure and intra-cloud application resources. It should provide
    support for the complete application operation workflow together with the
    possible application infrastructure and resources changes that can occur during
    the application lifecycle. The authors investigate existing technologies for
    automation of the service provisioning and management processes aiming to
    cross-pollinate best practices from currently disconnected domains such as
    cloud based applications provisioning and multi-domain high-performance network
    provisioning. The paper refers to the previous and legacy research by authors,
    the Open Cloud eXchange (OCX), that has been proposed to address the last mile
    problem in cloud services delivery to campuses over trans-national backbone
    networks such as GEANT. OCX will serve as an integral component of the
    prospective ZTP infrastructure over the GEANT network. Another important
    component, the Marketplace, is defined for providing cloud services and
    applications discovery (in generally intercloud environment) and may also
    support additional services such as services composition and trust brokering
    for establishing customer-provider federations.

    Resilience Design Patterns – A Structured Approach to Resilience at Extreme Scale

    Saurabh Hukerikar, Christian Engelmann
    Comments: Oak Ridge National Laboratory Technical Report version 1.0
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Software Engineering (cs.SE)

    In this document, we develop a structured approach to the management of HPC
    resilience based on the concept of resilience-based design patterns. A design
    pattern is a general repeatable solution to a commonly occurring problem. We
    identify the commonly occurring problems and solutions used to deal with
    faults, errors and failures in HPC systems. The catalog of resilience design
    patterns provides designers with reusable design elements. We define a design
    framework that enhances our understanding of the important constraints and
    opportunities for solutions deployed at various layers of the system stack. The
    framework may be used to establish mechanisms and interfaces to coordinate
    flexible fault management across hardware and software components. The
    framework also enables optimization of the cost-benefit trade-offs among
    performance, resilience, and power consumption. The overall goal of this work
    is to enable a systematic methodology for the design and evaluation of
    resilience technologies in extreme-scale HPC systems that keep scientific
    applications running to a correct solution in a timely and cost-efficient
    manner in spite of frequent faults, errors, and failures of various types.

    Increasing the throughput of machine translation systems using clouds

    Jernej Vičič, Andrej Brodnik
    Comments: 20 pages, 7 figures
    Subjects: Computation and Language (cs.CL); Distributed, Parallel, and Cluster Computing (cs.DC)

    The manuscript presents an experiment at implementation of a Machine
    Translation system in a MapReduce model. The empirical evaluation was done
    using fully implemented translation systems embedded into the MapReduce
    programming model. Two machine translation paradigms were studied: shallow
    transfer Rule Based Machine Translation and Statistical Machine Translation.

    The results show that the MapReduce model can be successfully used to
    increase the throughput of a machine translation system. Furthermore this
    method enhances the throughput of a machine translation system without
    decreasing the quality of the translation output.

    Thus, the present manuscript also represents a contribution to the seminal
    work in natural language processing, specifically Machine Translation. It first
    points toward the importance of the definition of the metric of throughput of
    translation system and, second, the applicability of the machine translation
    task to the MapReduce paradigm.


    Learning

    Fair Learning in Markovian Environments

    Shahin Jabbari, Matthew Joseph, Michael Kearns, Jamie Morgenstern, Aaron Roth
    Subjects: Learning (cs.LG)

    We initiate the study of fair learning in Markovian settings, where the
    actions of a learning algorithm may affect its environment and future rewards.
    Working in the model of reinforcement learning, we define a fairness constraint
    requiring that an algorithm never prefers one action over another if the
    long-term (discounted) reward of choosing the latter action is higher.

    Our first result is negative: despite the fact that fairness is consistent
    with the optimal policy, any learning algorithm satisfying fairness must take
    exponentially many rounds in the number of states to achieve non-trivial
    approximation to the optimal policy. Our main result is a polynomial time
    algorithm that is provably fair under an approximate notion of fairness, thus
    establishing an exponential gap between exact and approximate fairness.

    Incremental Sequence Learning

    Edwin D. de Jong
    Comments: 18 pages
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    Deep learning research over the past years has shown that by increasing the
    scope or difficulty of the learning problem over time, increasingly complex
    learning problems can be addressed. We study incremental learning in the
    context of sequence learning, using generative RNNs in the form of multi-layer
    recurrent Mixture Density Networks. We introduce Incremental Sequence Learning,
    a simple incremental approach to sequence learning. Incremental Sequence
    Learning starts out by using only the first few steps of each sequence as
    training data. Each time a performance criterion has been reached, the length
    of the parts of the sequences used for training is increased. To evaluate
    Incremental Sequence Learning and comparison methods, we introduce and make
    available a novel sequence learning task and data set: predicting and
    classifying MNIST pen stroke sequences, where the familiar handwritten digit
    images have been transformed to pen stroke sequences representing the skeletons
    of the digits. We find that Incremental Sequence Learning greatly speeds up
    sequence learning and reaches the best test performance level of regular
    sequence learning 20 times faster, reduces the test error by 74%, and in
    general performs more robustly; it displays lower variance and achieves
    sustained progress after all three comparison method have stopped improving. A
    trained sequence prediction model is also used in transfer learning to the task
    of sequence classification, where it is found that transfer learning realizes
    improved classification performance compared to methods that learn to classify
    from scratch.

    Attributing Hacks

    Ziqi Liu, Alexander J. Smola, Kyle Soska, Yu-Xiang Wang, Qinghua Zheng
    Subjects: Learning (cs.LG); Cryptography and Security (cs.CR); Applications (stat.AP)

    In this paper we describe an algorithm for estimating the provenance of hacks
    on websites. That is, given properties of sites and the temporal occurrence of
    attacks, we are able to attribute individual attacks to joint causes and
    vulnerabilities, as well as estimating the evolution of these vulnerabilities
    over time. Specifically, we use hazard regression with a time-varying additive
    hazard function parameterized in a generalized linear form. The activation
    coefficients on each feature are continuous-time functions over time. We
    formulate the problem of learning these functions as a constrained variational
    maximum likelihood estimation problem with total variation penalty and show
    that the optimal solution is a 0th order spline (a piecewise constant function)
    with a finite number of known knots. This allows the inference problem to be
    solved efficiently and at scale by solving a finite dimensional optimization
    problem. Extensive experiments on real data sets show that our method
    significantly outperforms Cox’s proportional hazard model. We also conduct a
    case study and verify that the fitted functions are indeed recovering
    vulnerable features and real-life events such as the release of code to exploit
    these features in hacker blogs.

    Online Learning for Wireless Distributed Computing

    Yi-Hsuan Kao, Kwame Wright, Bhaskar Krishnamachari, Fan Bai
    Comments: 10 pages, 8 figures, conference
    Subjects: Learning (cs.LG)

    There has been a growing interest for Wireless Distributed Computing (WDC),
    which leverages collaborative computing over multiple wireless devices. WDC
    enables complex applications that a single device cannot support individually.
    However, the problem of assigning tasks over multiple devices becomes
    challenging in the dynamic environments encountered in real-world settings,
    considering that the resource availability and channel conditions change over
    time in unpredictable ways due to mobility and other factors. In this paper, we
    formulate a task assignment problem as an online learning problem using an
    adversarial multi-armed bandit framework. We propose MABSTA, a novel online
    learning algorithm that learns the performance of unknown devices and channel
    qualities continually through exploratory probing and makes task assignment
    decisions by exploiting the gained knowledge. For maximal adaptability, MABSTA
    is designed to make no stochastic assumption about the environment. We analyze
    it mathematically and provide a worst-case performance guarantee for any
    dynamic environment. We also compare it with the optimal offline policy as well
    as other baselines via emulations on trace-data obtained from a wireless IoT
    testbed, and show that it offers competitive and robust performance in all
    cases. To the best of our knowledge, MABSTA is the first online algorithm in
    this domain of task assignment problems and provides provable performance
    guarantee.

    Tuning Recurrent Neural Networks with Reinforcement Learning

    Natasha Jaques, Shixiang Gu, Richard E. Turner, Douglas Eck
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

    Sequence models can be trained using supervised learning and a next-step
    prediction objective. This approach, however, suffers from known failure modes.
    For example, it is notoriously difficult to ensure multi-step generated
    sequences have coherent global structure. Motivated by the fact that
    reinforcement learning (RL) can be used to impose arbitrary properties on
    generated data by choosing appropriate reward functions, in this paper we
    propose a novel approach for sequence training which combines Maximum
    Likelihood (ML) and RL training. We refine a sequence predictor by optimizing
    for some imposed reward functions, while maintaining good predictive properties
    learned from data. We propose efficient ways to solve this by augmenting deep
    Q-learning with a cross-entropy reward and deriving novel off-policy methods
    for RNNs from stochastic optimal control (SOC). We explore the usefulness of
    our approach in the context of music generation. An LSTM is trained on a large
    corpus of songs to predict the next note in a musical sequence. This Note-RNN
    is then refined using RL, where the reward function is a combination of rewards
    based on rules of music theory, as well as the output of another trained
    Note-RNN. We show that by combining ML and RL, this RL Tuner method can not
    only produce more pleasing melodies, but that it can significantly reduce
    unwanted behaviors and failure modes of the RNN.

    Delving into Transferable Adversarial Examples and Black-box Attacks

    Yanpei Liu, Xinyun Chen, Chang Liu, Dawn Song
    Subjects: Learning (cs.LG)

    An intriguing property of deep neural networks is the existence of
    adversarial examples, which can transfer among different architectures. These
    transferable adversarial examples may severely hinder deep neural network-based
    applications. Previous works mostly study the transferability using small scale
    datasets. In this work, we are the first to conduct an extensive study of the
    transferability over large models and a large scale dataset, and we are also
    the first to study the transferability of targeted adversarial examples with
    their target labels. We study both non-targeted and targeted adversarial
    examples, and show that while transferable non-targeted adversarial examples
    are easy to find, targeted adversarial examples generated using existing
    approaches almost never transfer with their target labels. Therefore, we
    propose novel ensemble-based approaches to generating transferable adversarial
    examples. Using such approaches, we observe a large proportion of targeted
    adversarial examples that are able to transfer with their target labels for the
    first time. We also present some geometric studies to help understanding the
    transferable adversarial examples. Finally, we show that the adversarial
    examples generated using ensemble-based approaches can successfully attack
    Clarifai.com, which is a black-box image classification system.

    Recursive Regression with Neural Networks: Approximating the HJI PDE Solution

    Vicenç Rubies Royo
    Subjects: Learning (cs.LG); Dynamical Systems (math.DS)

    Most machine learning applications using neural networks seek to approximate
    some function g(x) by minimizing some cost criterion. In the simplest case, if
    one has access to pairs of the form (x, y) where y = g(x), the problem can be
    framed as a simple regression problem. Beyond this family of problems, we find
    many important cases where g(x) is unknown so this approach is not always
    viable. However, similar to what we find in the work of Mnih et al. (2013), if
    we have some known properties of the function we are seeking to approximate,
    there is still hope to frame the problem as a regression problem. In this work,
    we show this in the context of trying to approximate the solution to a
    particular partial differential equation known as the Hamilton-Jacobi-Isaacs
    PDE found in the fields of control theory and robotics.

    Variational Lossy Autoencoder

    Xi Chen, Diederik P. Kingma, Tim Salimans, Yan Duan, Prafulla Dhariwal, John Schulman, Ilya Sutskever, Pieter Abbeel
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    Representation learning seeks to expose certain aspects of observed data in a
    learned representation that’s amenable to downstream tasks like classification.
    For instance, a good representation for 2D images might be one that describes
    only global structure and discards information about detailed texture. In this
    paper, we present a simple but principled method to learn such global
    representations by combining Variational Autoencoder (VAE) with neural
    autoregressive models such as RNN, MADE and PixelRNN/CNN. Our proposed VAE
    model allows us to have control over what the global latent code can learn and
    , by designing the architecture accordingly, we can force the global latent
    code to discard irrelevant information such as texture in 2D images, and hence
    the code only “autoencodes” data in a lossy fashion. In addition, by leveraging
    autoregressive models as both prior distribution (p(z)) and decoding
    distribution (p(x|z)), we can greatly improve generative modeling performance
    of VAEs, achieving new state-of-the-art results on MNIST, OMNIGLOT and
    Caltech-101 Silhouettes density estimation tasks.

    A Unified Maximum Likelihood Approach for Optimal Distribution Property Estimation

    Jayadev Acharya, Hirakendu Das, Alon Orlitsky, Ananda Theertha Suresh
    Subjects: Information Theory (cs.IT); Learning (cs.LG)

    The advent of data science has spurred interest in estimating properties of
    distributions over large alphabets. Fundamental symmetric properties such as
    support size, support coverage, entropy, and proximity to uniformity, received
    most attention, with each property estimated using a different technique and
    often intricate analysis tools.

    We prove that for all these properties, a single, simple, plug-in
    estimator—profile maximum likelihood (PML)—performs as well as the best
    specialized techniques. This raises the possibility that PML may optimally
    estimate many other symmetric properties.

    Heter-LP: A heterogeneous label propagation algorithm and its application in drug repositioning

    Maryam Lotfi Shahreza, Nasser Ghadiri, Seyed Rasul Mossavi, Jaleh Varshosaz, James Green
    Subjects: Quantitative Methods (q-bio.QM); Learning (cs.LG)

    Drug repositioning offers an effective solution to drug discovery, saving
    both time and resources by finding new indications for existing drugs.
    Typically, a drug takes effect via its protein targets in the cell. As a
    result, it is necessary for drug development studies to conduct an
    investigation into the interrelationships of drugs, protein targets, and
    diseases. Although previous studies have made a strong case for the
    effectiveness of integrative network-based methods for predicting these
    interrelationships, little progress has been achieved in this regard within
    drug repositioning research. Moreover, the interactions of new drugs and
    targets (lacking any known targets and drugs, respectively) cannot be
    accurately predicted by most established methods. In this paper, we propose a
    novel semi-supervised heterogeneous label propagation algorithm named Heter-LP,
    which applies both local as well as global network features for data
    integration. To predict drug-target, disease-target, and drug-disease
    associations, we use information about drugs, diseases, and targets as
    collected from multiple sources at different levels. Our algorithm integrates
    these various types of data into a heterogeneous network and implements a label
    propagation algorithm to find new interactions. Statistical analyses of 10-fold
    cross-validation results and experimental analysis support the effectiveness of
    the proposed algorithm.

    Audio Visual Speech Recognition using Deep Recurrent Neural Networks

    Abhinav Thanda, Shankar M Venkatesan
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Learning (cs.LG)

    In this work, we propose a training algorithm for an audio-visual automatic
    speech recognition (AV-ASR) system using deep recurrent neural network
    (RNN).First, we train a deep RNN acoustic model with a Connectionist Temporal
    Classification (CTC) objective function. The frame labels obtained from the
    acoustic model are then used to perform a non-linear dimensionality reduction
    of the visual features using a deep bottleneck network. Audio and visual
    features are fused and used to train a fusion RNN. The use of bottleneck
    features for visual modality helps the model to converge properly during
    training. Our system is evaluated on GRID corpus. Our results show that
    presence of visual modality gives significant improvement in character error
    rate (CER) at various levels of noise even when the model is trained without
    noisy data. We also provide a comparison of two fusion methods: feature fusion
    and decision fusion.

    Lie-Access Neural Turing Machines

    Greg Yang, Alexander M. Rush
    Comments: Submitted to ICLR. Rewrite and improvement of this https URL
    Subjects: Neural and Evolutionary Computing (cs.NE); Learning (cs.LG)

    Recent work has demonstrated the effectiveness of employing explicit external
    memory structures in conjunction with deep neural models for algorithmic
    learning (Graves et al. 2014; Weston et al. 2014). These models utilize
    differentiable versions of traditional discrete memory-access structures
    (random access, stacks, tapes) to provide the variable-length storage necessary
    for computational tasks. In this work, we propose an alternative model,
    Lie-access memory, that is explicitly designed for the neural setting. In this
    paradigm, memory is accessed using a continuous head in a key-space manifold.
    The head is moved via Lie group actions, such as shifts or rotations, generated
    by a controller, and soft memory access is performed by considering the
    distance to keys associated with each memory. We argue that Lie groups provide
    a natural generalization of discrete memory structures, such as Turing
    machines, as they provide inverse and identity operators while maintain
    differentiability. To experiment with this approach, we implement several
    simplified Lie-access neural Turing machine (LANTM) with different Lie groups.
    We find that this approach is able to perform well on a range of algorithmic
    tasks.

    RL(^2): Fast Reinforcement Learning via Slow Reinforcement Learning

    Yan Duan, John Schulman, Xi Chen, Peter Bartlett, Ilya Sutskever, Pieter Abbeel
    Comments: 14 pages. Under review as a conference paper at ICLR 2017
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    Deep reinforcement learning (deep RL) has been successful in learning
    sophisticated behaviors automatically; however, the learning process requires a
    huge number of trials. In contrast, animals can learn new tasks in just a few
    trials, benefiting from their prior knowledge about the world. This paper seeks
    to bridge this gap. Rather than designing a “fast” reinforcement learning
    algorithm, we propose to represent it as a recurrent neural network (RNN) and
    learn it from data. In our proposed method, RL(^2), the algorithm is encoded in
    the weights of the RNN, which are learned slowly through a general-purpose
    (“slow”) RL algorithm. The RNN receives all information a typical RL algorithm
    would receive, including observations, actions, rewards, and termination flags;
    and it retains its state across episodes in a given Markov Decision Process
    (MDP). The activations of the RNN store the state of the “fast” RL algorithm on
    the current (previously unseen) MDP. We evaluate RL(^2) experimentally on both
    small-scale and large-scale problems. On the small-scale side, we train it to
    solve randomly generated multi-arm bandit problems and finite MDPs. After
    RL(^2) is trained, its performance on new MDPs is close to human-designed
    algorithms with optimality guarantees. On the large-scale side, we test RL(^2)
    on a vision-based navigation task and show that it scales up to
    high-dimensional problems.

    Recursive Decomposition for Nonconvex Optimization

    Abram L. Friesen, Pedro Domingos
    Comments: 11 pages, 7 figures, pdflatex
    Journal-ref: Proceedings of the 24th International Joint Conference on
    Artificial Intelligence (2015), pp. 253-259
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    Continuous optimization is an important problem in many areas of AI,
    including vision, robotics, probabilistic inference, and machine learning.
    Unfortunately, most real-world optimization problems are nonconvex, causing
    standard convex techniques to find only local optima, even with extensions like
    random restarts and simulated annealing. We observe that, in many cases, the
    local modes of the objective function have combinatorial structure, and thus
    ideas from combinatorial optimization can be brought to bear. Based on this, we
    propose a problem-decomposition approach to nonconvex optimization. Similarly
    to DPLL-style SAT solvers and recursive conditioning in probabilistic
    inference, our algorithm, RDIS, recursively sets variables so as to simplify
    and decompose the objective function into approximately independent
    sub-functions, until the remaining functions are simple enough to be optimized
    by standard techniques like gradient descent. The variables to set are chosen
    by graph partitioning, ensuring decomposition whenever possible. We show
    analytically that RDIS can solve a broad class of nonconvex optimization
    problems exponentially faster than gradient descent with random restarts.
    Experimentally, RDIS outperforms standard techniques on problems like structure
    from motion and protein folding.


    Information Theory

    NP-Hardness of Reed-Solomon Decoding, and the Prouhet-Tarry-Escott Problem

    Venkata Gandikota, Badih Ghazi, Elena Grigorescu
    Subjects: Information Theory (cs.IT); Computational Complexity (cs.CC)

    Establishing the complexity of {em Bounded Distance Decoding} for
    Reed-Solomon codes is a fundamental open problem in coding theory, explicitly
    asked by Guruswami and Vardy (IEEE Trans. Inf. Theory, 2005). The problem is
    motivated by the large current gap between the regime when it is NP-hard, and
    the regime when it is efficiently solvable (i.e., the Johnson radius).

    We show the first NP-hardness results for asymptotically smaller decoding
    radii than the maximum likelihood decoding radius of Guruswami and Vardy.
    Specifically, for Reed-Solomon codes of length (N) and dimension (K=O(N)), we
    show that it is NP-hard to decode more than ( N-K- cfrac{log N}{loglog N})
    errors (with (c>0) an absolute constant). Moreover, we show that the problem is
    NP-hard under quasipolynomial-time reductions for an error amount (> N-K-
    clog{N}) (with (c>0) an absolute constant).

    These results follow from the NP-hardness of a generalization of the
    classical Subset Sum problem to higher moments, called {em Moments Subset
    Sum}, which has been a known open problem, and which may be of independent
    interest.

    We further reveal a strong connection with the well-studied
    Prouhet-Tarry-Escott problem in Number Theory, which turns out to capture a
    main barrier in extending our techniques. We believe the Prouhet-Tarry-Escott
    problem deserves further study in the theoretical computer science community.

    The Nonconvex Geometry of Low-Rank Matrix Optimizations with General Objective Functions

    Qiuwei Li, Gongguo Tang
    Subjects: Information Theory (cs.IT); Optimization and Control (math.OC)

    This work considers the minimization of a general convex function (f(X)) over
    the cone of positive semi-definite matrices whose optimal solution (X^star) is
    of low-rank. Standard first-order convex solvers require performing an
    eigenvalue decomposition in each iteration, severely limiting their
    scalability. A natural nonconvex reformulation of the problem factors the
    variable (X) into the product of a rectangular matrix with fewer columns and
    its transpose. For a special class of matrix sensing and completion problems
    with quadratic objective functions, local search algorithms applied to the
    factored problem have been shown to be much more efficient and, in spite of
    being nonconvex, to converge to the global optimum. The purpose of this work is
    to extend this line of study to general convex objective functions (f(X)) and
    investigate the geometry of the resulting factored formulations. Specifically,
    we prove that when (f(X)) satisfies restricted strong convexity and smoothness,
    each critical point of the factored problem either corresponds to the optimal
    solution (X^star) or is a strict saddle point where the Hessian matrix has a
    negative eigenvalue. Such a geometric structure of the factored formulation
    ensures that many local search algorithms can converge to the global optimum
    with random initializations.

    Throughput Analysis of Decentralized Coded Content Caching in Cellular Networks

    Mohsen Karimzadeh Kiskani, Hamid R. Sadjadpour
    Comments: Accepted to be published in IEEE Transactions on Wireless Communications (November 2016)
    Subjects: Information Theory (cs.IT)

    Decentralized coded content caching for next generation cellular networks is
    studied. The contents are linearly combined and cached in under-utilized caches
    of User Terminals (UTs) and its throughput capacity is compared with
    decentralized uncoded content caching. In both scenarios, we consider multihop
    Device-to-Device (D2D) communications and the use of femtocaches in the
    network. It is shown that decentralized coded content caching can increase the
    network throughput capacity compared to decentralized uncoded caching by
    reducing the number of hops needed to deliver the desired content. Further, the
    throughput capacity for Zipfian content request distribution is computed and it
    is shown that the decentralized coded content cache placement can increase the
    throughput capacity of cellular networks by a factor of ((log(n))^2) where (n)
    is the number of nodes served by a femtocache.

    Channel Estimation for Hybrid Architecture Based Wideband Millimeter Wave Systems

    Kiran Venugopal, Ahmed Alkhateeb, Nuria González Prelcic, Robert W. Heath, Jr
    Comments: 31 pages, 13 figures
    Subjects: Information Theory (cs.IT)

    Hybrid analog and digital precoding allows millimeter wave (mmWave) systems
    to achieve both array and multiplexing gain. The design of the hybrid precoders
    and combiners, though, is usually based on knowledge of the channel. Prior work
    on mmWave channel estimation with hybrid architectures focused on narrowband
    channels. Since mmWave systems will be wideband with frequency selectivity, it
    is vital to develop channel estimation solutions for hybrid architectures based
    wideband mmWave systems. In this paper, we develop a sparse formulation and
    compressed sensing based solutions for the wideband mmWave channel estimation
    problem for hybrid architectures. First, we leverage the sparse structure of
    the frequency selective mmWave channels and formulate the channel estimation
    problem as a sparse recovery in both time and frequency domains. Then, we
    propose explicit channel estimation techniques for purely time or frequency
    domains and for combined time/frequency domains. Our solutions are suitable for
    both SC-FDE and OFDM systems. Simulation results show that the proposed
    solutions achieve good channel estimation quality, while requiring small
    training overhead. Leveraging the hybrid architecture at the transceivers gives
    further improvement in estimation error performance and achievable rates.

    Energy and Bursty Packet Loss Tradeoff over Fading Channels: A System Level Model

    M. Majid Butt, Eduard A. Jorswieck, Amr Mohamed
    Journal-ref: IEEE Systems Journal, 2016
    Subjects: Information Theory (cs.IT)

    Energy efficiency and quality of service (QoS) guarantees are the key design
    goals for the 5G wireless communication systems. In this context, we discuss a
    multiuser scheduling scheme over fading channels for loss tolerant
    applications. The loss tolerance of the application is characterized in terms
    of different parameters that contribute to quality of experience for the
    application. The mobile users are scheduled opportunistically such that a
    minimum QoS is guaranteed. We propose an opportunistic scheduling scheme and
    address the cross layer design framework when channel state information is not
    perfectly available at the transmitter and the receiver. We characterize the
    system energy as a function of different QoS and channel state estimation error
    parameters. The optimization problem is formulated using Markov chain framework
    and solved using stochastic optimization techniques. The results demonstrate
    that the parameters characterizing the packet loss are tightly coupled and
    relaxation of one parameter does not benefit the system much if the other
    constraints are tight. We evaluate the energy-performance trade-off numerically
    and show the effect of channel uncertainty on the packet scheduler design.

    New User-Irrepressible Protocol Sequences

    Yijin Zhang, Yuan-Hsun Lo, Kenneth W. Shum, Wing Shing Wong
    Comments: 15 pages, 3 tables
    Subjects: Information Theory (cs.IT); Combinatorics (math.CO)

    Protocol sequences are binary and periodic sequences used for deterministic
    multiple access in a collision channel without feedback. In this paper, we
    focus on user-irrepressible (UI) protocol sequences that can guarantee a
    positive individual throughput per sequence period with probability one for a
    slot-synchronous channel. As the sequence period has a fundamental impact on
    the worst-case channel access delay, a common objective of designing UI
    sequences is to make the sequence period as short as possible. Consider a
    communication channel that is shared by (M) active users, and assume that each
    protocol sequence has a constant Hamming weight (w). To attain a better delay
    performance than previously known UI sequences, this paper presents a CRTm
    construction of UI sequences with (w=M+1). For all non-prime (Mgeq 8), our
    construction produces the shortest known sequence period of UI sequences.
    Simulation results further show that the new construction enjoys a better
    average delay performance than other constructions. In addition, we derive an
    asymptotic lower bound on the minimum sequence period for (w=M+1) if the
    sequence structure satisfies some technical conditions, called equi-difference,
    and prove the tightness of this lower bound by using the CRTm construction.

    Bayesian data assimilation based on a family of outer measures

    Jeremie Houssineau, Daniel E. Clark
    Subjects: Information Theory (cs.IT)

    A flexible representation of uncertainty that remains within the standard
    framework of probabilistic measure theory is presented along with a study of
    its properties. This representation relies on a specific type of outer measure
    that is based on the measure of a supremum, hence combining additive and highly
    sub-additive components. It is shown that this type of outer measure enables
    the introduction of intuitive concepts such as pullback and general data
    assimilation operations.

    A Unified Maximum Likelihood Approach for Optimal Distribution Property Estimation

    Jayadev Acharya, Hirakendu Das, Alon Orlitsky, Ananda Theertha Suresh
    Subjects: Information Theory (cs.IT); Learning (cs.LG)

    The advent of data science has spurred interest in estimating properties of
    distributions over large alphabets. Fundamental symmetric properties such as
    support size, support coverage, entropy, and proximity to uniformity, received
    most attention, with each property estimated using a different technique and
    often intricate analysis tools.

    We prove that for all these properties, a single, simple, plug-in
    estimator—profile maximum likelihood (PML)—performs as well as the best
    specialized techniques. This raises the possibility that PML may optimally
    estimate many other symmetric properties.

    Dynamic TDD: the Asynchronous Case and the Synchronous Case

    Ming Ding, David Lopez Perez, Guoqiang Mao, Zihuai Lin
    Comments: 7 pages, 5 figures, submitted to IEEE ICC 2017
    Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

    Small cell networks (SCNs) are envisioned to embrace dynamic time division
    duplexing (TDD) in order to tailor downlink (DL)/uplink (UL) subframe resources
    to quick variations and burstiness of DL/UL traffic. The study of dynamic TDD
    is particularly important because it serves as the predecessor of the full
    duplex transmission technology, which has been identified as one of the
    candidate technologies for the 5th-generation (5G) networks. Up to now, the
    existing work on dynamic TDD only considered an asynchronous network scenario.
    However, the current 4G network is a synchronous one and it is very likely that
    the future 5G networks will follow the same system design philosophy for easy
    implementation. In this paper, for the first time, we consider dynamic TDD in
    synchronous networks and present analytical results on the probabilities of
    inter-cell inter-link interference and the DL/UL time resource utilization.
    Based on our analytical results, the area spectral efficiency is further
    investigated to shed new light on the performance and the deployment of dynamic
    TDD in future synchronous networks.

    Unit circle MVDR beamformer

    Saurav R Tuladhar, John R Buck
    Comments: Accepted to ICASSP 2015
    Subjects: Information Theory (cs.IT)

    The array polynomial is the z-transform of the array weights for a narrowband
    planewave beamformer using a uniform linear array (ULA). Evaluating the array
    polynomial on the unit circle in the complex plane yields the beampattern. The
    locations of the polynomial zeros on the unit circle indicate the nulls of the
    beampattern. For planewave signals measured with a ULA, the locations of the
    ensemble MVDR polynomial zeros are constrained on the unit circle. However,
    sample matrix inversion (SMI) MVDR polynomial zeros generally do not fall on
    the unit circle. The proposed unit circle MVDR (UC MVDR) projects the zeros of
    the SMI MVDR polynomial radially on the unit circle. This satisfies the
    constraint on the zeros of ensemble MVDR polynomial. Numerical simulations show
    that the UC MVDR beamformer suppresses interferers better than the SMI MVDR and
    the diagonal loaded MVDR beamformer and also improves the white noise gain
    (WNG).

    Probabilistic Caching in Wireless D2D Networks: Cache Hit Optimal vs. Throughput Optimal

    Zheng Chen, Nikolaos Pappas, Marios Kountouris
    Comments: IEEE Communications Letters
    Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT)

    Departing from the conventional cache hit optimization in cache-enabled
    wireless networks, we consider an alternative optimization approach for the
    probabilistic caching placement in stochastic wireless D2D caching networks
    taking into account the reliability of D2D transmissions. Using tools from
    stochastic geometry, we provide a closed-form approximation of cache-aided
    throughput, which measures the density of successfully served requests by local
    device caches, and we obtain the optimal caching probabilities with numerical
    optimization. Compared to the cache-hit-optimal case, the optimal caching
    probabilities obtained by cache-aided throughput optimization show notable gain
    in terms of the density of successfully served user requests, particularly in
    dense user environments.

    Mobile Lattice-Coded Physical-Layer Network Coding With Practical Channel Alignment

    Yihua Tan, Soung Chang Liew, Tao Huang
    Comments: 18 pages, 15 figures
    Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT)

    The original concept of physical-layer network coding (PNC) was first
    proposed in a MobiCom challenge paper in 2006 as a new paradigm to boost the
    throughput of wireless relay networks. Since then, PNC has attracted a wide
    following within the research community. A high point of PNC research was a
    theoretical proof that the use of nested lattice codes in PNC could achieve the
    capacity of a two-way relay network to within half bit. Many practical
    challenges, however, remain to be addressed before the full potential of
    lattice-coded PNC can be realized. Two major challenges are channel alignment
    of distributed nodes and complexity reduction of lattice encoding-decoding.
    This paper reports a first comprehensive implementation of a lattice-coded PNC
    system. Our contributions are twofold: 1) we design and demonstrate a
    low-overhead channel precoding scheme that can accurately align the channels of
    distributed nodes driven by independent low-cost temperature-compensated
    oscillators (TCXO); 2) we adapt the low-density lattice code (LDLC) for use in
    practical PNC systems. Our lattice-coded PNC implementation yields good
    throughput performance in static line-of-sight (LoS) scenario and mobile
    non-LoS scenarios.

    Minimum node degree in inhomogeneous random key graphs with unreliable links

    Rashad Eletreby, Osman Yağan
    Comments: In proceedings of the IEEE International Symposium on Information Theory (ISIT) 2016. arXiv admin note: text overlap with arXiv:1610.07576
    Subjects: Cryptography and Security (cs.CR); Information Theory (cs.IT); Probability (math.PR)

    We consider wireless sensor networks under a heterogeneous random key
    predistribution scheme and an on-off channel model. The heterogeneous key
    predistribution scheme has recently been introduced by Yau{g}an – as an
    extension to the Eschenauer and Gligor scheme – for the cases when the network
    consists of sensor nodes with varying level of resources and/or connectivity
    requirements, e.g., regular nodes vs. cluster heads. The network is modeled by
    the intersection of the inhomogeneous random key graph (induced by the
    heterogeneous scheme) with an ErdH{o}s-R’enyi graph (induced by the on/off
    channel model). We present conditions (in the form of zero-one laws) on how to
    scale the parameters of the intersection model so that with high probability
    all of its nodes are connected to at least (k) other nodes; i.e., the minimum
    node degree of the graph is no less than (k). We also present numerical results
    to support our results in the finite-node regime. The numerical results suggest
    that the conditions that ensure (k)-connectivity coincide with those ensuring
    the minimum node degree being no less than (k).




沪ICP备19023445号-2号
友情链接