IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Thu, 15 Dec 2016

    我爱机器学习(52ml.net)发表于 2016-12-15 00:00:00
    love 0

    Neural and Evolutionary Computing

    Stable Memory Allocation in the Hippocampus: Fundamental Limits and Neural Realization

    Wenlong Mou, Zhi Wang, Liwei Wang
    Subjects: Neural and Evolutionary Computing (cs.NE); Data Structures and Algorithms (cs.DS); Learning (cs.LG)

    It is believed that hippocampus functions as a memory allocator in brain, the
    mechanism of which remains unrevealed. In Valiant’s neuroidal model, the
    hippocampus was described as a randomly connected graph, the computation on
    which maps input to a set of activated neuroids with stable size. Valiant
    proposed three requirements for the hippocampal circuit to become a stable
    memory allocator (SMA): stability, continuity and orthogonality. The
    functionality of SMA in hippocampus is essential in further computation within
    cortex, according to Valiant’s model.

    In this paper, we put these requirements for memorization functions into
    rigorous mathematical formulation and introduce the concept of capacity, based
    on the probability of erroneous allocation. We prove fundamental limits for the
    capacity and error probability of SMA, in both data-independent and
    data-dependent settings. We also establish an example of stable memory
    allocator that can be implemented via neuroidal circuits. Both theoretical
    bounds and simulation results show that the neural SMA functions well.

    Imposing higher-level Structure in Polyphonic Music Generation using Convolutional Restricted Boltzmann Machines and Constraints

    Stefan Lattner, Maarten Grachten, Gerhard Widmer
    Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    We introduce a method for imposing higher-level structure on generated,
    polyphonic music. A Convolutional Restricted Boltzmann Machine (C-RBM) as a
    generative model is combined with gradient descent constraint optimization to
    provide further control over the generation process. Among other things, this
    allows for the use of a “template” piece, from which some structural properties
    can be extracted, and transferred as constraints to newly generated material.
    The sampling process is guided with Simulated Annealing in order to avoid local
    optima, and find solutions that both satisfy the constraints, and are
    relatively stable with respect to the C-RBM. Results show that with this
    approach it is possible to control the higher level self-similarity structure,
    the meter, as well as tonal properties of the resulting musical piece while
    preserving its local musical coherence.


    Computer Vision and Pattern Recognition

    Fast-AT: Fast Automatic Thumbnail Generation using Deep Neural Networks

    Seyed A. Esmaeili, Bharat Singh, Larry S. Davis
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Fast-AT is an automatic thumbnail generation system based on deep neural
    networks. It is a fully-convolutional CNN, which learns specific filters for
    thumbnails of different sizes and aspect ratios. During inference, the
    appropriate filter is selected depending on the dimensions of the target
    thumbnail. Unlike most previous work, Fast-AT does not utilize saliency but
    addresses the problem directly. In addition, it eliminates the need to conduct
    region search on the saliency map. The model generalizes to thumbnails of
    different sizes including those with extreme aspect ratios and can generate
    thumbnails in real time. A data set of more than 70,000 thumbnail annotations
    was collected to train Fast-AT. We show competitive results in comparison to
    existing techniques.

    Spectral video construction from RGB video: Application to Image Guided Neurosurgery

    Md. Abul Hasnat, Jussi Parkkinen, Markku Hauta-Kasari
    Comments: Experiments were conducted in 2011, Paper rewritten with recent review in 2015
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Spectral imaging has received enormous interest in the field of medical
    imaging modalities. It provides a powerful tool for the analysis of different
    organs and non-invasive tissues. Therefore, significant amount of research has
    been conducted to explore the possibility of using spectral imaging in
    biomedical applications. To observe spectral image information in real time
    during surgery and monitor the temporal changes in the organs and tissues is a
    demanding task. Available spectral imaging devices are not sufficient to
    accomplish this task with an acceptable spatial and spectral resolution. A
    solution to this problem is to estimate the spectral video from RGB video and
    perform visualization with the most prominent spectral bands. In this research,
    we propose a framework to generate neurosurgery spectral video from RGB video.
    A spectral estimation technique is applied on each RGB video frames. The RGB
    video is captured using a digital camera connected with an operational
    microscope dedicated to neurosurgery. A database of neurosurgery spectral
    images is used to collect training data and evaluate the estimation accuracy. A
    searching technique is used to identify the best training set. Five different
    spectrum estimation techniques are experimented to indentify the best method.
    Although this framework is established for neurosurgery spectral video
    generation, however, the methodology outlined here would also be applicable to
    other similar research.

    Registering large volume serial-section electron microscopy image sets for neural circuit reconstruction using FFT signal whitening

    Arthur W. Wetzel, Jennifer Bakal, Markus Dittrich, David G. C. Hildebrand, Josh L. Morgan, Jeff W. Lichtman
    Comments: 10 pages, 4 figures as submitted for the 2016 IEEE Applied Imagery and Pattern Recognition Workshop proceedings, Oct 18-20, 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    The detailed reconstruction of neural anatomy for connectomics studies
    requires a combination of resolution and large three-dimensional data capture
    provided by serial section electron microscopy (ssEM). The convergence of high
    throughput ssEM imaging and improved tissue preparation methods now allows ssEM
    capture of complete specimen volumes up to cubic millimeter scale. The
    resulting multi-terabyte image sets span thousands of serial sections and must
    be precisely registered into coherent volumetric forms in which neural circuits
    can be traced and segmented. This paper introduces a Signal Whitening Fourier
    Transform Image Registration approach (SWiFT-IR) under development at the
    Pittsburgh Supercomputing Center and its use to align mouse and zebrafish brain
    datasets acquired using the wafer mapper ssEM imaging technology recently
    developed at Harvard University. Unlike other methods now used for ssEM
    registration, SWiFT-IR modifies its spatial frequency response during image
    matching to maximize a signal-to-noise measure used as its primary indicator of
    alignment quality. This alignment signal is more robust to rapid variations in
    biological content and unavoidable data distortions than either phase-only or
    standard Pearson correlation, thus allowing more precise alignment and
    statistical confidence. These improvements in turn enable an iterative
    registration procedure based on projections through multiple sections rather
    than more typical adjacent-pair matching methods. This projection approach,
    when coupled with known anatomical constraints and iteratively applied in a
    multi-resolution pyramid fashion, drives the alignment into a smooth form that
    properly represents complex and widely varying anatomical content such as the
    full cross-section zebrafish data.

    Beam Search for Learning a Deep Convolutional Neural Network of 3D Shapes

    Xu Xu, Sinisa Todorovic
    Comments: ICPR 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Computational Geometry (cs.CG)

    This paper addresses 3D shape recognition. Recent work typically represents a
    3D shape as a set of binary variables corresponding to 3D voxels of a uniform
    3D grid centered on the shape, and resorts to deep convolutional neural
    networks(CNNs) for modeling these binary variables. Robust learning of such
    CNNs is currently limited by the small datasets of 3D shapes available, an
    order of magnitude smaller than other common datasets in computer vision.
    Related work typically deals with the small training datasets using a number of
    ad hoc, hand-tuning strategies. To address this issue, we formulate CNN
    learning as a beam search aimed at identifying an optimal CNN architecture,
    namely, the number of layers, nodes, and their connectivity in the network, as
    well as estimating parameters of such an optimal CNN. Each state of the beam
    search corresponds to a candidate CNN. Two types of actions are defined to add
    new convolutional filters or new convolutional layers to a parent CNN, and thus
    transition to children states. The utility function of each action is
    efficiently computed by transferring parameter values of the parent CNN to its
    children, thereby enabling an efficient beam search. Our experimental
    evaluation on the 3D ModelNet dataset demonstrates that our model pursuit using
    the beam search yields a CNN with superior performance on 3D shape
    classification than the state of the art.

    Detect, Replace, Refine: Deep Structured Prediction For Pixel Wise Labeling

    Spyros Gidaris, Nikos Komodakis
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Pixel wise image labeling is an interesting and challenging problem with
    great significance in the computer vision community. In order for a dense
    labeling algorithm to be able to achieve accurate and precise results, it has
    to consider the dependencies that exist in the joint space of both the input
    and the output variables. An implicit approach for modeling those dependencies
    is by training a deep neural network that, given as input an initial estimate
    of the output labels and the input image, it will be able to predict a new
    refined estimate for the labels. In this context, our work is concerned with
    what is the optimal architecture for performing the label improvement task. We
    argue that the prior approaches of either directly predicting new label
    estimates or predicting residual corrections w.r.t. the initial labels with
    feed-forward deep network architectures are sub-optimal. Instead, we propose a
    generic architecture that decomposes the label improvement task to three steps:
    1) detecting the initial label estimates that are incorrect, 2) replacing the
    incorrect labels with new ones, and finally 3) refining the renewed labels by
    predicting residual corrections w.r.t. them. Furthermore, we explore and
    compare various other alternative architectures that consist of the
    aforementioned Detection, Replace, and Refine components. We extensively
    evaluate the examined architectures in the challenging task of dense disparity
    estimation (stereo matching) and we report both quantitative and qualitative
    results on three different datasets. Finally, our dense disparity estimation
    network that implements the proposed generic architecture, achieves
    state-of-the-art results in the KITTI 2015 test surpassing prior approaches by
    a significant margin.

    Attentive Explanations: Justifying Decisions and Pointing to the Evidence

    Dong Huk Park, Lisa Anne Hendricks, Zeynep Akata, Bernt Schiele, Trevor Darrell, Marcus Rohrbach
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

    Deep models are the defacto standard in visual decision models due to their
    impressive performance on a wide array of visual tasks. However, they are
    frequently seen as opaque and are unable to explain their decisions. In
    contrast, humans can justify their decisions with natural language and point to
    the evidence in the visual world which led to their decisions. We postulate
    that deep models can do this as well and propose our Pointing and Justification
    (PJ-X) model which can justify its decision with a sentence and point to the
    evidence by introspecting its decision and explanation process using an
    attention mechanism. Unfortunately there is no dataset available with reference
    explanations for visual decision making. We thus collect two datasets in two
    domains where it is interesting and challenging to explain decisions. First, we
    extend the visual question answering task to not only provide an answer but
    also a natural language explanation for the answer. Second, we focus on
    explaining human activities which is traditionally more challenging than object
    classification. We extensively evaluate our PJ-X model, both on the
    justification and pointing tasks, by comparing it to prior models and ablations
    using both automatic and human evaluations.

    Super-resolution Reconstruction of SAR Image based on Non-Local Means Denoising Combined with BP Neural Network

    Zeling Wu, Haoxiang Wang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this article, we propose a super-resolution method to resolve the problem
    of image low spatial because of the limitation of imaging devices. We make use
    of the strong non-linearity mapped ability of the back-propagation neural
    networks(BPNN). Training sample images are got by undersampled method. The
    elements chose as the inputs of the BPNN are pixels referred to Non-local
    means(NL-Means). Making use of the self-similarity of the images, those inputs
    are the pixels which are pixels gained from modified NL-means which is specific
    for super-resolution. Besides, small change on core function of NL-means has
    been applied in the method we use in this article so that we can have a clearer
    edge in the shrunk image. Experimental results gained from the Peak Signal to
    Noise Ratio(PSNR) and the Equivalent Number of Look(ENL), indicate that adding
    the similar pixels as inputs will increase the results than not taking them
    into consideration.

    UnrealStereo: A Synthetic Dataset for Analyzing Stereo Vision

    Yi Zhang, Weichao Qiu, Qi Chen, Xiaolin Hu, Alan Yuille
    Comments: Tech report
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Stereo algorithm is important for robotics applications, such as quadcopter
    and autonomous driving. It needs to be robust enough to handle images of
    challenging conditions, such as raining or strong lighting. Textureless and
    specular regions of these images make feature matching difficult and smoothness
    assumption invalid. It is important to understand whether an algorithm is
    robust to these hazardous regions. Many stereo benchmarks have been developed
    to evaluate the performance and track progress. But it is not easy to quantize
    the effect of these hazardous regions. In this paper, we develop a synthetic
    image generation tool and build a benchmark with synthetic images. First, we
    manually tweak hazardous factors in a virtual world, such as making objects
    more specular or transparent, to simulate corner cases to test the robustness
    of stereo algorithms. Second, we use ground truth information, such as object
    mask, material property, to automatically identify hazardous regions and
    evaluate the accuracy of these regions. Our tool is based on a popular game
    engine Unreal Engine 4 and will be open-source. Many publicly available
    realistic game contents can be used by our tool which can provide an enormous
    resource for algorithm development and evaluation.

    Harmonic Networks: Deep Translation and Rotation Equivariance

    Daniel E. Worrall, Stephan J. Garbin, Daniyar Turmukhambetov, Gabriel J. Brostow
    Comments: Submitted to CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)

    Translating or rotating an input image should not affect the results of many
    computer vision tasks. Convolutional neural networks (CNNs) are already
    translation equivariant: input image translations produce proportionate feature
    map translations. This is not the case for rotations. Global rotation
    equivariance is typically sought through data augmentation, but patch-wise
    equivariance is more difficult. We present Harmonic Networks or H-Nets, a CNN
    exhibiting equivariance to patch-wise translation and 360-rotation. We achieve
    this by replacing regular CNN filters with circular harmonics, returning a
    maximal response and orientation for every receptive field patch.

    H-Nets use a rich, parameter-efficient and low computational complexity
    representation, and we show that deep feature maps within the network encode
    complicated rotational invariants. We demonstrate that our layers are general
    enough to be used in conjunction with the latest architectures and techniques,
    such as deep supervision and batch normalization. We also achieve
    state-of-the-art classification on rotated-MNIST, and competitive results on
    other benchmark challenges.

    Defining the Pose of any 3D Rigid Object and an Associated Metric

    Romain Brégier (IMAGINE), Frédéric Devernay (IMAGINE), Laetitia Leyrit, James Crowley
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Metric Geometry (math.MG); Classical Physics (physics.class-ph)

    A pose of a rigid object is usually regarded as a rigid transformation,
    described by a translation and a rotation. In this article, we define a pose as
    a distinguishable static state of the considered object, and show that the
    usual identification of the pose space with the space of rigid transformations
    is abusive, as it is not adapted to objects with proper symmetries. Based
    solely on geometric considerations, we propose a frame-invariant metric on the
    pose space, valid for any physical object, and requiring no arbitrary tuning.
    This distance can be evaluated efficiently thanks to a representation of poses
    within a low dimension Euclidean space, and enables to perform efficient
    neighborhood queries such as radius searches or k-nearest neighbor searches
    within a large set of poses using off-the-shelf methods. We lastly solve the
    problems of projection from the Euclidean space onto the pose space, and of
    pose averaging for this metric. The practical value of those theoretical
    developments is illustrated with an application of pose estimation of instances
    of a 3D rigid object given an input depth map, via a Mean Shift procedure .

    The Mehler-Fock Transform and some Applications in Texture Analysis and Color Processing

    Reiner Lenz
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Many stochastic processes are defined on special geometrical objects like
    spheres and cones. We describe how tools from harmonic analysis, i.e. Fourier
    analysis on groups, can be used to investigate probability density functions
    (pdfs) on groups and homogeneous spaces. We consider the special case of the
    Lorentz group SU(1,1) and the unit disk with its hyperbolic geometry, but the
    procedure can be generalized to a much wider class of Lie-groups. We mainly
    concentrate on the Mehler-Fock transform which is the radial part of the
    Fourier transform on the disk. Some of the characteristic features of this
    transform are the relation to group-convolutions, the isometry between signal
    and transform space, the relation to the Laplace-Beltrami operator and the
    relation to group representation theory. We will give an overview over these
    properties and their applications in signal processing. We will illustrate the
    theory with two examples from low-level vision and color image processing.

    Permutation-equivariant neural networks applied to dynamics prediction

    Nicholas Guttenberg, Nathaniel Virgo, Olaf Witkowski, Hidetoshi Aoki, Ryota Kanai
    Comments: 7 pages, 4 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

    The introduction of convolutional layers greatly advanced the performance of
    neural networks on image tasks due to innately capturing a way of encoding and
    learning translation-invariant operations, matching one of the underlying
    symmetries of the image domain. In comparison, there are a number of problems
    in which there are a number of different inputs which are all ‘of the same
    type’ — multiple particles, multiple agents, multiple stock prices, etc. The
    corresponding symmetry to this is permutation symmetry, in that the algorithm
    should not depend on the specific ordering of the input data. We discuss a
    permutation-invariant neural network layer in analogy to convolutional layers,
    and show the ability of this architecture to learn to predict the motion of a
    variable number of interacting hard discs in 2D. In the same way that
    convolutional layers can generalize to different image sizes, the permutation
    layer we describe generalizes to different numbers of objects.

    Astronomical image reconstruction with convolutional neural networks

    Rémi Flamary
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Instrumentation and Methods for Astrophysics (astro-ph.IM); Machine Learning (stat.ML)

    State of the art methods in astronomical image reconstruction rely on the
    resolution of a regularized or constrained optimization problem. Solving this
    problem can be computationally intensive and usually leads to a quadratic or at
    least superlinear complexity w.r.t. the number of pixels in the image. We
    investigate in this work the use of convolutional neural networks for image
    reconstruction in astronomy. With neural networks, the computationally
    intensive tasks is the training step, but the prediction step has a fixed
    complexity per pixel, i.e. a linear complexity. Numerical experiments show that
    our approach is both computationally efficient and competitive with other state
    of the art methods in addition to being interpretable.

    Single Image Action Recognition using Semantic Body Part Actions

    Zhichen Zhao, Huimin Ma, Shaodi You
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we propose a novel single image action recognition algorithm
    which is based on the idea of semantic body part actions. Unlike existing
    bottom up methods, we argue that the human action is a combination of
    meaningful body part actions. In detail, we divide human body into five parts:
    head, torso, arms, hands and legs. And for each of the body parts, we define
    several semantic body part actions, e.g., hand holding, hand waving. These
    semantic body part actions are strongly related to the body actions, e.g.,
    writing, and jogging. Based on the idea, we propose a deep neural network based
    system: first, body parts are localized by a Semi-FCN network. Second, for each
    body parts, a Part Action Res-Net is used to predict semantic body part
    actions. And finally, we use SVM to fuse the body part actions and predict the
    entire body action. Experiments on two dataset: PASCAL VOC 2012 and Stanford-40
    report mAP improvement from the state-of-the-art by 3.8% and 2.6% respectively.

    Sparse Factorization Layers for Neural Networks with Limited Supervision

    Parker Koch, Jason J. Corso
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    Whereas CNNs have demonstrated immense progress in many vision problems, they
    suffer from a dependence on monumental amounts of labeled training data. On the
    other hand, dictionary learning does not scale to the size of problems that
    CNNs can handle, despite being very effective at low-level vision tasks such as
    denoising and inpainting. Recently, interest has grown in adapting dictionary
    learning methods for supervised tasks such as classification and inverse
    problems. We propose two new network layers that are based on dictionary
    learning: a sparse factorization layer and a convolutional sparse factorization
    layer, analogous to fully-connected and convolutional layers, respectively.
    Using our derivations, these layers can be dropped in to existing CNNs, trained
    together in an end-to-end fashion with back-propagation, and leverage
    semisupervision in ways classical CNNs cannot. We experimentally compare
    networks with these two new layers against a baseline CNN. Our results
    demonstrate that networks with either of the sparse factorization layers are
    able to outperform classical CNNs when supervised data are few. They also show
    performance improvements in certain tasks when compared to the CNN with no
    sparse factorization layers with the same exact number of parameters.

    Analysis of proposed PDE-based underwater image enhancement algorithms

    U. A. Nnolim
    Comments: 57 pages, 9 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This report describes the experimental analysis of proposed underwater image
    enhancement algorithms based on partial differential equations (PDEs). The
    algorithms perform simultaneous smoothing and enhancement due to the
    combination of both processes within the PDE-formulation. The framework enables
    the incorporation of suitable colour and contrast enhancement algorithms within
    one unified functional. Additional modification of the formulation includes the
    combination of the popular Contrast Limited Adaptive Histogram Equalization
    (CLAHE) with the proposed approach. This modification enables the hybrid
    algorithm to provide both local enhancement (due to the CLAHE) and global
    enhancement (due to the proposed contrast term). Additionally, the CLAHE clip
    limit parameter is computed dynamically in each iteration and used to gauge the
    amount of local enhancement performed by the CLAHE within the formulation. This
    enables the algorithm to reduce or prevent the enhancement of noisy artifacts,
    which if present, are also smoothed out by the anisotropic diffusion term
    within the PDE formulation. In other words, the modified algorithm combines the
    strength of the CLAHE, AD and the contrast term while minimizing their
    weaknesses. Ultimately, the system is optimized using image data metrics for
    automated enhancement and compromise between visual and quantitative results.
    Experiments indicate that the proposed algorithms perform a series of functions
    such as illumination correction, colour enhancement correction and restoration,
    contrast enhancement and noise suppression. Moreover, the proposed approaches
    surpass most other conventional algorithms found in the literature.

    Disentangling Space and Time in Video with Hierarchical Variational Auto-encoders

    Will Grathwohl, Aaron Wilson
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)

    There are many forms of feature information present in video data. Principle
    among them are object identity information which is largely static across
    multiple video frames, and object pose and style information which continuously
    transforms from frame to frame. Most existing models confound these two types
    of representation by mapping them to a shared feature space. In this paper we
    propose a probabilistic approach for learning separable representations of
    object identity and pose information using unsupervised video data. Our
    approach leverages a deep generative model with a factored prior distribution
    that encodes properties of temporal invariances in the hidden feature set.
    Learning is achieved via variational inference. We present results of learning
    identity and pose information on a dataset of moving characters as well as a
    dataset of rotating 3D objects. Our experimental results demonstrate our
    model’s success in factoring its representation, and demonstrate that the model
    achieves improved performance in transfer learning tasks.

    Finding Tiny Faces

    Peiyun Hu, Deva Ramanan
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Though tremendous strides have been made in object recognition, one of the
    remaining open challenges is detecting small objects. We explore three aspects
    of the problem in the context of finding small faces: the role of scale
    invariance, image resolution, and contextual reasoning. While most recognition
    approaches aim to be scale-invariant, the cues for recognizing a 3px tall face
    are fundamentally different than those for recognizing a 300px tall face. We
    take a different approach and train separate detectors for different scales. To
    maintain efficiency, detectors are trained in a multi-task fashion: they make
    use of features extracted from multiple layers of single (deep) feature
    hierarchy. While training detectors for large objects is straightforward, the
    crucial challenge remains training detectors for small objects. We show that
    context is crucial, and define templates that make use of massively-large
    receptive fields (where 99% of the template extends beyond the object of
    interest). Finally, we explore the role of scale in pre-trained deep networks,
    providing ways to extrapolate networks tuned for limited scales to rather
    extreme ranges. We demonstrate state-of-the-art results on
    massively-benchmarked face datasets (FDDB and WIDER FACE). In particular, when
    compared to prior art on WIDER FACE, our results reduce error by a factor of 2
    (our models produce an AP of 81% while prior art ranges from 29-64%).

    Deep Function Machines: Generalized Neural Networks for Topological Layer Expression

    William H. Guss
    Comments: Before empirical experiments–Preprint
    Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    In this paper we propose a generalization of deep neural networks called deep
    function machines (DFMs). DFMs act on vector spaces of arbitrary (possibly
    infinite) dimension and we show that a family of DFMs are invariant to the
    dimension of input data; that is, the parameterization of the model does not
    directly hinge on the quality of the input (eg. high resolution images). Using
    this generalization we provide a new theory of universal approximation of
    bounded non-linear operators between function spaces locally compact Hausdorff
    spaces. We then suggest that DFMs provide an expressive framework for designing
    new neural network layer types with topological considerations in mind.
    Finally, we provide several examples of DFMs and in particular give a practical
    algorithm for neural networks approximating infinite dimensional operators.


    Artificial Intelligence

    Scalable Computation of Optimized Queries for Sequential Diagnosis

    Patrick Rodler, Wolfgang Schmid, Kostyantyn Shchekotykhin
    Subjects: Artificial Intelligence (cs.AI)

    In many model-based diagnosis applications it is impossible to provide such a
    set of observations and/or measurements that allow to identify the real cause
    of a fault. Therefore, diagnosis systems often return many possible candidates,
    leaving the burden of selecting the correct diagnosis to a user. Sequential
    diagnosis techniques solve this problem by automatically generating a sequence
    of queries to some oracle. The answers to these queries provide additional
    information necessary to gradually restrict the search space by removing
    diagnosis candidates inconsistent with the answers.

    During query computation, existing sequential diagnosis methods often require
    the generation of many unnecessary query candidates and strongly rely on
    expensive logical reasoners. We tackle this issue by devising efficient
    heuristic query search methods. The proposed methods enable for the first time
    a completely reasoner-free query generation while at the same time guaranteeing
    optimality conditions, e.g. minimal cardinality or best understandability, of
    the returned query that existing methods cannot realize. Hence, the performance
    of this approach is independent of the (complexity of the) diagnosed system.
    Experiments conducted using real-world problems show that the new approach is
    highly scalable and outperforms existing methods by orders of magnitude.

    Encapsulating models and approximate inference programs in probabilistic modules

    Marco F. Cusumano-Towner, Vikash K. Mansinghka
    Subjects: Artificial Intelligence (cs.AI)

    This paper introduces the probabilistic module interface, which allows
    encapsulation of complex probabilistic models with latent variables alongside
    custom stochastic approximate inference machinery, and provides a
    platform-agnostic abstraction barrier separating the model internals from the
    host probabilistic inference system. The interface can be seen as a stochastic
    generalization of a standard simulation and density interface for probabilistic
    primitives. We show that sound approximate inference algorithms can be
    constructed for networks of probabilistic modules, and we demonstrate that the
    interface can be implemented using learned stochastic inference networks and
    MCMC and SMC approximate inference programs.

    Real-time interactive sequence generation and control with Recurrent Neural Network ensembles

    Memo Akten, Mick Grierson
    Comments: Demo presentation at NIPS 2016, and poster presentation at the RNN Symposium at NIPS 2016. 7 pages including 1 page references, 1 page appendix, 2 figures
    Subjects: Artificial Intelligence (cs.AI)

    Recurrent Neural Networks (RNN), particularly Long Short Term Memory (LSTM)
    RNNs, are a popular and very successful method for learning and generating
    sequences. However, current generative RNN techniques do not allow real-time
    interactive control of the sequence generation process, thus aren’t well suited
    for live creative expression. We propose a method of real-time continuous
    control and ‘steering’ of sequence generation using an ensemble of RNNs and
    dynamically altering the mixture weights of the models. We demonstrate the
    method using character based LSTM networks and a gestural interface allowing
    users to ‘conduct’ the generation of text.

    Web-based Argumentation

    Kenrick
    Subjects: Artificial Intelligence (cs.AI)

    Assumption-Based Argumentation (ABA) is an argumentation framework that has
    been proposed in the late 20th century. Since then, there was still no solver
    implemented in a programming language which is easy to setup and no solver have
    been interfaced to the web, which impedes the interests of the public. This
    project aims to implement an ABA solver in a modern programming language that
    performs reasonably well and interface it to the web for easier access by the
    public. This project has demonstrated the novelty of development of an ABA
    solver, that computes conflict-free, stable, admissible, grounded, ideal, and
    complete semantics, in Python programming language which can be used via an
    easy-to-use web interface for visualization of the argument and dispute trees.
    Experiments were conducted to determine the project’s best configurations and
    to compare this project with proxdd, a state-of-the-art ABA solver, which has
    no web interface and computes less number of semantics. From the results of the
    experiments, this project’s best configuration is achieved by utilizing
    “pickle” technique and tree caching technique. Using this project’s best
    configuration, this project achieved a lower average runtime compared to
    proxdd. On other aspect, this project encountered more cases with exceptions
    compared to proxdd, which might be caused by this project computing more
    semantics and hence requires more resources to do so. Hence, it can be said
    that this project run comparably well to the state-of-the-art ABA solver
    proxdd. Future works of this project include computational complexity analysis
    and efficiency analysis of algorithms implemented, implementation of more
    semantics in argumentation framework, and usability testing of the web
    interface.

    Anomaly Detection Using the Knowledge-based Temporal Abstraction Method

    Asaf Shabtai
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

    The rapid growth in stored time-oriented data necessitates the development of
    new methods for handling, processing, and interpreting large amounts of
    temporal data. One important example of such processing is detecting anomalies
    in time-oriented data. The Knowledge-Based Temporal Abstraction method was
    previously proposed for intelligent interpretation of temporal data based on
    predefined domain knowledge. In this study we propose a framework that
    integrates the KBTA method with a temporal pattern mining process for anomaly
    detection. According to the proposed method a temporal pattern mining process
    is applied on a dataset of basic temporal abstraction database in order to
    extract patterns representing normal behavior. These patterns are then analyzed
    in order to identify abnormal time periods characterized by a significantly
    small number of normal patterns. The proposed approach was demonstrated using a
    dataset collected from a real server.

    Attentive Explanations: Justifying Decisions and Pointing to the Evidence

    Dong Huk Park, Lisa Anne Hendricks, Zeynep Akata, Bernt Schiele, Trevor Darrell, Marcus Rohrbach
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

    Deep models are the defacto standard in visual decision models due to their
    impressive performance on a wide array of visual tasks. However, they are
    frequently seen as opaque and are unable to explain their decisions. In
    contrast, humans can justify their decisions with natural language and point to
    the evidence in the visual world which led to their decisions. We postulate
    that deep models can do this as well and propose our Pointing and Justification
    (PJ-X) model which can justify its decision with a sentence and point to the
    evidence by introspecting its decision and explanation process using an
    attention mechanism. Unfortunately there is no dataset available with reference
    explanations for visual decision making. We thus collect two datasets in two
    domains where it is interesting and challenging to explain decisions. First, we
    extend the visual question answering task to not only provide an answer but
    also a natural language explanation for the answer. Second, we focus on
    explaining human activities which is traditionally more challenging than object
    classification. We extensively evaluate our PJ-X model, both on the
    justification and pointing tasks, by comparing it to prior models and ablations
    using both automatic and human evaluations.

    Imposing higher-level Structure in Polyphonic Music Generation using Convolutional Restricted Boltzmann Machines and Constraints

    Stefan Lattner, Maarten Grachten, Gerhard Widmer
    Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    We introduce a method for imposing higher-level structure on generated,
    polyphonic music. A Convolutional Restricted Boltzmann Machine (C-RBM) as a
    generative model is combined with gradient descent constraint optimization to
    provide further control over the generation process. Among other things, this
    allows for the use of a “template” piece, from which some structural properties
    can be extracted, and transferred as constraints to newly generated material.
    The sampling process is guided with Simulated Annealing in order to avoid local
    optima, and find solutions that both satisfy the constraints, and are
    relatively stable with respect to the C-RBM. Results show that with this
    approach it is possible to control the higher level self-similarity structure,
    the meter, as well as tonal properties of the resulting musical piece while
    preserving its local musical coherence.

    Sparse Factorization Layers for Neural Networks with Limited Supervision

    Parker Koch, Jason J. Corso
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    Whereas CNNs have demonstrated immense progress in many vision problems, they
    suffer from a dependence on monumental amounts of labeled training data. On the
    other hand, dictionary learning does not scale to the size of problems that
    CNNs can handle, despite being very effective at low-level vision tasks such as
    denoising and inpainting. Recently, interest has grown in adapting dictionary
    learning methods for supervised tasks such as classification and inverse
    problems. We propose two new network layers that are based on dictionary
    learning: a sparse factorization layer and a convolutional sparse factorization
    layer, analogous to fully-connected and convolutional layers, respectively.
    Using our derivations, these layers can be dropped in to existing CNNs, trained
    together in an end-to-end fashion with back-propagation, and leverage
    semisupervision in ways classical CNNs cannot. We experimentally compare
    networks with these two new layers against a baseline CNN. Our results
    demonstrate that networks with either of the sparse factorization layers are
    able to outperform classical CNNs when supervised data are few. They also show
    performance improvements in certain tasks when compared to the CNN with no
    sparse factorization layers with the same exact number of parameters.

    An argumentative agent-based model of scientific inquiry

    Annemarie Borg, Daniel Frey, Dunja Šešelja, Christian Straßer
    Comments: 14 page, 3 figures
    Subjects: Social and Information Networks (cs.SI); Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)

    In this paper we present an agent-based model (ABM) of scientific inquiry
    aimed at investigating how different social networks impact the efficiency of
    scientists in acquiring knowledge. As such, the ABM is a computational tool for
    tackling issues in the domain of scientific methodology and science policy. In
    contrast to existing ABMs of science, our model aims to represent the
    argumentative dynamics that underlies scientific practice. To this end we
    employ abstract argumentation theory as the core design feature of the model.
    This helps to avoid a number of problematic idealizations which are present in
    other ABMs of science and which impede their relevance for actual scientific
    practice.


    Information Retrieval

    User Model-Based Intent-Aware Metrics for Multilingual Search Evaluation

    Alexey Drutsa (Yandex, Moscow, Russia), Andrey Shutovich (Yandex, Moscow, Russia), Philipp Pushnyakov (Yandex, Moscow, Russia), Evgeniy Krokhalyov (Yandex, Moscow, Russia), Gleb Gusev (Yandex, Moscow, Russia), Pavel Serdyukov (Yandex, Moscow, Russia)
    Comments: 7 pages, 1 figure, 3 tables
    Journal-ref: NIPS 2016 Workshop “What If? Inference and Learning of
    Hypothetical and Counterfactual Interventions in Complex Systems” (What If
    2016) pre-print
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Human-Computer Interaction (cs.HC); Learning (cs.LG); Machine Learning (stat.ML)

    Despite the growing importance of multilingual aspect of web search, no
    appropriate offline metrics to evaluate its quality are proposed so far. At the
    same time, personal language preferences can be regarded as intents of a query.
    This approach translates the multilingual search problem into a particular task
    of search diversification. Furthermore, the standard intent-aware approach
    could be adopted to build a diversified metric for multilingual search on the
    basis of a classical IR metric such as ERR. The intent-aware approach estimates
    user satisfaction under a user behavior model. We show however that the
    underlying user behavior models is not realistic in the multilingual case, and
    the produced intent-aware metric do not appropriately estimate the user
    satisfaction. We develop a novel approach to build intent-aware user behavior
    models, which overcome these limitations and convert to quality metrics that
    better correlate with standard online metrics of user satisfaction.

    You Are What You Eat… Listen to, Watch, and Read

    Mason Bretan
    Subjects: Social and Information Networks (cs.SI); Computation and Language (cs.CL); Information Retrieval (cs.IR)

    This article describes a data driven method for deriving the relationship
    between personality and media preferences. A qunatifiable representation of
    such a relationship can be leveraged for use in recommendation systems and
    ameliorate the “cold start” problem. Here, the data is comprised of an original
    collection of 1,316 Okcupid dating profiles. Of these profiles, 800 are labeled
    with one of 16 possible Myers-Briggs Type Indicators (MBTI). A personality
    specific topic model describing a person’s favorite books, movies, shows,
    music, and food was generated using latent Dirichlet allocation (LDA). There
    were several significant findings, for example, intuitive thinking types
    preferred sci-fi/fantasy entertainment, extraversion correlated positively with
    upbeat dance music, and jazz, folk, and international cuisine correlated
    positively with those characterized by openness to experience. Many other
    correlations confirmed previous findings describing the relationship among
    personality, writing style, and personal preferences. (For complete
    word/personality type assocations see the Appendix).


    Computation and Language

    CoPaSul Manual – Contour-based parametric and superpositional intonation stylization

    Uwe D. Reichel
    Subjects: Computation and Language (cs.CL)

    The purposes of the CoPaSul toolkit are (1) automatic prosodic annotation and
    (2) prosodic feature extraction from syllable to utterance level. CoPaSul
    stands for contour-based, parametric, superpositional intonation stylization.
    In this framework intonation is represented as a superposition of global and
    local contours that are described parametrically in terms of polynomial
    coefficients. On the global level (usually associated but not necessarily
    restricted to intonation phrases) the stylization serves to represent register
    in terms of time-varying F0 level and range. On the local level (e.g. accent
    groups), local contour shapes are described. From this parameterization several
    features related to prosodic boundaries and prominence can be derived.
    Furthermore, by coefficient clustering prosodic contour classes can be derived
    in a bottom-up way. Next to the stylization-based feature extraction also
    standard F0 and energy measures (e.g. mean and variance) as well as rhythmic
    aspects can be calculated. At the current state automatic annotation comprises:
    segmentation into interpausal chunks, syllable nucleus extraction, and
    unsupervised localization of prosodic phrase boundaries and prominent
    syllables. F0 and partly also energy feature sets can be derived for: standard
    measurements (as median and IQR), register in terms of F0 level and range,
    prosodic boundaries, local contour shapes, bottom-up derived contour classes,
    Gestalt of accent groups in terms of their deviation from higher level prosodic
    units, as well as for rhythmic aspects quantifying the relation between F0 and
    energy contours and prosodic event rates.

    Incorporating Language Level Information into Acoustic Models

    Peidong Wang, Deliang Wang
    Subjects: Computation and Language (cs.CL); Learning (cs.LG); Sound (cs.SD)

    This paper proposed a class of novel Deep Recurrent Neural Networks which can
    incorporate language-level information into acoustic models. For simplicity, we
    named these networks Recurrent Deep Language Networks (RDLNs). Multiple
    variants of RDLNs were considered, including two kinds of context information,
    two methods to process the context, and two methods to incorporate the
    language-level information. RDLNs provided possible methods to fine-tune the
    whole Automatic Speech Recognition (ASR) system in the acoustic modeling
    process.

    Multilingual Word Embeddings using Multigraphs

    Radu Soricut, Nan Ding
    Comments: 12 pages
    Subjects: Computation and Language (cs.CL)

    We present a family of neural-network–inspired models for computing
    continuous word representations, specifically designed to exploit both
    monolingual and multilingual text. This framework allows us to perform
    unsupervised training of embeddings that exhibit higher accuracy on syntactic
    and semantic compositionality, as well as multilingual semantic similarity,
    compared to previous models trained in an unsupervised fashion. We also show
    that such multilingual embeddings, optimized for semantic similarity, can
    improve the performance of statistical machine translation with respect to how
    it handles words not present in the parallel data.

    Unsupervised Clustering of Commercial Domains for Adaptive Machine Translation

    Mauro Cettolo, Mara Chinea Rios, Roldano Cattoni
    Comments: 9 pages report on Summer Internship at FBK
    Subjects: Computation and Language (cs.CL)

    In this paper, we report on domain clustering in the ambit of an adaptive MT
    architecture. A standard bottom-up hierarchical clustering algorithm has been
    instantiated with five different distances, which have been compared, on an MT
    benchmark built on 40 commercial domains, in terms of dendrograms, intrinsic
    and extrinsic evaluations. The main outcome is that the most expensive distance
    is also the only one able to allow the MT engine to guarantee good performance
    even with few, but highly populated clusters of domains.

    Recurrent Deep Stacking Networks for Speech Recognition

    Peidong Wang, Zhongqiu Wang, Deliang Wang
    Subjects: Computation and Language (cs.CL); Sound (cs.SD)

    This paper presented our work on applying Recurrent Deep Stacking Networks
    (RDSNs) to Robust Automatic Speech Recognition (ASR) tasks. In the paper, we
    also proposed a more efficient yet comparable substitute to RDSN, Bi- Pass
    Stacking Network (BPSN). The main idea of these two models is to add
    phoneme-level information into acoustic models, transforming an acoustic model
    to the combination of an acoustic model and a phoneme-level N-gram model.
    Experiments showed that RDSN and BPsn can substantially improve the
    performances over conventional DNNs.

    How Grammatical is Character-level Neural Machine Translation? Assessing MT Quality with Contrastive Translation Pairs

    Rico Sennrich
    Subjects: Computation and Language (cs.CL)

    Analysing translation quality in regards to specific linguistic phenomena has
    historically been difficult and time-consuming. Neural machine translation has
    the attractive property that it can produce scores for arbitrary translations,
    and we propose a novel method to assess how well NMT systems model specific
    linguistic phenomena such as agreement over long distances, the production of
    novel words, and the faithful translation of polarity. The core idea is that we
    measure whether a reference translation is more probable under a NMT model than
    a contrastive translation which introduces a specific type of error. We present
    LingEval90, a large-scale data set of 90000 contrastive translation pairs based
    on the WMT English->German translation task, with errors automatically created
    with simple rules. We report a number of baseline results, and find that
    recently introduced character-level NMT systems perform better at
    transliteration than models with BPE segmentation, but perform more poorly at
    morphosyntactic agreement, and translating discontiguous units of meaning.

    Neural Emoji Recommendation in Dialogue Systems

    Ruobing Xie, Zhiyuan Liu, Rui Yan, Maosong Sun
    Comments: 7 pages
    Subjects: Computation and Language (cs.CL)

    Emoji is an essential component in dialogues which has been broadly utilized
    on almost all social platforms. It could express more delicate feelings beyond
    plain texts and thus smooth the communications between users, making dialogue
    systems more anthropomorphic and vivid. In this paper, we focus on
    automatically recommending appropriate emojis given the contextual information
    in multi-turn dialogue systems, where the challenges locate in understanding
    the whole conversations. More specifically, we propose the hierarchical long
    short-term memory model (H-LSTM) to construct dialogue representations,
    followed by a softmax classifier for emoji classification. We evaluate our
    models on the task of emoji classification in a real-world dataset, with some
    further explorations on parameter sensitivity and case study. Experimental
    results demonstrate that our method achieves the best performances on all
    evaluation metrics. It indicates that our method could well capture the
    contextual information and emotion flow in dialogues, which is significant for
    emoji recommendation.

    Grammatical Constraints on Intra-sentential Code-Switching: From Theories to Working Models

    Gayatri Bhat, Monojit Choudhury, Kalika Bali
    Comments: 13 pages
    Subjects: Computation and Language (cs.CL)

    We make one of the first attempts to build working models for
    intra-sentential code-switching based on the Equivalence-Constraint (Poplack
    1980) and Matrix-Language (Myers-Scotton 1993) theories. We conduct a detailed
    theoretical analysis, and a small-scale empirical study of the two models for
    Hindi-English CS. Our analyses show that the models are neither sound nor
    complete. Taking insights from the errors made by the models, we propose a new
    model that combines features of both the theories.

    Mining Compatible/Incompatible Entities from Question and Answering via Yes/No Answer Classification using Distant Label Expansion

    Hu Xu, Lei Shu, Jingyuan Zhang, Philip S. Yu
    Comments: 9 pages, 1 figures
    Subjects: Computation and Language (cs.CL)

    Product Community Question Answering (PCQA) provides useful information about
    products and their features (aspects) that may not be well addressed by product
    descriptions and reviews. We observe that a product’s compatibility issues with
    other products are frequently discussed in PCQA and such issues are more
    frequently addressed in accessories, i.e., via a yes/no question “Does this
    mouse work with windows 10?”. In this paper, we address the problem of
    extracting compatible and incompatible products from yes/no questions in PCQA.
    This problem can naturally have a two-stage framework: first, we perform
    Complementary Entity (product) Recognition (CER) on yes/no questions; second,
    we identify the polarities of yes/no answers to assign the complementary
    entities a compatibility label (compatible, incompatible or unknown). We
    leverage an existing unsupervised method for the first stage and a 3-class
    classifier by combining a distant PU-learning method (learning from positive
    and unlabeled examples) together with a binary classifier for the second stage.
    The benefit of using distant PU-learning is that it can help to expand more
    implicit yes/no answers without using any human annotated data. We conduct
    experiments on 4 products to show that the proposed method is effective.

    Hypernyms under Siege: Linguistically-motivated Artillery for Hypernymy Detection

    Vered Shwartz, Enrico Santus, Dominik Schlechtweg
    Comments: EACL 2017. 9 pages
    Subjects: Computation and Language (cs.CL)

    The fundamental role of hypernymy in NLP has motivated the development of
    many methods for the automatic identification of this relation, most of which
    rely on word distribution. We investigate an extensive number of such
    unsupervised measures, using several distributional semantic models that differ
    by context type and feature weighting. We analyze the performance of the
    different methods based on their linguistic motivation. Comparison to the
    state-of-the-art supervised methods shows that while supervised methods
    generally outperform the unsupervised ones, the former are sensitive to the
    distribution of training instances, hurting their reliability. Being based on
    general linguistic hypotheses and independent from training data, unsupervised
    measures are more robust, and therefore are still useful artillery for
    hypernymy detection.

    Improving Neural Language Models with a Continuous Cache

    Edouard Grave, Armand Joulin, Nicolas Usunier
    Comments: Submitted to ICLR 2017
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    We propose an extension to neural network language models to adapt their
    prediction to the recent history. Our model is a simplified version of memory
    augmented networks, which stores past hidden activations as memory and accesses
    them through a dot product with the current hidden activation. This mechanism
    is very efficient and scales to very large memory sizes. We also draw a link
    between the use of external memory in neural network and cache models used with
    count based language models. We demonstrate on several language model datasets
    that our approach performs significantly better than recent memory augmented
    networks.

    Attentive Explanations: Justifying Decisions and Pointing to the Evidence

    Dong Huk Park, Lisa Anne Hendricks, Zeynep Akata, Bernt Schiele, Trevor Darrell, Marcus Rohrbach
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

    Deep models are the defacto standard in visual decision models due to their
    impressive performance on a wide array of visual tasks. However, they are
    frequently seen as opaque and are unable to explain their decisions. In
    contrast, humans can justify their decisions with natural language and point to
    the evidence in the visual world which led to their decisions. We postulate
    that deep models can do this as well and propose our Pointing and Justification
    (PJ-X) model which can justify its decision with a sentence and point to the
    evidence by introspecting its decision and explanation process using an
    attention mechanism. Unfortunately there is no dataset available with reference
    explanations for visual decision making. We thus collect two datasets in two
    domains where it is interesting and challenging to explain decisions. First, we
    extend the visual question answering task to not only provide an answer but
    also a natural language explanation for the answer. Second, we focus on
    explaining human activities which is traditionally more challenging than object
    classification. We extensively evaluate our PJ-X model, both on the
    justification and pointing tasks, by comparing it to prior models and ablations
    using both automatic and human evaluations.

    User Model-Based Intent-Aware Metrics for Multilingual Search Evaluation

    Alexey Drutsa (Yandex, Moscow, Russia), Andrey Shutovich (Yandex, Moscow, Russia), Philipp Pushnyakov (Yandex, Moscow, Russia), Evgeniy Krokhalyov (Yandex, Moscow, Russia), Gleb Gusev (Yandex, Moscow, Russia), Pavel Serdyukov (Yandex, Moscow, Russia)
    Comments: 7 pages, 1 figure, 3 tables
    Journal-ref: NIPS 2016 Workshop “What If? Inference and Learning of
    Hypothetical and Counterfactual Interventions in Complex Systems” (What If
    2016) pre-print
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Human-Computer Interaction (cs.HC); Learning (cs.LG); Machine Learning (stat.ML)

    Despite the growing importance of multilingual aspect of web search, no
    appropriate offline metrics to evaluate its quality are proposed so far. At the
    same time, personal language preferences can be regarded as intents of a query.
    This approach translates the multilingual search problem into a particular task
    of search diversification. Furthermore, the standard intent-aware approach
    could be adopted to build a diversified metric for multilingual search on the
    basis of a classical IR metric such as ERR. The intent-aware approach estimates
    user satisfaction under a user behavior model. We show however that the
    underlying user behavior models is not realistic in the multilingual case, and
    the produced intent-aware metric do not appropriately estimate the user
    satisfaction. We develop a novel approach to build intent-aware user behavior
    models, which overcome these limitations and convert to quality metrics that
    better correlate with standard online metrics of user satisfaction.

    You Are What You Eat… Listen to, Watch, and Read

    Mason Bretan
    Subjects: Social and Information Networks (cs.SI); Computation and Language (cs.CL); Information Retrieval (cs.IR)

    This article describes a data driven method for deriving the relationship
    between personality and media preferences. A qunatifiable representation of
    such a relationship can be leveraged for use in recommendation systems and
    ameliorate the “cold start” problem. Here, the data is comprised of an original
    collection of 1,316 Okcupid dating profiles. Of these profiles, 800 are labeled
    with one of 16 possible Myers-Briggs Type Indicators (MBTI). A personality
    specific topic model describing a person’s favorite books, movies, shows,
    music, and food was generated using latent Dirichlet allocation (LDA). There
    were several significant findings, for example, intuitive thinking types
    preferred sci-fi/fantasy entertainment, extraversion correlated positively with
    upbeat dance music, and jazz, folk, and international cuisine correlated
    positively with those characterized by openness to experience. Many other
    correlations confirmed previous findings describing the relationship among
    personality, writing style, and personal preferences. (For complete
    word/personality type assocations see the Appendix).


    Distributed, Parallel, and Cluster Computing

    Data allocation on disks with solution reconfiguration (problems, heuristics)

    Mark Sh. Levin
    Comments: 10 pages, 9 figures, 9 tables
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Optimization and Control (math.OC)

    The paper addresses problem of data allocation in two-layer computer storage
    while taking into account dynamic digraph(s) over computing tasks. The basic
    version of data file allocation on parallel hard magnetic disks is considered
    as special bin packing model. Two problems of the allocation solution
    reconfiguration (restructuring) are suggested: (i) one-stage restructuring
    model, (ii) multistage restructuring models. Solving schemes are based on
    simplified heuristics. Numerical examples illustrate problems and solving
    schemes.

    On the Convergence of Asynchronous Parallel Iteration with Arbitrary Delays

    Zhimin Peng, Yangyang Xu, Ming Yan, Wotao Yin
    Comments: submitted to Journal of Machine Learning Research
    Subjects: Optimization and Control (math.OC); Distributed, Parallel, and Cluster Computing (cs.DC); Numerical Analysis (math.NA); Machine Learning (stat.ML)

    Recent years have witnessed the surge of asynchronous parallel
    (async-parallel) iterative algorithms due to problems involving very
    large-scale data and a large number of decision variables. Because of
    asynchrony, the iterates are computed with outdated information, and the age of
    the outdated information, which we call delay, is the number of times it has
    been updated since its creation. Almost all recent works prove convergence
    under the assumption of a finite maximum delay and set their stepsize
    parameters accordingly. However, the maximum delay is practically unknown.

    This paper presents convergence analysis of an async-parallel method from a
    probabilistic viewpoint, and it allows for arbitrarily large delays. An
    explicit formula of stepsize that guarantees convergence is given depending on
    delays’ statistics. With (p+1) identical processors, we empirically measured
    that delays closely follow the Poisson distribution with parameter (p),
    matching our theoretical model, and thus the stepsize can be set accordingly.
    Simulations on both convex and nonconvex optimization problems demonstrate the
    validness of our analysis and also show that the existing maximum-delay induced
    stepsize is too conservative, often slowing down the convergence of the
    algorithm.


    Learning

    Anomaly Detection Using the Knowledge-based Temporal Abstraction Method

    Asaf Shabtai
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

    The rapid growth in stored time-oriented data necessitates the development of
    new methods for handling, processing, and interpreting large amounts of
    temporal data. One important example of such processing is detecting anomalies
    in time-oriented data. The Knowledge-Based Temporal Abstraction method was
    previously proposed for intelligent interpretation of temporal data based on
    predefined domain knowledge. In this study we propose a framework that
    integrates the KBTA method with a temporal pattern mining process for anomaly
    detection. According to the proposed method a temporal pattern mining process
    is applied on a dataset of basic temporal abstraction database in order to
    extract patterns representing normal behavior. These patterns are then analyzed
    in order to identify abnormal time periods characterized by a significantly
    small number of normal patterns. The proposed approach was demonstrated using a
    dataset collected from a real server.

    An Architecture for Deep, Hierarchical Generative Models

    Philip Bachman
    Comments: Published in NIPS 2016
    Subjects: Learning (cs.LG)

    We present an architecture which lets us train deep, directed generative
    models with many layers of latent variables. We include deterministic paths
    between all latent variables and the generated output, and provide a richer set
    of connections between computations for inference and generation, which enables
    more effective communication of information throughout the model during
    training. To improve performance on natural images, we incorporate a
    lightweight autoregressive model in the reconstruction distribution. These
    techniques permit end-to-end training of models with 10+ layers of latent
    variables. Experiments show that our approach achieves state-of-the-art
    performance on standard image modelling benchmarks, can expose latent class
    structure in the absence of label information, and can provide convincing
    imputations of occluded regions in natural images.

    Predicting Process Behaviour using Deep Learning

    Joerg Evermann, Jana-Rebecca Rehse, Peter Fettke
    Comments: 34 pages, 10 figures
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    Predicting business process behaviour, such as the final state of a running
    process, the remaining time to completion or the next activity of a running
    process, is an important aspect of business process management. Motivated by
    research in natural language processing, this paper describes an application of
    deep learning with recurrent neural networks to the problem of predicting the
    next event in a business process. This is both a novel method in process
    prediction, which has largely relied on explicit process models, and also a
    novel application of deep learning methods. The approach is evaluated on two
    real datasets and our results surpass the state-of-the-art in prediction
    precision. The paper offers recommendations for researchers and practitioners
    and points out areas for future applications of deep learning in business
    process management.

    Deep Function Machines: Generalized Neural Networks for Topological Layer Expression

    William H. Guss
    Comments: Before empirical experiments–Preprint
    Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    In this paper we propose a generalization of deep neural networks called deep
    function machines (DFMs). DFMs act on vector spaces of arbitrary (possibly
    infinite) dimension and we show that a family of DFMs are invariant to the
    dimension of input data; that is, the parameterization of the model does not
    directly hinge on the quality of the input (eg. high resolution images). Using
    this generalization we provide a new theory of universal approximation of
    bounded non-linear operators between function spaces locally compact Hausdorff
    spaces. We then suggest that DFMs provide an expressive framework for designing
    new neural network layer types with topological considerations in mind.
    Finally, we provide several examples of DFMs and in particular give a practical
    algorithm for neural networks approximating infinite dimensional operators.

    Detect, Replace, Refine: Deep Structured Prediction For Pixel Wise Labeling

    Spyros Gidaris, Nikos Komodakis
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Pixel wise image labeling is an interesting and challenging problem with
    great significance in the computer vision community. In order for a dense
    labeling algorithm to be able to achieve accurate and precise results, it has
    to consider the dependencies that exist in the joint space of both the input
    and the output variables. An implicit approach for modeling those dependencies
    is by training a deep neural network that, given as input an initial estimate
    of the output labels and the input image, it will be able to predict a new
    refined estimate for the labels. In this context, our work is concerned with
    what is the optimal architecture for performing the label improvement task. We
    argue that the prior approaches of either directly predicting new label
    estimates or predicting residual corrections w.r.t. the initial labels with
    feed-forward deep network architectures are sub-optimal. Instead, we propose a
    generic architecture that decomposes the label improvement task to three steps:
    1) detecting the initial label estimates that are incorrect, 2) replacing the
    incorrect labels with new ones, and finally 3) refining the renewed labels by
    predicting residual corrections w.r.t. them. Furthermore, we explore and
    compare various other alternative architectures that consist of the
    aforementioned Detection, Replace, and Refine components. We extensively
    evaluate the examined architectures in the challenging task of dense disparity
    estimation (stereo matching) and we report both quantitative and qualitative
    results on three different datasets. Finally, our dense disparity estimation
    network that implements the proposed generic architecture, achieves
    state-of-the-art results in the KITTI 2015 test surpassing prior approaches by
    a significant margin.

    Incorporating Language Level Information into Acoustic Models

    Peidong Wang, Deliang Wang
    Subjects: Computation and Language (cs.CL); Learning (cs.LG); Sound (cs.SD)

    This paper proposed a class of novel Deep Recurrent Neural Networks which can
    incorporate language-level information into acoustic models. For simplicity, we
    named these networks Recurrent Deep Language Networks (RDLNs). Multiple
    variants of RDLNs were considered, including two kinds of context information,
    two methods to process the context, and two methods to incorporate the
    language-level information. RDLNs provided possible methods to fine-tune the
    whole Automatic Speech Recognition (ASR) system in the acoustic modeling
    process.

    Stable Memory Allocation in the Hippocampus: Fundamental Limits and Neural Realization

    Wenlong Mou, Zhi Wang, Liwei Wang
    Subjects: Neural and Evolutionary Computing (cs.NE); Data Structures and Algorithms (cs.DS); Learning (cs.LG)

    It is believed that hippocampus functions as a memory allocator in brain, the
    mechanism of which remains unrevealed. In Valiant’s neuroidal model, the
    hippocampus was described as a randomly connected graph, the computation on
    which maps input to a set of activated neuroids with stable size. Valiant
    proposed three requirements for the hippocampal circuit to become a stable
    memory allocator (SMA): stability, continuity and orthogonality. The
    functionality of SMA in hippocampus is essential in further computation within
    cortex, according to Valiant’s model.

    In this paper, we put these requirements for memorization functions into
    rigorous mathematical formulation and introduce the concept of capacity, based
    on the probability of erroneous allocation. We prove fundamental limits for the
    capacity and error probability of SMA, in both data-independent and
    data-dependent settings. We also establish an example of stable memory
    allocator that can be implemented via neuroidal circuits. Both theoretical
    bounds and simulation results show that the neural SMA functions well.

    Harmonic Networks: Deep Translation and Rotation Equivariance

    Daniel E. Worrall, Stephan J. Garbin, Daniyar Turmukhambetov, Gabriel J. Brostow
    Comments: Submitted to CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)

    Translating or rotating an input image should not affect the results of many
    computer vision tasks. Convolutional neural networks (CNNs) are already
    translation equivariant: input image translations produce proportionate feature
    map translations. This is not the case for rotations. Global rotation
    equivariance is typically sought through data augmentation, but patch-wise
    equivariance is more difficult. We present Harmonic Networks or H-Nets, a CNN
    exhibiting equivariance to patch-wise translation and 360-rotation. We achieve
    this by replacing regular CNN filters with circular harmonics, returning a
    maximal response and orientation for every receptive field patch.

    H-Nets use a rich, parameter-efficient and low computational complexity
    representation, and we show that deep feature maps within the network encode
    complicated rotational invariants. We demonstrate that our layers are general
    enough to be used in conjunction with the latest architectures and techniques,
    such as deep supervision and batch normalization. We also achieve
    state-of-the-art classification on rotated-MNIST, and competitive results on
    other benchmark challenges.

    Retrieving sinusoids from nonuniformly sampled data using recursive formulation

    Ivan Maric
    Comments: 16 pages, 17 figures, Expert Systems with Applications – accepted
    Subjects: Information Theory (cs.IT); Learning (cs.LG)

    A heuristic procedure based on novel recursive formulation of sinusoid (RFS)
    and on regression with predictive least-squares (LS) enables to decompose both
    uniformly and nonuniformly sampled 1-d signals into a sparse set of sinusoids
    (SSS). An optimal SSS is found by Levenberg-Marquardt (LM) optimization of RFS
    parameters of near-optimal sinusoids combined with common criteria for the
    estimation of the number of sinusoids embedded in noise. The procedure
    estimates both the cardinality and the parameters of SSS. The proposed
    algorithm enables to identify the RFS parameters of a sinusoid from a data
    sequence containing only a fraction of its cycle. In extreme cases when the
    frequency of a sinusoid approaches zero the algorithm is able to detect a
    linear trend in data. Also, an irregular sampling pattern enables the algorithm
    to correctly reconstruct the under-sampled sinusoid. Parsimonious nature of the
    obtaining models opens the possibilities of using the proposed method in
    machine learning and in expert and intelligent systems needing analysis and
    simple representation of 1-d signals. The properties of the proposed algorithm
    are evaluated on examples of irregularly sampled artificial signals in noise
    and are compared with high accuracy frequency estimation algorithms based on
    linear prediction (LP) approach, particularly with respect to Cramer-Rao Bound
    (CRB).

    Disentangling Space and Time in Video with Hierarchical Variational Auto-encoders

    Will Grathwohl, Aaron Wilson
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)

    There are many forms of feature information present in video data. Principle
    among them are object identity information which is largely static across
    multiple video frames, and object pose and style information which continuously
    transforms from frame to frame. Most existing models confound these two types
    of representation by mapping them to a shared feature space. In this paper we
    propose a probabilistic approach for learning separable representations of
    object identity and pose information using unsupervised video data. Our
    approach leverages a deep generative model with a factored prior distribution
    that encodes properties of temporal invariances in the hidden feature set.
    Learning is achieved via variational inference. We present results of learning
    identity and pose information on a dataset of moving characters as well as a
    dataset of rotating 3D objects. Our experimental results demonstrate our
    model’s success in factoring its representation, and demonstrate that the model
    achieves improved performance in transfer learning tasks.

    Improving Neural Language Models with a Continuous Cache

    Edouard Grave, Armand Joulin, Nicolas Usunier
    Comments: Submitted to ICLR 2017
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    We propose an extension to neural network language models to adapt their
    prediction to the recent history. Our model is a simplified version of memory
    augmented networks, which stores past hidden activations as memory and accesses
    them through a dot product with the current hidden activation. This mechanism
    is very efficient and scales to very large memory sizes. We also draw a link
    between the use of external memory in neural network and cache models used with
    count based language models. We demonstrate on several language model datasets
    that our approach performs significantly better than recent memory augmented
    networks.

    User Model-Based Intent-Aware Metrics for Multilingual Search Evaluation

    Alexey Drutsa (Yandex, Moscow, Russia), Andrey Shutovich (Yandex, Moscow, Russia), Philipp Pushnyakov (Yandex, Moscow, Russia), Evgeniy Krokhalyov (Yandex, Moscow, Russia), Gleb Gusev (Yandex, Moscow, Russia), Pavel Serdyukov (Yandex, Moscow, Russia)
    Comments: 7 pages, 1 figure, 3 tables
    Journal-ref: NIPS 2016 Workshop “What If? Inference and Learning of
    Hypothetical and Counterfactual Interventions in Complex Systems” (What If
    2016) pre-print
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Human-Computer Interaction (cs.HC); Learning (cs.LG); Machine Learning (stat.ML)

    Despite the growing importance of multilingual aspect of web search, no
    appropriate offline metrics to evaluate its quality are proposed so far. At the
    same time, personal language preferences can be regarded as intents of a query.
    This approach translates the multilingual search problem into a particular task
    of search diversification. Furthermore, the standard intent-aware approach
    could be adopted to build a diversified metric for multilingual search on the
    basis of a classical IR metric such as ERR. The intent-aware approach estimates
    user satisfaction under a user behavior model. We show however that the
    underlying user behavior models is not realistic in the multilingual case, and
    the produced intent-aware metric do not appropriately estimate the user
    satisfaction. We develop a novel approach to build intent-aware user behavior
    models, which overcome these limitations and convert to quality metrics that
    better correlate with standard online metrics of user satisfaction.


    Information Theory

    The Capacity of Gaussian MIMO Channels Under Total and Per-Antenna Power Constraints

    S. Loyka
    Comments: accepted by IEEE Trans. Communications
    Subjects: Information Theory (cs.IT)

    The capacity of a fixed Gaussian multiple-input multiple-output (MIMO)
    channel and the optimal transmission strategy under the total power (TP)
    constraint and full channel state information are well-known. This problem
    remains open in the general case under individual per-antenna (PA) power
    constraints, while some special cases have been solved. These include a
    full-rank solution for the MIMO channel and a general solution for the
    multiple-input single-output (MISO) channel. In this paper, the fixed Gaussian
    MISO channel is considered and its capacity as well as optimal transmission
    strategies are determined in a closed form under the joint total and
    per-antenna power constraints in the general case. In particular, the optimal
    strategy is hybrid and includes two parts: first is equal-gain transmission and
    second is maximum-ratio transmission, which are responsible for the PA and TP
    constraints respectively. The optimal beamforming vector is given in a
    closed-form and an accurate yet simple approximation to the capacity is
    proposed. Finally, the above results are extended to the MIMO case by
    establishing the ergodic capacity of fading MIMO channels under the joint power
    constraints when the fading distribution is right unitary-invariant (of which
    i.i.d. and semi-correlated Rayleigh fading are special cases). Unlike the fixed
    MISO case, the optimal signaling is shown to be isotropic in this case.

    Operating Massive MIMO in Unlicensed Bands for Enhanced Coexistence and Spatial Reuse

    Giovanni Geraci, Adrian Garcia-Rodriguez, David López-Pérez, Andrea Bonfante, Lorenzo Galati Giordano, Holger Claussen
    Subjects: Information Theory (cs.IT)

    We propose to operate massive multiple-input multiple output (MIMO) cellular
    base stations (BSs) in unlicensed bands. We denote such system as massive MIMO
    unlicensed (mMIMO-U). We design the key procedures required at a cellular BS to
    guarantee coexistence with nearby Wi-Fi devices operating in the same band. In
    particular, spatial reuse is enhanced by actively suppressing interference
    towards neighboring Wi-Fi devices. Wi-Fi interference rejection is also
    performed during an enhanced listen-before-talk (LBT) phase. These operations
    enable Wi-Fi devices to access the channel as though no cellular BSs were
    transmitting, and vice versa. Under concurrent Wi-Fi and BS transmissions, the
    downlink rates attainable by cellular user equipment (UEs) are degraded by the
    Wi-Fi-generated interference. To mitigate this effect, we select a suitable set
    of UEs to be served in the unlicensed band accounting for a measure of the
    Wi-Fi/UE proximity. Our results show that the so-designed mMIMO-U allows
    simultaneous cellular and Wi-Fi transmissions by keeping their mutual
    interference below the regulatory threshold. Compared to a system without
    interference suppression, Wi-Fi devices enjoy a median interference power
    reduction of between 3 dB with 16 antennas and 18 dB with 128 antennas. With
    mMIMO-U, cellular BSs can also achieve large data rates without significantly
    degrading the performance of Wi-Fi networks deployed within their coverage
    area.

    Lexicodes over Finite Principal Left Ideal Rings

    Jared Antrobus, Heide Gluesing-Luerssen
    Subjects: Information Theory (cs.IT)

    Let R be a finite principal left ideal ring. Via a total ordering of the ring
    elements and an ordered basis a lexicographic ordering of the module R^n is
    produced. This is used to set up a greedy algorithm that selects vectors for
    which all linear combination with the previously selected vectors satisfy a
    pre-specified selection property and updates the to-be-constructed code to the
    linear hull of the vectors selected so far. The output is called a lexicode.
    This process was discussed earlier in the literature for fields and chain
    rings. In this paper we investigate the properties of such lexicodes over
    finite principal left ideal rings and show that the total ordering of the ring
    elements has to respect containment of ideals in order for the algorithm to
    produce meaningful results. Only then it is guaranteed that the algorithm is
    exhaustive and thus produces codes that are maximal with respect to inclusion.
    It is further illustrated that the output of the algorithm heavily depends on
    the total ordering and chosen basis.

    Lightweight compression with encryption based on Asymmetric Numeral Systems

    Jarek Duda, Marcin Niemiec
    Comments: 10 pages, 6 figures
    Subjects: Information Theory (cs.IT); Cryptography and Security (cs.CR)

    Data compression combined with effective encryption is a common requirement
    of data storage and transmission. Low cost of these operations is often a high
    priority in order to increase transmission speed and reduce power usage. This
    requirement is crucial for battery-powered devices with limited resources, such
    as autonomous remote sensors or implants. Well-known and popular encryption
    techniques are frequently too expensive. This problem is on the increase as
    machine-to-machine communication and the Internet of Things are becoming a
    reality. Therefore, there is growing demand for finding trade-offs between
    security, cost and performance in lightweight cryptography. This article
    discusses Asymmetric Numeral Systems — an innovative approach to entropy
    coding which can be used for compression with encryption. It provides
    compression ratio comparable with arithmetic coding at similar speed as Huffman
    coding, hence, this coding is starting to replace them in new compressors.
    Additionally, by perturbing its coding tables, the Asymmetric Numeral System
    makes it possible to simultaneously encrypt the encoded message at nearly no
    additional cost. The article introduces this approach and analyzes its security
    level. The basic application is reducing the number of rounds of some cipher
    used on ANS-compressed data, or completely removing additional encryption layer
    if reaching a satisfactory protection level.

    Performance Analysis for Training-Based Multi-Pair Two-Way Full-Duplex Relaying with Massive Antennas

    Zhanzhan Zhang, Zhiyong Chen, Manyuan Shen, Bin Xia, Weiliang Xie, Yong Zhao
    Comments: Accepted by IEEE Transactions on Vehicular Technology
    Subjects: Information Theory (cs.IT)

    This paper considers a multi-pair two-way amplify-and-forward relaying
    system, where multiple pairs of full-duplex users are served via a full-duplex
    relay with massive antennas, and the relay adopts maximum-ratio
    combining/maximum-ratio transmission (MRC/MRT) processing. The orthogonal pilot
    scheme and the least square method are firstly exploited to estimate the
    channel state information (CSI). When the number of relay antennas is finite,
    we derive an approximate sum rate expression which is shown to be a good
    predictor of the ergodic sum rate, especially in large number of antennas. Then
    the corresponding achievable rate expression is obtained by adopting another
    pilot scheme which estimates the composite CSI for each user pair to reduce the
    pilot overhead of channel estimation. We analyze the achievable rates of the
    two pilot schemes and then show the relative merits of the two methods.
    Furthermore, power allocation strategies for users and the relay are proposed
    based on sum rate maximization and max-min fairness criterion, respectively.
    Finally, numerical results verify the accuracy of the analytical results and
    show the performance gains achieved by the proposed power allocation.

    Blind Measurement Selection: A Random Matrix Theory Approach

    Khalil Elkhalil, Abla Kammoun, Tareq Y. Al-Naffouri, Mohamed-Slim Alouini
    Subjects: Information Theory (cs.IT)

    This paper considers the problem of selecting a set of (k) measurements from
    (n) available sensor observations. The selected measurements should minimize a
    certain error function assessing the error in estimating a certain (m)
    dimensional parameter vector. The exhaustive search inspecting each of the
    (nchoose k) possible choices would require a very high computational
    complexity and as such is not practical for large (n) and (k). Alternative
    methods with low complexity have recently been investigated but their main
    drawbacks are that 1) they require perfect knowledge of the measurement matrix
    and 2) they need to be applied at the pace of change of the measurement matrix.
    To overcome these issues, we consider the asymptotic regime in which (k), (n)
    and (m) grow large at the same pace. Tools from random matrix theory are then
    used to approximate in closed-form the most important error measures that are
    commonly used. The asymptotic approximations are then leveraged to select
    properly (k) measurements exhibiting low values for the asymptotic error
    measures. Two heuristic algorithms are proposed: the first one merely consists
    in applying the convex optimization artifice to the asymptotic error measure.
    The second algorithm is a low-complexity greedy algorithm that attempts to look
    for a sufficiently good solution for the original minimization problem. The
    greedy algorithm can be applied to both the exact and the asymptotic error
    measures and can be thus implemented in blind and channel-aware fashions. We
    present two potential applications where the proposed algorithms can be used,
    namely antenna selection for uplink transmissions in large scale multi-user
    systems and sensor selection for wireless sensor networks. Numerical results
    are also presented and sustain the efficiency of the proposed blind methods in
    reaching the performances of channel-aware algorithms.

    Retrieving sinusoids from nonuniformly sampled data using recursive formulation

    Ivan Maric
    Comments: 16 pages, 17 figures, Expert Systems with Applications – accepted
    Subjects: Information Theory (cs.IT); Learning (cs.LG)

    A heuristic procedure based on novel recursive formulation of sinusoid (RFS)
    and on regression with predictive least-squares (LS) enables to decompose both
    uniformly and nonuniformly sampled 1-d signals into a sparse set of sinusoids
    (SSS). An optimal SSS is found by Levenberg-Marquardt (LM) optimization of RFS
    parameters of near-optimal sinusoids combined with common criteria for the
    estimation of the number of sinusoids embedded in noise. The procedure
    estimates both the cardinality and the parameters of SSS. The proposed
    algorithm enables to identify the RFS parameters of a sinusoid from a data
    sequence containing only a fraction of its cycle. In extreme cases when the
    frequency of a sinusoid approaches zero the algorithm is able to detect a
    linear trend in data. Also, an irregular sampling pattern enables the algorithm
    to correctly reconstruct the under-sampled sinusoid. Parsimonious nature of the
    obtaining models opens the possibilities of using the proposed method in
    machine learning and in expert and intelligent systems needing analysis and
    simple representation of 1-d signals. The properties of the proposed algorithm
    are evaluated on examples of irregularly sampled artificial signals in noise
    and are compared with high accuracy frequency estimation algorithms based on
    linear prediction (LP) approach, particularly with respect to Cramer-Rao Bound
    (CRB).

    New few weight codes from trace codes over a local Ring

    Shi Minjia, Qian Liqin, Sole Patrick
    Comments: 19 pages. arXiv admin note: text overlap with arXiv:1612.00128
    Subjects: Information Theory (cs.IT)

    In this paper, new few weights linear codes over the local ring
    (R=mathbb{F}_p+umathbb{F}_p+vmathbb{F}_p+uvmathbb{F}_p,) with (u^2=v^2=0,
    uv=vu,) are constructed by using the trace function defined over an extension
    ring of degree (m.) %In fact, These codes are punctured from the linear code is
    defined in cite{SWLP} up to coordinate permutations. These trace codes have
    the algebraic structure of abelian codes. Their weight distributions are
    evaluated explicitly by means of Gaussian sums over finite fields. Two
    different defining sets are explored.

    Using a linear Gray map from (R) to (mathbb{F}_p^4,) we obtain several
    families of new (p)-ary codes from trace codes of dimension (4m). For the first
    defining set: when (m) is even, or (m) is odd and (pequiv3 ~({
    m mod} ~4),)
    we obtain a new family of two-weight codes, which are shown to be optimal by
    the application of the Griesmer bound; when (m) is even and under some special
    conditions, we obtain two new classes of three-weight codes. For the second
    defining set: we obtain a new class of two-weight codes and prove that it meets
    the Griesmer bound. In addition, we give the minimum distance of the dual code.
    Finally, applications of the (p)-ary image codes in secret sharing schemes are
    presented.

    Binary Linear Codes From Vectorial Boolean Functions and Their Weight Distribution

    Deng Tang, Claude Carlet, Zhengchun Zhou
    Comments: 30 pages
    Subjects: Information Theory (cs.IT)

    Binary linear codes with good parameters have important applications in
    secret sharing schemes, authentication codes, association schemes, and consumer
    electronics and communications. In this paper, we construct several classes of
    binary linear codes from vectorial Boolean functions and determine their
    parameters, by further studying a generic construction developed by Ding
    emph{et al.} recently. First, by employing perfect nonlinear functions and
    almost bent functions, we obtain several classes of six-weight linear codes
    which contains the all-one codeword. Second, we investigate a subcode of any
    linear code mentioned above and consider its parameters. When the vectorial
    Boolean function is a perfect nonlinear function or a Gold function in odd
    dimension, we can completely determine the weight distribution of this subcode.
    Besides, our linear codes have larger dimensions than the ones by Ding et al.’s
    generic construction.




沪ICP备19023445号-2号
友情链接