IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Fri, 3 Feb 2017

    我爱机器学习(52ml.net)发表于 2017-02-03 00:00:00
    love 0

    Neural and Evolutionary Computing

    Dominance Move: A Measure of Comparing Solution Sets in Multiobjective Optimization

    Miqing Li, Xin Yao
    Comments: 23 pages, 10 figures
    Subjects: Neural and Evolutionary Computing (cs.NE); Optimization and Control (math.OC)

    One of the most common approaches for multiobjective optimization is to
    generate a solution set that well approximates the whole Pareto-optimal
    frontier to facilitate the later decision-making process. However, how to
    evaluate and compare the quality of different solution sets remains
    challenging. Existing measures typically require additional problem knowledge
    and information, such as a reference point or a substituted set of the
    Pareto-optimal frontier. In this paper, we propose a quality measure, called
    dominance move (DoM), to compare solution sets generated by multiobjective
    optimizers. Given two solution sets, DoM measures the minimum sum of move
    distances for one set to weakly Pareto dominate the other set. DoM can be seen
    as a natural reflection of the difference between two solutions, capturing all
    aspects of solution sets’ quality, being compliant with Pareto dominance, and
    does not need any additional problem knowledge and parameters. We present an
    exact method to calculate the DoM in the biobjective case. We show the
    necessary condition of constructing the optimal partition for a solution set’s
    minimum move, and accordingly propose an efficient algorithm to recursively
    calculate the DoM. Finally, DoM is evaluated on several groups of artificial
    and real test cases as well as by a comparison with two well-established
    quality measures.

    Scaling Properties of Human Brain Functional Networks

    Riccardo Zucca, Xerxes D. Arsiwalla, Hoang Le, Mikail Rubinov, Paul Verschure
    Comments: International Conference on Artificial Neural Networks – ICANN 2016
    Journal-ref: Artificial Neural Networks and Machine Learning, Lecture Notes in
    Computer Science, vol 9886, 2016
    Subjects: Neurons and Cognition (q-bio.NC); Disordered Systems and Neural Networks (cond-mat.dis-nn); Neural and Evolutionary Computing (cs.NE); Data Analysis, Statistics and Probability (physics.data-an)

    We investigate scaling properties of human brain functional networks in the
    resting-state. Analyzing network degree distributions, we statistically test
    whether their tails scale as power-law or not. Initial studies, based on
    least-squares fitting, were shown to be inadequate for precise estimation of
    power-law distributions. Subsequently, methods based on maximum-likelihood
    estimators have been proposed and applied to address this question.
    Nevertheless, no clear consensus has emerged, mainly because results have shown
    substantial variability depending on the data-set used or its resolution. In
    this study, we work with high-resolution data (10K nodes) from the Human
    Connectome Project and take into account network weights. We test for the
    power-law, exponential, log-normal and generalized Pareto distributions. Our
    results show that the statistics generally do not support a power-law, but
    instead these degree distributions tend towards the thin-tail limit of the
    generalized Pareto model. This may have implications for the number of hubs in
    human brain functional networks.

    Learning Criticality in an Embodied Boltzmann Machine

    Miguel Aguilera, Manuel G. Bedia
    Subjects: Adaptation and Self-Organizing Systems (nlin.AO); Disordered Systems and Neural Networks (cond-mat.dis-nn); Statistical Mechanics (cond-mat.stat-mech); Neural and Evolutionary Computing (cs.NE); Neurons and Cognition (q-bio.NC)

    Many biological and cognitive systems do not operate deep into one or other
    regime of activity. Instead, they exploit critical surfaces poised at
    transitions in their parameter space. The pervasiveness of criticality in
    natural systems suggests that there may be general principles inducing this
    behaviour. However, there is a lack of conceptual models explaining how
    embodied agents propel themselves towards these critical points. In this paper,
    we present a learning model driving an embodied Boltzmann Machine towards
    critical behaviour by maximizing the heat capacity of the network. We test and
    corroborate the model implementing an embodied agent in the mountain car
    benchmark, controlled by a Boltzmann Machine that adjust its weights according
    to the model. We find that the neural controller reaches a point of
    criticality, which coincides with a transition point of the behaviour of the
    agent between two regimes of behaviour, maximizing the synergistic information
    between its sensors and the hidden and motor neurons. Finally, we discuss the
    potential of our learning model to study the contribution of criticality to the
    behaviour of embodied living systems in scenarios not necessarily constrained
    by biological restrictions of the examples of criticality we find in nature.

    Information-theoretic interpretation of tuning curves for multiple motion directions

    Wentao Huang, Xin Huang, Kechen Zhang
    Comments: The 51st Annual Conference on Information Sciences and Systems (CISS), 2017
    Subjects: Information Theory (cs.IT); Neural and Evolutionary Computing (cs.NE); Neurons and Cognition (q-bio.NC); Quantitative Methods (q-bio.QM)

    We have developed an efficient information-maximization method for computing
    the optimal shapes of tuning curves of sensory neurons by optimizing the
    parameters of the underlying feedforward network model. When applied to the
    problem of population coding of visual motion with multiple directions, our
    method yields several types of tuning curves with both symmetric and asymmetric
    shapes that resemble what have been found in the visual cortex. Our result
    suggests that the diversity or heterogeneity of tuning curve shapes as observed
    in neurophysiological experiment might actually constitute an optimal
    population representation of visual motions with multiple components.

    On orthogonality and learning recurrent networks with long term dependencies

    Eugene Vorontsov, Chiheb Trabelsi, Samuel Kadoury, Chris Pal
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    It is well known that it is challenging to train deep neural networks and
    recurrent neural networks for tasks that exhibit long term dependencies. The
    vanishing or exploding gradient problem is a well known issue associated with
    these challenges. One approach to addressing vanishing and exploding gradients
    is to use either soft or hard constraints on weight matrices so as to encourage
    or enforce orthogonality. Orthogonal matrices preserve gradient norm during
    backpropagation and can therefore be a desirable property; however, we find
    that hard constraints on orthogonality can negatively affect the speed of
    convergence and model performance. This paper explores the issues of
    optimization convergence, speed and gradient stability using a variety of
    different methods for encouraging or enforcing orthogonality. In particular we
    propose a weight matrix factorization and parameterization strategy through
    which we can bound matrix norms and therein control the degree of expansivity
    induced during backpropagation.


    Computer Vision and Pattern Recognition

    Pixel Recursive Super Resolution

    Ryan Dahl, Mohammad Norouzi, Jonathon Shlens
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    We present a pixel recursive super resolution model that synthesizes
    realistic details into images while enhancing their resolution. A low
    resolution image may correspond to multiple plausible high resolution images,
    thus modeling the super resolution process with a pixel independent conditional
    model often results in averaging different details–hence blurry edges. By
    contrast, our model is able to represent a multimodal conditional distribution
    by properly modeling the statistical dependencies among the high resolution
    image pixels, conditioned on a low resolution input. We employ a PixelCNN
    architecture to define a strong prior over natural images and jointly optimize
    this prior with a deep conditioning convolutional network. Human evaluations
    indicate that samples from our proposed model look more photo realistic than a
    strong L2 regression baseline.

    Maritime situational awareness using adaptive multi-sensor management under hazy conditions

    D. K. Prasad, C. K. Prasath, D. Rajan, L. Rachmawati, E. Rajabally, C. Quek
    Comments: 11 pages, 2 figures, MTEC 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper presents a multi-sensor architecture with an adaptive multi-sensor
    management system suitable for control and navigation of autonomous maritime
    vessels in hazy and poor-visibility conditions. This architecture resides in
    the autonomous maritime vessels. It augments the data from on-board imaging
    sensors and weather sensors with the AIS data and weather data from sensors on
    other vessels and the on-shore vessel traffic surveillance system. The combined
    data is analyzed using computational intelligence and data analytics to
    determine suitable course of action while utilizing historically learnt
    knowledge and performing live learning from the current situation. Such
    framework is expected to be useful in diverse weather conditions and shall be a
    useful architecture to provide autonomy to maritime vessels.

    Handwritten Recognition Using SVM, KNN and Neural Network

    Norhidayu Abdul Hamid, Nilam Nur Amir Sjarif
    Comments: 11 pages ; 22 Figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Handwritten recognition (HWR) is the ability of a computer to receive and
    interpret intelligible handwritten input from source such as paper documents,
    photographs, touch-screens and other devices. In this paper we will using three
    (3) classification t o re cognize the handwritten which is SVM, KNN and Neural
    Network.

    Learning a time-dependent master saliency map from eye-tracking data in videos

    Antoine Coutrot, Nathalie Guyader
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    To predict the most salient regions of complex natural scenes, saliency
    models commonly compute several feature maps (contrast, orientation, motion…)
    and linearly combine them into a master saliency map. Since feature maps have
    different spatial distribution and amplitude dynamic ranges, determining their
    contributions to overall saliency remains an open problem. Most
    state-of-the-art models do not take time into account and give feature maps
    constant weights across the stimulus duration. However, visual exploration is a
    highly dynamic process shaped by many time-dependent factors. For instance,
    some systematic viewing patterns such as the center bias are known to
    dramatically vary across the time course of the exploration. In this paper, we
    use maximum likelihood and shrinkage methods to dynamically and jointly learn
    feature map and systematic viewing pattern weights directly from eye-tracking
    data recorded on videos. We show that these weights systematically vary as a
    function of time, and heavily depend upon the semantic visual category of the
    videos being processed. Our fusion method allows taking these variations into
    account, and outperforms other state-of-the-art fusion schemes using constant
    weights over time. The code, videos and eye-tracking data we used for this
    study are available online:
    this http URL

    Side Information in Robust Principle Component Analysis: Algorithms and Applications

    Niannan Xue, Yannis Panagakis, Stefanos Zafeiriou
    Comments: Submitted to CVPR
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Robust rank minimisation aims at recovering a low-rank subspace from grossly
    corrupted high-dimensional (often visual) data and is a cornerstone in many
    machine learning and computer vision applications. The most prominent method
    for this task is the Robust Principal Component Analysis (PCA). It recovers a
    low-rank matrix from sparse corruptions of unknown magnitude and support by
    Principal Component Pursuit (PCP), which is a convex approximation to the
    otherwise NP-hard rank minimisation problem. Even though PCP has been shown to
    be very successful in solving many rank minimisation problems, there are cases
    where degenerate or suboptimal solutions are obtained. This can be attributed
    to the fact that domain-dependent prior knowledge is not taken into account by
    PCP. In this paper, we address the problem of PCP when prior information is
    available. To this end, we propose algorithms for solving the PCP problem with
    the aid of prior information on the low-rank structure of the data. The
    versatility of the proposed methods is demonstrated by applying them to four
    applications, namely background substraction, facial image denoising, face and
    facial expression recognition. Experimental results on synthetic and five real
    world datasets indicate the robustness and effectiveness of the proposed
    methods on these application domains, largely outperforming previous approaches
    that incorporate side information within Robust PCA.

    A Fast and Compact Salient Score Regression Network Based on Fully Convolutional Network

    Xuanyang Xi, Yongkang Luo, Fengfu Li, Peng Wang, Hong Qiao
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Visual saliency detection aims at identifying the most visually distinctive
    parts in an image, and serves as a pre-processing step for a variety of
    computer vision and image processing tasks. To this end, the saliency detection
    procedure must be as fast and compact as possible and optimally processes input
    images in a real time manner. However, contemporary detection methods always
    take hundreds of milliseconds to pursue feeble improvements on the detection
    precession. In this paper, we tackle this problem by proposing a fast and
    compact salient score regression network which employs deep convolutional
    neural networks (CNN) to estimate the saliency of objects in images. It
    operates (including training and testing) in an end-to-end manner
    (image-to-image prediction) and also directly produces whole saliency maps from
    original images without any pre-processings and post-processings. Comparing
    with contemporary CNN-based saliency detection methods, the proposed method
    extremely simplifies the detection procedure and further promotes the
    representation ability of CNN for the saliency detection. Our method is
    evaluated on six public datasets, and experimental results show that the
    precision can be comparable to the published state-of-the-art methods while the
    speed gets a significant improvement (35 FPS, processing in real time).

    Automating Image Analysis by Annotating Landmarks with Deep Neural Networks

    Mikhail Breslav, Tyson L. Hedrick, Stan Sclaroff, Margrit Betke
    Comments: 30 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Image and video analysis is often a crucial step in the study of animal
    behavior and kinematics. Often these analyses require that the position of one
    or more animal landmarks are annotated (marked) in numerous images. The process
    of annotating landmarks can require a significant amount of time and tedious
    labor, which motivates the need for algorithms that can automatically annotate
    landmarks. In the community of scientists that use image and video analysis to
    study the 3D flight of animals, there has been a trend of developing more
    automated approaches for annotating landmarks, yet they fall short of being
    generally applicable. Inspired by the success of Deep Neural Networks (DNNs) on
    many problems in the field of computer vision, we investigate how suitable DNNs
    are for accurate and automatic annotation of landmarks in video datasets
    representative of those collected by scientists studying animals.

    Our work shows, through extensive experimentation on videos of hawkmoths,
    that DNNs are suitable for automatic and accurate landmark localization. In
    particular, we show that one of our proposed DNNs is more accurate than the
    current best algorithm for automatic localization of landmarks on hawkmoth
    videos. Moreover, we demonstrate how these annotations can be used to
    quantitatively analyze the 3D flight of a hawkmoth. To facilitate the use of
    DNNs by scientists from many different fields, we provide a self contained
    explanation of what DNNs are, how they work, and how to apply them to other
    datasets using the freely available library Caffe and supplemental code that we
    provide.

    Deep Learning the Indus Script

    Satish Palaniappan, Ronojoy Adhikari
    Comments: 17 pages, 10 figures, 7 supporting figures (2 pages)
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Learning (cs.LG)

    Standardized corpora of undeciphered scripts, a necessary starting point for
    computational epigraphy, requires laborious human effort for their preparation
    from raw archaeological records. Automating this process through machine
    learning algorithms can be of significant aid to epigraphical research. Here,
    we take the first steps in this direction and present a deep learning pipeline
    that takes as input images of the undeciphered Indus script, as found in
    archaeological artifacts, and returns as output a string of graphemes, suitable
    for inclusion in a standard corpus. The image is first decomposed into regions
    using Selective Search and these regions are classified as containing textual
    and/or graphical information using a convolutional neural network. Regions
    classified as potentially containing text are hierarchically merged and trimmed
    to remove non-textual information. The remaining textual part of the image is
    segmented using standard image processing techniques to isolate individual
    graphemes. This set is finally passed to a second convolutional neural network
    to classify the graphemes, based on a standard corpus. The classifier can
    identify the presence or absence of the most frequent Indus grapheme, the “jar”
    sign, with an accuracy of 92%. Our results demonstrate the great potential of
    deep learning approaches in computational epigraphy and, more generally, in the
    digital humanities.

    Segmentation of optic disc, fovea and retinal vasculature using a single convolutional neural network

    Jen Hong Tan, U. Rajendra Acharya, Sulatha V. Bhandary, Kuang Chua Chua, Sobha Sivaprasad
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    We have developed and trained a convolutional neural network to automatically
    and simultaneously segment optic disc, fovea and blood vessels. Fundus images
    were normalised before segmentation was performed to enforce consistency in
    background lighting and contrast. For every effective point in the fundus
    image, our algorithm extracted three channels of input from the neighbourhood
    of the point and forward the response across the 7 layer network. In average,
    our segmentation achieved an accuracy of 92.68 percent on the testing set from
    Drive database.

    Solving Uncalibrated Photometric Stereo Using Fewer Images by Jointly Optimizing Low-rank Matrix Completion and Integrability

    Soumyadip Sengupta, Hao Zhou, Walter Forkel, Ronen Basri, Tom Goldstein, David W. Jacobs
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We introduce a new, integrated approach to uncalibrated photometric stereo.
    We perform 3D reconstruction of Lambertian objects using multiple images
    produced by unknown, directional light sources. We show how to formulate a
    single optimization that includes rank and integrability constraints, allowing
    also for missing data. We then solve this optimization using the Alternate
    Direction Method of Multipliers (ADMM). We conduct extensive experimental
    evaluation on real and synthetic data sets. Our integrated approach is
    particularly valuable when performing photometric stereo using as few as 4-6
    images, since the integrability constraint is capable of improving estimation
    of the linear subspace of possible solutions. We show good improvements over
    prior work in these cases.

    Algorithmic Performance-Accuracy Trade-off in 3D Vision Applications Using HyperMapper

    Luigi Nardi, Bruno Bodin, Sajad Saeedi, Emanuele Vespa, Andrew J. Davison, Paul H. J. Kelly
    Comments: 10 pages, Keywords: design space exploration, machine learning, computer vision, SLAM, embedded systems, GPU, crowd-sourcing
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC); Learning (cs.LG); Performance (cs.PF)

    In this paper we investigate an emerging application, 3D scene understanding,
    likely to be significant in the mobile space in the near future. The goal of
    this exploration is to reduce execution time while meeting our quality of
    result objectives. In previous work we showed for the first time that it is
    possible to map this application to power constrained embedded systems,
    highlighting that decision choices made at the algorithmic design-level have
    the most impact.

    As the algorithmic design space is too large to be exhaustively evaluated, we
    use a previously introduced multi-objective Random Forest Active Learning
    prediction framework dubbed HyperMapper, to find good algorithmic designs. We
    show that HyperMapper generalizes on a recent cutting edge 3D scene
    understanding algorithm and on a modern GPU-based computer architecture.
    HyperMapper is able to beat an expert human hand-tuning the algorithmic
    parameters of the class of Computer Vision applications taken under
    consideration in this paper automatically. In addition, we use crowd-sourcing
    using a 3D scene understanding Android app to show that the Pareto front
    obtained on an embedded system can be used to accelerate the same application
    on all the 83 smart-phones and tablets crowd-sourced with speedups ranging from
    2 to over 12.

    Learning to Compose with Professional Photographs on the Web

    Yi-Ling Chen, Jan Klopp, Min Sun, Shao-Yi Chien, Kwan-Liu Ma
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Photo composition is an important factor affecting the aesthetics in
    photography. However, it is a highly challenging task to model the aesthetic
    properties of good compositions due to the lack of globally applicable rules to
    the wide variety of photographic styles. Inspired by the thinking process of
    photo taking, we treat the photo composition problem as a view finding process
    which successively examines pairs of views and determines the aesthetic
    preference. Without devising complex hand-crafted features, the ranking model
    is built upon a deep convolutional neural network through joint representation
    learning from raw pixels. Exploiting rich professional photographs on the web
    as data source, we devise a nearly unsupervised approach to generate unlimited
    high quality image pairs for training the network. The resulting ranking model
    is generic and without any heuristics. The experimental results show that the
    proposed view finding network achieves state-of-the-art performance with simple
    sliding window search strategy on two image cropping datasets.

    HashNet: Deep Learning to Hash by Continuation

    Zhangjie Cao, Mingsheng Long, Jianmin Wang, Philip S. Yu
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)

    Learning to hash has been widely applied to approximate nearest neighbor
    search for large-scale multimedia retrieval, due to its computation efficiency
    and retrieval quality. Deep learning to hash, which improves retrieval quality
    by end-to-end representation learning and hash encoding, has received
    increasing attention recently. Subject to the vanishing gradient difficulty in
    the optimization with binary activations, existing deep learning to hash
    methods need to first learn continuous representations and then generate binary
    hash codes in a separated binarization step, which suffer from substantial loss
    of retrieval quality. This paper presents HashNet, a novel deep architecture
    for deep learning to hash by continuation method, which learns exactly binary
    hash codes from imbalanced similarity data where the number of similar pairs is
    much smaller than the number of dissimilar pairs. The key idea is to attack the
    vanishing gradient problem in optimizing deep networks with non-smooth binary
    activations by continuation method, in which we begin from learning an easier
    network with smoothed activation function and let it evolve during the
    training, until it eventually goes back to being the original, difficult to
    optimize, deep network with the sign activation function. Comprehensive
    empirical evidence shows that HashNet can generate exactly binary hash codes
    and yield state-of-the-art multimedia retrieval performance on standard
    benchmarks.


    Artificial Intelligence

    Two forms of minimality in ASPIC+

    Zimi Li, Andrea Cohen, Simon Parsons
    Subjects: Artificial Intelligence (cs.AI); Logic in Computer Science (cs.LO)

    Many systems of structured argumentation explicitly require that the facts
    and rules that make up the argument for a conclusion be the minimal set
    required to derive the conclusion. ASPIC+ does not place such a requirement on
    arguments, instead requiring that every rule and fact that are part of an
    argument be used in its construction. Thus ASPIC+ arguments are minimal in the
    sense that removing any element of the argument would lead to a structure that
    is not an argument. In this brief note we discuss these two types of minimality
    and show how the first kind of minimality can, if desired, be recovered in
    ASPIC+.

    Procedural Content Generation via Machine Learning (PCGML)

    Adam Summerville, Sam Snodgrass, Matthew Guzdial, Christoffer Holmgård, Amy K. Hoover, Aaron Isaksen, Andy Nealen, Julian Togelius
    Subjects: Artificial Intelligence (cs.AI)

    This survey explores Procedural Content Generation via Machine Learning
    (PCGML), defined as the generation of game content using machine learning
    models trained on existing content. As the importance of PCG for game
    development increases, researchers explore new avenues for generating
    high-quality content with or without human involvement; this paper addresses
    the relatively new paradigm of using machine learning (in contrast with
    search-based, solver-based, and constructive methods). We focus on what is most
    often considered functional game content such as platformer levels, game maps,
    interactive fiction stories, and cards in collectible card games, as opposed to
    cosmetic content such as sprites and sound effects. In addition to using PCG
    for autonomous generation, co-creativity, mixed-initiative design, and
    compression, PCGML is suited for repair, critique, and content analysis because
    of its focus on modeling existing content. We discuss various data sources and
    representations that affect the resulting generated content. Multiple PCGML
    methods are covered, including neural networks, long short-term memory (LSTM)
    networks, autoencoders, and deep convolutional networks; Markov models,
    (n)-grams, and multi-dimensional Markov chains; clustering; and matrix
    factorization. Finally, we discuss open problems in the application of PCGML,
    including learning from small datasets, lack of training data, multi-layered
    learning, style-transfer, parameter tuning, and PCG as a game mechanic.

    Multilingual and Cross-lingual Timeline Extraction

    Egoitz Laparra, Rodrigo Agerri, Itziar Aldabe, German Rigau
    Comments: 20 pages, 7 tables, 7 figures; submitted to Knowledge Based Systems (Elsevier), January, 2017
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

    In this paper we present an approach to extract ordered timelines of events,
    their participants, locations and times from a set of multilingual and
    cross-lingual data sources. Based on the assumption that event-related
    information can be recovered from different documents written in different
    languages, we extend the Cross-document Event Ordering task presented at
    SemEval 2015 by specifying two new tasks for, respectively, Multilingual and
    Cross-lingual Timeline Extraction. We then develop three deterministic
    algorithms for timeline extraction based on two main ideas. First, we address
    implicit temporal relations at document level since explicit time-anchors are
    too scarce to build a wide coverage timeline extraction system. Second, we
    leverage several multilingual resources to obtain a single, inter-operable,
    semantic representation of events across documents and across languages. The
    result is a highly competitive system that strongly outperforms the current
    state-of-the-art. Nonetheless, further analysis of the results reveals that
    linking the event mentions with their target entities and time-anchors remains
    a difficult challenge. The systems, resources and scorers are freely available
    to facilitate its use and guarantee the reproducibility of results.


    Information Retrieval

    Semantic URL Analytics to Support Efficient Annotation of Large Scale Web Archives

    Tarcisio Souza, Elena Demidova, Thomas Risse, Helge Holzmann, Gerhard Gossen, Julian Szymanski
    Subjects: Information Retrieval (cs.IR)

    Long-term Web archives comprise Web documents gathered over longer time
    periods and can easily reach hundreds of terabytes in size. Semantic
    annotations such as named entities can facilitate intelligent access to the Web
    archive data. However, the annotation of the entire archive content on this
    scale is often infeasible. The most efficient way to access the documents
    within Web archives is provided through their URLs, which are typically stored
    in dedicated index files.The URLs of the archived Web documents can contain
    semantic information and can offer an efficient way to obtain initial semantic
    annotations for the archived documents. In this paper, we analyse the
    applicability of semantic analysis techniques such as named entity extraction
    to the URLs in a Web archive. We evaluate the precision of the named entity
    extraction from the URLs in the Popular German Web dataset and analyse the
    proportion of the archived URLs from 1,444 popular domains in the time interval
    from 2000 to 2012 to which these techniques are applicable. Our results
    demonstrate that named entity recognition can be successfully applied to a
    large number of URLs in our Web archive and provide a good starting point to
    efficiently annotate large scale collections of Web documents.


    Computation and Language

    Symbolic, Distributed and Distributional Representations for Natural Language Processing in the Era of Deep Learning: a Survey

    Lorenzo Ferrone, Fabio Massimo Zanzotto
    Comments: 25 pages
    Subjects: Computation and Language (cs.CL)

    Natural language and symbols are intimately correlated. Recent advances in
    machine learning (ML) and in natural language processing (NLP) seem to
    contradict the above intuition: symbols are fading away, erased by vectors or
    tensors called distributed and distributional representations. However, there
    is a strict link between distributed/distributional representations and
    symbols, being the first an approximation of the second. A clearer
    understanding of the strict link between distributed/distributional
    representations and symbols will certainly lead to radically new deep learning
    networks. In this paper we make a survey that aims to draw the link between
    symbolic representations and distributed/distributional representations. This
    is the right time to revitalize the area of interpreting how symbols are
    represented inside neural networks.

    Analysing Temporal Evolution of Interlingual Wikipedia Article Pairs

    Simon Gottschalk, Elena Demidova
    Comments: Published in the SIGIR ’16 Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval
    Subjects: Computation and Language (cs.CL)

    Wikipedia articles representing an entity or a topic in different language
    editions evolve independently within the scope of the language-specific user
    communities. This can lead to different points of views reflected in the
    articles, as well as complementary and inconsistent information. An analysis of
    how the information is propagated across the Wikipedia language editions can
    provide important insights in the article evolution along the temporal and
    cultural dimensions and support quality control. To facilitate such analysis,
    we present MultiWiki – a novel web-based user interface that provides an
    overview of the similarities and differences across the article pairs
    originating from different language editions on a timeline. MultiWiki enables
    users to observe the changes in the interlingual article similarity over time
    and to perform a detailed visual comparison of the article snapshots at a
    particular time point.

    Multilingual and Cross-lingual Timeline Extraction

    Egoitz Laparra, Rodrigo Agerri, Itziar Aldabe, German Rigau
    Comments: 20 pages, 7 tables, 7 figures; submitted to Knowledge Based Systems (Elsevier), January, 2017
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

    In this paper we present an approach to extract ordered timelines of events,
    their participants, locations and times from a set of multilingual and
    cross-lingual data sources. Based on the assumption that event-related
    information can be recovered from different documents written in different
    languages, we extend the Cross-document Event Ordering task presented at
    SemEval 2015 by specifying two new tasks for, respectively, Multilingual and
    Cross-lingual Timeline Extraction. We then develop three deterministic
    algorithms for timeline extraction based on two main ideas. First, we address
    implicit temporal relations at document level since explicit time-anchors are
    too scarce to build a wide coverage timeline extraction system. Second, we
    leverage several multilingual resources to obtain a single, inter-operable,
    semantic representation of events across documents and across languages. The
    result is a highly competitive system that strongly outperforms the current
    state-of-the-art. Nonetheless, further analysis of the results reveals that
    linking the event mentions with their target entities and time-anchors remains
    a difficult challenge. The systems, resources and scorers are freely available
    to facilitate its use and guarantee the reproducibility of results.

    AMR-to-text Generation with Synchronous Node Replacement Grammar

    Linfeng Song, Xiaochang Peng, Yue Zhang, Zhiguo Wang, Daniel Gildea
    Subjects: Computation and Language (cs.CL)

    This paper addresses the task of AMR-to-text generation by leveraging
    synchronous node replacement grammar. During training, graph-to-string rules
    are learned using a heuristic extraction algorithm. At test time, a graph
    transducer is applied to collapse input AMRs and generate output sentences.
    Evaluated on SemEval-2016 Task 8, our method gives a BLEU score of 25.62, which
    is the best reported so far.

    Modelling dependency completion in sentence comprehension as a Bayesian hierarchical mixture process: A case study involving Chinese relative clauses

    Shravan Vasishth, Nicolas Chopin, Robin Ryder, Bruno Nicenboim
    Comments: 6 pages, 3 figures. Submitted to the conference Cognitive Science 2017, London, UK
    Subjects: Applications (stat.AP); Computation and Language (cs.CL); Methodology (stat.ME); Machine Learning (stat.ML)

    In sentence comprehension, it is widely assumed (Gibson 2000, Lewis &
    Vasishth, 2005) that the distance between linguistic co-dependents affects the
    latency of dependency resolution: the longer the distance, the longer the
    retrieval time (the distance-based account). An alternative theory of
    dependency resolution difficulty is the direct-access model (McElree et al.,
    2003); this model assumes that retrieval times are a mixture of two
    distributions: one distribution represents successful retrieval and the other
    represents an initial failure to retrieve the correct dependent, followed by a
    reanalysis that leads to successful retrieval. The time needed for a successful
    retrieval is independent of the dependency distance (cf. the distance-based
    account), but reanalyses cost extra time, and the proportion of failures
    increases with increasing dependency distance. We implemented a series of
    increasingly complex hierarchical Bayesian models to compare the distance-based
    account and the direct-access model; the latter was implemented as a
    hierarchical finite mixture model with heterogeneous variances for the two
    mixture distributions. We evaluated the models using two published data-sets on
    Chinese relative clauses which have been used to argue in favour of the
    distance account, but this account has found little support in subsequent work
    (e.g., J”ager et al., 2015). The hierarchical finite mixture model, i.e., an
    implementation of direct-access, is shown to provide a superior account of the
    data than the distance account.

    Deep Learning the Indus Script

    Satish Palaniappan, Ronojoy Adhikari
    Comments: 17 pages, 10 figures, 7 supporting figures (2 pages)
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Learning (cs.LG)

    Standardized corpora of undeciphered scripts, a necessary starting point for
    computational epigraphy, requires laborious human effort for their preparation
    from raw archaeological records. Automating this process through machine
    learning algorithms can be of significant aid to epigraphical research. Here,
    we take the first steps in this direction and present a deep learning pipeline
    that takes as input images of the undeciphered Indus script, as found in
    archaeological artifacts, and returns as output a string of graphemes, suitable
    for inclusion in a standard corpus. The image is first decomposed into regions
    using Selective Search and these regions are classified as containing textual
    and/or graphical information using a convolutional neural network. Regions
    classified as potentially containing text are hierarchically merged and trimmed
    to remove non-textual information. The remaining textual part of the image is
    segmented using standard image processing techniques to isolate individual
    graphemes. This set is finally passed to a second convolutional neural network
    to classify the graphemes, based on a standard corpus. The classifier can
    identify the presence or absence of the most frequent Indus grapheme, the “jar”
    sign, with an accuracy of 92%. Our results demonstrate the great potential of
    deep learning approaches in computational epigraphy and, more generally, in the
    digital humanities.

    Predicting Auction Price of Vehicle License Plate with Deep Recurrent Neural Network

    Vinci Chow
    Subjects: Computation and Language (cs.CL); Learning (cs.LG); Economics (q-fin.EC); Machine Learning (stat.ML)

    In Chinese societies where superstition is of paramount importance, vehicle
    license plates with desirable numbers can fetch for very high prices in
    auctions. Unlike auctions of other valuable items, however, license plates do
    not get an estimated price before auction. In this paper, I propose that the
    task of predicting plate prices can be viewed as a natural language processing
    task, because the value of a plate depends on the meaning of each individual
    character on the plate as well as the semantics. I construct a deep recurrent
    neural network to predict the prices of vehicle license plates in Hong Kong
    based on the characters on a plate. Trained with 13-years of historical auction
    prices, the deep RNN outperforms previous models by significant margin.


    Distributed, Parallel, and Cluster Computing

    Algorithmic Performance-Accuracy Trade-off in 3D Vision Applications Using HyperMapper

    Luigi Nardi, Bruno Bodin, Sajad Saeedi, Emanuele Vespa, Andrew J. Davison, Paul H. J. Kelly
    Comments: 10 pages, Keywords: design space exploration, machine learning, computer vision, SLAM, embedded systems, GPU, crowd-sourcing
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC); Learning (cs.LG); Performance (cs.PF)

    In this paper we investigate an emerging application, 3D scene understanding,
    likely to be significant in the mobile space in the near future. The goal of
    this exploration is to reduce execution time while meeting our quality of
    result objectives. In previous work we showed for the first time that it is
    possible to map this application to power constrained embedded systems,
    highlighting that decision choices made at the algorithmic design-level have
    the most impact.

    As the algorithmic design space is too large to be exhaustively evaluated, we
    use a previously introduced multi-objective Random Forest Active Learning
    prediction framework dubbed HyperMapper, to find good algorithmic designs. We
    show that HyperMapper generalizes on a recent cutting edge 3D scene
    understanding algorithm and on a modern GPU-based computer architecture.
    HyperMapper is able to beat an expert human hand-tuning the algorithmic
    parameters of the class of Computer Vision applications taken under
    consideration in this paper automatically. In addition, we use crowd-sourcing
    using a 3D scene understanding Android app to show that the Pareto front
    obtained on an embedded system can be used to accelerate the same application
    on all the 83 smart-phones and tablets crowd-sourced with speedups ranging from
    2 to over 12.


    Learning

    HashNet: Deep Learning to Hash by Continuation

    Zhangjie Cao, Mingsheng Long, Jianmin Wang, Philip S. Yu
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)

    Learning to hash has been widely applied to approximate nearest neighbor
    search for large-scale multimedia retrieval, due to its computation efficiency
    and retrieval quality. Deep learning to hash, which improves retrieval quality
    by end-to-end representation learning and hash encoding, has received
    increasing attention recently. Subject to the vanishing gradient difficulty in
    the optimization with binary activations, existing deep learning to hash
    methods need to first learn continuous representations and then generate binary
    hash codes in a separated binarization step, which suffer from substantial loss
    of retrieval quality. This paper presents HashNet, a novel deep architecture
    for deep learning to hash by continuation method, which learns exactly binary
    hash codes from imbalanced similarity data where the number of similar pairs is
    much smaller than the number of dissimilar pairs. The key idea is to attack the
    vanishing gradient problem in optimizing deep networks with non-smooth binary
    activations by continuation method, in which we begin from learning an easier
    network with smoothed activation function and let it evolve during the
    training, until it eventually goes back to being the original, difficult to
    optimize, deep network with the sign activation function. Comprehensive
    empirical evidence shows that HashNet can generate exactly binary hash codes
    and yield state-of-the-art multimedia retrieval performance on standard
    benchmarks.

    Optimal Schemes for Discrete Distribution Estimation under Locally Differential Privacy

    Min Ye, Alexander Barg
    Comments: A short version is submitted to ISIT 2017
    Subjects: Learning (cs.LG); Information Theory (cs.IT)

    We consider the minimax estimation problem of a discrete distribution with
    support size (k) under privacy constraints. A privatization scheme is applied
    to each raw sample independently, and we need to estimate the distribution of
    the raw samples from the privatized samples. A positive number (epsilon)
    measures the privacy level of a privatization scheme. For a given (epsilon,)
    we consider the problem of constructing optimal privatization schemes with
    (epsilon)-privacy level, i.e., schemes that minimize the expected estimation
    loss for the worst-case distribution. Two schemes in the literature provide
    order optimal performance in the high privacy regime where (epsilon) is very
    close to (0,) and in the low privacy regime where (e^{epsilon}approx k,)
    respectively.

    In this paper, we propose a new family of schemes which substantially improve
    the performance of the existing schemes in the medium privacy regime when (1ll
    e^{epsilon} ll k.) More concretely, we prove that when (3.8 < epsilon
    <ln(k/9) ,) our schemes reduce the expected estimation loss by (50\%) under
    (ell_2^2) metric and by (30\%) under (ell_1) metric over the existing
    schemes. We also prove a lower bound for the region (e^{epsilon} ll k,) which
    implies that our schemes are order optimal in this regime.

    Pixel Recursive Super Resolution

    Ryan Dahl, Mohammad Norouzi, Jonathon Shlens
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    We present a pixel recursive super resolution model that synthesizes
    realistic details into images while enhancing their resolution. A low
    resolution image may correspond to multiple plausible high resolution images,
    thus modeling the super resolution process with a pixel independent conditional
    model often results in averaging different details–hence blurry edges. By
    contrast, our model is able to represent a multimodal conditional distribution
    by properly modeling the statistical dependencies among the high resolution
    image pixels, conditioned on a low resolution input. We employ a PixelCNN
    architecture to define a strong prior over natural images and jointly optimize
    this prior with a deep conditioning convolutional network. Human evaluations
    indicate that samples from our proposed model look more photo realistic than a
    strong L2 regression baseline.

    Natasha: Faster Stochastic Non-Convex Optimization via Strongly Non-Convex Parameter

    Zeyuan Allen-Zhu
    Subjects: Optimization and Control (math.OC); Data Structures and Algorithms (cs.DS); Learning (cs.LG); Machine Learning (stat.ML)

    Given a non-convex function (f(x)) that is an average of (n) smooth
    functions, we design stochastic first-order methods to find its approximate
    stationary points. The performance of our new methods depend on the smallest
    (negative) eigenvalue (-sigma) of the Hessian. This parameter (sigma)
    captures how strongly non-convex (f(x)) is, and is analogous to the strong
    convexity parameter for convex optimization.

    Our methods outperform the best known results for a wide range of (sigma),
    and can also be used to find approximate local minima.

    In particular, we find an interesting dichotomy: there exists a threshold
    (sigma_0) so that the fastest methods for (sigma>sigma_0) and for
    (sigma<sigma_0) have drastically different behaviors: the former scales with
    (n^{2/3}) and the latter scales with (n^{3/4}).

    IQN: An Incremental Quasi-Newton Method with Local Superlinear Convergence Rate

    Aryan Mokhtari, Mark Eisen, Alejandro Ribeiro
    Subjects: Optimization and Control (math.OC); Learning (cs.LG)

    This paper studies the problem of minimizing a global objective function
    which can be written as the average of a set of (n) smooth and strongly convex
    functions. Quasi-Newton methods, which build on the idea of approximating the
    Newton step using the first-order information of the objective function, are
    successful in reducing the computational complexity of Newton’s method by
    avoiding the Hessian and its inverse computation at each iteration, while
    converging at a superlinear rate to the optimal argument. However, quasi-Newton
    methods are impractical for solving the finite sum minimization problem since
    they operate on the information of all (n) functions at each iteration. This
    issue has been addressed by incremental quasi-Newton methods which use the
    information of a subset of functions at each iteration. Although incremental
    quasi-Newton methods are able to reduce the computational complexity of
    traditional quasi-Newton methods significantly, they fail to converge at a
    superlinear rate. In this paper, we propose the IQN method as the first
    incremental quasi-Newton method with a local superlinear convergence rate. In
    IQN, we compute and update the information of only a single function at each
    iteration and use the gradient information to approximate the Newton direction
    without a computationally expensive inversion. IQN differs from
    state-of-the-art incremental quasi-Newton methods in three criteria. First, the
    use of aggregated information of variables, gradients, and quasi-Newton Hessian
    approximations; second, the approximation of each individual function by its
    Taylor’s expansion in which the linear and quadratic terms are evaluated with
    respect to the same iterate; and third, the use of a cyclic scheme to update
    the functions in lieu of a random selection routine. We use these fundamental
    properties of IQN to establish its local superlinear convergence rate.

    Deep Learning the Indus Script

    Satish Palaniappan, Ronojoy Adhikari
    Comments: 17 pages, 10 figures, 7 supporting figures (2 pages)
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Learning (cs.LG)

    Standardized corpora of undeciphered scripts, a necessary starting point for
    computational epigraphy, requires laborious human effort for their preparation
    from raw archaeological records. Automating this process through machine
    learning algorithms can be of significant aid to epigraphical research. Here,
    we take the first steps in this direction and present a deep learning pipeline
    that takes as input images of the undeciphered Indus script, as found in
    archaeological artifacts, and returns as output a string of graphemes, suitable
    for inclusion in a standard corpus. The image is first decomposed into regions
    using Selective Search and these regions are classified as containing textual
    and/or graphical information using a convolutional neural network. Regions
    classified as potentially containing text are hierarchically merged and trimmed
    to remove non-textual information. The remaining textual part of the image is
    segmented using standard image processing techniques to isolate individual
    graphemes. This set is finally passed to a second convolutional neural network
    to classify the graphemes, based on a standard corpus. The classifier can
    identify the presence or absence of the most frequent Indus grapheme, the “jar”
    sign, with an accuracy of 92%. Our results demonstrate the great potential of
    deep learning approaches in computational epigraphy and, more generally, in the
    digital humanities.

    Recovering True Classifier Performance in Positive-Unlabeled Learning

    Shantanu Jain, Martha White, Predrag Radivojac
    Comments: Full paper with supplement
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    A common approach in positive-unlabeled learning is to train a classification
    model between labeled and unlabeled data. This strategy is in fact known to
    give an optimal classifier under mild conditions; however, it results in biased
    empirical estimates of the classifier performance. In this work, we show that
    the typically used performance measures such as the receiver operating
    characteristic curve, or the precision-recall curve obtained on such data can
    be corrected with the knowledge of class priors; i.e., the proportions of the
    positive and negative examples in the unlabeled data. We extend the results to
    a noisy setting where some of the examples labeled positive are in fact
    negative and show that the correction also requires the knowledge of the
    proportion of noisy examples in the labeled positives. Using state-of-the-art
    algorithms to estimate the positive class prior and the proportion of noise, we
    experimentally evaluate two correction approaches and demonstrate their
    efficacy on real-life data.

    Segmentation of optic disc, fovea and retinal vasculature using a single convolutional neural network

    Jen Hong Tan, U. Rajendra Acharya, Sulatha V. Bhandary, Kuang Chua Chua, Sobha Sivaprasad
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    We have developed and trained a convolutional neural network to automatically
    and simultaneously segment optic disc, fovea and blood vessels. Fundus images
    were normalised before segmentation was performed to enforce consistency in
    background lighting and contrast. For every effective point in the fundus
    image, our algorithm extracted three channels of input from the neighbourhood
    of the point and forward the response across the 7 layer network. In average,
    our segmentation achieved an accuracy of 92.68 percent on the testing set from
    Drive database.

    Algorithmic Performance-Accuracy Trade-off in 3D Vision Applications Using HyperMapper

    Luigi Nardi, Bruno Bodin, Sajad Saeedi, Emanuele Vespa, Andrew J. Davison, Paul H. J. Kelly
    Comments: 10 pages, Keywords: design space exploration, machine learning, computer vision, SLAM, embedded systems, GPU, crowd-sourcing
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC); Learning (cs.LG); Performance (cs.PF)

    In this paper we investigate an emerging application, 3D scene understanding,
    likely to be significant in the mobile space in the near future. The goal of
    this exploration is to reduce execution time while meeting our quality of
    result objectives. In previous work we showed for the first time that it is
    possible to map this application to power constrained embedded systems,
    highlighting that decision choices made at the algorithmic design-level have
    the most impact.

    As the algorithmic design space is too large to be exhaustively evaluated, we
    use a previously introduced multi-objective Random Forest Active Learning
    prediction framework dubbed HyperMapper, to find good algorithmic designs. We
    show that HyperMapper generalizes on a recent cutting edge 3D scene
    understanding algorithm and on a modern GPU-based computer architecture.
    HyperMapper is able to beat an expert human hand-tuning the algorithmic
    parameters of the class of Computer Vision applications taken under
    consideration in this paper automatically. In addition, we use crowd-sourcing
    using a 3D scene understanding Android app to show that the Pareto front
    obtained on an embedded system can be used to accelerate the same application
    on all the 83 smart-phones and tablets crowd-sourced with speedups ranging from
    2 to over 12.

    Electron-Proton Dynamics in Deep Learning

    Qiuyi Zhang, Rina Panigrahy, Sushant Sachdeva, Ali Rahimi
    Subjects: Data Structures and Algorithms (cs.DS); Learning (cs.LG); Data Analysis, Statistics and Probability (physics.data-an)

    We study the efficacy of learning neural networks with neural networks by the
    (stochastic) gradient descent method. While gradient descent enjoys empirical
    success in a variety of applications, there is a lack of theoretical guarantees
    that explains the practical utility of deep learning. We focus on two-layer
    neural networks with a linear activation on the output node. We show that under
    some mild assumptions and certain classes of activation functions, gradient
    descent does learn the parameters of the neural network and converges to the
    global minima. Using a node-wise gradient descent algorithm, we show that
    learning can be done in finite, sometimes (poly(d,1/epsilon)), time and sample
    complexity.

    Generative Adversarial Networks recover features in astrophysical images of galaxies beyond the deconvolution limit

    Kevin Schawinski, Ce Zhang, Hantian Zhang, Lucas Fowler, Gokula Krishnan Santhanam
    Comments: Accepted for publication in MNRAS, for the full code and a virtual machine set up to run it, see this http URL
    Subjects: Instrumentation and Methods for Astrophysics (astro-ph.IM); Astrophysics of Galaxies (astro-ph.GA); Learning (cs.LG); Machine Learning (stat.ML)

    Observations of astrophysical objects such as galaxies are limited by various
    sources of random and systematic noise from the sky background, the optical
    system of the telescope and the detector used to record the data. Conventional
    deconvolution techniques are limited in their ability to recover features in
    imaging data by the Shannon-Nyquist sampling theorem. Here we train a
    generative adversarial network (GAN) on a sample of (4,550) images of nearby
    galaxies at (0.01<z<0.02) from the Sloan Digital Sky Survey and conduct
    (10 imes) cross validation to evaluate the results. We present a method using
    a GAN trained on galaxy images that can recover features from artificially
    degraded images with worse seeing and higher noise than the original with a
    performance which far exceeds simple deconvolution. The ability to better
    recover detailed features such as galaxy morphology from low-signal-to-noise
    and low angular resolution imaging data significantly increases our ability to
    study existing data sets of astrophysical objects as well as future
    observations with observatories such as the Large Synoptic Sky Telescope (LSST)
    and the Hubble and James Webb space telescopes.

    Pose-Selective Max Pooling for Measuring Similarity

    Xiang Xiang, Trac D. Tran
    Comments: The tutorial and program associated with this paper are available at this https URL yet for non-commercial use
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    In this paper, we deal with two challenges for measuring the similarity of
    the subject identities in practical video-based face recognition – the
    variation of the head pose in uncontrolled environments and the computational
    expense of processing videos. Since the frame-wise feature mean is unable to
    characterize the pose diversity among frames, we define and preserve the
    overall pose diversity and closeness in a video. Then, identity will be the
    only source of variation across videos since the pose varies even within a
    single video. Instead of simply using all the frames, we select those faces
    whose pose point is closest to the centroid of the K-means cluster containing
    that pose point. Then, we represent a video as a bag of frame-wise deep face
    features while the number of features has been reduced from hundreds to K.
    Since the video representation can well represent the identity, now we measure
    the subject similarity between two videos as the max correlation among all
    possible pairs in the two bags of features. On the official 5,000 video-pairs
    of the YouTube Face dataset for face verification, our algorithm achieves a
    comparable performance with VGG-face that averages over deep features of all
    frames. Other vision tasks can also benefit from the generic idea of employing
    geometric cues to improve the descriptiveness of deep features.


    Information Theory

    Complexity-Aware Scheduling for an LDPC Encoded C-RAN Uplink

    Kyle Whetzel, Matthew C. Valenti
    Comments: Conference on Information Sciences and Systems (CISS) 2017, to appear
    Subjects: Information Theory (cs.IT)

    Centralized Radio Access Network (C-RAN) is a new paradigm for wireless
    networks that centralizes the signal processing in a computing cloud, allowing
    commodity computational resources to be pooled. While C-RAN improves
    utilization and efficiency, the computational load occasionally exceeds the
    available resources, creating a computational outage. This paper provides a
    mathematical characterization of the computational outage probability for
    low-density parity check (LDPC) codes, a common class of error-correcting
    codes. For tractability, a binary erasures channel is assumed. Using the
    concept of density evolution, the computational demand is determined for a
    given ensemble of codes as a function of the erasure probability. The analysis
    reveals a trade-off: aggressively signaling at a high rate stresses the
    computing pool, while conservatively backing-off the rate can avoid
    computational outages. Motivated by this trade-off, an effective
    computationally aware scheduling algorithm is developed that balances demands
    for high throughput and low outage rates.

    On the Input-Degradedness and Input-Equivalence Between Channels

    Rajai Nasser
    Comments: 30 pages. Submitted to IEEE Trans. Inform. Theory and in part to ISIT2017. arXiv admin note: substantial text overlap with arXiv:1701.04467
    Subjects: Information Theory (cs.IT)

    A channel (W) is said to be input-degraded from another channel (W’) if (W)
    can be simulated from (W’) by randomization at the input. We provide a
    necessary and sufficient condition for a channel to be input-degraded from
    another one. We show that any decoder that is good for (W’) is also good for
    (W). We provide two characterizations for input-degradedness, one of which is
    similar to the Blackwell-Sherman-Stein theorem. We say that two channels are
    input-equivalent if they are input-degraded from each other. We study the
    topologies that can be constructed on the space of input-equivalent channels,
    and we investigate their properties. Moreover, we study the continuity of
    several channel parameters and operations under these topologies.

    On the Information Dimension of Stochastic Processes

    Bernhard C. Geiger, Tobias Koch
    Comments: 13 pages
    Subjects: Information Theory (cs.IT)

    Jalali and Poor (“Universal compressed sensing,” arXiv:1406.7807v3, Jan.
    2016) have recently proposed a generalization of R’enyi’s information
    dimension to stationary stochastic processes by defining the information
    dimension rate as the information dimension of (k) samples divided by (k) in
    the limit as (k oinfty). This paper proposes an alternative definition of
    information dimension rate as the entropy rate of the uniformly-quantized
    stochastic process divided by minus the logarithm of the quantizer step size
    (1/m) in the limit as (m oinfty). It is demonstrated that both definitions
    are equivalent for stochastic processes that are (psi^*)-mixing, but may
    differ in general. In particular, it is shown that for Gaussian processes with
    essentially-bounded power spectral density (PSD), the proposed information
    dimension rate equals the Lebesgue measure of the PSD’s support. This is in
    stark contrast to the information dimension rate proposed by Jalali and Poor,
    which is (1) if the process’s PSD is positive on any set with positive Lebesgue
    measure, irrespective of its support size.

    Random Ensembles of Lattices from Generalized Reductions

    Antonio Campello
    Comments: 15 pages
    Subjects: Information Theory (cs.IT); Metric Geometry (math.MG)

    We propose a general framework to study constructions of Euclidean lattices
    from linear codes over finite fields. In particular, we prove general
    conditions for an ensemble constructed using linear codes to contain dense
    lattices (i.e., with packing density comparable to the Minkowski-Hlawka lower
    bound). Specializing to number field lattices, we obtain a number of
    interesting corollaries – for instance, the best known packing density of ideal
    lattices, and an elementary coding-theoretic construction of asymptotically
    dense Hurwitz lattices. All results are algorithmically effective, in the sense
    that, for any dimension, a finite family containing dense lattices is
    exhibited. For suitable constructions based on Craig’s lattices, this family is
    significantly smaller, in terms of alphabet-size, than previous ones in the
    literature.

    Joint Offloading and Computing Optimization in Wireless Powered Mobile-Edge Computing Systems

    Feng Wang, Jie Xu, Xin Wang, Shuguang Cui
    Comments: Accepted by IEEE ICC 2017
    Subjects: Information Theory (cs.IT)

    Integrating mobile-edge computing (MEC) and wireless power transfer (WPT) is
    a promising technique in the Internet of Things (IoT) era. It can provide
    massive lowpower mobile devices with enhanced computation capability and
    sustainable energy supply. In this paper, we consider a wireless powered
    multiuser MEC system, where a multi-antenna access point (AP) (integrated with
    an MEC server) broadcasts wireless power to charge multiple users and each user
    node relies on the harvested energy to execute latency-sensitive computation
    tasks. With MEC, these users can execute their respective tasks locally by
    themselves or offload all or part of the tasks to the AP based on a time
    division multiple access (TDMA) protocol. Under this setup, we pursue an
    energy-efficient wireless powered MEC system design by jointly optimizing the
    transmit energy beamformer at the AP, the central processing unit (CPU)
    frequency and the offloaded bits at each user, as well as the time allocation
    among different users. In particular, we minimize the energy consumption at the
    AP over a particular time block subject to the computation latency and energy
    harvesting constraints per user. By formulating this problem into a convex
    framework and employing the Lagrange duality method, we obtain its optimal
    solution in a semi-closed form. Numerical results demonstrate the benefit of
    the proposed joint design over alternative benchmark schemes in terms of the
    achieved energy efficiency.

    Ultra Reliable Short Message Relaying with Wireless Power Transfer

    Onel L. Alcaraz López, Hirley Alves, Richard Demo Souza, Evelio Martín García Fernández
    Comments: Accepted in ICC 2017
    Subjects: Information Theory (cs.IT); Applications (stat.AP)

    We consider a dual-hop wireless network where an energy constrained relay
    node first harvests energy through the received radio-frequency signal from the
    source, and then uses the harvested energy to forward the source’s information
    to the destination node. The throughput and delay metrics are investigated for
    a decode-and-forward relaying mechanism at finite blocklength regime and
    delay-limited transmission mode. We consider ultra-reliable communication
    scenarios under discussion for the next fifth-generation of wireless systems,
    with error and latency constraints. The impact on these metrics of the
    blocklength, information bits, and relay position is investigated.

    Uplink Multiuser Massive MIMO Systems with One-Bit ADCs: A Coding-Theoretic Viewpoint

    Seonho Kim, Namyoon Lee, Songnam Hong
    Comments: to be published in IEEE WCNC 2017
    Subjects: Information Theory (cs.IT)

    This paper investigates an uplink multiuser massive multiple-input
    multiple-output (MIMO) system with one-bit analog-to-digital converters (ADCs),
    in which (K) users with a single-antenna communicate with one base station (BS)
    with (n_r) antennas. In this system, we propose a novel MIMO detection
    framework, which is inspired by coding theory. The key idea of the proposed
    framework is to create a non-linear code (Cc) of length (n_r) and rate (K/n_r)
    using the encoding function that is completely characterized by a non-linear
    MIMO channel matrix. From this, a multiuser MIMO detection problem is converted
    into an equivalent channel coding problem, in which a codeword of the (Cc) is
    sent over (n_r) parallel binary symmetric channels, each with different
    crossover probabilities. Levereging this framework, we develop a maximum
    likelihood decoding method, and show that the minimum distance of the (Cc) is
    strongly related to a diversity order. Furthermore, we propose a practical
    implementation method of the proposed framework when the channel state
    information is not known to the BS. The proposed method is to estimate the code
    (Cc) at the BS using a training sequence. Then, the proposed {em weighted}
    minimum distance decoding is applied. Simulations results show that the
    proposed method almost achieves an ideal performance with a reasonable training
    overhead.

    Information-theoretic interpretation of tuning curves for multiple motion directions

    Wentao Huang, Xin Huang, Kechen Zhang
    Comments: The 51st Annual Conference on Information Sciences and Systems (CISS), 2017
    Subjects: Information Theory (cs.IT); Neural and Evolutionary Computing (cs.NE); Neurons and Cognition (q-bio.NC); Quantitative Methods (q-bio.QM)

    We have developed an efficient information-maximization method for computing
    the optimal shapes of tuning curves of sensory neurons by optimizing the
    parameters of the underlying feedforward network model. When applied to the
    problem of population coding of visual motion with multiple directions, our
    method yields several types of tuning curves with both symmetric and asymmetric
    shapes that resemble what have been found in the visual cortex. Our result
    suggests that the diversity or heterogeneity of tuning curve shapes as observed
    in neurophysiological experiment might actually constitute an optimal
    population representation of visual motions with multiple components.

    Optimal Schemes for Discrete Distribution Estimation under Locally Differential Privacy

    Min Ye, Alexander Barg
    Comments: A short version is submitted to ISIT 2017
    Subjects: Learning (cs.LG); Information Theory (cs.IT)

    We consider the minimax estimation problem of a discrete distribution with
    support size (k) under privacy constraints. A privatization scheme is applied
    to each raw sample independently, and we need to estimate the distribution of
    the raw samples from the privatized samples. A positive number (epsilon)
    measures the privacy level of a privatization scheme. For a given (epsilon,)
    we consider the problem of constructing optimal privatization schemes with
    (epsilon)-privacy level, i.e., schemes that minimize the expected estimation
    loss for the worst-case distribution. Two schemes in the literature provide
    order optimal performance in the high privacy regime where (epsilon) is very
    close to (0,) and in the low privacy regime where (e^{epsilon}approx k,)
    respectively.

    In this paper, we propose a new family of schemes which substantially improve
    the performance of the existing schemes in the medium privacy regime when (1ll
    e^{epsilon} ll k.) More concretely, we prove that when (3.8 < epsilon
    <ln(k/9) ,) our schemes reduce the expected estimation loss by (50\%) under
    (ell_2^2) metric and by (30\%) under (ell_1) metric over the existing
    schemes. We also prove a lower bound for the region (e^{epsilon} ll k,) which
    implies that our schemes are order optimal in this regime.




沪ICP备19023445号-2号
友情链接