IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Thu, 2 Feb 2017

    我爱机器学习(52ml.net)发表于 2017-02-02 00:00:00
    love 0

    Neural and Evolutionary Computing

    Robust Order Scheduling in the Fashion Industry: A Multi-Objective Optimization Approach

    Wei Du, Yang Tang, Sunney Yung Sun Leung, Le Tong, Athanasios V. Vasilakos, Feng Qian
    Subjects: Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    In the fashion industry, order scheduling focuses on the assignment of
    production orders to appropriate production lines. In reality, before a new
    order can be put into production, a series of activities known as
    pre-production events need to be completed. In addition, in real production
    process, owing to various uncertainties, the daily production quantity of each
    order is not always as expected. In this research, by considering the
    pre-production events and the uncertainties in the daily production quantity,
    robust order scheduling problems in the fashion industry are investigated with
    the aid of a multi-objective evolutionary algorithm (MOEA) called nondominated
    sorting adaptive differential evolution (NSJADE). The experimental results
    illustrate that it is of paramount importance to consider pre-production events
    in order scheduling problems in the fashion industry. We also unveil that the
    existence of the uncertainties in the daily production quantity heavily affects
    the order scheduling.

    Low-Dose CT with a Residual Encoder-Decoder Convolutional Neural Network (RED-CNN)

    Hu Chen, Yi Zhang, Mannudeep K. Kalra, Feng Lin, Peixi Liao, Jiliu Zhou, Ge Wang
    Subjects: Medical Physics (physics.med-ph); Neural and Evolutionary Computing (cs.NE)

    Given the potential X-ray radiation risk to the patient, low-dose CT has
    attracted a considerable interest in the medical imaging field. The current
    main stream low-dose CT methods include vendor-specific sinogram domain
    filtration and iterative reconstruction, but they need to access original raw
    data whose formats are not transparent to most users. Due to the difficulty of
    modeling the statistical characteristics in the image domain, the existing
    methods for directly processing reconstructed images cannot eliminate image
    noise very well while keeping structural details. Inspired by the idea of deep
    learning, here we combine the autoencoder, the deconvolution network, and
    shortcut connections into the residual encoder-decoder convolutional neural
    network (RED-CNN) for low-dose CT imaging. After patch-based training, the
    proposed RED-CNN achieves a competitive performance relative to
    the-state-of-art methods in both simulated and clinical cases. Especially, our
    method has been favorably evaluated in terms of noise suppression, structural
    preservation and lesion detection.


    Computer Vision and Pattern Recognition

    Product Graph-based Higher Order Contextual Similarities for Inexact Subgraph Matching

    Anjan Dutta, Josep Lladós, Horst Bunke, Umapada Pal
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Many algorithms formulate graph matching as an optimization of an objective
    function of pairwise quantification of nodes and edges of two graphs to be
    matched. Pairwise measurements usually consider local attributes but disregard
    contextual information involved in graph structures. We address this issue by
    proposing contextual similarities between pairs of nodes. This is done by
    considering the tensor product graph (TPG) of two graphs to be matched, where
    each node is an ordered pair of nodes of the operand graphs. Contextual
    similarities between a pair of nodes are computed by accumulating weighted
    walks (normalized pairwise similarities) terminating at the corresponding
    paired node in TPG. Once the contextual similarities are obtained, we formulate
    subgraph matching as a node and edge selection problem in TPG. We use
    contextual similarities to construct an objective function and optimize it with
    a linear programming approach. Since random walk formulation through TPG takes
    into account higher order information, it is not a surprise that we obtain more
    reliable similarities and better discrimination among the nodes and edges.
    Experimental results shown on synthetic as well as real benchmarks illustrate
    that higher order contextual similarities add discriminating power and allow
    one to find approximate solutions to the subgraph matching problem.

    Understanding trained CNNs by indexing neuron selectivity

    Ivet Rafegas, Maria Vanrell, Luis A. Alexandre
    Comments: Under review as a conference paper at ICLR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    The impressive performance and plasticity of convolutional neural networks to
    solve different vision problems are shadowed by their black-box nature and its
    consequent lack of full understanding. To reduce this gap we propose to
    describe the activity of individual neurons by quantifying their inherent
    selectivity to specific properties. Our approach is based on the definition of
    feature selectivity indexes that allow the ranking of neurons according to
    specific properties. Here we report the results of exploring selectivity
    indexes for: (a) an image feature (color); and (b) an image label (class
    membership). Our contribution is a framework to seek or classify neurons by
    indexing on these selectivity properties. It helps to find color selective
    neurons, such as a red-mushroom neuron in layer conv4 or class selective
    neurons such as dog-face neurons in layer conv5, and establishes a methodology
    to derive other selectivity properties. Indexing on neuron selectivity can
    statistically draw how features and classes are represented through layers at a
    moment when the size of trained nets is growing and automatic tools to index
    can be helpful.

    Visual Saliency Prediction Using a Mixture of Deep Neural Networks

    Samuel Dodge, Lina Karam
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Visual saliency models have recently begun to incorporate deep learning to
    achieve predictive capacity much greater than previous unsupervised methods.
    However, most existing models predict saliency using local mechanisms limited
    to the receptive field of the network. We propose a model that incorporates
    global scene semantic information in addition to local information gathered by
    a convolutional neural network. Our model is formulated as a mixture of
    experts. Each expert network is trained to predict saliency for a set of
    closely related images. The final saliency map is computed as a weighted
    mixture of the expert networks’ output, with weights determined by a separate
    gating network. This gating network is guided by global scene information to
    predict weights. The expert networks and the gating network are trained
    simultaneously in an end-to-end manner. We show that our mixture formulation
    leads to improvement in performance over an otherwise identical non-mixture
    model that does not incorporate global scene information.

    Siamese Network of Deep Fisher-Vector Descriptors for Image Retrieval

    Eng-Jon Ong, Sameed Husain, Miroslaw Bober
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper addresses the problem of large scale image retrieval, with the aim
    of accurately ranking the similarity of a large number of images to a given
    query image. To achieve this, we propose a novel Siamese network. This network
    consists of two computational strands, each comprising of a CNN component
    followed by a Fisher vector component. The CNN component produces dense, deep
    convolutional descriptors that are then aggregated by the Fisher Vector method.
    Crucially, we propose to simultaneously learn both the CNN filter weights and
    Fisher Vector model parameters. This allows us to account for the evolving
    distribution of deep descriptors over the course of the learning process. We
    show that the proposed approach gives significant improvements over the
    state-of-the-art methods on the Oxford and Paris image retrieval datasets.
    Additionally, we provide a baseline performance measure for both these datasets
    with the inclusion of 1 million distractors.

    Pixel-wise Ear Detection with Convolutional Encoder-Decoder Networks

    Žiga Emeršič, Luka Lan Gabriel, Vitomir Štruc, Peter Peer
    Comments: 12 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Object detection and segmentation represents the basis for many tasks in
    computer and machine vision. In biometric recognition systems the detection of
    the region-of-interest (ROI) is one of the most crucial steps in the overall
    processing pipeline, significantly impacting the performance of the entire
    recognition system. Existing approaches to ear detection, for example, are
    commonly susceptible to the presence of severe occlusions, ear accessories or
    variable illumination conditions and often deteriorate in their performance if
    applied on ear images captured in unconstrained settings. To address these
    shortcomings, we present in this paper a novel ear detection technique based on
    convolutional encoder-decoder networks (CEDs). For our technique, we formulate
    the problem of ear detection as a two-class segmentation problem and train a
    convolutional encoder-decoder network based on the SegNet architecture to
    distinguish between image-pixels belonging to either the ear or the non-ear
    class. The output of the network is then post-processed to further refine the
    segmentation result and return the final locations of the ears in the input
    image. Different from competing techniques from the literature, our approach
    does not simply return a bounding box around the detected ear, but provides
    detailed, pixel-wise information about the location of the ears in the image.
    Our experiments on a dataset gathered from the web (a.k.a. in the wild) show
    that the proposed technique ensures good detection results in the presence of
    various covariate factors and significantly outperforms the existing
    state-of-the-art.

    Evolving Boxes for fast Vehicle Detection

    Li Wang, Yao Lu, Hong Wang, Yinbin Zheng, Hao Ye, Xiangyang Xue
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We perform fast vehicle detection from traffic surveillance cameras. The
    classic cascade object detection is revisited and a novel deep learning
    framework, namely Evolving Boxes, is developed that proposes and refines the
    object boxes under different feature representations. Specifically, our
    framework is embedded with a light-weight proposal network to generate initial
    anchor boxes as well as to early discard unlikely regions; a fine-turning
    network produces detailed features for these candidate boxes. We show
    intriguingly that by apply different feature fusion techniques, the initial
    boxes can be refined in terms of both localization and recognition, leading to
    evolved boxes. We evaluate our network on the recent DETRAC benchmark and
    obtain a significant improvement over the state-of-the-art Faster RCNN by 9.5%
    mAP. Further, our network achieves 9-13 FPS detection speed on a moderate
    commercial GPU.

    ImageNet MPEG-7 Visual Descriptors – Technical Report

    Frédéric Rayar
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR)

    ImageNet is a large scale and publicly available image database. It currently
    offers more than 14 millions of images, organised according to the WordNet
    hierarchy. One of the main objective of the creators is to provide to the
    research community a relevant database for visual recognition applications such
    as object recognition, image classification or object localisation. However,
    only a few visual descriptors of the images are available to be used by the
    researchers. Only SIFT-based features have been extracted from a subset of the
    collection. This technical report presents the extraction of some MPEG-7 visual
    descriptors from the ImageNet database. These descriptors are made publicly
    available in an effort towards open research.

    A Kinematic Chain Space for Monocular Motion Capture

    Bastian Wandt, Hanno Ackermann, Bodo Rosenhahn
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper deals with motion capture of kinematic chains (e.g. human
    skeletons) from monocular image sequences taken by uncalibrated cameras. We
    present a method based on projecting an observation into a kinematic chain
    space (KCS). An optimization of the nuclear norm is proposed that implicitly
    enforces structural properties of the kinematic chain. Unlike other approaches
    our method does not require specific camera or object motion and is not relying
    on training data or previously determined constraints such as particular body
    lengths. The proposed algorithm is able to reconstruct scenes with limited
    camera motion and previously unseen motions. It is not only applicable to human
    skeletons but also to other kinematic chains for instance animals or industrial
    robots. We achieve state-of-the-art results on different benchmark data bases
    and real world scenes.

    Design, Analysis and Application of A Volumetric Convolutional Neural Network

    Xiaqing Pan, Yueru Chen, C.-C. Jay Kuo
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    The design, analysis and application of a volumetric convolutional neural
    network (VCNN) are studied in this work. Although many CNNs have been proposed
    in the literature, their design is empirical. In the design of the VCNN, we
    propose a feed-forward K-means clustering algorithm to determine the filter
    number and size at each convolutional layer systematically. For the analysis of
    the VCNN, the cause of confusing classes in the output of the VCNN is explained
    by analyzing the relationship between the filter weights (also known as anchor
    vectors) from the last fully-connected layer to the output. Furthermore, a
    hierarchical clustering method followed by a random forest classification
    method is proposed to boost the classification performance among confusing
    classes. For the application of the VCNN, we examine the 3D shape
    classification problem and conduct experiments on a popular ModelNet40 dataset.
    The proposed VCNN offers the state-of-the-art performance among all
    volume-based CNN methods.

    High Order Stochastic Graphlet Embedding for Graph-Based Pattern Recognition

    Anjan Dutta
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Graph-based methods are known to be successful for pattern description and
    comparison purpose. However, a lot of mathematical tools are unavailable in
    graph domain, thus restricting the generic graph-based techniques to be
    applicable within the machine learning framework. A way to tackle this problem
    is graph embedding into high dimensional space in either an explicit or
    implicit manner. In this paper, we propose high order stochastic graphlet
    embedding (SGE) that explicitly embed a graph into a real vector space. Our
    main contribution includes a new stochastic search procedure that allows one to
    efficiently parse a given graph and extract or sample unlimitedly high order
    graphlets. We consider these graphlets with increasing size in order to model
    local features, as well as, their complex interactions. We also introduce or
    design graph hash functions with very low probability of collision to hash
    those sampled graphlets for partitioning them into sets of isomorphic ones and
    measure their distribution in large graph collections, which results in
    accurate graph descriptions. When combined with support vector machines, these
    high order graphlet-based descriptions have positive impact on the performance
    of graph-based pattern comparison and classification as corroborated through
    experiments on different standard benchmark databases.

    Denoising Hyperspectral Image with Non-i.i.d. Noise Structure

    Yang Chen, Xiangyong Cao, Qian Zhao, Deyu Meng, Zongben Xu
    Comments: 13 pages, 14 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Hyperspectral image (HSI) denoising has been attracting much research
    attention in remote sensing area due to its importance in improving the HSI
    qualities. The existing HSI denoising methods mainly focus on specific spectral
    and spatial prior knowledge in HSIs, and share a common underlying assumption
    that the embedded noise in HSI is independent and identically distributed
    (i.i.d.). In real scenarios, however, the noise existed in a natural HSI is
    always with much more complicated non-i.i.d. statistical structures and the
    under-estimation to this noise complexity often tends to evidently degenerate
    the robustness of current methods. To alleviate this issue, this paper attempts
    the first effort to model the HSI noise using a non-i.i.d. mixture of Gaussians
    (NMoG) noise assumption, which is finely in accordance with the noise
    characteristics possessed by a natural HSI and thus is capable of adapting
    various noise shapes encountered in real applications. Then we integrate such
    noise modeling strategy into the low-rank matrix factorization (LRMF) model and
    propose a NMoG-LRMF model in the Bayesian framework. A variational Bayes
    algorithm is designed to infer the posterior of the proposed model. All
    involved parameters can be recursively updated in closed-form. Compared with
    the current techniques, the proposed method performs more robust beyond the
    state-of-the-arts, as substantiated by our experiments implemented on synthetic
    and real noisy HSIs.

    Vertical Landing for Micro Air Vehicles using Event-Based Optical Flow

    Bas J. Pijnacker Hordijk, Kirk Y.W. Scheper, Guido C.H.E. de Croon
    Comments: 29 pages, 14 figures, under peer review
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Small flying robots can perform landing maneuvers using bio-inspired optical
    flow by maintaining a constant divergence. However, optical flow is typically
    estimated from frame sequences recorded by standard miniature cameras. This
    requires processing full images on-board, limiting the update rate of
    divergence measurements, and thus the speed of the control loop and the robot.
    Event-based cameras overcome these limitations by only measuring pixel-level
    brightness changes at microsecond temporal accuracy, hence providing an
    efficient mechanism for optical flow estimation. This paper presents, to the
    best of our knowledge, the first work integrating event-based optical flow
    estimation into the control loop of a flying robot. We extend an existing
    ‘local plane fitting’ algorithm to obtain an improved and more computationally
    efficient optical flow estimation method, valid for a wide range of optical
    flow velocities. This method is validated for real event sequences. In
    addition, a method for estimating the divergence from event-based optical flow
    is introduced, which accounts for the aperture problem. The developed
    algorithms are implemented in a constant divergence landing controller on-board
    of a quadrotor. Experiments show that, using event-based optical flow, accurate
    divergence estimates can be obtained over a wide range of speeds. This enables
    the quadrotor to perform very fast landing maneuvers.

    Spatial Aggregation of Holistically-Nested Convolutional Neural Networks for Automated Pancreas Localization and Segmentation

    Holger R. Roth, Le Lu, Nathan Lay, Adam P. Harrison, Amal Farag, Andrew Sohn, Ronald M. Summers
    Comments: This version was submitted to IEEE Trans. on Medical Imaging on Dec. 18th, 2016. The content of this article is covered by US Patent Applications of 62/345,606# and 62/450,681#
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Accurate and automatic organ segmentation from 3D radiological scans is an
    important yet challenging problem for medical image analysis. Specifically, the
    pancreas demonstrates very high inter-patient anatomical variability in both
    its shape and volume. In this paper, we present an automated system using 3D
    computed tomography (CT) volumes via a two-stage cascaded approach: pancreas
    localization and segmentation. For the first step, we localize the pancreas
    from the entire 3D CT scan, providing a reliable bounding box for the more
    refined segmentation step. We introduce a fully deep-learning approach, based
    on an efficient application of holistically-nested convolutional networks
    (HNNs) on the three orthogonal axial, sagittal, and coronal views. The
    resulting HNN per-pixel probability maps are then fused using pooling to
    reliably produce a 3D bounding box of the pancreas that maximizes the recall.
    We show that our introduced localizer compares favorably to both a conventional
    non-deep-learning method and a recent hybrid approach based on spatial
    aggregation of superpixels using random forest classification. The second,
    segmentation, phase operates within the computed bounding box and integrates
    semantic mid-level cues of deeply-learned organ interior and boundary maps,
    obtained by two additional and separate realizations of HNNs. By integrating
    these two mid-level cues, our method is capable of generating
    boundary-preserving pixel-wise class label maps that result in the final
    pancreas segmentation. Quantitative evaluation is performed on a publicly
    available dataset of 82 patient CT scans using 4-fold cross-validation (CV). We
    achieve a Dice similarity coefficient (DSC) of 81.27+/-6.27% in validation,
    which significantly outperforms previous state-of-the art methods that report
    DSCs of 71.80+/-10.70% and 78.01+/-8.20%, respectively, using the same dataset.


    Artificial Intelligence

    A Hybrid Evolutionary Algorithm Based on Solution Merging for the Longest Arc-Preserving Common Subsequence Problem

    Christian Blum, Maria J. Blesa
    Subjects: Artificial Intelligence (cs.AI)

    The longest arc-preserving common subsequence problem is an NP-hard
    combinatorial optimization problem from the field of computational biology.
    This problem finds applications, in particular, in the comparison of
    arc-annotated Ribonucleic acid (RNA) sequences. In this work we propose a
    simple, hybrid evolutionary algorithm to tackle this problem. The most
    important feature of this algorithm concerns a crossover operator based on
    solution merging. In solution merging, two or more solutions to the problem are
    merged, and an exact technique is used to find the best solution within this
    union. It is experimentally shown that the proposed algorithm outperforms a
    heuristic from the literature.

    Blue Sky Ideas in Artificial Intelligence Education from the EAAI 2017 New and Future AI Educator Program

    Eric Eaton, Sven Koenig, Claudia Schulz, Francesco Maurelli, John Lee, Joshua Eckroth, Mark Crowley, Richard G. Freedman, Rogelio E. Cardona-Rivera, Tiago Machado, Tom Williams
    Comments: Working paper in the 7th Symposium on Educational Advances in Artificial Intelligence (EAAI-17)
    Subjects: Artificial Intelligence (cs.AI); Computers and Society (cs.CY)

    The 7th Symposium on Educational Advances in Artificial Intelligence
    (EAAI’17, co-chaired by Sven Koenig and Eric Eaton) launched the EAAI New and
    Future AI Educator Program to support the training of early-career university
    faculty, secondary school faculty, and future educators (PhD candidates or
    postdocs who intend a career in academia). As part of the program, awardees
    were asked to address one of the following “blue sky” questions:

    * How could/should Artificial Intelligence (AI) courses incorporate ethics
    into the curriculum?

    * How could we teach AI topics at an early undergraduate or a secondary
    school level?

    * AI has the potential for broad impact to numerous disciplines. How could we
    make AI education more interdisciplinary, specifically to benefit
    non-engineering fields?

    This paper is a collection of their responses, intended to help motivate
    discussion around these issues in AI education.

    Towards "AlphaChem": Chemical Synthesis Planning with Tree Search and Deep Neural Network Policies

    Marwin Segler, Mike Preuß, Mark P. Waller
    Comments: 4 pages, 1 figure
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Chemical Physics (physics.chem-ph)

    Retrosynthesis is a technique to plan the chemical synthesis of organic
    molecules, for example drugs, agro- and fine chemicals. In retrosynthesis, a
    search tree is built by analysing molecules recursively and dissecting them
    into simpler molecular building blocks until one obtains a set of known
    building blocks. The search space is intractably large, and it is difficult to
    determine the value of retrosynthetic positions. Here, we propose to model
    retrosynthesis as a Markov Decision Process. In combination with a Deep Neural
    Network policy learned from essentially the complete published knowledge of
    chemistry, Monte Carlo Tree Search (MCTS) can be used to evaluate positions. In
    exploratory studies, we demonstrate that MCTS with neural network policies
    outperforms the traditionally used best-first search with hand-coded
    heuristics.

    Robust Order Scheduling in the Fashion Industry: A Multi-Objective Optimization Approach

    Wei Du, Yang Tang, Sunney Yung Sun Leung, Le Tong, Athanasios V. Vasilakos, Feng Qian
    Subjects: Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

    In the fashion industry, order scheduling focuses on the assignment of
    production orders to appropriate production lines. In reality, before a new
    order can be put into production, a series of activities known as
    pre-production events need to be completed. In addition, in real production
    process, owing to various uncertainties, the daily production quantity of each
    order is not always as expected. In this research, by considering the
    pre-production events and the uncertainties in the daily production quantity,
    robust order scheduling problems in the fashion industry are investigated with
    the aid of a multi-objective evolutionary algorithm (MOEA) called nondominated
    sorting adaptive differential evolution (NSJADE). The experimental results
    illustrate that it is of paramount importance to consider pre-production events
    in order scheduling problems in the fashion industry. We also unveil that the
    existence of the uncertainties in the daily production quantity heavily affects
    the order scheduling.


    Information Retrieval

    Inferring Conceptual Relationships When Ranking Patients

    Nut Limsopatham, Craig Macdonald, Iadh Ounis
    Subjects: Information Retrieval (cs.IR)

    Searching patients based on the relevance of their medical records is
    challenging because of the inherent implicit knowledge within the patients’
    medical records and queries. Such knowledge is known to the medical
    practitioners but may be hidden from a search system. For example, when
    searching for the patients with a heart disease, medical practitioners commonly
    know that patients who are taking the amiodarone medicine are relevant, since
    this drug is used to combat heart disease. In this article, we argue that
    leveraging such implicit knowledge improves the retrieval effectiveness, since
    it provides new evidence to infer the relevance of patients’ medical records
    towards a query. Specifically, built upon existing conceptual representation
    for both medical records and queries, we proposed a novel expansion of queries
    that infers additional conceptual relationships from domain-specific resources
    as well as by extracting informative concepts from the top-ranked patients’
    medical records. We evaluate the retrieval effectiveness of our proposed
    approach in the context of the TREC 2011 and 2012 Medical Records track. Our
    results show the effectiveness of our approach to model the implicit knowledge
    in patient search, whereby the retrieval performance is significantly improved
    over both an effective conceptual representation baseline and an existing
    semantic query expansion baseline. In addition, we provide an analysis of the
    types of queries that the proposed approach is likely to be effective.

    ImageNet MPEG-7 Visual Descriptors – Technical Report

    Frédéric Rayar
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR)

    ImageNet is a large scale and publicly available image database. It currently
    offers more than 14 millions of images, organised according to the WordNet
    hierarchy. One of the main objective of the creators is to provide to the
    research community a relevant database for visual recognition applications such
    as object recognition, image classification or object localisation. However,
    only a few visual descriptors of the images are available to be used by the
    researchers. Only SIFT-based features have been extracted from a subset of the
    collection. This technical report presents the extraction of some MPEG-7 visual
    descriptors from the ImageNet database. These descriptors are made publicly
    available in an effort towards open research.

    Batch Incremental Shared Nearest Neighbor Density Based Clustering Algorithm for Dynamic Datasets

    Panthadeep Bhattacharjee, Amit Awekar
    Comments: 6 pages, Accepted at ECIR 2017
    Subjects: Databases (cs.DB); Information Retrieval (cs.IR)

    Incremental data mining algorithms process frequent updates to dynamic
    datasets efficiently by avoiding redundant computation. Existing incremental
    extension to shared nearest neighbor density based clustering (SNND) algorithm
    cannot handle deletions to dataset and handles insertions only one point at a
    time. We present an incremental algorithm to overcome both these bottlenecks by
    efficiently identifying affected parts of clusters while processing updates to
    dataset in batch mode. We show effectiveness of our algorithm by performing
    experiments on large synthetic as well as real world datasets. Our algorithm is
    up to four orders of magnitude faster than SNND and requires up to 60% extra
    memory than SNND while providing output identical to SNND.

    TOFEC: Achieving Optimal Throughput-Delay Trade-off of Cloud Storage Using Erasure Codes

    Guanfeng Liang, Ulas C. Kozat
    Subjects: Networking and Internet Architecture (cs.NI); Information Retrieval (cs.IR); Performance (cs.PF)

    Our paper presents solutions using erasure coding, parallel connections to
    storage cloud and limited chunking (i.e., dividing the object into a few
    smaller segments) together to significantly improve the delay performance of
    uploading and downloading data in and out of cloud storage.

    TOFEC is a strategy that helps front-end proxy adapt to level of workload by
    treating scalable cloud storage (e.g. Amazon S3) as a shared resource requiring
    admission control. Under light workloads, TOFEC creates more smaller chunks and
    uses more parallel connections per file, minimizing service delay. Under heavy
    workloads, TOFEC automatically reduces the level of chunking (fewer chunks with
    increased size) and uses fewer parallel connections to reduce overhead,
    resulting in higher throughput and preventing queueing delay. Our trace-driven
    simulation results show that TOFEC’s adaptation mechanism converges to an
    appropriate code that provides the optimal delay-throughput trade-off without
    reducing system capacity. Compared to a non-adaptive strategy optimized for
    throughput, TOFEC delivers 2.5x lower latency under light workloads; compared
    to a non-adaptive strategy optimized for latency, TOFEC can scale to support
    over 3x as many requests.


    Computation and Language

    SMPOST: Parts of Speech Tagger for Code-Mixed Indic Social Media Text

    Deepak Gupta, Shubham Tripathi, Asif Ekbal, Pushpak Bhattacharyya
    Comments: 5 pages, ICON 2016
    Subjects: Computation and Language (cs.CL)

    Use of social media has grown dramatically during the last few years. Users
    follow informal languages in communicating through social media. The language
    of communication is often mixed in nature, where people transcribe their
    regional language with English and this technique is found to be extremely
    popular. Natural language processing (NLP) aims to infer the information from
    these text where Part-of-Speech (PoS) tagging plays an important role in
    getting the prosody of the written text. For the task of PoS tagging on
    Code-Mixed Indian Social Media Text, we develop a supervised system based on
    Conditional Random Field classifier. In order to tackle the problem
    effectively, we have focused on extracting rich linguistic features. We
    participate in three different language pairs, ie. English-Hindi,
    English-Bengali and English-Telugu on three different social media platforms,
    Twitter, Facebook & WhatsApp. The proposed system is able to successfully
    assign coarse as well as fine-grained PoS tag labels for a given a code-mixed
    sentence. Experiments show that our system is quite generic that shows
    encouraging performance levels on all the three language pairs in all the
    domains.

    Foreign-language Reviews: Help or Hindrance?

    Scott A. Hale, Irene Eleta
    Journal-ref: Proceedings of the SIGCHI Conference on Human Factors in Computing
    Systems, CHI 2017
    Subjects: Human-Computer Interaction (cs.HC); Computation and Language (cs.CL); Computers and Society (cs.CY)

    The number and quality of user reviews greatly affects consumer purchasing
    decisions. While reviews in all languages are increasing, it is still often the
    case (especially for non-English speakers) that there are only a few reviews in
    a person’s first language. Using an online experiment, we examine the value
    that potential purchasers receive from interfaces showing additional reviews in
    a second language. The results paint a complicated picture with both positive
    and negative reactions to the inclusion of foreign-language reviews. Roughly
    26-28% of subjects clicked to see translations of the foreign-language content
    when given the opportunity, and those who did so were more likely to select the
    product with foreign-language reviews than those who did not.


    Distributed, Parallel, and Cluster Computing

    Agreement Functions for Distributed Computing Models

    Petr Kuznetsov, Thibault Rieutord
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    The paper proposes a surprisingly simple characterization of a large class of
    models of distributed computing, via an agreement function: for each set of
    processes, the function determines the best level of set consensus these
    processes can reach. We show that the task computability of a large class of
    steady adversaries that includes, in particular superset-closed and symmetric
    one, is precisely captured by agreement functions.

    Dynamic load balancing for large-scale adaptive finite element computation

    Hui Liu, Tao Cui, Wei Leng, Linbo Zhang
    Comments: in Chinese
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    For the parallel computation of partial differential equations, one key is
    the grid partitioning. It requires that each process owns the same amount of
    computations, and also, the partitioning quality should be proper to reduce the
    communications among processes. When calculating the partial differential
    equations using adaptive finite element methods, the grid and the basis
    functions adjust in each iteration, which introduce load balancing issues. The
    grid should be redistributed dynamically. This paper studies dynamic load
    balancing algorithms and the implementation on the adaptive finite element
    platform PHG. The numerical experiments show that algorithms studied in this
    paper have good partitioning quality, and they are efficient.

    Transaction Support over Redis: An Overview

    Yuqing Zhu, Jianxun Liu, Mengying Guo, Wenlong Ma, Yungang Bao
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    This document outlines the approach to supporting cross-node transactions
    over a Redis cluster.

    Fault diagnosability of data center networks

    Mei-Mei Gu, Rong-Xia Hao, Shuming Zhou
    Comments: 16 pages, 2 figures
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Combinatorics (math.CO)

    The data center networks (D_{n,k}), proposed in 2008, has many desirable
    features such as high network capacity. A kind of generalization of
    diagnosability for network (G) is (g)-good-neighbor diagnosability which is
    denoted by (t_g(G)). Let (kappa^g(G)) be the (R^g)-connectivity. Lin et. al.
    in [IEEE Trans. on Reliability, 65 (3) (2016) 1248–1262] and Xu et. al in
    [Theor. Comput. Sci. 659 (2017) 53–63] gave the same problem independently
    that: the relationship between the (R^g)-connectivity (kappa^g(G)) and
    (t_g(G)) of a general graph (G) need to be studied in the future. In this
    paper, this open problem is solved for general regular graphs. We firstly
    establish the relationship of (kappa^g(G)) and (t_g(G)), and obtain that
    (t_g(G)=kappa^g(G)+g) under some conditions. Secondly, we obtain the
    (g)-good-neighbor diagnosability of (D_{k,n}) which are
    (t_g(D_{k,n})=(g+1)(k-1)+n+g) for (1leq gleq n-1) under the PMC model and the
    MM model, respectively. Further more, we show that (D_{k,n}) is tightly super
    ((n+k-1))-connected for (ngeq 2) and (kgeq 2) and we also prove that the
    largest connected component of the survival graph contains almost all of the
    remaining vertices in (D_{k,n}) when (2k+n-2) vertices removed.


    Learning

    PCA-Initialized Deep Neural Networks Applied To Document Image Analysis

    Mathias Seuret, Michele Alberti, Rolf Ingold, Marcus Liwicki
    Subjects: Learning (cs.LG); Machine Learning (stat.ML)

    In this paper, we present a novel approach for initializing deep neural
    networks, i.e., by turning PCA into neural layers. Usually, the initialization
    of the weights of a deep neural network is done in one of the three following
    ways: 1) with random values, 2) layer-wise, usually as Deep Belief Network or
    as auto-encoder, and 3) re-use of layers from another network (transfer
    learning). Therefore, typically, many training epochs are needed before
    meaningful weights are learned, or a rather similar dataset is required for
    seeding a fine-tuning of transfer learning. In this paper, we describe how to
    turn a PCA into an auto-encoder, by generating an encoder layer of the PCA
    parameters and furthermore adding a decoding layer. We analyze the
    initialization technique on real documents. First, we show that a PCA-based
    initialization is quick and leads to a very stable initialization. Furthermore,
    for the task of layout analysis we investigate the effectiveness of PCA-based
    initialization and show that it outperforms state-of-the-art random weight
    initialization methods.

    On orthogonality and learning recurrent networks with long term dependencies

    Eugene Vorontsov, Chiheb Trabelsi, Samuel Kadoury, Chris Pal
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    It is well known that it is challenging to train deep neural networks and
    recurrent neural networks for tasks that exhibit long term dependencies. The
    vanishing or exploding gradient problem is a well known issue associated with
    these challenges. One approach to addressing vanishing and exploding gradients
    is to use either soft or hard constraints on weight matrices so as to encourage
    or enforce orthogonality. Orthogonal matrices preserve gradient norm during
    backpropagation and can therefore be a desirable property; however, we find
    that hard constraints on orthogonality can negatively affect the speed of
    convergence and model performance. This paper explores the issues of
    optimization convergence, speed and gradient stability using a variety of
    different methods for encouraging or enforcing orthogonality. In particular we
    propose a weight matrix factorization and parameterization strategy through
    which we can bound matrix norms and therein control the degree of expansivity
    induced during backpropagation.

    Learning the distribution with largest mean: two bandit frameworks

    Emilie Kaufmann (SEQUEL, CRIStAL, CNRS), Aurélien Garivier (IMT)
    Subjects: Learning (cs.LG); Statistics Theory (math.ST); Machine Learning (stat.ML)

    Over the past few years, the multi-armed bandit model has become increasingly
    popular in the machine learning community, in part because of applications
    including online content optimization. This paper reviews two different
    sequential learning tasks that have been considered in the bandit literature ;
    they can be formulated as (sequentially) learning the distribution that has the
    highest mean among a set of distributions, with some constraints on the
    learning process. For both of them (regret minimization and best arm
    identification), we present (asymptotically) optimal algorithms, some of which
    are quite recent. We compare the behavior of the sampling rule of each
    algorithm as well as the complexity terms associated to each problem.

    On SGD's Failure in Practice: Characterizing and Overcoming Stalling

    Vivak Patel
    Comments: 17 pages, 4 figures, 4 Tables
    Subjects: Machine Learning (stat.ML); Learning (cs.LG); Optimization and Control (math.OC); Computation (stat.CO)

    Stochastic Gradient Descent (SGD) is widely used in machine learning problems
    to efficiently perform empirical risk minimization, yet, in practice, SGD is
    known to stall before reaching the actual minimizer of the empirical risk. SGD
    stalling has often been attributed to its sensitivity to the conditioning of
    the problem; however, as we demonstrate, SGD will stall even when applied to a
    simple linear regression problem with unity condition number for standard
    learning rates. Thus, in this work, we numerically demonstrate and
    mathematically argue that stalling is a crippling and generic limitation of SGD
    and its variants in practice. Once we have established the problem of stalling,
    we introduce a framework for hedging against its effects, which (1) deters SGD
    and its variants from stalling, (2) still provides convergence guarantees, and
    (3) makes SGD and its variants more practical methods for minimization.

    Machine learning based compact photonic structure design for strong light confinement

    Mirbek Turduev, Çağrı Latifoğlu, İbrahim Halil Giden, Y. Sinan Hanay
    Comments: 7 pages, 4 figures
    Subjects: Optics (physics.optics); Learning (cs.LG)

    We present a novel approach based on machine learning for designing photonic
    structures. In particular, we focus on strong light confinement that allows the
    design of an efficient free-space-to-waveguide coupler which is made of Si-
    slab overlying on the top of silica substrate. The learning algorithm is
    implemented using bitwise square Si- cells and the whole optimized device has a
    footprint of (oldsymbol{2 , mu m imes 1, mu m}), which is the smallest
    size ever achieved numerically. To find the effect of Si- slab thickness on the
    sub-wavelength focusing and strong coupling characteristics of optimized
    photonic structure, we carried out three-dimensional time-domain numerical
    calculations. Corresponding optimum values of full width at half maximum and
    coupling efficiency were calculated as (oldsymbol{0.158 lambda}) and
    (oldsymbol{-1.87,dB}) with slab thickness of (oldsymbol{280nm}). Compared
    to the conventional counterparts, the optimized lens and coupler designs are
    easy-to-fabricate via optical lithography techniques, quite compact, and can
    operate at telecommunication wavelengths. The outcomes of the presented study
    show that machine learning can be beneficial for efficient photonic designs in
    various potential applications such as polarization-division, beam manipulation
    and optical interconnects.

    Communication-Optimal Distributed Clustering

    Jiecao Chen, He Sun, David P. Woodruff, Qin Zhang
    Comments: A preliminary version of this paper appeared at the 30th Annual Conference on Neural Information Processing Systems (NIPS), 2016
    Subjects: Data Structures and Algorithms (cs.DS); Learning (cs.LG)

    Clustering large datasets is a fundamental problem with a number of
    applications in machine learning. Data is often collected on different sites
    and clustering needs to be performed in a distributed manner with low
    communication. We would like the quality of the clustering in the distributed
    setting to match that in the centralized setting for which all the data resides
    on a single site. In this work, we study both graph and geometric clustering
    problems in two distributed models: (1) a point-to-point model, and (2) a model
    with a broadcast channel. We give protocols in both models which we show are
    nearly optimal by proving almost matching communication lower bounds. Our work
    highlights the surprising power of a broadcast channel for clustering problems;
    roughly speaking, to spectrally cluster (n) points or (n) vertices in a graph
    distributed across (s) servers, for a worst-case partitioning the communication
    complexity in a point-to-point model is (n cdot s), while in the broadcast
    model it is (n + s). A similar phenomenon holds for the geometric setting as
    well. We implement our algorithms and demonstrate this phenomenon on real life
    datasets, showing that our algorithms are also very efficient in practice.

    On the Futility of Learning Complex Frame-Level Language Models for Chord Recognition

    Filip Korzeniowski, Gerhard Widmer
    Subjects: Sound (cs.SD); Learning (cs.LG)

    Chord recognition systems use temporal models to post-process frame-wise
    chord preditions from acoustic models. Traditionally, first-order models such
    as Hidden Markov Models were used for this task, with recent works suggesting
    to apply Recurrent Neural Networks instead. Due to their ability to learn
    longer-term dependencies, these models are supposed to learn and to apply
    musical knowledge, instead of just smoothing the output of the acoustic model.
    In this paper, we argue that learning complex temporal models at the level of
    audio frames is futile on principle, and that non-Markovian models do not
    perform better than their first-order counterparts. We support our argument
    through three experiments on the McGill Billboard dataset. The first two show
    1) that when learning complex temporal models at the frame level, improvements
    in chord sequence modelling are marginal; and 2) that these improvements do not
    translate when applied within a full chord recognition system. The third, still
    rather preliminary experiment gives first indications that the use of complex
    sequential models for chord prediction at higher temporal levels might be more
    promising.

    Representation of big data by dimension reduction

    A.G.Ramm, C. Van
    Subjects: Information Theory (cs.IT); Learning (cs.LG); Machine Learning (stat.ML)

    Suppose the data consist of a set (S) of points (x_j, 1 leq j leq J),
    distributed in a bounded domain (D subset R^N), where (N) and (J) are large
    numbers. In this paper an algorithm is proposed for checking whether there
    exists a manifold (mathbb{M}) of low dimension near which many of the points
    of (S) lie and finding such (mathbb{M}) if it exists. There are many dimension
    reduction algorithms, both linear and non-linear. Our algorithm is simple to
    implement and has some advantages compared with the known algorithms. If there
    is a manifold of low dimension near which most of the data points lie, the
    proposed algorithm will find it. Some numerical results are presented
    illustrating the algorithm and analyzing its performance compared to the
    classical PCA (principal component analysis) and Isomap.

    Towards "AlphaChem": Chemical Synthesis Planning with Tree Search and Deep Neural Network Policies

    Marwin Segler, Mike Preuß, Mark P. Waller
    Comments: 4 pages, 1 figure
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Chemical Physics (physics.chem-ph)

    Retrosynthesis is a technique to plan the chemical synthesis of organic
    molecules, for example drugs, agro- and fine chemicals. In retrosynthesis, a
    search tree is built by analysing molecules recursively and dissecting them
    into simpler molecular building blocks until one obtains a set of known
    building blocks. The search space is intractably large, and it is difficult to
    determine the value of retrosynthetic positions. Here, we propose to model
    retrosynthesis as a Markov Decision Process. In combination with a Deep Neural
    Network policy learned from essentially the complete published knowledge of
    chemistry, Monte Carlo Tree Search (MCTS) can be used to evaluate positions. In
    exploratory studies, we demonstrate that MCTS with neural network policies
    outperforms the traditionally used best-first search with hand-coded
    heuristics.


    Information Theory

    New Beam Tracking Technique for Millimeter Wave-band Communications

    Jisu Bae, Sun Hong Lim, Jin Hyeok Yoo, Jun Won Choi
    Comments: 6 pages
    Subjects: Information Theory (cs.IT)

    In this paper, we propose an efficient beam tracking method for mobility
    scenario in mmWave-band communications. When the position of the mobile changes
    in mobility scenario, the base-station needs to perform beam training
    frequently to track the time-varying channel, thereby spending significant
    resources for training beams. In order to reduce the training overhead, we
    propose a new beam training approach called “beam tracking” which exploits the
    continuous nature of time varying angle of departure (AoD) for beam selection.
    We show that transmission of only two training beams is enough to track the
    time-varying AoD at good accuracy. We derive the optimal selection of beam pair
    which minimizes Cramer-Rao Lower Bound (CRLB) for AoD estimation averaged over
    statistical distribution of the AoD. Our numerical results demonstrate that the
    proposed beam tracking scheme produces better AoD estimation than the
    conventional beam training protocol with less training overhead.

    The Average Dimension of the Hermitian Hull of Constayclic Codes over Finite Fields

    Somphong Jitman, Ekkasit Sangwisut
    Subjects: Information Theory (cs.IT)

    The hulls of linear and cyclic codes have been extensively studied due to
    their wide applications.

    In this paper, the average dimension of the Hermitian hull of constacyclic
    codes of length (n) over a finite field (mathbb{F}_{q^2}) is determined
    together with some upper and lower bounds. It turns out that either the average
    dimension of the Hermitian hull of constacyclic codes of length (n) over
    (mathbb{F}_{q^2}) is zero or it grows the same rate as (n). Comparison to the
    average dimension of the Euclidean hull of cyclic codes is discussed as well.

    Collision vs non-Collision Distributed Time Synchronization for Dense IoT Deployments

    Maria Antonieta Alvarez, Umberto Spagnolini
    Comments: to be published in IEEE Int. Conf. Commun. (ICC), 2017
    Subjects: Information Theory (cs.IT)

    Massive co-located devices require new paradigms to allow proper network
    connectivity. Internet of things (IoT) is the paradigm that offers a solution
    for the inter-connectivity of devices, but in dense IoT networks time
    synchronization is a critical aspect. Further, the scalability is another
    crucial aspect. This paper focuses on synchronization for uncoordinated dense
    networks without any external timing reference. Two synchronization methods are
    proposed and compared: i) conventional synchronization that copes with the high
    density of nodes by frame collision-avoidance methods (e.g., CSMA/CA) to avoid
    the superimposition (or collision) of synchronization signals; and ii)
    distributed synchronization that exploits the frames’ collision to drive the
    network to a global synchronization. The distributed synchronization algorithm
    allows the network to reach a timing synchronization status based on a common
    beacon with the same signature broadcasted by every device. The superimposition
    of beacons from all the other devices enables the network synchronization,
    rather than preventing it. Numerical analysis evaluates the synchronization
    performance based on the convergence time and synchronization dispersion, both
    on collision and non-collision scenario, by investigating the scalability of
    the network. Results prove that in dense network the ensemble of signatures
    provides remarkable improvements of synchronization performance compared to
    conventional master-slave reference.

    Location and Orientation Optimisation for Spatially Stretched Tripole Arrays Based on Compressive Sensing

    Matthew Hawes, Lyudmila Mihaylova, Wei Liu
    Comments: This is an extended version of a paper published in IEEE Transactions of Signal Processing. If citing this work please use the information for the published version
    Subjects: Information Theory (cs.IT)

    The design of sparse spatially stretched tripole arrays is an important but
    also challenging task and this paper proposes for the very first time efficient
    solutions to this problem. Unlike for the design of traditional sparse antenna
    arrays, the developed approaches optimise both the dipole locations and
    orientations. The novelty of the paper consists in formulating these
    optimisation problems into a form that can be solved by the proposed
    compressive sensing and Bayesian compressive sensing based approaches. The
    performance of the developed approaches is validated and it is shown that
    accurate approximation of a reference response can be achieved with a 67%
    reduction in the number of dipoles required as compared to an equivalent
    uniform spatially stretched tripole array, leading to a significant reduction
    in the cost associated with the resulting arrays.

    Short-Message Communication and FIR System Identification using Huffman Sequences

    Philipp Walk, Peter Jung, Babak Hassibi
    Subjects: Information Theory (cs.IT)

    Providing short-message communication and simultaneous channel estimation for
    sporadic and fast fading scenarios is a challenge for future wireless networks.
    In this work we propose a novel blind communication and deconvolution scheme by
    using Huffman sequences, which allows to solve three important tasks in one
    step: (i) determination of the transmit power (ii) identification of the
    discrete-time FIR channel by providing a maximum delay of less than (L/2) and
    (iii) simultaneously communicating (L-1) bits of information. Our signal
    reconstruction uses a recent semi-definite program that can recover two unknown
    signals from their auto-correlations and cross-correlations. This convex
    algorithm is stable and operates fully deterministic without any further
    channel assumptions.

    Structure and Performance of Generalized Quasi-Cyclic Codes

    Cem Güneri, Ferruh Özbudak, Buket Özkaya, Elif Saçıkara, Zahra Sepasdar, Patrick Solé
    Subjects: Information Theory (cs.IT)

    Generalized quasi-cyclic (GQC) codes form a natural generalization of
    quasi-cyclic (QC) codes. They are viewed here as mixed alphabet codes over a
    family of ring alphabets. Decomposing these rings into local rings by the
    Chinese Remainder Theorem yields a decomposition of GQC codes into a sum of
    concatenated codes. This decomposition leads to a trace formula, a minimum
    distance bound, and to a criteria for the GQC code to be self-dual or to be
    linear complementary dual (LCD). Explicit long GQC codes that are LCD, but not
    QC, are exhibited.

    Optimal Caching in Mobile HetNets: A Throughput–Delay Trade-off Perspective

    Trung-Anh Do, Sang-Woon Jeon, Won-Yong Shin
    Comments: 18 pages, 7 figures, Submitted to IEEE Transactions on Mobile Computing. Part of this paper was presented at the IEEE International Symposium on Information Theory, Barcelona, Spain, July 2016
    Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

    We analyze the optimal throughput–delay trade-off in content-centric mobile
    heterogeneous networks (HetNets), where each node moves according to the random
    walk mobility model and requests a content object from the library
    independently at random, according to a Zipf popularity distribution. Instead
    of allowing access to all content objects via costly backhaul, we consider a
    more practical scenario where mobile nodes and base stations (BSs), each having
    a finite-size cache space, are able to cache a subset of content objects so
    that each request is served by other mobile nodes or BSs via multihop
    transmissions. Under a given caching strategy, we first characterize a
    fundamental throughput–delay trade-off in terms of scaling laws by introducing
    a general content delivery multihop routing protocol. Then, the optimal
    throughput–delay trade-off is characterized by presenting the optimal cache
    allocation strategy, which jointly finds the replication sets at mobile nodes
    and BSs via nontrivial variable decoupling. In addition, computer simulations
    are performed to validate our analytical results. We also show that the optimal
    strategy strictly outperforms a baseline approach, where the replication sets
    at mobile nodes and BSs are optimized separately.

    Info-Clustering: An Efficient Algorithm by Network Information Flow

    Chung Chan, Ali Al-Bashabsheh, Qiaoqiao Zhou
    Subjects: Information Theory (cs.IT)

    Motivated by the fact that entities in a social network or biological system
    often interact by exchanging information, we propose an efficient
    info-clustering algorithm that can group entities into communities using a
    parametric max-flow algorithm. This is a meaningful special case of the
    info-clustering paradigm where the dependency structure is graphical and can be
    learned readily from data.

    Expansion of the Kullback-Leibler Divergence, and a new class of information metrics

    David J. Galas, T. Gregory Dewey, James Kunert-Graf, Nikita A. Sakhanenko
    Comments: 22 pages, 1 figure
    Subjects: Information Theory (cs.IT)

    Inferring and comparing complex, multivariable probability density functions
    is a fundamental problem in several fields, including probabilistic learning,
    network theory, and data analysis. Classification and prediction are the two
    faces of this class of problem. We take an approach here that simplifies many
    aspects of these problems by presenting a structured series expansion of the
    Kullback-Leibler divergence – a function central to information theory – and
    devise a distance metric based on this divergence. Using the M”obius inversion
    duality between multivariable entropies and multivariable interaction
    information, we express the divergence as an additive series in the number of
    interacting variables. Truncations of this series yield approximations based on
    the number of interacting variables. The first few terms of the
    expansion-truncation are illustrated and shown to lead naturally to familiar
    approximations, including the well-known Kirkwood superposition approximation.
    Truncation can also induce a simple relation between the multi-information and
    the interaction information. A measure of distance between distributions, based
    on Kullback-Leibler divergence, is then described and shown to be a true
    metric. Finally, the expansion is shown to generate a hierarchy of metrics and
    connects it to information geometry formalisms. These results provide a general
    approach for systematic approximations in numbers of interactions or
    connections, and a related quantitative metric.

    Representation of big data by dimension reduction

    A.G.Ramm, C. Van
    Subjects: Information Theory (cs.IT); Learning (cs.LG); Machine Learning (stat.ML)

    Suppose the data consist of a set (S) of points (x_j, 1 leq j leq J),
    distributed in a bounded domain (D subset R^N), where (N) and (J) are large
    numbers. In this paper an algorithm is proposed for checking whether there
    exists a manifold (mathbb{M}) of low dimension near which many of the points
    of (S) lie and finding such (mathbb{M}) if it exists. There are many dimension
    reduction algorithms, both linear and non-linear. Our algorithm is simple to
    implement and has some advantages compared with the known algorithms. If there
    is a manifold of low dimension near which most of the data points lie, the
    proposed algorithm will find it. Some numerical results are presented
    illustrating the algorithm and analyzing its performance compared to the
    classical PCA (principal component analysis) and Isomap.

    Sharp Bounds on Arimoto's Conditional Rényi Entropies Between Two Distinct Orders

    Yuta Sakai, Ken-ichi Iwata
    Comments: 61 pages, 6 figures; typos added, references added for Section V; A brief version of this paper was submitted to the 2017 IEEE International Symposium on Information Theory (ISIT2017)
    Subjects: Information Theory (cs.IT)

    This study examines sharp bounds on Arimoto’s conditional R’enyi entropy of
    order (eta) with a fixed another one of distinct order (alpha
    eq eta).
    Arimoto inspired the relation between the R’enyi entropy and the
    (ell_{r})-norm of probability distributions, and he introduced a conditional
    version of the R’enyi entropy. From this perspective, we analyze the
    (ell_{r})-norms of particular distributions. As results, we identify specific
    probability distributions whose achieve our sharp bounds on the conditional
    R’enyi entropy. The sharp bounds derived in this study can be applicable to
    other information measures, e.g., the minimum average probability of error, the
    Bhattacharyya parameter, Gallager’s reliability function (E_{0}), and Sibson’s
    (alpha)-mutual information, whose are strictly monotone functions of the
    conditional R’enyi entropy.

    Experimental Investigation of Optimum Beam Size for FSO Uplink

    Hemani Kaushal, Georges Kaddoum, Virander Kumar Jain, Subrat Kar
    Comments: 14 pages, 11 figures
    Subjects: Instrumentation and Detectors (physics.ins-det); Information Theory (cs.IT)

    In this paper, the effect of transmitter beam size on the performance of free
    space optical (FSO) communication has been determined experimentally.
    Irradiance profile for varying turbulence strength is obtained using optical
    turbulence generating (OTG) chamber inside laboratory environment. Based on the
    results, an optimum beam size is investigated using the semi-analytical method.
    Moreover, the combined effects of atmospheric scintillation and beam wander
    induced pointing errors are considered in order to determine the optimum beam
    size that minimizes the bit error rate (BER) of the system for a fixed
    transmitter power and link length. The results show that the optimum beam size
    increases with the increase in zenith angle but has negligible effect with the
    increase in fade threshold level at low turbulence levels and has a marginal
    effect at high turbulence levels. Finally, the obtained outcome is useful for
    FSO system design and BER performance analysis.

    Coded Computation over Heterogeneous Clusters

    Amirhossein Reisizadehmobarakeh, Saurav Prakash, Ramtin Pedarsani, Salman Avestimehr
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Information Theory (cs.IT)

    In large-scale distributed computing clusters, such as Amazon EC2, there are
    several types of “system noise” that can result in major degradation of
    performance: system failures, bottlenecks due to limited communication
    bandwidth, latency due to straggler nodes, etc. On the other hand, these
    systems enjoy abundance of redundancy — a vast number of computing nodes and
    large storage capacity. There have been recent results that demonstrate the
    impact of coding for efficient utilization of computation and storage
    redundancy to alleviate the effect of stragglers and communication bottlenecks
    in homogeneous clusters. In this paper, we focus on general heterogeneous
    distributed computing clusters consisting of a variety of computing machines
    with different capabilities. We propose a coding framework for speeding up
    distributed computing in heterogeneous clusters with straggling servers by
    trading redundancy for reducing the latency of computation. In particular, we
    propose Heterogeneous Coded Matrix Multiplication (HCMM) algorithm for
    performing distributed matrix multiplication over heterogeneous clusters that
    is provably asymptotically optimal. Moreover, if the number of worker nodes in
    the cluster is (n), we show that HCMM is (Theta(log n)) times faster than any
    uncoded scheme. We further provide numerical results demonstrating significant
    speedups of up to (49\%) and (34\%) for HCMM in comparison to the “uncoded” and
    “homogeneous coded” schemes, respectively.




沪ICP备19023445号-2号
友情链接