IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Fri, 17 Feb 2017

    我爱机器学习(52ml.net)发表于 2017-02-17 00:00:00
    love 0

    Neural and Evolutionary Computing

    Unbiased Online Recurrent Optimization

    Corentin Tallec, Yann Ollivier
    Comments: 11 pages, 5 figures
    Subjects: Neural and Evolutionary Computing (cs.NE); Learning (cs.LG)

    The novel Unbiased Online Recurrent Optimization (UORO) algorithm allows for
    online learning of general recurrent computational graphs such as recurrent
    network models. It works in a streaming fashion and avoids backtracking through
    past activations and inputs. UORO is a modification of NoBackTrack that
    bypasses the need for model sparsity and makes implementation easy in current
    deep learning frameworks, even for complex models. Computationally, UORO is as
    costly as Truncated Backpropagation Through Time (TBPTT). Contrary to TBPTT,
    UORO is guaranteed to provide unbiased gradient estimates, and does not favor
    short-term dependencies. The downside is added noise, requiring smaller
    learning rates.

    On synthetic tasks, UORO is found to overcome several deficiencies of TBPTT.
    For instance, when a parameter has a positive short-term but negative long-term
    influence, TBPTT may require truncation lengths substantially larger than the
    intrinsic temporal range of the interactions, while UORO performs well thanks
    to the unbiasedness of its gradients.

    Precise Recovery of Latent Vectors from Generative Adversarial Networks

    Zachary C. Lipton, Subarna Tripathi
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    Generative adversarial networks (GANs) transform latent vectors into visually
    plausible images. It is generally thought that the original GAN formulation
    gives no out-of-the-box method to reverse the mapping, projecting images back
    into latent space. We introduce a simple, gradient-based technique called
    stochastic clipping. In experiments, for images generated by the GAN, we
    exactly recover their latent vector pre-images 100% of the time. Additional
    experiments demonstrate that this method is robust to noise. Finally, we show
    that even for unseen images, our method appears to recover unique encodings.

    Training Language Models Using Target-Propagation

    Sam Wiseman, Sumit Chopra, Marc'Aurelio Ranzato, Arthur Szlam, Ruoyu Sun, Soumith Chintala, Nicolas Vasilache
    Subjects: Computation and Language (cs.CL); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    While Truncated Back-Propagation through Time (BPTT) is the most popular
    approach to training Recurrent Neural Networks (RNNs), it suffers from being
    inherently sequential (making parallelization difficult) and from truncating
    gradient flow between distant time-steps. We investigate whether Target
    Propagation (TPROP) style approaches can address these shortcomings.
    Unfortunately, extensive experiments suggest that TPROP generally underperforms
    BPTT, and we end with an analysis of this phenomenon, and suggestions for
    future work.


    Computer Vision and Pattern Recognition

    Improving Text Proposals for Scene Images with Fully Convolutional Networks

    Dena Bazazian, Raul Gomez, Anguelos Nicolaou, Lluis Gomez, Dimosthenis Karatzas, Andrew D. Bagdanov
    Comments: 6 pages, 8 figures, International Conference on Pattern Recognition (ICPR) – DLPR (Deep Learning for Pattern Recognition) workshop
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Text Proposals have emerged as a class-dependent version of object proposals
    – efficient approaches to reduce the search space of possible text object
    locations in an image. Combined with strong word classifiers, text proposals
    currently yield top state of the art results in end-to-end scene text
    recognition. In this paper we propose an improvement over the original Text
    Proposals algorithm of Gomez and Karatzas (2016), combining it with Fully
    Convolutional Networks to improve the ranking of proposals. Results on the
    ICDAR RRC and the COCO-text datasets show superior performance over current
    state-of-the-art.

    KEPLER: Keypoint and Pose Estimation of Unconstrained Faces by Learning Efficient H-CNN Regressors

    Amit Kumar, Azadeh Alavi, Rama Chellappa
    Comments: Accept as Oral FG’17
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Keypoint detection is one of the most important pre-processing steps in tasks
    such as face modeling, recognition and verification. In this paper, we present
    an iterative method for Keypoint Estimation and Pose prediction of
    unconstrained faces by Learning Efficient H-CNN Regressors (KEPLER) for
    addressing the face alignment problem. Recent state of the art methods have
    shown improvements in face keypoint detection by employing Convolution Neural
    Networks (CNNs). Although a simple feed forward neural network can learn the
    mapping between input and output spaces, it cannot learn the inherent
    structural dependencies. We present a novel architecture called H-CNN
    (Heatmap-CNN) which captures structured global and local features and thus
    favors accurate keypoint detecion. HCNN is jointly trained on the visibility,
    fiducials and 3D-pose of the face. As the iterations proceed, the error
    decreases making the gradients small and thus requiring efficient training of
    DCNNs to mitigate this. KEPLER performs global corrections in pose and
    fiducials for the first four iterations followed by local corrections in the
    subsequent stage. As a by-product, KEPLER also provides 3D pose (pitch, yaw and
    roll) of the face accurately. In this paper, we show that without using any 3D
    information, KEPLER outperforms state of the art methods for alignment on
    challenging datasets such as AFW and AFLW.

    Improving automated multiple sclerosis lesion segmentation with a cascaded 3D convolutional neural network approach

    Sergi Valverde, Mariano Cabezas, Eloy Roura, Sandra González-Villà, Deborah Pareto, Joan-Carles Vilanova, LLuís Ramió-Torrentà, Àlex Rovira, Arnau Oliver, Xavier Lladó
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we present a novel automated method for White Matter (WM)
    lesion segmentation of Multiple Sclerosis (MS) patient images. Our approach is
    based on a cascade of two 3D patch-wise convolutional neural networks (CNN).
    The first network is trained to be more sensitive revealing possible candidate
    lesion voxels while the second network is trained to reduce the number of
    misclassified voxels coming from the first network. This cascaded CNN
    architecture tends to learn well from small sets of training data, which can be
    very interesting in practice, given the difficulty to obtain manual label
    annotations and the large amount of available unlabeled Magnetic Resonance
    Imaging (MRI) data. We evaluate the accuracy of the proposed method on the
    public MS lesion segmentation challenge MICCAI2008 dataset, comparing it with
    respect to other state-of-the-art MS lesion segmentation tools. Furthermore,
    the proposed method is also evaluated on two private MS clinical datasets,
    where the performance of our method is also compared with different recent
    public available state-of-the-art MS lesion segmentation methods. At the time
    of writing this paper, our method is the best ranked approach on the MICCAI2008
    challenge, outperforming the rest of 60 participant methods when using all the
    available input modalities (T1-w, T2-w and FLAIR), while still in the top-rank
    (3rd position) when using only T1-w and FLAIR modalities. On clinical MS data,
    our approach exhibits a significant increase in the accuracy segmenting of WM
    lesions when compared with the rest of evaluated methods, highly correlating
    ((r ge 0.97)) also with the expected lesion volume.

    Deep Hybrid Similarity Learning for Person Re-identification

    Jianqing Zhu, Huanqiang Zeng, Shengcai Liao, Zhen Lei, Canhui Cai, LiXin Zheng
    Comments: 10 pages, 12 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Person Re-IDentification (Re-ID) aims to match person images captured from
    two non-overlapping cameras. In this paper, a deep hybrid similarity learning
    (DHSL) method for person Re-ID based on a convolution neural network (CNN) is
    proposed. In our approach, a CNN learning feature pair for the input image pair
    is simultaneously extracted. Then, both the element-wise absolute difference
    and multiplication of the CNN learning feature pair are calculated. Finally, a
    hybrid similarity function is designed to measure the similarity between the
    feature pair, which is realized by learning a group of weight coefficients to
    project the element-wise absolute difference and multiplication into a
    similarity score. Consequently, the proposed DHSL method is able to reasonably
    assign parameters of feature learning and metric learning in a CNN so that the
    performance of person Re-ID is improved. Experiments on three challenging
    person Re-ID databases, QMUL GRID, VIPeR and CUHK03, illustrate that the
    proposed DHSL method is superior to multiple state-of-the-art person Re-ID
    methods.

    Chord Angle Deviation using Tangent (CADT), an Efficient and Robust Contour-based Corner Detector

    Mohammad Asiful Hossain, Abdul Kawsar Tushar
    Comments: Conference Name – 2017 IEEE International Conference on Imaging, Vision & Pattern Recognition (icIVPR17); Conference Date – 13 Feb, 2017; Conference Venue – University of Dhaka, Dhaka, Bangladesh
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Detection of corner is the most essential process in a large number of
    computer vision and image processing applications. We have mentioned a number
    of popular contour-based corner detectors in our paper. Among all these
    detectors chord to triangular arm angle (CTAA) has been demonstrated as the
    most dominant corner detector in terms of average repeatability. We introduce a
    new effective method to calculate the value of curvature in this paper. By
    demonstrating experimental results, our proposed technique outperforms CTAA and
    other detectors mentioned in this paper. The results exhibit that our proposed
    method is simple yet efficient at finding out corners more accurately and
    reliably.

    Discovering objects and their relations from entangled scene representations

    David Raposo, Adam Santoro, David Barrett, Razvan Pascanu, Timothy Lillicrap, Peter Battaglia
    Comments: ICLR Workshop 2017
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)

    Our world can be succinctly and compactly described as structured scenes of
    objects and relations. A typical room, for example, contains salient objects
    such as tables, chairs and books, and these objects typically relate to each
    other by their underlying causes and semantics. This gives rise to correlated
    features, such as position, function and shape. Humans exploit knowledge of
    objects and their relations for learning a wide spectrum of tasks, and more
    generally when learning the structure underlying observed data. In this work,
    we introduce relation networks (RNs) – a general purpose neural network
    architecture for object-relation reasoning. We show that RNs are capable of
    learning object relations from scene description data. Furthermore, we show
    that RNs can act as a bottleneck that induces the factorization of objects from
    entangled scene description inputs, and from distributed deep representations
    of scene images provided by a variational autoencoder. The model can also be
    used in conjunction with differentiable memory mechanisms for implicit relation
    discovery in one-shot learning tasks. Our results suggest that relation
    networks are a potentially powerful architecture for solving a variety of
    problems that require object relation reasoning.


    Artificial Intelligence

    Reflexive Regular Equivalence for Bipartite Data

    Aaron Gerow, Mingyang Zhou, Stan Matwin, Feng Shi
    Comments: A condensed version of this paper will appear in Proceedings of the 30th Canadian Conference on Artificial Intelligence, Edmonton, Alberta, Canada
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    Bipartite data is common in data engineering and brings unique challenges,
    particularly when it comes to clustering tasks that impose on strong structural
    assumptions. This work presents an unsupervised method for assessing similarity
    in bipartite data. Similar to some co-clustering methods, the method is based
    on regular equivalence in graphs. The algorithm uses spectral properties of a
    bipartite adjacency matrix to estimate similarity in both dimensions. The
    method is reflexive in that similarity in one dimension is used to inform
    similarity in the other. Reflexive regular equivalence can also use the
    structure of transitivities — in a network sense — the contribution of which
    is controlled by the algorithm’s only free-parameter, (alpha). The method is
    completely unsupervised and can be used to validate assumptions of
    co-similarity, which are required but often untested, in co-clustering
    analyses. Three variants of the method with different normalizations are tested
    on synthetic data. The method is found to be robust to noise and well-suited to
    asymmetric co-similar structure, making it particularly informative for cluster
    analysis and recommendation in bipartite data of unknown structure. In
    experiments, the convergence and speed of the algorithm are found to be stable
    for different levels of noise. Real-world data from a network of malaria genes
    are analyzed, where the similarity produced by the reflexive method is shown to
    out-perform other measures’ ability to correctly classify genes.

    Theoretical and Practical Advances on Smoothing for Extensive-Form Games

    Christian Kroer, Kevin Waugh, Fatma Kilinc-Karzan, Tuomas Sandholm
    Subjects: Computer Science and Game Theory (cs.GT); Artificial Intelligence (cs.AI)

    Sparse iterative methods, in particular first-order methods, are known to be
    among the most effective in solving large-scale two-player zero-sum
    extensive-form games. The convergence rates of these methods depend heavily on
    the properties of the distance-generating function that they are based on. We
    investigate the acceleration of first-order methods for solving extensive-form
    games through better design of the dilated entropy function—a class of
    distance-generating functions related to the domains associated with the
    extensive-form games. By introducing a new weighting scheme for the dilated
    entropy function, we develop the first distance-generating function for the
    strategy spaces of sequential games that has no dependence on the branching
    factor of the player. This result improves the convergence rate of several
    first-order methods by a factor of (Omega(b^dd)), where (b) is the branching
    factor of the player, and (d) is the depth of the game tree.

    Thus far, counterfactual regret minimization methods have been faster in
    practice, and more popular, than first-order methods despite their
    theoretically inferior convergence rates. Using our new weighting scheme and
    practical tuning we show that, for the first time, the excessive gap technique
    can be made faster than the fastest counterfactual regret minimization
    algorithm, CFR+, in practice.

    Efficient Computation of Moments in Sum-Product Networks

    Han Zhao, Geoff Gordon
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

    Bayesian online learning algorithms for Sum-Product Networks (SPNs) need to
    compute moments of model parameters under the one-step update posterior
    distribution. The best existing method for computing such moments scales
    quadratically in the size of the SPN, although it scales linearly for trees. We
    propose a linear-time algorithm that works even when the SPN is a directed
    acyclic graph (DAG). We achieve this goal by reducing the moment computation
    problem into a joint inference problem in SPNs and by taking advantage of a
    special structure of the one-step update posterior distribution: it is a
    multilinear polynomial with exponentially many monomials, and we can evaluate
    moments by differentiating. The latter is known as the emph{differential
    trick}. We apply the proposed algorithm to develop a linear time assumed
    density filter (ADF) for SPN parameter learning. As an additional contribution,
    we conduct extensive experiments comparing seven different online learning
    algorithms for SPNs on 20 benchmark datasets. The new linear-time ADF method
    consistently achieves low runtime due to the efficient linear-time algorithm
    for moment computation; however, we discover that two other methods (CCCP and
    SMA) typically perform better statistically, while a third (BMM) is comparable
    to ADF. Interestingly, CCCP can be viewed as implicitly using the same
    differentiation trick that we make explicit here. The fact that two of the top
    four fastest methods use this trick suggests that the same trick might find
    other uses for SPN learning in the future.


    Information Retrieval

    Luandri: a Clean Lua Interface to the Indri Search Engine

    Bhaskar Mitra, Fernando Diaz, Nick Craswell
    Comments: Under review for SIGIR’17
    Subjects: Information Retrieval (cs.IR)

    In recent years, the information retrieval (IR) community has witnessed the
    first successful applications of deep neural network models to short-text
    matching and ad-hoc retrieval. It is exciting to see the research on deep
    neural networks and IR converge on these tasks of shared interest. However, the
    two communities have less in common when it comes to the choice of programming
    languages. Indri, an indexing framework popularly used by the IR community, is
    written in C++, while Torch, a popular machine learning library for deep
    learning, is written in the light-weight scripting language Lua. To bridge this
    gap, we introduce Luandri (pronounced “laundry”), a simple interface for
    exposing the search capabilities of Indri to Torch models implemented in Lua.

    Multimodal Content Representation and Similarity Ranking of Movies

    Konstantinos Bougiatiotis
    Comments: Preliminary work
    Subjects: Information Retrieval (cs.IR)

    In this paper we examine the existence of correlation between movie
    similarity and low level features from respective movie content. In particular,
    we demonstrate the extraction of multi-modal representation models of movies
    based on subtitles, audio and metadata mining. We emphasize our research in
    topic modeling of movies based on their subtitles. In order to demonstrate the
    proposed content representation approach, we have built a small dataset of 160
    widely known movies. We assert movie similarities, as propagated by the
    singular modalities and fusion models, in the form of recommendation rankings.
    We showcase a novel topic model browser for movies that allows for exploration
    of the different aspects of similarities between movies and an information
    retrieval system for movie similarity based on multi-modal content.


    Computation and Language

    Addressing the Data Sparsity Issue in Neural AMR Parsing

    Xiaochang Peng, Chuan Wang, Daniel Gildea, Nianwen Xue
    Comments: Accepted by EACL-17
    Subjects: Computation and Language (cs.CL)

    Neural attention models have achieved great success in different NLP tasks.
    How- ever, they have not fulfilled their promise on the AMR parsing task due to
    the data sparsity issue. In this paper, we de- scribe a sequence-to-sequence
    model for AMR parsing and present different ways to tackle the data sparsity
    problem. We show that our methods achieve significant improvement over a
    baseline neural atten- tion model and our results are also compet- itive
    against state-of-the-art systems that do not use extra linguistic resources.

    Fast and unsupervised methods for multilingual cognate clustering

    Taraka Rama, Johannes Wahle, Pavel Sofroniev, Gerhard Jäger
    Subjects: Computation and Language (cs.CL)

    In this paper we explore the use of unsupervised methods for detecting
    cognates in multilingual word lists. We use online EM to train sound segment
    similarity weights for computing similarity between two words. We tested our
    online systems on geographically spread sixteen different language groups of
    the world and show that the Online PMI system (Pointwise Mutual Information)
    outperforms a HMM based system and two linguistically motivated systems:
    LexStat and ALINE. Our results suggest that a PMI system trained in an online
    fashion can be used by historical linguists for fast and accurate
    identification of cognates in not so well-studied language families.

    An Analysis of Machine Learning Intelligence

    John P. Lalor, Hao Wu, Tsendsuren Munkhdalai, Hong Yu
    Comments: 10 pages, 2 figures, 3 tables
    Subjects: Computation and Language (cs.CL)

    Deep neural networks (DNNs) have set state of the art results in many machine
    learning and NLP tasks. However, we do not have a strong understanding of what
    DNN models learn. In this paper, we examine learning in DNNs through analysis
    of their outputs. We compare DNN performance directly to a human population,
    and use characteristics of individual data points such as difficulty to see how
    well models perform on easy and hard examples. We investigate how training size
    and the incorporation of noise affect a DNN’s ability to generalize and learn.
    Our experiments show that unlike traditional machine learning models (e.g.,
    Naive Bayes, Decision Trees), DNNs exhibit human-like learning properties. As
    they are trained with more data, they are more able to distinguish between easy
    and difficult items, and performance on easy items improves at a higher rate
    than difficult items. We find that different DNN models exhibit different
    strengths in learning and are robust to noise in training data.

    Training Language Models Using Target-Propagation

    Sam Wiseman, Sumit Chopra, Marc'Aurelio Ranzato, Arthur Szlam, Ruoyu Sun, Soumith Chintala, Nicolas Vasilache
    Subjects: Computation and Language (cs.CL); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    While Truncated Back-Propagation through Time (BPTT) is the most popular
    approach to training Recurrent Neural Networks (RNNs), it suffers from being
    inherently sequential (making parallelization difficult) and from truncating
    gradient flow between distant time-steps. We investigate whether Target
    Propagation (TPROP) style approaches can address these shortcomings.
    Unfortunately, extensive experiments suggest that TPROP generally underperforms
    BPTT, and we end with an analysis of this phenomenon, and suggestions for
    future work.


    Distributed, Parallel, and Cluster Computing

    Integration of QoS aspects in the Cloud Computing Research and Selection System

    Manar Abourezq, Abdellah Idrissi
    Journal-ref: International Journal of Advanced Computer Science and Application
    (IJACSA) Vol. 6, No. 6, 2015
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Cloud Computing is a business model revolution more than a technological one.
    It capitalized on various technologies that have proved themselves and reshaped
    the use of computers by replacing their local use by a centralized one where
    shared resources are stored and managed by a third-party in a way transparent
    to end-users. With this new use came new needs and one of them is the need to
    search through Cloud services and select the ones that meet certain
    requirements. To address this need, we have developed, in a previous work, the
    Cloud Service Research and Selection System (CSRSS) which aims to allow Cloud
    users to search through Cloud services in the database and find the ones that
    match their requirements. It is based on the Skyline and ELECTRE IS. In this
    paper, we improve the system by introducing 7 new dimensions related to QoS
    constraints. Our work’s main contribution is conceiving an Agent that uses both
    the Skyline and an outranking method, called ELECTREIsSkyline, to determine
    which Cloud services meet better the users’ requirements while respecting QoS
    properties. We programmed and tested this method for a total of 10 dimensions
    and for 50 000 cloud services. The first results are very promising and show
    the effectiveness of our approach.

    Ignore or Comply? On Breaking Symmetry in Consensus

    Petra Berenbrink, Andrea Clementi, Robert Elsässer, Peter Kling, Frederik Mallmann-Trenn, Emanuele Natale
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    We study consensus processes on the complete graph of (n) nodes. Initially,
    each node supports one from up to n opinions. Nodes randomly and in parallel
    sample the opinions of constant many nodes. Based on these samples, they use an
    update rule to change their own opinion.

    The goal is to reach consensus, a configuration where all nodes support the
    same opinion. We compare two well-known update rules: 2-Choices and 3-Majority.
    In the former, each node samples two nodes and adopts their opinion if they
    agree. In the latter, each node samples three nodes: If an opinion is supported
    by at least two samples the node adopts it, otherwise it randomly adopts one of
    the sampled opinions. Known results for these update rules focus on initial
    configurations with a limited number of colors (say (n^{1/3}) ), or typically
    assume a bias, where one opinion has a much larger support than any other. For
    such biased configurations, the time to reach consensus is roughly the same for
    2-Choices and 3-Majority.

    Interestingly, we prove that this is no longer true for configurations with a
    large number of initial colors. In particular, we show that 3-Majority reaches
    consensus with high probability in (O(n^{3/4}log^{7/8}n)) rounds, while
    2-Choices can need (Omega(n/log n)) rounds. We thus get the first
    unconditional sublinear bound for 3-Majority and the first result separating
    the consensus time of these processes. Along the way, we develop a framework
    that allows a fine-grained comparison between consensus processes from a
    specific class. We believe that this framework might help to classify the
    performance of more consensus processes.

    Proust: A Design Space for Highly-Concurrent Transactional Data Structures

    Thomas D. Dickerson, Paul Gazzillo, Eric Koskinen, Maurice Herlihy
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS)

    Most STM systems are poorly equipped to support libraries of concurrent data
    structures. One reason is that they typically detect conflicts by tracking
    transactions’ read sets and write sets, an approach that often leads to false
    conflicts. A second is that existing data structures and libraries often need
    to be rewritten from scratch to support transactional conflict detection and
    rollback. This paper introduces Proust, a framework for the design and
    implementation of transactional data structures. Proust is designed to maximize
    re-use of existing well-engineered by providing transactional “wrappers” to
    make existing thread-safe concurrent data structures transactional. Proustian
    objects are also integrated with an underling STM system, allowing them to take
    advantage of well-engineered STM conflict detection mechanisms. Proust
    generalizes and unifies prior approaches such as boosting and predication.

    Coded TeraSort

    Songze Li, Sucha Supittayapornpong, Mohammad Ali Maddah-Ali, A. Salman Avestimehr
    Comments: to appear in proceedings of 2017 International Workshop on Parallel and Distributed Computing for Large Scale Machine Learning and Big Data Analytics
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Information Theory (cs.IT)

    We focus on sorting, which is the building block of many machine learning
    algorithms, and propose a novel distributed sorting algorithm, named Coded
    TeraSort, which substantially improves the execution time of the TeraSort
    benchmark in Hadoop MapReduce. The key idea of Coded TeraSort is to impose
    structured redundancy in data, in order to enable in-network coding
    opportunities that overcome the data shuffling bottleneck of TeraSort. We
    empirically evaluate the performance of CodedTeraSort algorithm on Amazon EC2
    clusters, and demonstrate that it achieves 1.97x – 3.39x speedup, compared with
    TeraSort, for typical settings of interest.

    An Efficient Parallel Data Clustering Algorithm Using Isoperimetric Number of Trees

    Ramin Javadi, Saleh Ashkboos
    Comments: 16 pages, 6 figures
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    We propose a parallel graph-based data clustering algorithm using CUDA GPU,
    based on exact clustering of the minimum spanning tree in terms of a minimum
    isoperimetric criteria. We also provide a comparative performance analysis of
    our algorithm with other related ones which demonstrates the general
    superiority of this parallel algorithm over other competing algorithms in terms
    of accuracy and speed.


    Learning

    Discovering objects and their relations from entangled scene representations

    David Raposo, Adam Santoro, David Barrett, Razvan Pascanu, Timothy Lillicrap, Peter Battaglia
    Comments: ICLR Workshop 2017
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)

    Our world can be succinctly and compactly described as structured scenes of
    objects and relations. A typical room, for example, contains salient objects
    such as tables, chairs and books, and these objects typically relate to each
    other by their underlying causes and semantics. This gives rise to correlated
    features, such as position, function and shape. Humans exploit knowledge of
    objects and their relations for learning a wide spectrum of tasks, and more
    generally when learning the structure underlying observed data. In this work,
    we introduce relation networks (RNs) – a general purpose neural network
    architecture for object-relation reasoning. We show that RNs are capable of
    learning object relations from scene description data. Furthermore, we show
    that RNs can act as a bottleneck that induces the factorization of objects from
    entangled scene description inputs, and from distributed deep representations
    of scene images provided by a variational autoencoder. The model can also be
    used in conjunction with differentiable memory mechanisms for implicit relation
    discovery in one-shot learning tasks. Our results suggest that relation
    networks are a potentially powerful architecture for solving a variety of
    problems that require object relation reasoning.

    Reflexive Regular Equivalence for Bipartite Data

    Aaron Gerow, Mingyang Zhou, Stan Matwin, Feng Shi
    Comments: A condensed version of this paper will appear in Proceedings of the 30th Canadian Conference on Artificial Intelligence, Edmonton, Alberta, Canada
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    Bipartite data is common in data engineering and brings unique challenges,
    particularly when it comes to clustering tasks that impose on strong structural
    assumptions. This work presents an unsupervised method for assessing similarity
    in bipartite data. Similar to some co-clustering methods, the method is based
    on regular equivalence in graphs. The algorithm uses spectral properties of a
    bipartite adjacency matrix to estimate similarity in both dimensions. The
    method is reflexive in that similarity in one dimension is used to inform
    similarity in the other. Reflexive regular equivalence can also use the
    structure of transitivities — in a network sense — the contribution of which
    is controlled by the algorithm’s only free-parameter, (alpha). The method is
    completely unsupervised and can be used to validate assumptions of
    co-similarity, which are required but often untested, in co-clustering
    analyses. Three variants of the method with different normalizations are tested
    on synthetic data. The method is found to be robust to noise and well-suited to
    asymmetric co-similar structure, making it particularly informative for cluster
    analysis and recommendation in bipartite data of unknown structure. In
    experiments, the convergence and speed of the algorithm are found to be stable
    for different levels of noise. Real-world data from a network of malaria genes
    are analyzed, where the similarity produced by the reflexive method is shown to
    out-perform other measures’ ability to correctly classify genes.

    Learning to Use Learners' Advice

    Adish Singla, Hamed Hassani, Andreas Krause
    Subjects: Learning (cs.LG)

    In this paper, we study a variant of the framework of online learning using
    expert advice with limited/bandit feedback—we consider each expert a learning
    entity and thereby capture scenarios that are more realistic and practical for
    real-world applications. In our setting, the feedback at any time (t) is
    limited in a sense that it is only available to the expert (i^t) that has been
    selected by the central algorithm (forecaster), i.e., only the expert (i^t)
    receives feedback from the environment and gets to learn at time (t). We
    consider a generic black-box approach whereby the forecaster doesn’t control or
    know the learning dynamics of the experts apart from knowing the following
    no-regret learning property: the average regret of any expert (j) vanishes at a
    rate of at least (O(t_j^{eta-1})) with (t_j) learning steps where (eta in
    [0, 1]) is a parameter. We prove the following hardness result: without any
    coordination between the forecaster and the experts, it is impossible to design
    a forecaster achieving no-regret guarantees in the worst-case. In order to
    circumvent this hardness result, we consider a practical assumption allowing
    the forecaster to “guide” the learning process of the experts by
    filtering/blocking some of the feedbacks observed by them from the environment,
    i.e., not allowing the selected expert (i^t) to learn at time (t) for some time
    steps. Then, we design a novel no-regret learning algorithm algo for this
    problem setting by carefully guiding the feedbacks observed by experts. We
    prove that algo achieves the worst-case expected cumulative regret of
    (O(T^frac{1}{2 – eta})) after (T) time steps and matches the regret bound of
    (Theta(T^frac{1}{2})) for the special case of multi-armed bandits.

    Precise Recovery of Latent Vectors from Generative Adversarial Networks

    Zachary C. Lipton, Subarna Tripathi
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    Generative adversarial networks (GANs) transform latent vectors into visually
    plausible images. It is generally thought that the original GAN formulation
    gives no out-of-the-box method to reverse the mapping, projecting images back
    into latent space. We introduce a simple, gradient-based technique called
    stochastic clipping. In experiments, for images generated by the GAN, we
    exactly recover their latent vector pre-images 100% of the time. Additional
    experiments demonstrate that this method is robust to noise. Finally, we show
    that even for unseen images, our method appears to recover unique encodings.

    Efficient Computation of Moments in Sum-Product Networks

    Han Zhao, Geoff Gordon
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)

    Bayesian online learning algorithms for Sum-Product Networks (SPNs) need to
    compute moments of model parameters under the one-step update posterior
    distribution. The best existing method for computing such moments scales
    quadratically in the size of the SPN, although it scales linearly for trees. We
    propose a linear-time algorithm that works even when the SPN is a directed
    acyclic graph (DAG). We achieve this goal by reducing the moment computation
    problem into a joint inference problem in SPNs and by taking advantage of a
    special structure of the one-step update posterior distribution: it is a
    multilinear polynomial with exponentially many monomials, and we can evaluate
    moments by differentiating. The latter is known as the emph{differential
    trick}. We apply the proposed algorithm to develop a linear time assumed
    density filter (ADF) for SPN parameter learning. As an additional contribution,
    we conduct extensive experiments comparing seven different online learning
    algorithms for SPNs on 20 benchmark datasets. The new linear-time ADF method
    consistently achieves low runtime due to the efficient linear-time algorithm
    for moment computation; however, we discover that two other methods (CCCP and
    SMA) typically perform better statistically, while a third (BMM) is comparable
    to ADF. Interestingly, CCCP can be viewed as implicitly using the same
    differentiation trick that we make explicit here. The fact that two of the top
    four fastest methods use this trick suggests that the same trick might find
    other uses for SPN learning in the future.

    Unbiased Online Recurrent Optimization

    Corentin Tallec, Yann Ollivier
    Comments: 11 pages, 5 figures
    Subjects: Neural and Evolutionary Computing (cs.NE); Learning (cs.LG)

    The novel Unbiased Online Recurrent Optimization (UORO) algorithm allows for
    online learning of general recurrent computational graphs such as recurrent
    network models. It works in a streaming fashion and avoids backtracking through
    past activations and inputs. UORO is a modification of NoBackTrack that
    bypasses the need for model sparsity and makes implementation easy in current
    deep learning frameworks, even for complex models. Computationally, UORO is as
    costly as Truncated Backpropagation Through Time (TBPTT). Contrary to TBPTT,
    UORO is guaranteed to provide unbiased gradient estimates, and does not favor
    short-term dependencies. The downside is added noise, requiring smaller
    learning rates.

    On synthetic tasks, UORO is found to overcome several deficiencies of TBPTT.
    For instance, when a parameter has a positive short-term but negative long-term
    influence, TBPTT may require truncation lengths substantially larger than the
    intrinsic temporal range of the interactions, while UORO performs well thanks
    to the unbiasedness of its gradients.

    Generalizing Jensen and Bregman divergences with comparative convexity and the statistical Bhattacharyya distances with comparable means

    Frank Nielsen, Richard Nock
    Comments: 19 pages
    Subjects: Information Theory (cs.IT); Learning (cs.LG)

    Comparative convexity is a generalization of convexity relying on abstract
    notions of means. We define the Jensen divergence and the Jensen diversity from
    the viewpoint of comparative convexity, and show how to obtain the generalized
    Bregman divergences as limit cases of skewed Jensen divergences. In particular,
    we report explicit formula of these generalized Bregman divergences when
    considering quasi-arithmetic means. Finally, we introduce a generalization of
    the Bhattacharyya statistical distances based on comparative means using
    relative convexity.

    Sketched Ridge Regression: Optimization Perspective, Statistical Perspective, and Model Averaging

    Shusen Wang, Alex Gittens, Michael W. Mahoney
    Subjects: Machine Learning (stat.ML); Learning (cs.LG); Numerical Analysis (cs.NA)

    We address the statistical and optimization impacts of using classical or
    Hessian sketch to approximately solve the Matrix Ridge Regression (MRR)
    problem. Prior research has considered the effects of classical sketch on least
    squares regression (LSR), a strictly simpler problem. We establish that
    classical sketch has a similar effect upon the optimization properties of MRR
    as it does on those of LSR — namely, it recovers nearly optimal solutions. In
    contrast, Hessian sketch does not have this guarantee; instead, the
    approximation error is governed by a subtle interplay between the “mass” in the
    responses and the optimal objective value.

    For both types of approximations, the regularization in the sketched MRR
    problem gives it significantly different statistical properties from the
    sketched LSR problem. In particular, there is a bias-variance trade-off in
    sketched MRR that is not present in sketched LSR. We provide upper and lower
    bounds on the biases and variances of sketched MRR; these establish that the
    variance is significantly increased when classical sketch is used, while the
    bias is significantly increased when using Hessian sketches. Empirically,
    sketched MRR solutions can have risks that are higher by an order-of-magnitude
    than those of the optimal MRR solutions.

    We establish theoretically and empirically that model averaging greatly
    decreases this gap. Thus, in the distributed setting, sketching combined with
    model averaging is a powerful technique for quickly obtaining near-optimal
    solutions to the MRR problem greatly mitigating the statistical losses incurred
    by sketching.

    Dynamic Partition Models

    Marc Goessling, Yali Amit
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    We present a new approach for learning compact and intuitive distributed
    representations with binary encoding. Rather than summing up expert votes as in
    products of experts, we employ for each variable the opinion of the most
    reliable expert. Data points are hence explained through a partitioning of the
    variables into expert supports. The partitions are dynamically adapted based on
    which experts are active. During the learning phase we adopt a smoothed version
    of this model that uses separate mixtures for each data dimension. In our
    experiments we achieve accurate reconstructions of high-dimensional data points
    with at most a dozen experts.

    Training Language Models Using Target-Propagation

    Sam Wiseman, Sumit Chopra, Marc'Aurelio Ranzato, Arthur Szlam, Ruoyu Sun, Soumith Chintala, Nicolas Vasilache
    Subjects: Computation and Language (cs.CL); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

    While Truncated Back-Propagation through Time (BPTT) is the most popular
    approach to training Recurrent Neural Networks (RNNs), it suffers from being
    inherently sequential (making parallelization difficult) and from truncating
    gradient flow between distant time-steps. We investigate whether Target
    Propagation (TPROP) style approaches can address these shortcomings.
    Unfortunately, extensive experiments suggest that TPROP generally underperforms
    BPTT, and we end with an analysis of this phenomenon, and suggestions for
    future work.


    Information Theory

    Binary, Shortened Projective Reed Muller Codes for Coded Private Information Retrieval

    Myna Vajha, Vinayak Ramkumar, P. Vijay Kumar
    Comments: submitted to ISIT 2017
    Subjects: Information Theory (cs.IT)

    The notion of a Private Information Retrieval (PIR) code was recently
    introduced by Fazeli, Vardy and Yaakobi who showed that this class of codes
    permit PIR at reduced levels of storage overhead in comparison with
    replicated-server PIR. In the present paper, the construction of an ((n,k))
    ( au)-server binary, linear PIR code having parameters (n = sumlimits_{i =
    0}^{ell} {m choose i}), (k = {m choose ell}) and ( au = 2^{ell}) is
    presented. These codes are obtained through homogeneous-polynomial evaluation
    and correspond to the binary, Projective Reed Muller (PRM) code. The
    construction can be extended to yield PIR codes for any ( au) of the form
    (2^{ell}), (2^{ell}-1) and any value of (k), through a combination of
    single-symbol puncturing and shortening of the PRM code. Each of these code
    constructions above, have smaller storage overhead in comparison with other PIR
    codes appearing in the literature.

    For the particular case of ( au=3,4), we show that the codes constructed
    here are optimal, systematic PIR codes by providing an improved lower bound on
    the block length (n(k, au)) of a systematic PIR code. It follows from a
    result by Vardy and Yaakobi, that these codes also yield optimal, systematic
    primitive multi-set ((n, k, au)_B) batch codes for ( au=3,4). The PIR code
    constructions presented here also yield upper bounds on the generalized Hamming
    weights of binary PRM codes.

    Cache-Aided Full-Duplex Small Cells

    Marco Maso, Italo Atzeni, Imène Ghamnia, Ejder Baştuğ, Mérouane Debbah
    Comments: Submitted to CCDWN Workshop, International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOpt) 2017
    Subjects: Information Theory (cs.IT)

    Caching popular contents at the edge of the network can positively impact the
    performance and future sustainability of wireless networks in several ways,
    e.g., end-to-end access delay reduction and peak rate increase. In this paper,
    we aim at showing that non-negligible performance enhancements can be observed
    in terms of network interference footprint as well. To this end, we consider a
    full-duplex small-cell network consisting of non-cooperative cache-aided base
    stations, which communicate simultaneously with both downlink users and
    wireless backhaul nodes. We propose a novel caching model seeking to mimic a
    geographical policy based on local files popularity and calculate the
    corresponding cache hit probability. Subsequently we study the performance of
    the considered network in terms of throughput gain with respect to its
    cache-free half-duplex counterpart. Numerical results corroborate our
    theoretical findings and highlight remarkable performance gains when moving
    from cache-free to cache-aided full-duplex small-cell networks.

    A Universal Sampling Framework for Solving Physics-driven Inverse Source Problems

    John Murray-Bruce, Pier Luigi Dragotti
    Subjects: Information Theory (cs.IT); Analysis of PDEs (math.AP)

    Partial differential equations are central to describing many physical
    phenomena. Moreover in many applications these phenomena are observed through a
    sensor network, with the aim of inferring its underlying properties. Here we
    present a new framework for analysing fields governed by linear partial
    differential equations. The framework leverages from certain results in
    sampling and approximation theory to provide a unifying approach for solving a
    class of inverse source problems. Specifically, we show that the unknown field
    sources can be recovered from a sequence of so called generalised measurements
    using multidimensional frequency estimation techniques. Moreover, we show that
    this sequence of generalised measurements for our physics-driven fields, can be
    computed by taking linear weighted-sums of the sensor measurements, whereby the
    exact weights (of the sums) coincide with those that can reproduce
    multidimensional exponentials from linearly combined translates of a particular
    prototype function. The prototype function and the desired weights are shown to
    depend on the Green’s function of the underlying field. Based on this new
    framework we develop practical, noise robust, sensor network strategies for
    solving the inverse source problem, and then present numerical simulation
    results to verify their performance.

    Cross-Mode Interference Characterization in Cellular Networks with Voronoi Guard Regions

    Stelios Stefanatos, Antonis G. Gotsis, Angeliki Alexiou
    Comments: 12 pages (two-column format); Submitted
    Subjects: Information Theory (cs.IT)

    Advances in cellular networks such as device-to-device communications and
    full-duplex radios, as well as the inherent elimination of intra-cell
    interference achieved by network-controlled multiple access schemes, motivates
    the investigation of the cross-mode interference under a guard region
    corresponding to the Voronoi cell of an access point. By modeling the positions
    of interfering access points (APs) and user equipments (UEs) as Poisson
    distributed, analytical expressions for the statistics of the interference
    generated by either APs or UEs are obtained based on appropriately defined
    density functions. The considered system model and analysis are general enough
    to capture many operational scenarios of practical interest, including
    conventional downlink/uplink transmissions with nearest AP association, as well
    as transmissions where not both communicating nodes lie within the cell.
    Analysis provides insights on the level of protection offered by a Voronoi
    guard region and its dependence on type of interference and receiver position.
    Various examples demonstrate the validity/accuracy of the analysis in obtaining
    the system coverage probability for operational scenarios of practical
    interest.

    Sensor management with time, energy and communication constraints

    Cristian Rusu, John Thompson, Neil M. Robertson
    Subjects: Information Theory (cs.IT)

    In this paper we present new algorithms and analysis for the linear inverse
    sensor placement and scheduling problems over multiple time instances with
    power and communications constraints. The proposed algorithms, which deal
    directly with minimizing the mean squared and worst case errors (MSE and WCE),
    are based on the convex relaxation approach to address the binary optimization
    scheduling problems that are formulated in sensor network scenarios. We propose
    to balance the energy and communications demands of operating a network of
    sensors over time while we still guarantee a minimum level of estimation
    accuracy. We measure this accuracy by the MSE and WCE for which we provide
    tight average case and lower bounds analyses. We show experimentally how the
    proposed algorithm perform against state-of-the-art methods previously
    described in the literature.

    Compressed sensing in Hilbert spaces

    Yann Traonmilin (PANAMA), Gilles Puy, Rémi Gribonval (PANAMA), Mike Davies
    Subjects: Information Theory (cs.IT)

    In many linear inverse problems, we want to estimate an unknown vector
    belonging to a high-dimensional (or infinite-dimensional) space from few linear
    measurements. To overcome the ill-posed nature of such problems, we use a
    low-dimension assumption on the unknown vector: it belongs to a low-dimensional
    model set. The question of whether it is possible to recover such an unknown
    vector from few measurements then arises. If the answer is yes, it is also
    important to be able to describe a way to perform such a recovery. We describe
    a general framework where appropriately chosen random measurements guarantee
    that recovery is possible. We further describe a way to study the performance
    of recovery methods that consist in the minimization of a regularization
    function under a data-fit constraint.

    Polar codes with a stepped boundary

    Ilya Dumer
    Comments: This article has been submitted to ISIT 2017
    Subjects: Information Theory (cs.IT)

    We consider explicit polar constructions of blocklength (n
    ightarrowinfty)
    for the two extreme cases of code rates (R
    ightarrow1) and (R
    ightarrow0.)
    For code rates (R
    ightarrow1,) we design codes with complexity order of (nlog
    n) in code construction, encoding, and decoding. These codes achieve the
    vanishing output bit error rates on the binary symmetric channels with any
    transition error probability (p
    ightarrow 0) and perform this task with a
    substantially smaller redundancy ((1-R)n) than do other known high-rate codes,
    such as BCH codes or Reed-Muller (RM). We then extend our design to the
    low-rate codes that achieve the vanishing output error rates with the same
    complexity order of (nlog n) and an asymptotically optimal code rate
    (R
    ightarrow0) for the case of (p
    ightarrow1/2.)

    Generalizing Jensen and Bregman divergences with comparative convexity and the statistical Bhattacharyya distances with comparable means

    Frank Nielsen, Richard Nock
    Comments: 19 pages
    Subjects: Information Theory (cs.IT); Learning (cs.LG)

    Comparative convexity is a generalization of convexity relying on abstract
    notions of means. We define the Jensen divergence and the Jensen diversity from
    the viewpoint of comparative convexity, and show how to obtain the generalized
    Bregman divergences as limit cases of skewed Jensen divergences. In particular,
    we report explicit formula of these generalized Bregman divergences when
    considering quasi-arithmetic means. Finally, we introduce a generalization of
    the Bhattacharyya statistical distances based on comparative means using
    relative convexity.

    Improved Converses and Gap-Results for Coded Caching

    Chien-Yi Wang, Shirin Saeedi Bidokhti, Michele Wigger
    Subjects: Information Theory (cs.IT)

    Improved lower bounds on the average and the worst-case rate-memory tradeoffs
    for the Maddah-Ali&Niesen coded caching scenario are presented. For any number
    of users and files and for arbitrary cache sizes, the multiplicative gap
    between the exact rate-memory tradeoff and the new lower bound is less than
    2.315 in the worst-case scenario and less than 2.507 in the average-case
    scenario.

    ns-3 Implementation of the 3GPP MIMO Channel Model for Frequency Spectrum above 6 GHz

    Menglei Zhang, Michele Polese, Marco Mezzavilla, Sundeep Rangan, Michele Zorzi
    Subjects: Information Theory (cs.IT)

    Communications at mmWave frequencies will be a key enabler of the next
    generation of cellular networks, due to the multi-Gbps rate that can be
    achieved. However, there are still several problems that must be solved before
    this technology can be widely adopted, primarily associated with the interplay
    between the variability of mmWave links and the complexity of mobile networks.
    An end-to-end network simulator represents a great tool to assess the
    performance of any proposed solution to meet the stringent 5G requirements.
    Given the criticality of channel propagation characteristics at higher
    frequencies, we present our implementation of the 3GPP channel model for the
    6-100 GHz band for the ns-3 end-to-end 5G mmWave module, and detail its
    associated MIMO beamforming architecture.

    An Equivalence Between Secure Network and Index Coding

    Lawrence Ong, Badri N. Vellambi, Jörg Kliewer, Phee Lep Yeoh
    Journal-ref: Proceedings of the 2016 IEEE Globecom Workshop on Network Coding
    and Applications (NetCod), Washington, USA, Dec. 4-8, 2016
    Subjects: Information Theory (cs.IT)

    We extend the equivalence between network coding and index coding by Effros,
    El Rouayheb, and Langberg to the secure communication setting in the presence
    of an eavesdropper. Specifically, we show that the most general versions of
    secure network-coding setup by Chan and Grant and the secure index-coding setup
    by Dau, Skachek, and Chee, which also include the randomised encoding setting,
    are equivalent.

    Achievable information rates estimates in optically-amplified transmission systems using nonlinearity compensation and probabilistic shaping

    Daniel Semrau, Tianhua Xu, Nikita A. Shevchenko, Milen Paskov, Alex Alvarado, Robert I. Killey, Polina Bayvel
    Journal-ref: Optics Letters, Vol. 42, No. 1, 121-124 (2017)
    Subjects: Information Theory (cs.IT)

    Achievable information rates (AIRs) of wideband optical communication systems
    using ~40 nm (~5 THz) EDFA and ~100 nm (~12.5 THz) distributed Raman
    amplification are estimated based on a first-order perturbation analysis. The
    AIRs of each individual channel have been evaluated for DP-64QAM, DP-256QAM,
    and DP-1024QAM modulation formats. The impact of full-field nonlinear
    compensation (FF-NLC) and probabilistically shaped constellations using a
    Maxwell-Boltzmann distribution were studied and compared to electronic
    dispersion compensation. It is found that a probabilistically shaped DP-1024QAM
    constellation combined with FF-NLC yields AIRs of ~75 Tbit/s for the EDFA
    scheme and ~223 Tbit/s for the Raman amplification scheme over 2000 km standard
    single mode fibre transmission.

    Gaussian and Sparse Processes Are Limits of Generalized Poisson Processes

    Julien Fageot, Virginie Uhlmann, Michael Unser
    Comments: 16 pages, 11 figures
    Subjects: Probability (math.PR); Information Theory (cs.IT)

    The theory of sparse stochastic processes offers a broad class of statistical
    models to study signals. In this framework, signals are represented as
    realizations of random processes that are solution of linear stochastic
    differential equations driven by white L’evy noises. Among these processes,
    generalized Poisson processes based on compound-Poisson noises admit an
    interpretation as random L-splines with random knots and weights. We
    demonstrate that every generalized L’evy process-from Gaussian to sparse-can
    be understood as the limit in law of a sequence of generalized Poisson
    processes. This enables a new conceptual understanding of sparse processes and
    suggests simple algorithms for the numerical generation of such objects.

    Performance Analysis of Dense Small Cell Networks with Generalized Fading

    Bin Yang, Ming Ding, Guoqiang Mao, Xiaohu Ge
    Comments: 3 figures. Accepted by ICC 2017
    Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT)

    In this paper, we propose a unified framework to analyze the performance of
    dense small cell networks (SCNs) in terms of the coverage probability and the
    area spectral efficiency (ASE). In our analysis, we consider a practical path
    loss model that accounts for both non-line-of-sight (NLOS) and line-of-sight
    (LOS) transmissions. Furthermore, we adopt a generalized fading model, in which
    Rayleigh fading, Rician fading and Nakagami-m fading can be treated in a
    unified framework. The analytical results of the coverage probability and the
    ASE are derived, using a generalized stochastic geometry analysis. Different
    from existing work that does not differentiate NLOS and LOS transmissions, our
    results show that NLOS and LOS transmissions have a significant impact on the
    coverage probability and the ASE performance, particularly when the SCNs grow
    dense. Furthermore, our results establish for the first time that the
    performance of the SCNs can be divided into four regimes, according to the
    intensity (aka density) of BSs, where in each regime the performance is
    dominated by different factors.

    Coded TeraSort

    Songze Li, Sucha Supittayapornpong, Mohammad Ali Maddah-Ali, A. Salman Avestimehr
    Comments: to appear in proceedings of 2017 International Workshop on Parallel and Distributed Computing for Large Scale Machine Learning and Big Data Analytics
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Information Theory (cs.IT)

    We focus on sorting, which is the building block of many machine learning
    algorithms, and propose a novel distributed sorting algorithm, named Coded
    TeraSort, which substantially improves the execution time of the TeraSort
    benchmark in Hadoop MapReduce. The key idea of Coded TeraSort is to impose
    structured redundancy in data, in order to enable in-network coding
    opportunities that overcome the data shuffling bottleneck of TeraSort. We
    empirically evaluate the performance of CodedTeraSort algorithm on Amazon EC2
    clusters, and demonstrate that it achieves 1.97x – 3.39x speedup, compared with
    TeraSort, for typical settings of interest.




沪ICP备19023445号-2号
友情链接