IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Tue, 31 Jan 2017

    我爱机器学习(52ml.net)发表于 2017-01-31 00:00:00
    love 0

    Neural and Evolutionary Computing

    PathNet: Evolution Channels Gradient Descent in Super Neural Networks

    Chrisantha Fernando, Dylan Banarse, Charles Blundell, Yori Zwols, David Ha, Andrei A. Rusu, Alexander Pritzel, Daan Wierstra
    Subjects: Neural and Evolutionary Computing (cs.NE); Learning (cs.LG)

    For artificial general intelligence (AGI) it would be efficient if multiple
    users trained the same giant neural network, permitting parameter reuse,
    without catastrophic forgetting. PathNet is a first step in this direction. It
    is a neural network algorithm that uses agents embedded in the neural network
    whose task is to discover which parts of the network to re-use for new tasks.
    Agents are pathways (views) through the network which determine the subset of
    parameters that are used and updated by the forwards and backwards passes of
    the backpropogation algorithm. During learning, a tournament selection genetic
    algorithm is used to select pathways through the neural network for replication
    and mutation. Pathway fitness is the performance of that pathway measured
    according to a cost function. We demonstrate successful transfer learning;
    fixing the parameters along a path learned on task A and re-evolving a new
    population of paths for task B, allows task B to be learned faster than it
    could be learned from scratch or after fine-tuning. Paths evolved on task B
    re-use parts of the optimal path evolved on task A. Positive transfer was
    demonstrated for binary MNIST, CIFAR, and SVHN supervised learning
    classification tasks, and a set of Atari and Labyrinth reinforcement learning
    tasks, suggesting PathNets have general applicability for neural network
    training. Finally, PathNet also significantly improves the robustness to
    hyperparameter choices of a parallel asynchronous reinforcement learning
    algorithm (A3C).

    Memory Augmented Neural Networks with Wormhole Connections

    Caglar Gulcehre, Sarath Chandar, Yoshua Bengio
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    Recent empirical results on long-term dependency tasks have shown that neural
    networks augmented with an external memory can learn the long-term dependency
    tasks more easily and achieve better generalization than vanilla recurrent
    neural networks (RNN). We suggest that memory augmented neural networks can
    reduce the effects of vanishing gradients by creating shortcut (or wormhole)
    connections. Based on this observation, we propose a novel memory augmented
    neural network model called TARDIS (Temporal Automatic Relation Discovery in
    Sequences). The controller of TARDIS can store a selective set of embeddings of
    its own previous hidden states into an external memory and revisit them as and
    when needed. For TARDIS, memory acts as a storage for wormhole connections to
    the past to propagate the gradients more effectively and it helps to learn the
    temporal dependencies. The memory structure of TARDIS has similarities to both
    Neural Turing Machines (NTM) and Dynamic Neural Turing Machines (D-NTM), but
    both read and write operations of TARDIS are simpler and more efficient. We use
    discrete addressing for read/write operations which helps to substantially to
    reduce the vanishing gradient problem with very long sequences. Read and write
    operations in TARDIS are tied with a heuristic once the memory becomes full,
    and this makes the learning problem simpler when compared to NTM or D-NTM type
    of architectures. We provide a detailed analysis on the gradient propagation in
    general for MANNs. We evaluate our models on different long-term dependency
    tasks and report competitive results in all of them.

    Source localization in an ocean waveguide using supervised machine learning

    Haiqiang Niu, Peter Gerstoft, Emma Reeves
    Comments: Submitted to The Journal of the Acoustical Society of America
    Subjects: Atmospheric and Oceanic Physics (physics.ao-ph); Neural and Evolutionary Computing (cs.NE); Geophysics (physics.geo-ph)

    Source localization is solved as a classification problem by training a
    feed-forward neural network (FNN) on ocean acoustic data. The pressure received
    by a vertical linear array is preprocessed by constructing a normalized sample
    covariance matrix (SCM), which is used as input for the FNN. Each neuron of the
    output layer represents a discrete source range. FNN is a data-driven method
    that learns features directly from observed acoustic data, unlike model-based
    localization methods such as matched-field processing that require accurate
    sound propagation modeling. The FNN achieves a good performance (the mean
    absolute percentage error below 10\%) for predicting source ranges for vertical
    array data from the Noise09 experiment. The effects of varying the parameters
    of the method, such as number of hidden neurons and layers, number of output
    neurons and number of snapshots in each input sample are discussed.

    Detection, Segmentation and Recognition of Face and its Features Using Neural Network

    Smriti Tikoo, Nitin Malik
    Comments: Google Scholar Indexed Journal, 5 pages, 10 figures, Journal of Biosensors and Bioelectronics, vol. 7, no. 2, June-Sept 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    Face detection and recognition has been prevalent with research scholars and
    diverse approaches have been incorporated till date to serve purpose. The
    rampant advent of biometric analysis systems, which may be full body scanners,
    or iris detection and recognition systems and the finger print recognition
    systems, and surveillance systems deployed for safety and security purposes
    have contributed to inclination towards same. Advances has been made with
    frontal view, lateral view of the face or using facial expressions such as
    anger, happiness and gloominess, still images and video image to be used for
    detection and recognition. This led to newer methods for face detection and
    recognition to be introduced in achieving accurate results and economically
    feasible and extremely secure. Techniques such as Principal Component analysis
    (PCA), Independent component analysis (ICA), Linear Discriminant Analysis
    (LDA), have been the predominant ones to be used. But with improvements needed
    in the previous approaches Neural Networks based recognition was like boon to
    the industry. It not only enhanced the recognition but also the efficiency of
    the process. Choosing Backpropagation as the learning method was clearly out of
    its efficiency to recognize nonlinear faces with an acceptance ratio of more
    than 90% and execution time of only few seconds.


    Computer Vision and Pattern Recognition

    Document Decomposition of Bangla Printed Text

    Md. Fahad Hasan, Tasmin Afroz, Sabir Ismail, Md. Saiful Islam
    Comments: 6 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)

    Today all kind of information is getting digitized and along with all this
    digitization, the huge archive of various kinds of documents is being digitized
    too. We know that, Optical Character Recognition is the method through which,
    newspapers and other paper documents convert into digital resources. But, it is
    a fact that this method works on texts only. As a result, if we try to process
    any document which contains non-textual zones, then we will get garbage texts
    as output. That is why; in order to digitize documents properly they should be
    prepossessed carefully. And while preprocessing, segmenting document in
    different regions according to the category properly is most important. But,
    the Optical Character Recognition processes available for Bangla language have
    no such algorithm that can categorize a newspaper/book page fully. So we worked
    to decompose a document into its several parts like headlines, sub headlines,
    columns, images etc. And if the input is skewed and rotated, then the input was
    also deskewed and de-rotated. To decompose any Bangla document we found out the
    edges of the input image. Then we find out the horizontal and vertical area of
    every pixel where it lies in. Later on the input image was cut according to
    these areas. Then we pick each and every sub image and found out their
    height-width ratio, line height. Then according to these values the sub images
    were categorized. To deskew the image we found out the skew angle and de skewed
    the image according to this angle. To de-rotate the image we used the line
    height, matra line, pixel ratio of matra line.

    Self-Adaptation of Activity Recognition Systems to New Sensors

    David Bannach, Martin Jänicke, Vitor F. Rey, Sven Tomforde, Bernhard Sick, Paul Lukowicz
    Comments: 26 pages, very descriptive figures, comprehensive evaluation on real-life datasets
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)

    Traditional activity recognition systems work on the basis of training,
    taking a fixed set of sensors into account. In this article, we focus on the
    question how pattern recognition can leverage new information sources without
    any, or with minimal user input. Thus, we present an approach for opportunistic
    activity recognition, where ubiquitous sensors lead to dynamically changing
    input spaces. Our method is a variation of well-established principles of
    machine learning, relying on unsupervised clustering to discover structure in
    data and inferring cluster labels from a small number of labeled dates in a
    semi-supervised manner. Elaborating the challenges, evaluations of over 3000
    sensor combinations from three multi-user experiments are presented in detail
    and show the potential benefit of our approach.

    A Survey on Structure from Motion

    Onur Ozyesil, Vladislav Voroninski, Ronen Basri, Amit Singer
    Comments: 40 pages, 16 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    The structure from motion (SfM) problem in computer vision is the problem of
    recovering the (3)D structure of a stationary scene from a set of projective
    measurements, represented as a collection of (2)D images, via estimation of
    motion of the cameras corresponding to these images. In essence, SfM involves
    the three main stages of (1) extraction of features in images (e.g., points of
    interest, lines, etc.) and matching of these features between images, (2)
    camera motion estimation (e.g., using relative pairwise camera poses estimated
    from the extracted features), (3) recovery of the (3)D structure using the
    estimated motion and features (e.g., by minimizing the so-called reprojection
    error). This survey mainly focuses on the relatively recent developments in the
    literature pertaining to stages (2) and (3). More specifically, after touching
    upon the early factorization-based techniques for motion and structure
    estimation, we provide a detailed account of some of the recent camera location
    estimation methods in the literature, which precedes the discussion of notable
    techniques for (3)D structure recovery. We also cover the basics of the
    simultaneous localization and mapping (SLAM) problem, which can be considered
    to be a specific case of the SfM problem. Additionally, a review of the
    fundamentals of feature extraction and matching (i.e., stage (1) above),
    various recent methods for handling ambiguities in (3)D scenes, SfM techniques
    involving relatively uncommon camera models and image features, and popular
    sources of data and SfM software is included in our survey.

    CNN as Guided Multi-layer RECOS Transform

    C.-C. Jay Kuo
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    There is a resurging interest in developing a neural-network-based solution
    to the supervised machine learning problem. The convolutional neural network
    (CNN), which is also known as the feedforward neural network and the
    multi-layer perceptron (MLP), will be studied in this note. To begin with, we
    introduce a RECOS transform as a basic building block of CNNs. The “RECOS” is
    an acronym for “REctified-COrrelations on a Sphere”. It consists of two main
    concepts: 1) data clustering on a sphere and 2) rectification. Afterwards, we
    interpret a CNN as a network that implements the guided multi-layer RECOS
    transform with three highlights. First, we compare the traditional single-layer
    and modern multi-layer signal analysis approaches, point out key ingredients
    that enable the multi-layer approach, and provide a full explanation to the
    operating principle of CNNs. Second, we discuss how guidance is provided by
    labels through backpropagation in the training. Third, we show that a trained
    network can be greatly simplified in the testing stage demanding only one-bit
    representation for both filter weights and inputs.

    Scalable Nearest Neighbor Search based on kNN Graph

    Wan-Lei Zhao, Jie Yang, Cheng-Hao Deng
    Comments: 6 pages, 2 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Databases (cs.DB)

    Nearest neighbor search is known as a challenging issue that has been studied
    for several decades. Recently, this issue becomes more and more imminent in
    viewing that the big data problem arises from various fields. In this paper, a
    scalable solution based on hill-climbing strategy with the support of k-nearest
    neighbor graph (kNN) is presented. Two major issues have been considered in the
    paper. Firstly, an efficient kNN graph construction method based on two means
    tree is presented. For the nearest neighbor search, an enhanced hill-climbing
    procedure is proposed, which sees considerable performance boost over original
    procedure. Furthermore, with the support of inverted indexing derived from
    residue vector quantization, our method achieves close to 100% recall with high
    speed efficiency in two state-of-the-art evaluation benchmarks. In addition, a
    comparative study on both the compressional and traditional nearest neighbor
    search methods is presented. We show that our method achieves the best
    trade-off between search quality, efficiency and memory complexity.

    Re-ranking Person Re-identification with k-reciprocal Encoding

    Zhun Zhong, Liang Zheng, Donglin Cao, Shaozi Li
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    When considering person re-identification (re-ID) as a retrieval process,
    re-ranking is a critical step to improve its accuracy. Yet in the re-ID
    community, limited effort has been devoted to re-ranking, especially those
    fully automatic, unsupervised solutions. In this paper, we propose a
    k-reciprocal encoding method to re-rank the re-ID results. Our hypothesis is
    that if a gallery image is similar to the probe in the k-reciprocal nearest
    neighbors, it is more likely to be a true match. Specifically, given an image,
    a k-reciprocal feature is calculated by encoding its k-reciprocal nearest
    neighbors into a single vector, which is used for re-ranking under the Jaccard
    distance. The final distance is computed as the combination of the original
    distance and the Jaccard distance. Our re-ranking method does not require any
    human interaction or any labeled data, so it is applicable to large-scale
    datasets. Experiments on the large-scale Market-1501, CUHK03, MARS, and PRW
    datasets confirm the effectiveness of our method.

    Faceness-Net: Face Detection through Deep Facial Part Responses

    Shuo Yang, Ping Luo, Chen Change Loy, Xiaoou Tang
    Comments: An extended version of our ICCV 2015 paper
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We propose a deep convolutional neural network (CNN) for face detection
    leveraging on facial attributes based supervision. We observe a phenomenon that
    part detectors emerge within CNN trained to classify attributes from uncropped
    face images, without any explicit part supervision. The observation motivates a
    new method for finding faces through scoring facial parts responses by their
    spatial structure and arrangement. The scoring mechanism is data-driven, and
    carefully formulated considering challenging cases where faces are only
    partially visible. This consideration allows our network to detect faces under
    severe occlusion and unconstrained pose variations. Our method achieves
    promising performance on popular benchmarks including FDDB, PASCAL Faces, AFW,
    and WIDER FACE.

    The HASYv2 dataset

    Martin Thoma
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper describes the HASYv2 dataset. HASY is a publicly available, free
    of charge dataset of single symbols similar to MNIST. It contains 168233
    instances of 369 classes. HASY contains two challenges: A classification
    challenge with 10 pre-defined folds for 10-fold cross-validation and a
    verification challenge.

    MSCM-LiFe: Multi-scale cross modal linear feature for horizon detection in maritime images

    D. K. Prasad, D. Rajan, C. K. Prasath, L. Rachmawati, E. Rajabaly, C. Quek
    Comments: 5 pages, 4 figures, IEEE TENCON 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    This paper proposes a new method for horizon detection called the multi-scale
    cross modal linear feature. This method integrates three different concepts
    related to the presence of horizon in maritime images to increase the accuracy
    of horizon detection. Specifically it uses the persistence of horizon in
    multi-scale median filtering, and its detection as a linear feature commonly
    detected by two different methods, namely the Hough transform of edgemap and
    the intensity gradient. We demonstrate the performance of the method over 13
    videos comprising of more than 3000 frames and show that the proposed method
    detects horizon with small error in most of the cases, outperforming three
    state-of-the-art methods.

    VINet: Visual-Inertial Odometry as a Sequence-to-Sequence Learning Problem

    Ronald Clark, Sen Wang, Hongkai Wen, Andrew Markham, Niki Trigoni
    Comments: AAAI-17
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper we present an on-manifold sequence-to-sequence learning
    approach to motion estimation using visual and inertial sensors. It is to the
    best of our knowledge the first end-to-end trainable method for visual-inertial
    odometry which performs fusion of the data at an intermediate
    feature-representation level. Our method has numerous advantages over
    traditional approaches. Specifically, it eliminates the need for tedious manual
    synchronization of the camera and IMU as well as eliminating the need for
    manual calibration between the IMU and camera. A further advantage is that our
    model naturally and elegantly incorporates domain specific information which
    significantly mitigates drift. We show that our approach is competitive with
    state-of-the-art traditional methods when accurate calibration data is
    available and can be trained to outperform them in the presence of calibration
    and synchronization errors.

    Feature base fusion for splicing forgery detection based on neuro fuzzy

    Habib Ghaffari Hadigheh, Ghazali bin sulong
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Most of researches on image forensics have been mainly focused on detection
    of artifacts introduced by a single processing tool. They lead in the
    development of many specialized algorithms looking for one or more particular
    footprints under specific settings. Naturally, the performance of such
    algorithms are not perfect, and accordingly the provided output might be noisy,
    inaccurate and only partially correct. Furthermore, a forged image in practical
    scenarios is often the result of utilizing several tools available by
    image-processing software systems. Therefore, reliable tamper detection
    requires developing more poweful tools to deal with various tempering
    scenarios. Fusion of forgery detection tools based on Fuzzy Inference System
    has been used before for addressing this problem. Adjusting the membership
    functions and defining proper fuzzy rules for attaining to better results are
    time-consuming processes. This can be accounted as main disadvantage of fuzzy
    inference systems. In this paper, a Neuro-Fuzzy inference system for fusion of
    forgery detection tools is developed. The neural network characteristic of
    these systems provides appropriate tool for automatically adjusting the
    membership functions. Moreover, initial fuzzy inference system is generated
    based on fuzzy clustering techniques. The proposed framework is implemented and
    validated on a benchmark image splicing data set in which three forgery
    detection tools are fused based on adaptive Neuro-Fuzzy inference system. The
    outcome of the proposed method reveals that applying Neuro Fuzzy inference
    systems could be a better approach for fusion of forgery detection tools.

    Supervised Multilayer Sparse Coding Networks for Image Classification

    Xiaoxia Sun, Nasser M. Nasrabadi, Trac D. Tran
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper, we propose a novel multilayer sparse coding network capable of
    efficiently adapting its own regularization parameters to a given dataset. The
    network is trained end-to-end with a supervised task-driven learning algorithm
    via error backpropagation. During training, the network learns both the
    dictionaries and the regularization parameters of each sparse coding layer so
    that the reconstructive dictionaries are smoothly transformed into increasingly
    discriminative representations. We also incorporate a new weighted sparse
    coding scheme into our sparse recovery procedure, offering the system more
    flexibility to adjust sparsity levels. Furthermore, we have devised a sparse
    coding layer utilizing a ‘skinny’ dictionary. Integral to computational
    efficiency, these skinny dictionaries compress the high dimensional sparse
    codes into lower dimensional structures. The adaptivity and discriminability of
    our 13-layer sparse coding network are demonstrated on four benchmark datasets,
    namely Cifar-10, Cifar-100, SVHN and MNIST, most of which are considered
    difficult for sparse coding models. Experimental results show that our
    architecture overwhelmingly outperforms traditional one-layer sparse coding
    architectures while using much fewer parameters. Moreover, our multilayer
    architecture fuses the benefits of depth with sparse coding’s characteristic
    ability to operate on smaller datasets. In such data-constrained scenarios, we
    demonstrate our technique can overcome the limitations of deep neural networks
    by exceeding the state of the art in accuracy.

    Pooling Facial Segments to Face: The Shallow and Deep Ends

    Upal Mahbub, Sayantan Sarkar, Rama Chellappa
    Comments: 8 pages, 7 figures, 3 tables, accepted for publication in FG2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Generic face detection algorithms do not perform very well in the mobile
    domain due to significant presence of occluded and partially visible faces. One
    promising technique to handle the challenge of partial faces is to design face
    detectors based on facial segments. In this paper two such face detectors
    namely, SegFace and DeepSegFace, are proposed that detect the presence of a
    face given arbitrary combinations of certain face segments. Both methods use
    proposals from facial segments as input that are found using weak boosted
    classifiers. SegFace is a shallow and fast algorithm using traditional
    features, tailored for situations where real time constraints must be
    satisfied. On the other hand, DeepSegFace is a more powerful algorithm based on
    a deep convolutional neutral network (DCNN) architecture. DeepSegFace offers
    certain advantages over other DCNN-based face detectors as it requires
    relatively little amount of data to train by utilizing a novel data
    augmentation scheme and is very robust to occlusion by design. Extensive
    experiments show the superiority of the proposed methods, specially
    DeepSegFace, over other state-of-the-art face detectors in terms of
    precision-recall and ROC curve on two mobile face datasets.

    Treelogy: A Novel Tree Classifier Utilizing Deep and Hand-crafted Representations

    İlke Çuğu, Eren Şener, Çağrı Erciyes, Burak Balcı, Emre Akın, Itır Önal, Ahmet Oğuz Akyüz
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We propose a novel tree classification system called Treelogy, that fuses
    deep representations with hand-crafted features obtained from leaf images to
    perform leaf-based plant classification. Key to this system are segmentation of
    the leaf from an untextured background, using convolutional neural networks
    (CNNs) for learning deep representations, extracting hand-crafted features with
    a number of image processing techniques, training a linear SVM with feature
    vectors, merging SVM and CNN results, and identifying the species from a
    dataset of 57 trees. Our classification results show that fusion of deep
    representations with hand-crafted features leads to the highest accuracy. The
    proposed algorithm is embedded in a smart-phone application, which is publicly
    available. Furthermore, our novel dataset comprised of 5408 leaf images is also
    made public for use of other researchers.

    Face Detection using Deep Learning: An Improved Faster RCNN Approach

    Xudong Sun, Pengcheng Wu, Steven C.H. Hoi
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this report, we present a new face detection scheme using deep learning
    and achieve the state-of-the-art detection performance on the well-known FDDB
    face detetion benchmark evaluation. In particular, we improve the
    state-of-the-art faster RCNN framework by combining a number of strategies,
    including feature concatenation, hard negative mining, multi-scale training,
    model pretraining, and proper calibration of key parameters. As a consequence,
    the proposed scheme obtained the state-of-the-art face detection performance,
    making it the best model in terms of ROC curves among all the published methods
    on the FDDB benchmark.

    Pruned non-local means

    Sanjay Ghosh, Amit K. Mandal, Kunal N. Chaudhury
    Comments: Accepted in IET Image Processing, 16 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In Non-Local Means (NLM), each pixel is denoised by performing a weighted
    averaging of its neighboring pixels, where the weights are computed using image
    patches. We demonstrate that the denoising performance of NLM can be improved
    by pruning the neighboring pixels, namely, by rejecting neighboring pixels
    whose weights are below a certain threshold (lambda). While pruning can
    potentially reduce pixel averaging in uniform-intensity regions, we demonstrate
    that there is generally an overall improvement in the denoising performance. In
    particular, the improvement comes from pixels situated close to edges and
    corners. The success of the proposed method strongly depends on the choice of
    the global threshold (lambda), which in turn depends on the noise level and
    the image characteristics. We show how Stein’s unbiased estimator of the
    mean-squared error can be used to optimally tune (lambda), at a marginal
    computational overhead. We present some representative denoising results to
    demonstrate the superior performance of the proposed method over NLM and its
    variants.

    Exploiting saliency for object segmentation from image level labels

    Seong Joon Oh, Rodrigo Benenson, Anna Khoreva, Zeynep Akata, Mario Fritz, Bernt Schiele
    Comments: Submitted to CVPR 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    There have been remarkable improvements in the semantic labelling task in the
    recent years. However, the state of the art methods rely on large-scale
    pixel-level annotations. This paper studies the problem of training a
    pixel-wise semantic labeller network from image-level annotations of the
    present object classes. Recently, it has been shown that high quality seeds
    indicating discriminative object regions can be obtained from image-level
    labels. Without additional information, obtaining the full extent of the object
    is an inherently ill-posed problem due to co-occurrences. We propose using a
    saliency model as additional information and hereby exploit prior knowledge on
    the object extent and image statistics. We show how to combine both information
    sources in order to recover 80% of the fully supervised performance – which is
    the new state of the art in weakly supervised training for pixel-wise semantic
    labelling.

    Detection, Segmentation and Recognition of Face and its Features Using Neural Network

    Smriti Tikoo, Nitin Malik
    Comments: Google Scholar Indexed Journal, 5 pages, 10 figures, Journal of Biosensors and Bioelectronics, vol. 7, no. 2, June-Sept 2016
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    Face detection and recognition has been prevalent with research scholars and
    diverse approaches have been incorporated till date to serve purpose. The
    rampant advent of biometric analysis systems, which may be full body scanners,
    or iris detection and recognition systems and the finger print recognition
    systems, and surveillance systems deployed for safety and security purposes
    have contributed to inclination towards same. Advances has been made with
    frontal view, lateral view of the face or using facial expressions such as
    anger, happiness and gloominess, still images and video image to be used for
    detection and recognition. This led to newer methods for face detection and
    recognition to be introduced in achieving accurate results and economically
    feasible and extremely secure. Techniques such as Principal Component analysis
    (PCA), Independent component analysis (ICA), Linear Discriminant Analysis
    (LDA), have been the predominant ones to be used. But with improvements needed
    in the previous approaches Neural Networks based recognition was like boon to
    the industry. It not only enhanced the recognition but also the efficiency of
    the process. Choosing Backpropagation as the learning method was clearly out of
    its efficiency to recognize nonlinear faces with an acceptance ratio of more
    than 90% and execution time of only few seconds.

    Detection of Face using Viola Jones and Recognition using Back Propagation Neural Network

    Smriti Tikoo, Nitin Malik
    Comments: ISSN 2320-088X, 8 pages, 5 figures, 1 table
    Journal-ref: Int J. Computer Science and Mobile Computing, vol. 5, issue 5, pp.
    288-295 (May 2016)
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Detection and recognition of the facial images of people is an intricate
    problem which has garnered much attention during recent years due to its ever
    increasing applications in numerous fields. It continues to pose a challenge in
    finding a robust solution to it. Its scope extends to catering the security,
    commercial and law enforcement applications. Research for moreover a decade on
    this subject has brought about remarkable development with the modus operandi
    like human computer interaction, biometric analysis and content based coding of
    images, videos and surveillance. A trivial task for brain but cumbersome to be
    imitated artificially. The commonalities in faces does pose a problem on
    various grounds but features such as skin color, gender differentiate a person
    from the other. In this paper the facial detection has been carried out using
    Viola-Jones algorithm and recognition of face has been done using Back
    Propagation Neural Network (BPNN).

    An Efficient Algebraic Solution to the Perspective-Three-Point Problem

    Tong Ke, Stergios Roumeliotis
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this work, we present an algebraic solution to the classical
    perspective-3-point (P3P) problem for determining the position and attitude of
    a camera from observations of three known reference points. In contrast to
    previous approaches, we first directly determine the camera’s attitude by
    employing the corresponding geometric constraints to formulate a system of
    trigonometric equations. This is then efficiently solved, following an
    algebraic approach, to determine the unknown rotation matrix and subsequently
    the camera’s position. As compared to recent alternatives, our method avoids
    computing unnecessary (and potentially numerically unstable) intermediate
    results, and thus achieves higher numerical accuracy and robustness at a lower
    computational cost. These benefits are validated through extensive Monte-Carlo
    simulations for both nominal and close-to-singular geometric configurations.

    Camera-Trap Images Segmentation using Multi-Layer Robust Principal Component Analysis

    Jhony-Heriberto Giraldo-Zuluaga, Alexander Gomez, Augusto Salazar, Angélica Diaz-Pulido
    Comments: Submitted to ICIP 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Camera trapping is a technique to study wildlife using automatic triggered
    cameras. However, camera trapping collects a lot of false positives (images
    without animals), which must be segmented before the classification step. This
    paper presents a Multi-Layer Robust Principal Component Analysis (RPCA) for
    camera-trap images segmentation. Our Multi-Layer RPCA uses histogram
    equalization and Gaussian filter as pre-processing, texture and color
    descriptors as features, and morphological filters with active contour as
    post-processing. The experiments focus on computing the sparse and low-rank
    matrices with different amounts of camera-trap images. We tested the
    Multi-Layer RPCA in our camera-trap database. To our best knowledge, this paper
    is the first work proposing Multi-Layer RPCA and using it for camera-trap
    images segmentation.

    Peduncle Detection of Sweet Pepper for Autonomous Crop Harvesting – Combined Colour and 3D Information

    Inkyu Sa, Chris Lehnert, Andrew English, Chris McCool, Feras Dayoub, Ben Upcroft, Tristan Perez
    Comments: 8 pages, 14 figures, Robotics and Automation Letters
    Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)

    This paper presents a 3D visual detection method for the challenging task of
    detecting peduncles of sweet peppers (Capsicum annuum) in the field. Cutting
    the peduncle cleanly is one of the most difficult stages of the harvesting
    process, where the peduncle is the part of the crop that attaches it to the
    main stem of the plant. Accurate peduncle detection in 3D space is therefore a
    vital step in reliable autonomous harvesting of sweet peppers, as this can lead
    to precise cutting while avoiding damage to the surrounding plant. This paper
    makes use of both colour and geometry information acquired from an RGB-D sensor
    and utilises a supervised-learning approach for the peduncle detection task.
    The performance of the proposed method is demonstrated and evaluated using
    qualitative and quantitative results (the Area-Under-the-Curve (AUC) of the
    detection precision-recall curve). We are able to achieve an AUC of 0.71 for
    peduncle detection on field-grown sweet peppers. We release a set of manually
    annotated 3D sweet pepper and peduncle images to assist the research community
    in performing further research on this topic.

    SafeDrive: A Robust Lane Tracking System for Autonomous and Assisted Driving Under Limited Visibility

    Junaed Sattar, Jiawei Mo
    Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)

    We present an approach towards robust lane tracking for assisted and
    autonomous driving, particularly under poor visibility. Autonomous detection of
    lane markers improves road safety, and purely visual tracking is desirable for
    widespread vehicle compatibility and reducing sensor intrusion, cost, and
    energy consumption. However, visual approaches are often ineffective because of
    a number of factors, including but not limited to occlusion, poor weather
    conditions, and paint wear-off. Our method, named SafeDrive, attempts to
    improve visual lane detection approaches in drastically degraded visual
    conditions without relying on additional active sensors. In scenarios where
    visual lane detection algorithms are unable to detect lane markers, the
    proposed approach uses location information of the vehicle to locate and access
    alternate imagery of the road and attempts detection on this secondary image.
    Subsequently, by using a combination of feature-based and pixel-based
    alignment, an estimated location of the lane marker is found in the current
    scene. We demonstrate the effectiveness of our system on actual driving data
    from locations in the United States with Google Street View as the source of
    alternate imagery.

    Transformation-Based Models of Video Sequences

    Joost van Amersfoort, Anitha Kannan, Marc'Aurelio Ranzato, Arthur Szlam, Du Tran, Soumith Chintala
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)

    In this work we propose a simple unsupervised approach for next frame
    prediction in video. Instead of directly predicting the pixels in a frame given
    past frames, we predict the transformations needed for generating the next
    frame in a sequence, given the transformations of the past frames. This leads
    to sharper results, while using a smaller prediction model.

    In order to enable a fair comparison between different video frame prediction
    models, we also propose a new evaluation protocol. We use generated frames as
    input to a classifier trained with ground truth sequences. This criterion
    guarantees that models scoring high are those producing sequences which
    preserve discrim- inative features, as opposed to merely penalizing any
    deviation, plausible or not, from the ground truth. Our proposed approach
    compares favourably against more sophisticated ones on the UCF-101 data set,
    while also being more efficient in terms of the number of parameters and
    computational cost.

    When Slepian Meets Fiedler: Putting a Focus on the Graph Spectrum

    Dimitri Van De Ville, Robin Demesmaeker, Maria Giulia Preti
    Comments: 4 pages, 4 figures, submitted to IEEE Signal Processing Letters
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)

    Network models play an important role in studying complex systems in many
    scientific disciplines. Graph signal processing is receiving growing interest
    as to design novel tools to combine the analysis of topology and signals. The
    graph Fourier transform, defined as the eigendecomposition of the graph
    Laplacian, allows extending conventional signal-processing operations to
    graphs. One main feature is to let emerge global organization from local
    interactions; i.e., the Fiedler vector has the smallest non-zero eigenvalue and
    is key for Laplacian embedding and graph clustering. Here, we introduce the
    design of Slepian graph signals, by maximizing energy concentration in a
    predefined subgraph for a given spectral bandlimit. We also establish a link
    with classical Laplacian embedding and graph clustering, for which the graph
    Slepian design can serve as a generalization.

    Image-Grounded Conversations: Multimodal Context for Natural Question and Response Generation

    Nasrin Mostafazadeh, Chris Brockett, Bill Dolan, Michel Galley, Jianfeng Gao, Georgios P. Spithourakis, Lucy Vanderwende
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

    The popularity of image sharing on social media reflects the important role
    visual context plays in everyday conversation. In this paper, we present a
    novel task, Image-Grounded Conversations (IGC), in which natural-sounding
    conversations are generated about shared photographic images. We investigate
    this task using training data derived from image-grounded conversations on
    social media and introduce a new dataset of crowd-sourced conversations for
    benchmarking progress. Experiments using deep neural network models trained on
    social media data show that the combination of visual and textual context can
    enhance the quality of generated conversational turns. In human evaluation, a
    gap between human performance and that of both neural and retrieval
    architectures suggests that IGC presents an interesting challenge for vision
    and language research.

    Sampling Without Time: Recovering Echoes of Light via Temporal Phase Retrieval

    Ayush Bhandari, Aurelien Bourquard, Ramesh Raskar
    Comments: 12 pages, 4 figures, to appear at the 42nd IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
    Subjects: Information Theory (cs.IT); Computer Vision and Pattern Recognition (cs.CV)

    This paper considers the problem of sampling and reconstruction of a
    continuous-time sparse signal without assuming the knowledge of the sampling
    instants or the sampling rate. This topic has its roots in the problem of
    recovering multiple echoes of light from its low-pass filtered and
    auto-correlated, time-domain measurements. Our work is closely related to the
    topic of sparse phase retrieval and in this context, we discuss the advantage
    of phase-free measurements. While this problem is ill-posed, cues based on
    physical constraints allow for its appropriate regularization. We validate our
    theory with experiments based on customized, optical time-of-flight imaging
    sensors. What singles out our approach is that our sensing method allows for
    temporal phase retrieval as opposed to the usual case of spatial phase
    retrieval. Preliminary experiments and results demonstrate a compelling
    capability of our phase-retrieval based imaging device.


    Artificial Intelligence

    Diversification Methods for Zero-One Optimization

    Fred Glover
    Comments: 28 pages, 7 illustrations, 4 pseudocodes
    Subjects: Artificial Intelligence (cs.AI)

    We introduce new diversification methods for zero-one optimization that
    significantly extend strategies previously introduced in the setting of
    metaheuristic search. Our methods incorporate easily implemented strategies for
    partitioning assignments of values to variables, accompanied by processes
    called augmentation and shifting which create greater flexibility and
    generality. We then show how the resulting collection of diversified solutions
    can be further diversified by means of permutation mappings, which equally can
    be used to generate diversified collections of permutations for applications
    such as scheduling and routing. These methods can be applied to non-binary
    vectors by the use of binarization procedures and by Diversification-Based
    Learning (DBL) procedures which also provide connections to applications in
    clustering and machine learning. Detailed pseudocode and numerical
    illustrations are provided to show the operation of our methods and the
    collections of solutions they create.

    Redefinition of the concept of fuzzy set based on vague partition from the perspective of axiomatization

    Xiaodong Pan, Yang Xu
    Comments: 25 pages
    Subjects: Artificial Intelligence (cs.AI)

    Based on the in-depth analysis of the essence and features of vague
    phenomena, this paper focuses on establishing the axiomatical foundation of
    membership degree theory for vague phenomena, presents an axiomatic system to
    govern membership degrees and their interconnections. On this basis, the
    concept of vague partition is introduced, further, the concept of fuzzy set
    introduced by Zadeh in 1965 is redefined based on vague partition from the
    perspective of axiomatization. The thesis defended in this paper is that the
    relationship among vague attribute values should be the starting point to
    recognize and model vague phenomena from a quantitative view.

    Credal Networks under Epistemic Irrelevance

    Jasper De Bock
    Subjects: Artificial Intelligence (cs.AI); Probability (math.PR)

    A credal network under epistemic irrelevance is a generalised type of
    Bayesian network that relaxes its two main building blocks. On the one hand,
    the local probabilities are allowed to be partially specified. On the other
    hand, the assessments of independence do not have to hold exactly.
    Conceptually, these two features turn credal networks under epistemic
    irrelevance into a powerful alternative to Bayesian networks, offering a more
    flexible approach to graph-based multivariate uncertainty modelling. However,
    in practice, they have long been perceived as very hard to work with, both
    theoretically and computationally.

    The aim of this paper is to demonstrate that this perception is no longer
    justified. We provide a general introduction to credal networks under epistemic
    irrelevance, give an overview of the state of the art, and present several new
    theoretical results. Most importantly, we explain how these results can be
    combined to allow for the design of recursive inference methods. We provide
    numerous concrete examples of how this can be achieved, and use these to
    demonstrate that computing with credal networks under epistemic irrelevance is
    most definitely feasible, and in some cases even highly efficient. We also
    discuss several philosophical aspects, including the lack of symmetry, how to
    deal with probability zero, the interpretation of lower expectations, the
    axiomatic status of graphoid properties, and the difference between updating
    and conditioning.

    Survey on Models and Techniques for Root-Cause Analysis

    Marc Solé, Victor Muntés-Mulero, Annie Ibrahim Rana, Giovani Estrada
    Comments: 18 pages, 222 references
    Subjects: Artificial Intelligence (cs.AI)

    Automation and computer intelligence to support complex human decisions
    becomes essential to manage large and distributed systems in the Cloud and IoT
    era. Understanding the root cause of an observed symptom in a complex system
    has been a major problem for decades. As industry dives into the IoT world and
    the amount of data generated per year grows at an amazing speed, an important
    question is how to find appropriate mechanisms to determine root causes that
    can handle huge amounts of data or may provide valuable feedback in real-time.
    While many survey papers aim at summarizing the landscape of techniques for
    modelling system behavior and infering the root cause of a problem based in the
    resulting models, none of those focuses on analyzing how the different
    techniques in the literature fit growing requirements in terms of performance
    and scalability. In this survey, we provide a review of root-cause analysis,
    focusing on these particular aspects. We also provide guidance to choose the
    best root-cause analysis strategy depending on the requirements of a particular
    system and application.

    Rhythm Transcription of Polyphonic Piano Music Based on Merged-Output HMM for Multiple Voices

    Eita Nakamura, Kazuyoshi Yoshii, Shigeki Sagayama
    Comments: 13 pages, 13 figures, version accepted to IEEE/ACM TASLP
    Subjects: Artificial Intelligence (cs.AI); Sound (cs.SD)

    In a recent conference paper, we have reported a rhythm transcription method
    based on a merged-output hidden Markov model (HMM) that explicitly describes
    the multiple-voice structure of polyphonic music. This model solves a major
    problem of conventional methods that could not properly describe the nature of
    multiple voices as in polyrhythmic scores or in the phenomenon of loose
    synchrony between voices. In this paper we present a complete description of
    the proposed model and develop an inference technique, which is valid for any
    merged-output HMMs for which output probabilities depend on past events. We
    also examine the influence of the architecture and parameters of the method in
    terms of accuracies of rhythm transcription and voice separation and perform
    comparative evaluations with six other algorithms. Using MIDI recordings of
    classical piano pieces, we found that the proposed model outperformed other
    methods by more than 12 points in the accuracy for polyrhythmic performances
    and performed almost as good as the best one for non-polyrhythmic performances.
    This reveals the state-of-the-art methods of rhythm transcription for the first
    time in the literature. Publicly available source codes are also provided for
    future comparisons.

    Explanation Generation as Model Reconciliation in Multi-Model Planning

    Tathagata Chakraborti, Sarath Sreedharan, Yu Zhang, Subbarao Kambhampati
    Subjects: Artificial Intelligence (cs.AI)

    The ability to explain the rationale behind a planner’s deliberative process
    is crucial to the realization of effective human-planner interaction. However,
    in the context of human-in-the-loop planning, a significant challenge towards
    providing meaningful explanations arises due to the fact that the actor
    (planner) and the observer (human) are likely to have different models of the
    world, leading to a difference in the expected plan for the same perceived
    planning problem. In this paper, for the first time, we formalize this notion
    of Multi-Model Planning (MMP) and describe how a planner can provide
    explanations of its plans in the context of such model differences.
    Specifically, we will pose the multi-model explanation generation problem as a
    model reconciliation problem and show how meaningful explanations may be
    affected by making corrections to the human model. We will also demonstrate the
    efficacy of our approach in randomly generated problems from benchmark planning
    domains, and motivate exciting avenues of future research in the MMP paradigm.

    Practical Reasoning with Norms for Autonomous Software Agents (Full Edition)

    Zohreh Shams, Marina De Vos, Julian Padget, Wamberto W. Vasconcelos
    Subjects: Artificial Intelligence (cs.AI)

    Autonomous software agents operating in dynamic environments need to
    constantly reason about actions in pursuit of their goals, while taking into
    consideration norms which might be imposed on those actions. Normative
    practical reasoning supports agents making decisions about what is best for
    them to (not) do in a given situation. What makes practical reasoning
    challenging is the interplay between goals that agents are pursuing and the
    norms that the agents are trying to uphold. We offer a formalisation to allow
    agents to plan for multiple goals and norms in the presence of durative actions
    that can be executed concurrently. We compare plans based on decision-theoretic
    notions (i.e. utility) such that the utility gain of goals and utility loss of
    norm violations are the basis for this comparison. The set of optimal plans
    consists of plans that maximise the overall utility, each of which can be
    chosen by the agent to execute. We provide an implementation of our proposal in
    Answer Set Programming, thus allowing us to state the original problem in terms
    of a logic program that can be queried for solutions with specific properties.
    The implementation is proven to be sound and complete.

    Multiclass MinMax Rank Aggregation

    Pan Li, Olgica Milenkovic
    Subjects: Artificial Intelligence (cs.AI)

    We introduce a new family of minmax rank aggregation problems under two
    distance measures, the Kendall { au} and the Spearman footrule. As the
    problems are NP-hard, we proceed to describe a number of constant-approximation
    algorithms for solving them. We conclude with illustrative applications of the
    aggregation methods on the Mallows model and genomic data.

    A Study of FOSS'2013 Survey Data Using Clustering Techniques

    Mani A, Rebeka Mukherjee
    Comments: IEEE Women in Engineering Conference Paper: WIECON-ECE’2017 (Scheduled to appear in IEEE Xplore )
    Subjects: Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Software Engineering (cs.SE); Machine Learning (stat.ML)

    FOSS is an acronym for Free and Open Source Software. The FOSS 2013 survey
    primarily targets FOSS contributors and relevant anonymized dataset is publicly
    available under CC by SA license. In this study, the dataset is analyzed from a
    critical perspective using statistical and clustering techniques (especially
    multiple correspondence analysis) with a strong focus on women contributors
    towards discovering hidden trends and facts. Important inferences are drawn
    about development practices and other facets of the free software and OSS
    worlds.

    Pure Rough Mereology and Counting

    A. Mani
    Comments: IEEE Women in Engineering Conference, WIECON-ECE’2017 (Accepted for IEEEXplore)
    Subjects: Artificial Intelligence (cs.AI); Information Theory (cs.IT); Logic in Computer Science (cs.LO); Logic (math.LO)

    The study of mereology (parts and wholes) in the context of formal approaches
    to vagueness can be approached in a number of ways. In the context of rough
    sets, mereological concepts with a set-theoretic or valuation based ontology
    acquire complex and diverse behavior. In this research a general rough set
    framework called granular operator spaces is extended and the nature of
    parthood in it is explored from a minimally intrusive point of view. This is
    used to develop counting strategies that help in classifying the framework. The
    developed methodologies would be useful for drawing involved conclusions about
    the nature of data (and validity of assumptions about it) from antichains
    derived from context. The problem addressed is also about whether counting
    procedures help in confirming that the approximations involved in formation of
    data are indeed rough approximations?

    Incremental Maintenance Of Association Rules Under Support Threshold Change

    Mohamed Anis Bach Tobji, Mohamed Salah Gouider
    Subjects: Artificial Intelligence (cs.AI); Databases (cs.DB)

    Maintenance of association rules is an interesting problem. Several
    incremental maintenance algorithms were proposed since the work of (Cheung et
    al, 1996). The majority of these algorithms maintain rule bases assuming that
    support threshold doesn’t change. In this paper, we present incremental
    maintenance algorithm under support threshold change. This solution allows user
    to maintain its rule base under any support threshold.

    Comparative Study Of Data Mining Query Languages

    Mohamed Anis Bach Tobji
    Subjects: Artificial Intelligence (cs.AI); Databases (cs.DB)

    Since formulation of Inductive Database (IDB) problem, several Data Mining
    (DM) languages have been proposed, confirming that KDD process could be
    supported via inductive queries (IQ) answering. This paper reviews the existing
    DM languages. We are presenting important primitives of the DM language and
    classifying our languages according to primitives’ satisfaction. In addition,
    we presented languages’ syntaxes and tried to apply each one to a database
    sample to test a set of KDD operations. This study allows us to highlight
    languages capabilities and limits, which is very useful for future work and
    perspectives.

    Methodologies for realizing natural-language-facilitated human-robot cooperation: A review

    Rui Liu, Xiaoli Zhang
    Comments: 30 pages, 15 figures, article submitted to Knowledge-based Systems, 2017 Jan
    Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Human-Computer Interaction (cs.HC)

    Natural Language (NL) for transferring knowledge from a human to a robot.
    Recently, research on using NL to support human-robot cooperation (HRC) has
    received increasing attention in several domains such as robotic daily
    assistance, robotic health caregiving, intelligent manufacturing, autonomous
    navigation and robot social accompany. However, a high-level review that can
    reveal the realization process and the latest methodologies of using NL to
    facilitate HRC is missing. In this review, a comprehensive summary about the
    methodology development of natural-language-facilitated human-robot cooperation
    (NLC) has been made. We first analyzed driving forces for NLC developments.
    Then, with a temporal realization order, we reviewed three main steps of NLC:
    human NL understanding, knowledge representation, and knowledge-world mapping.
    Last, based on our paper review and perspectives, potential research trends in
    NLC was discussed.

    Decision structure of risky choice

    Lamb Wubin, Naixin Ren
    Comments: 13 pages
    Subjects: Economics (q-fin.EC); Artificial Intelligence (cs.AI)

    As we know, there is a controversy about the decision making under risk
    between economists and psychologists. We discuss to build a unified theory of
    risky choice, which would explain both of compensatory and non-compensatory
    theories. Obviously, decision strategy is not stuck in a rut, but based on the
    things, in the real life, and experiment materials, in the laboratory. We
    believe that human has a decision structure, which has constant and variable,
    interval, concepts of probability and value. Namely, according to cognition
    ability, we argue that people could not build a continuous and accurate
    subjective probability world, but several intervals of probability perception.
    More precisely, decision making is an order reduction process, which is
    simplifying the decision structure. However, we are not really sure which
    reduction path will occur during decision making process. It is why preference
    reversal always happens when making decisions. The most efficient way to reduce
    the order of decision structure is mathematical expectation. We also argue that
    the deliberation time at least has four parts, which are consist of
    substitution time,{ au}”(G) d{ au} time, { au}'(G) d{ au} time and
    calculation time. Decision structure can simply explain the phenomenon of
    paradoxes and anomalies. JEL Codes: C10, D03, D81.

    Feature base fusion for splicing forgery detection based on neuro fuzzy

    Habib Ghaffari Hadigheh, Ghazali bin sulong
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Most of researches on image forensics have been mainly focused on detection
    of artifacts introduced by a single processing tool. They lead in the
    development of many specialized algorithms looking for one or more particular
    footprints under specific settings. Naturally, the performance of such
    algorithms are not perfect, and accordingly the provided output might be noisy,
    inaccurate and only partially correct. Furthermore, a forged image in practical
    scenarios is often the result of utilizing several tools available by
    image-processing software systems. Therefore, reliable tamper detection
    requires developing more poweful tools to deal with various tempering
    scenarios. Fusion of forgery detection tools based on Fuzzy Inference System
    has been used before for addressing this problem. Adjusting the membership
    functions and defining proper fuzzy rules for attaining to better results are
    time-consuming processes. This can be accounted as main disadvantage of fuzzy
    inference systems. In this paper, a Neuro-Fuzzy inference system for fusion of
    forgery detection tools is developed. The neural network characteristic of
    these systems provides appropriate tool for automatically adjusting the
    membership functions. Moreover, initial fuzzy inference system is generated
    based on fuzzy clustering techniques. The proposed framework is implemented and
    validated on a benchmark image splicing data set in which three forgery
    detection tools are fused based on adaptive Neuro-Fuzzy inference system. The
    outcome of the proposed method reveals that applying Neuro Fuzzy inference
    systems could be a better approach for fusion of forgery detection tools.

    Systems of natural-language-facilitated human-robot cooperation: A review

    Rui Liu, Xiaoli Zhang
    Comments: 21 pages, 10 figures, article submitted to Knowledge-based Systems, 2017 Jan
    Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Human-Computer Interaction (cs.HC)

    Natural-language-facilitated human-robot cooperation (NLC), in which natural
    language (NL) is used to share knowledge between a human and a robot for
    conducting intuitive human-robot cooperation (HRC), is continuously developing
    in the recent decade. Currently, NLC is used in several robotic domains such as
    manufacturing, daily assistance and health caregiving. It is necessary to
    summarize current NLC-based robotic systems and discuss the future developing
    trends, providing helpful information for future NLC research. In this review,
    we first analyzed the driving forces behind the NLC research. Regarding to a
    robot s cognition level during the cooperation, the NLC implementations then
    were categorized into four types {NL-based control, NL-based robot training,
    NL-based task execution, NL-based social companion} for comparison and
    discussion. Last based on our perspective and comprehensive paper review, the
    future research trends were discussed.

    Entropic Causality and Greedy Minimum Entropy Coupling

    Murat Kocaoglu, Alexandros G. Dimakis, Sriram Vishwanath, Babak Hassibi
    Comments: Submitted to ISIT 2017
    Subjects: Information Theory (cs.IT); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    We study the problem of identifying the causal relationship between two
    discrete random variables from observational data. We recently proposed a novel
    framework called entropic causality that works in a very general functional
    model but makes the assumption that the unobserved exogenous variable has small
    entropy in the true causal direction.

    This framework requires the solution of a minimum entropy coupling problem:
    Given marginal distributions of m discrete random variables, each on n states,
    find the joint distribution with minimum entropy, that respects the given
    marginals. This corresponds to minimizing a concave function of nm variables
    over a convex polytope defined by nm linear constraints, called a
    transportation polytope. Unfortunately, it was recently shown that this minimum
    entropy coupling problem is NP-hard, even for 2 variables with n states. Even
    representing points (joint distributions) over this space can require
    exponential complexity (in n, m) if done naively.

    In our recent work we introduced an efficient greedy algorithm to find an
    approximate solution for this problem. In this paper we analyze this algorithm
    and establish two results: that our algorithm always finds a local minimum and
    also is within an additive approximation error from the unknown global optimum.

    Image-Grounded Conversations: Multimodal Context for Natural Question and Response Generation

    Nasrin Mostafazadeh, Chris Brockett, Bill Dolan, Michel Galley, Jianfeng Gao, Georgios P. Spithourakis, Lucy Vanderwende
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

    The popularity of image sharing on social media reflects the important role
    visual context plays in everyday conversation. In this paper, we present a
    novel task, Image-Grounded Conversations (IGC), in which natural-sounding
    conversations are generated about shared photographic images. We investigate
    this task using training data derived from image-grounded conversations on
    social media and introduce a new dataset of crowd-sourced conversations for
    benchmarking progress. Experiments using deep neural network models trained on
    social media data show that the combination of visual and textual context can
    enhance the quality of generated conversational turns. In human evaluation, a
    gap between human performance and that of both neural and retrieval
    architectures suggests that IGC presents an interesting challenge for vision
    and language research.


    Information Retrieval

    Click Through Rate Prediction for Contextual Advertisment Using Linear Regression

    Muhammad Junaid Effendi, Syed Abbas Ali
    Comments: 8 pages, 13 Figures, 11 Tables
    Subjects: Information Retrieval (cs.IR); Learning (cs.LG)

    This research presents an innovative and unique way of solving the
    advertisement prediction problem which is considered as a learning problem over
    the past several years. Online advertising is a multi-billion-dollar industry
    and is growing every year with a rapid pace. The goal of this research is to
    enhance click through rate of the contextual advertisements using Linear
    Regression. In order to address this problem, a new technique propose in this
    paper to predict the CTR which will increase the overall revenue of the system
    by serving the advertisements more suitable to the viewers with the help of
    feature extraction and displaying the advertisements based on context of the
    publishers. The important steps include the data collection, feature
    extraction, CTR prediction and advertisement serving. The statistical results
    obtained from the dynamically used technique show an efficient outcome by
    fitting the data close to perfection for the LR technique using optimized
    feature selection.

    Feature Studies to Inform the Classification of Depressive Symptoms from Twitter Data for Population Health

    Danielle Mowery, Craig Bryan, Mike Conway
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Computers and Society (cs.CY); Social and Information Networks (cs.SI)

    The utility of Twitter data as a medium to support population-level mental
    health monitoring is not well understood. In an effort to better understand the
    predictive power of supervised machine learning classifiers and the influence
    of feature sets for efficiently classifying depression-related tweets on a
    large-scale, we conducted two feature study experiments. In the first
    experiment, we assessed the contribution of feature groups such as lexical
    information (e.g., unigrams) and emotions (e.g., strongly negative) using a
    feature ablation study. In the second experiment, we determined the percentile
    of top ranked features that produced the optimal classification performance by
    applying a three-step feature elimination approach. In the first experiment, we
    observed that lexical features are critical for identifying depressive
    symptoms, specifically for depressed mood (-35 points) and for disturbed sleep
    (-43 points). In the second experiment, we observed that the optimal F1-score
    performance of top ranked features in percentiles variably ranged across
    classes e.g., fatigue or loss of energy (5th percentile, 288 features) to
    depressed mood (55th percentile, 3,168 features) suggesting there is no
    consistent count of features for predicting depressive-related tweets. We
    conclude that simple lexical features and reduced feature sets can produce
    comparable results to larger feature sets.

    Binary adaptive embeddings from order statistics of random projections

    Diego Valsesia, Enrico Magli
    Subjects: Learning (cs.LG); Information Retrieval (cs.IR)

    We use some of the largest order statistics of the random projections of a
    reference signal to construct a binary embedding that is adapted to signals
    correlated with such signal. The embedding is characterized from the analytical
    standpoint and shown to provide improved performance on tasks such as
    classification in a reduced-dimensionality space.

    Who With Whom And How?: Extracting Large Social Networks Using Search Engines

    Stefan Siersdorfer, Philipp Kemkes, Hanno Ackermann, Sergej Zerr
    Journal-ref: CIKM 2015 Proceedings of the 24th ACM International on Conference
    on Information and Knowledge Management Pages 1491-1500
    Subjects: Social and Information Networks (cs.SI); Information Retrieval (cs.IR)

    Social network analysis is leveraged in a variety of applications such as
    identifying influential entities, detecting communities with special interests,
    and determining the flow of information and innovations. However, existing
    approaches for extracting social networks from unstructured Web content do not
    scale well and are only feasible for small graphs. In this paper, we introduce
    novel methodologies for query-based search engine mining, enabling efficient
    extraction of social networks from large amounts of Web data. To this end, we
    use patterns in phrase queries for retrieving entity connections, and employ a
    bootstrapping approach for iteratively expanding the pattern set. Our
    experimental evaluation in different domains demonstrates that our algorithms
    provide high quality results and allow for scalable and efficient construction
    of social graphs.

    How to Search the Internet Archive Without Indexing It

    Nattiya Kanhabua, Philipp Kemkes, Wolfgang Nejdl, Tu Ngoc Nguyen, Felipe Reis, Nam Khanh Tran
    Journal-ref: 20th International Conference on Theory and Practice of Digital
    Libraries, TPDL 2016, Proceedings, pp 147-160
    Subjects: Digital Libraries (cs.DL); Information Retrieval (cs.IR)

    Significant parts of cultural heritage are produced on the web during the
    last decades. While easy accessibility to the current web is a good baseline,
    optimal access to the past web faces several challenges. This includes dealing
    with large-scale web archive collections and lacking of usage logs that contain
    implicit human feedback most relevant for today’s web search. In this paper, we
    propose an entity-oriented search system to support retrieval and analytics on
    the Internet Archive. We use Bing to retrieve a ranked list of results from the
    current web. In addition, we link retrieved results to the WayBack Machine;
    thus allowing keyword search on the Internet Archive without processing and
    indexing its raw archived content. Our search system complements existing web
    archive search tools through a user-friendly interface, which comes close to
    the functionalities of modern web search engines (e.g., keyword search, query
    auto-completion and related query suggestion), and provides a great benefit of
    taking user feedback on the current web into account also for web archive
    search. Through extensive experiments, we conduct quantitative and qualitative
    analyses in order to provide insights that enable further research on and
    practical applications of web archives.


    Computation and Language

    Bangla Word Clustering Based on Tri-gram, 4-gram and 5-gram Language Model

    Dipaloke Saha, Md Saddam Hossain, MD. Saiful Islam, Sabir Ismail
    Comments: 6 pages
    Subjects: Computation and Language (cs.CL)

    In this paper, we describe a research method that generates Bangla word
    clusters on the basis of relating to meaning in language and contextual
    similarity. The importance of word clustering is in parts of speech (POS)
    tagging, word sense disambiguation, text classification, recommender system,
    spell checker, grammar checker, knowledge discover and for many others Natural
    Language Processing (NLP) applications. In the history of word clustering,
    English and some other languages have already implemented some methods on word
    clustering efficiently. But due to lack of the resources, word clustering in
    Bangla has not been still implemented efficiently. Presently, its
    implementation is in the beginning stage. In some research of word clustering
    in English based on preceding and next five words of a key word they found an
    efficient result. Now, we are trying to implement the tri-gram, 4-gram and
    5-gram model of word clustering for Bangla to observe which one is the best
    among them. We have started our research with quite a large corpus of
    approximate 1 lakh Bangla words. We are using a machine learning technique in
    this research. We will generate word clusters and analyze the clusters by
    testing some different threshold values.

    A Comparative Study on Different Types of Approaches to Bengali document Categorization

    Md. Saiful Islam, Fazla Elahi Md Jubayer, Syed Ikhtiar Ahmed
    Comments: 6 pages
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    Document categorization is a technique where the category of a document is
    determined. In this paper three well-known supervised learning techniques which
    are Support Vector Machine(SVM), Na”ive Bayes(NB) and Stochastic Gradient
    Descent(SGD) compared for Bengali document categorization. Besides classifier,
    classification also depends on how feature is selected from dataset. For
    analyzing those classifier performances on predicting a document against twelve
    categories several feature selection techniques are also applied in this
    article namely Chi square distribution, normalized TFIDF (term
    frequency-inverse document frequency) with word analyzer. So, we attempt to
    explore the efficiency of those three-classification algorithms by using two
    different feature selection techniques in this article.

    Structural Analysis of Hindi Phonetics and A Method for Extraction of Phonetically Rich Sentences from a Very Large Hindi Text Corpus

    Shrikant Malviya, Rohit Mishra, Uma Shanker Tiwary
    Comments: 19th Coordination and Standardization of Speech Databases and Assessment Technique (O-COCOSDA) at Bali, Indonesia
    Subjects: Computation and Language (cs.CL)

    Automatic speech recognition (ASR) and Text to speech (TTS) are two prominent
    area of research in human computer interaction nowadays. A set of phonetically
    rich sentences is in a matter of importance in order to develop these two
    interactive modules of HCI. Essentially, the set of phonetically rich sentences
    has to cover all possible phone units distributed uniformly. Selecting such a
    set from a big corpus with maintaining phonetic characteristic based similarity
    is still a challenging problem. The major objective of this paper is to devise
    a criteria in order to select a set of sentences encompassing all phonetic
    aspects of a corpus with size as minimum as possible. First, this paper
    presents a statistical analysis of Hindi phonetics by observing the structural
    characteristics. Further a two stage algorithm is proposed to extract
    phonetically rich sentences with a high variety of triphones from the EMILLE
    Hindi corpus. The algorithm consists of a distance measuring criteria to select
    a sentence in order to improve the triphone distribution. Moreover, a special
    preprocessing method is proposed to score each triphone in terms of inverse
    probability in order to fasten the algorithm. The results show that the
    approach efficiently build uniformly distributed phonetically-rich corpus with
    optimum number of sentences.

    Graph-Based Semi-Supervised Conditional Random Fields For Spoken Language Understanding Using Unaligned Data

    Mohammad Aliannejadi, Masoud Kiaeeha, Shahram Khadivi, Saeed Shiry Ghidary
    Comments: Workshop of The Australasian Language Technology Association
    Subjects: Computation and Language (cs.CL)

    We experiment graph-based Semi-Supervised Learning (SSL) of Conditional
    Random Fields (CRF) for the application of Spoken Language Understanding (SLU)
    on unaligned data. The aligned labels for examples are obtained using IBM
    Model. We adapt a baseline semi-supervised CRF by defining new feature set and
    altering the label propagation algorithm. Our results demonstrate that our
    proposed approach significantly improves the performance of the supervised
    model by utilizing the knowledge gained from the graph.

    Extracting Bilingual Persian Italian Lexicon from Comparable Corpora Using Different Types of Seed Dictionaries

    Ebrahim Ansari, M.H. Sadreddini, Lucio Grandinetti, Mehdi Sheikhalishahi
    Comments: 30 pages, accepted to be published in “Applications of Comparable Corpora”, Berlin: Language Science Press
    Subjects: Computation and Language (cs.CL)

    Bilingual dictionaries are very important in various fields of natural
    language processing. In recent years, research on extracting new bilingual
    lexicons from non-parallel (comparable) corpora have been proposed. Almost all
    use a small existing dictionary or other resource to make an initial list
    called the “seed dictionary”. In this paper we discuss the use of different
    types of dictionaries as the initial starting list for creating a bilingual
    Persian-Italian lexicon from a comparable corpus.

    Our experiments apply state-of-the-art techniques on three different seed
    dictionaries; an existing dictionary, a dictionary created with pivot-based
    schema, and a dictionary extracted from a small Persian-Italian parallel text.
    The interesting challenge of our approach is to find a way to combine different
    dictionaries together in order to produce a better and more accurate lexicon.
    In order to combine seed dictionaries, we propose two different combination
    models and examine the effect of our novel combination models on various
    comparable corpora that have differing degrees of comparability. We conclude
    with a proposal for a new weighting system to improve the extracted lexicon.
    The experimental results produced by our implementation show the efficiency of
    our proposed models.

    Using English as Pivot to Extract Persian-Italian Parallel Sentences from Non-Parallel Corpora

    Ebrahim Ansari, M.H. Sadreddini, Mostafa Sheikhalishahi, Richard Wallace, Fatemeh Alimardani
    Comments: 30 pages, Accepted to be published in “Applications of Comparable Corpora”, Berlin: Language Science Press
    Subjects: Computation and Language (cs.CL)

    The effectiveness of a statistical machine translation system (SMT) is very
    dependent upon the amount of parallel corpus used in the training phase. For
    low-resource language pairs there are not enough parallel corpora to build an
    accurate SMT. In this paper, a novel approach is presented to extract bilingual
    Persian-Italian parallel sentences from a non-parallel (comparable) corpus. In
    this study, English is used as the pivot language to compute the matching
    scores between source and target sentences and candidate selection phase.
    Additionally, a new monolingual sentence similarity metric, Normalized Google
    Distance (NGD) is proposed to improve the matching process. Moreover, some
    extensions of the baseline system are applied to improve the quality of
    extracted sentences measured with BLEU. Experimental results show that using
    the new pivot based extraction can increase the quality of bilingual corpus
    significantly and consequently improves the performance of the Persian-Italian
    SMT system.

    Drug-Drug Interaction Extraction from Biomedical Text Using Long Short Term Memory Network

    Sunil Kumar Sahu, Ashish Anand
    Comments: 10 pages, 3 figures
    Subjects: Computation and Language (cs.CL)

    A drug can affect the activity of other drugs, when administered together, in
    both synergistic or antagonistic ways. In one hand synergistic effects lead to
    improved therapeutic outcomes, antagonistic consequences can be
    life-threatening, leading to increased healthcare cost, or may even cause
    death. Thus, identification of unknown drug-drug interaction (DDI) is an
    important concern for efficient and effective healthcare. Although there exist
    multiple resources for DDI, they often unable to keep pace with rich amount of
    information available in fast growing biomedical texts including literature.
    Most existing methods model DDI extraction from text as classification problem
    and mainly rely on handcrafted features. Some of these features further depends
    on domain specific tools. Recently neural network models using latent features
    has shown to be perform similar or better than the other existing models using
    handcrafted features. In this paper, we present three models namely, B-LSTM,
    AB-LSTM and Joint AB-LSTM based on long short-term memory (LSTM) network. All
    three models utilize word and position embedding as latent features and thus do
    not rely on feature engineering. Further use of bidirectional long short-term
    memory (Bi-LSTM) networks allow to extract optimal features from the whole
    sentence. The two models, AB-LSTM and Joint AB-LSTM also use attentive pooling
    in the output of Bi-LSTM layer to assign weights to features. Our experimental
    results on the SemEval-2013 DDI extraction dataset shows that the Joint AB-LSTM
    model outperforms all the existing methods, including those relying on
    handcrafted features. The other two proposed models also perform competitively
    with state-of-the-art methods.

    Image-Grounded Conversations: Multimodal Context for Natural Question and Response Generation

    Nasrin Mostafazadeh, Chris Brockett, Bill Dolan, Michel Galley, Jianfeng Gao, Georgios P. Spithourakis, Lucy Vanderwende
    Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

    The popularity of image sharing on social media reflects the important role
    visual context plays in everyday conversation. In this paper, we present a
    novel task, Image-Grounded Conversations (IGC), in which natural-sounding
    conversations are generated about shared photographic images. We investigate
    this task using training data derived from image-grounded conversations on
    social media and introduce a new dataset of crowd-sourced conversations for
    benchmarking progress. Experiments using deep neural network models trained on
    social media data show that the combination of visual and textual context can
    enhance the quality of generated conversational turns. In human evaluation, a
    gap between human performance and that of both neural and retrieval
    architectures suggests that IGC presents an interesting challenge for vision
    and language research.

    Adversarial Evaluation of Dialogue Models

    Anjuli Kannan, Oriol Vinyals
    Subjects: Computation and Language (cs.CL)

    The recent application of RNN encoder-decoder models has resulted in
    substantial progress in fully data-driven dialogue systems, but evaluation
    remains a challenge. An adversarial loss could be a way to directly evaluate
    the extent to which generated dialogue responses sound like they came from a
    human. This could reduce the need for human evaluation, while more directly
    evaluating on a generative task. In this work, we investigate this idea by
    training an RNN to discriminate a dialogue model’s samples from human-generated
    samples. Although we find some evidence this setup could be viable, we also
    note that many issues remain in its practical application. We discuss both
    aspects and conclude that future work is warranted.

    Methodologies for realizing natural-language-facilitated human-robot cooperation: A review

    Rui Liu, Xiaoli Zhang
    Comments: 30 pages, 15 figures, article submitted to Knowledge-based Systems, 2017 Jan
    Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Human-Computer Interaction (cs.HC)

    Natural Language (NL) for transferring knowledge from a human to a robot.
    Recently, research on using NL to support human-robot cooperation (HRC) has
    received increasing attention in several domains such as robotic daily
    assistance, robotic health caregiving, intelligent manufacturing, autonomous
    navigation and robot social accompany. However, a high-level review that can
    reveal the realization process and the latest methodologies of using NL to
    facilitate HRC is missing. In this review, a comprehensive summary about the
    methodology development of natural-language-facilitated human-robot cooperation
    (NLC) has been made. We first analyzed driving forces for NLC developments.
    Then, with a temporal realization order, we reviewed three main steps of NLC:
    human NL understanding, knowledge representation, and knowledge-world mapping.
    Last, based on our paper review and perspectives, potential research trends in
    NLC was discussed.

    Document Decomposition of Bangla Printed Text

    Md. Fahad Hasan, Tasmin Afroz, Sabir Ismail, Md. Saiful Islam
    Comments: 6 pages
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)

    Today all kind of information is getting digitized and along with all this
    digitization, the huge archive of various kinds of documents is being digitized
    too. We know that, Optical Character Recognition is the method through which,
    newspapers and other paper documents convert into digital resources. But, it is
    a fact that this method works on texts only. As a result, if we try to process
    any document which contains non-textual zones, then we will get garbage texts
    as output. That is why; in order to digitize documents properly they should be
    prepossessed carefully. And while preprocessing, segmenting document in
    different regions according to the category properly is most important. But,
    the Optical Character Recognition processes available for Bangla language have
    no such algorithm that can categorize a newspaper/book page fully. So we worked
    to decompose a document into its several parts like headlines, sub headlines,
    columns, images etc. And if the input is skewed and rotated, then the input was
    also deskewed and de-rotated. To decompose any Bangla document we found out the
    edges of the input image. Then we find out the horizontal and vertical area of
    every pixel where it lies in. Later on the input image was cut according to
    these areas. Then we pick each and every sub image and found out their
    height-width ratio, line height. Then according to these values the sub images
    were categorized. To deskew the image we found out the skew angle and de skewed
    the image according to this angle. To de-rotate the image we used the line
    height, matra line, pixel ratio of matra line.

    Systems of natural-language-facilitated human-robot cooperation: A review

    Rui Liu, Xiaoli Zhang
    Comments: 21 pages, 10 figures, article submitted to Knowledge-based Systems, 2017 Jan
    Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Human-Computer Interaction (cs.HC)

    Natural-language-facilitated human-robot cooperation (NLC), in which natural
    language (NL) is used to share knowledge between a human and a robot for
    conducting intuitive human-robot cooperation (HRC), is continuously developing
    in the recent decade. Currently, NLC is used in several robotic domains such as
    manufacturing, daily assistance and health caregiving. It is necessary to
    summarize current NLC-based robotic systems and discuss the future developing
    trends, providing helpful information for future NLC research. In this review,
    we first analyzed the driving forces behind the NLC research. Regarding to a
    robot s cognition level during the cooperation, the NLC implementations then
    were categorized into four types {NL-based control, NL-based robot training,
    NL-based task execution, NL-based social companion} for comparison and
    discussion. Last based on our perspective and comprehensive paper review, the
    future research trends were discussed.

    Feature Studies to Inform the Classification of Depressive Symptoms from Twitter Data for Population Health

    Danielle Mowery, Craig Bryan, Mike Conway
    Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Computers and Society (cs.CY); Social and Information Networks (cs.SI)

    The utility of Twitter data as a medium to support population-level mental
    health monitoring is not well understood. In an effort to better understand the
    predictive power of supervised machine learning classifiers and the influence
    of feature sets for efficiently classifying depression-related tweets on a
    large-scale, we conducted two feature study experiments. In the first
    experiment, we assessed the contribution of feature groups such as lexical
    information (e.g., unigrams) and emotions (e.g., strongly negative) using a
    feature ablation study. In the second experiment, we determined the percentile
    of top ranked features that produced the optimal classification performance by
    applying a three-step feature elimination approach. In the first experiment, we
    observed that lexical features are critical for identifying depressive
    symptoms, specifically for depressed mood (-35 points) and for disturbed sleep
    (-43 points). In the second experiment, we observed that the optimal F1-score
    performance of top ranked features in percentiles variably ranged across
    classes e.g., fatigue or loss of energy (5th percentile, 288 features) to
    depressed mood (55th percentile, 3,168 features) suggesting there is no
    consistent count of features for predicting depressive-related tweets. We
    conclude that simple lexical features and reduced feature sets can produce
    comparable results to larger feature sets.

    A Comprehensive Survey on Bengali Phoneme Recognition

    Sadia Tasnim Swarna, Shamim Ehsan, Md. Saiful Islam, Marium E Jannat
    Comments: 6 pages
    Subjects: Sound (cs.SD); Computation and Language (cs.CL)

    Hidden Markov model based various phoneme recognition methods for Bengali
    language is reviewed. Automatic phoneme recognition for Bengali language using
    multilayer neural network is reviewed. Usefulness of multilayer neural network
    over single layer neural network is discussed. Bangla phonetic feature table
    construction and enhancement for Bengali speech recognition is also discussed.
    Comparison among these methods is discussed.


    Distributed, Parallel, and Cluster Computing

    Fog-Assisted wIoT: A Smart Fog Gateway for End-to-End Analytics in Wearable Internet of Things

    Nicholas Constant, Debanjan Borthakur, Mohammadreza Abtahi, Harishchandra Dubey, Kunal Mankodiya
    Comments: 5 pages, 4 figures, The 23rd IEEE Symposium on High Performance Computer Architecture HPCA 2017, (Feb. 4, 2017 – Feb. 8, 2017), Austin, Texas, USA
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Computers and Society (cs.CY); Networking and Internet Architecture (cs.NI)

    Today, wearable internet-of-things (wIoT) devices continuously flood the
    cloud data centers at an enormous rate. This increases a demand to deploy an
    edge infrastructure for computing, intelligence, and storage close to the
    users. The emerging paradigm of fog computing could play an important role to
    make wIoT more efficient and affordable. Fog computing is known as the cloud on
    the ground. This paper presents an end-to-end architecture that performs data
    conditioning and intelligent filtering for generating smart analytics from
    wearable data. In wIoT, wearable sensor devices serve on one end while the
    cloud backend offers services on the other end. We developed a prototype of
    smart fog gateway (a middle layer) using Intel Edison and Raspberry Pi. We
    discussed the role of the smart fog gateway in orchestrating the process of
    data conditioning, intelligent filtering, smart analytics, and selective
    transfer to the cloud for long-term storage and temporal variability
    monitoring. We benchmarked the performance of developed prototypes on
    real-world data from smart e-textile gloves. Results demonstrated the usability
    and potential of proposed architecture for converting the real-world data into
    useful analytics while making use of knowledge-based models. In this way, the
    smart fog gateway enhances the end-to-end interaction between wearables (sensor
    devices) and the cloud.

    Autotuning GPU Kernels via Static and Predictive Analysis

    Robert V. Lim, Boyana Norris, Allen D. Malony
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Performance (cs.PF)

    Optimizing the performance of GPU kernels is challenging for both human
    programmers and code generators. For example, CUDA programmers must set thread
    and block parameters for a kernel, but might not have the intuition to make a
    good choice. Similarly, compilers can generate working code, but may miss
    tuning opportunities by not targeting GPU models or performing code
    transformations. Although empirical autotuning addresses some of these
    challenges, it requires extensive experimentation and search for optimal code
    variants. This research presents an approach for tuning CUDA kernels based on
    static analysis that considers fine-grained code structure and the specific GPU
    architecture features. Notably, our approach does not require any program runs
    in order to discover near-optimal parameter settings. We demonstrate the
    applicability of our approach in enabling code autotuners such as Orio to
    produce competitive code variants comparable with empirical-based methods,
    without the high cost of experiments.

    RIoTBench: A Real-time IoT Benchmark for Distributed Stream Processing Platforms

    Anshu Shukla, Shilpa Chaturvedi, Yogesh Simmhan
    Comments: 33 pages. arXiv admin note: substantial text overlap with arXiv:1606.07621
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    The Internet of Things (IoT) is an emerging technology paradigm where
    millions of sensors and actuators help monitor and manage, physical,
    environmental and human systems in real-time. The inherent closedloop
    responsiveness and decision making of IoT applications make them ideal
    candidates for using low latency and scalable stream processing platforms.
    Distributed Stream Processing Systems (DSPS) hosted on Cloud data-centers are
    becoming the vital engine for real-time data processing and analytics in any
    IoT software architecture. But the efficacy and performance of contemporary
    DSPS have not been rigorously studied for IoT applications and data streams.
    Here, we develop RIoTBench, a Realtime IoT Benchmark suite, along with
    performance metrics, to evaluate DSPS for streaming IoT applications. The
    benchmark includes 27 common IoT tasks classified across various functional
    categories and implemented as reusable micro-benchmarks. Further, we propose
    four IoT application benchmarks composed from these tasks, and that leverage
    various dataflow semantics of DSPS. The applications are based on common IoT
    patterns for data pre-processing, statistical summarization and predictive
    analytics. These are coupled with four stream workloads sourced from real IoT
    observations on smart cities and fitness, with peak streams rates that range
    from 500 to 10000 messages/sec and diverse frequency distributions. We validate
    the RIoTBench suite for the popular Apache Storm DSPS on the Microsoft Azure
    public Cloud, and present empirical observations. This suite can be used by
    DSPS researchers for performance analysis and resource scheduling, and by IoT
    practitioners to evaluate DSPS platforms.

    IFCIoT: Integrated Fog Cloud IoT Architectural Paradigm for Future Internet of Things

    Arslan Munir, Prasanna Kansakar, Samee U. Khan
    Comments: 9 pages, 3 figures, accepted for publication in IEEE Consumer Electronics Magazine, July 2017 issue
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    We propose a novel integrated fog cloud IoT (IFCIoT) architectural paradigm
    that promises increased performance, energy efficiency, reduced latency,
    quicker response time, scalability, and better localized accuracy for future
    IoT applications. The fog nodes (e.g., edge servers, smart routers, base
    stations) receive computation offloading requests and sensed data from various
    IoT devices. To enhance performance, energy efficiency, and real-time
    responsiveness of applications, we propose a reconfigurable and layered fog
    node (edge server) architecture that analyzes the applications’ characteristics
    and reconfigure the architectural resources to better meet the peak workload
    demands. The layers of the proposed fog node architecture include application
    layer, analytics layer, virtualization layer, reconfiguration layer, and
    hardware layer. The layered architecture facilitates abstraction and
    implementation for fog computing paradigm that is distributed in nature and
    where multiple vendors (e.g., applications, services, data and content
    providers) are involved. We also elaborate the potential applications of IFCIoT
    architecture, such as smart cities, intelligent transportation systems,
    localized weather maps and environmental monitoring, and real-time agricultural
    data analytics and control.

    Accelerated Computing in Magnetic Resonance Imaging – Real-Time Imaging Using Non-Linear Inverse Reconstruction

    Sebastian Schaetz, Dirk Voit, Jens Frahm, Martin Uecker
    Comments: 22 pages, 8 figures, 6 tables
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Medical Physics (physics.med-ph)

    Purpose: To develop generic optimization strategies for image reconstruction
    using graphical processing units (GPUs) in magnetic resonance imaging (MRI) and
    to exemplarily report about our experience with a highly accelerated
    implementation of the non-linear inversion algorithm (NLINV) for dynamic MRI
    with high frame rates. Methods: The NLINV algorithm is optimized and ported to
    run on an a multi-GPU single-node server. The algorithm is mapped to multiple
    GPUs by decomposing the data domain along the channel dimension. Furthermore,
    the algorithm is decomposed along the temporal domain by relaxing a temporal
    regularization constraint, allowing the algorithm to work on multiple frames in
    parallel. Finally, an autotuning method is presented that is capable of
    combining different decomposition variants to achieve optimal algorithm
    performance in different imaging scenarios. Results: The algorithm is
    successfully ported to a multi-GPU system and allows online image
    reconstruction with high frame rates. Real-time reconstruction with low latency
    and frame rates up to 30 frames per second is demonstrated. Conclusion: Novel
    parallel decomposition methods are presented which are applicable to many
    iterative algorithms for dynamic MRI. Using these methods to parallelize the
    NLINV algorithm on multiple GPUs it is possible to achieve online image
    reconstruction with high frame rates.

    pMR: A high-performance communication library

    Peter Georg, Daniel Richtmann, Tilo Wettig
    Comments: 7 pages, 2 figures, Proceedings of Lattice 2016
    Subjects: High Energy Physics – Lattice (hep-lat); Distributed, Parallel, and Cluster Computing (cs.DC); Computational Physics (physics.comp-ph)

    On many parallel machines, the time LQCD applications spent in communication
    is a significant contribution to the total wall-clock time, especially in the
    strong-scaling limit. We present a novel high-performance communication library
    that can be used as a de facto drop-in replacement for MPI in existing
    software. Its lightweight nature that avoids some of the unnecessary overhead
    introduced by MPI allows us to improve the communication performance of
    applications without any algorithmic or complicated implementation changes. As
    a first real-world benchmark, we make use of the pMR library in the coarse-grid
    solve of the Regensburg implementation of the DD-(alpha)AMG algorithm. On
    realistic lattices, we see an improvement of a factor 2x in pure communication
    time and total execution time savings of up to 20%.


    Learning

    Memory Augmented Neural Networks with Wormhole Connections

    Caglar Gulcehre, Sarath Chandar, Yoshua Bengio
    Subjects: Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

    Recent empirical results on long-term dependency tasks have shown that neural
    networks augmented with an external memory can learn the long-term dependency
    tasks more easily and achieve better generalization than vanilla recurrent
    neural networks (RNN). We suggest that memory augmented neural networks can
    reduce the effects of vanishing gradients by creating shortcut (or wormhole)
    connections. Based on this observation, we propose a novel memory augmented
    neural network model called TARDIS (Temporal Automatic Relation Discovery in
    Sequences). The controller of TARDIS can store a selective set of embeddings of
    its own previous hidden states into an external memory and revisit them as and
    when needed. For TARDIS, memory acts as a storage for wormhole connections to
    the past to propagate the gradients more effectively and it helps to learn the
    temporal dependencies. The memory structure of TARDIS has similarities to both
    Neural Turing Machines (NTM) and Dynamic Neural Turing Machines (D-NTM), but
    both read and write operations of TARDIS are simpler and more efficient. We use
    discrete addressing for read/write operations which helps to substantially to
    reduce the vanishing gradient problem with very long sequences. Read and write
    operations in TARDIS are tied with a heuristic once the memory becomes full,
    and this makes the learning problem simpler when compared to NTM or D-NTM type
    of architectures. We provide a detailed analysis on the gradient propagation in
    general for MANNs. We evaluate our models on different long-term dependency
    tasks and report competitive results in all of them.

    Predicting Auction Price of Vehicle License Plate with Deep Recurrent Neural Network

    Vinci Chow
    Subjects: Learning (cs.LG); Economics (q-fin.EC); Machine Learning (stat.ML)

    In Chinese societies where superstition is of paramount importance, vehicle
    license plates with desirable numbers can fetch for very high prices in
    auctions. Unlike auctions of other valuable items, however, license plates do
    not get an estimated price before auction. In this paper, I construct a deep
    recurrent neural network to predict the prices of vehicle license plates in
    Hong Kong based on the characters on a plate. Trained with 13-years of
    historical auction prices, the deep RNN outperforms previous models by
    significant margin.

    A Unifying Framework for Guiding Point Processes with Stochastic Intensity Functions

    Yichen Wang, Grady Williams, Evangelos Theodorou, Le Song
    Subjects: Learning (cs.LG); Social and Information Networks (cs.SI); Systems and Control (cs.SY); Optimization and Control (math.OC)

    Temporal point processes are powerful tools to model event occurrences and
    have a plethora of applications in social sciences. While the majority of prior
    works focus on the modeling and learning of these processes, we consider the
    problem of how to design the optimal control policy for general point process
    with stochastic intensities, such that the stochastic system driven by the
    process is steered to a target state. In particular, we exploit the novel
    insight from the information theoretic formulations of stochastic optimal
    control. We further propose a novel convex optimization framework and a highly
    efficient online algorithm to update the policy adaptively to the current
    system state. Experiments on synthetic and real-world data show that our
    algorithm can steer the user activities much more accurately than
    state-of-arts.

    Binary adaptive embeddings from order statistics of random projections

    Diego Valsesia, Enrico Magli
    Subjects: Learning (cs.LG); Information Retrieval (cs.IR)

    We use some of the largest order statistics of the random projections of a
    reference signal to construct a binary embedding that is adapted to signals
    correlated with such signal. The embedding is characterized from the analytical
    standpoint and shown to provide improved performance on tasks such as
    classification in a reduced-dimensionality space.

    Model-based Classification and Novelty Detection For Point Pattern Data

    Ba-Ngu Vo, Quang N. Tran, Dinh Phung, Ba-Tuong Vo
    Comments: Prepint: 23rd Int. Conf. Pattern Recognition (ICPR). Cancun, Mexico, December 2016
    Subjects: Learning (cs.LG)

    Point patterns are sets or multi-sets of unordered elements that can be found
    in numerous data sources. However, in data analysis tasks such as
    classification and novelty detection, appropriate statistical models for point
    pattern data have not received much attention. This paper proposes the
    modelling of point pattern data via random finite sets (RFS). In particular, we
    propose appropriate likelihood functions, and a maximum likelihood estimator
    for learning a tractable family of RFS models. In novelty detection, we propose
    novel ranking functions based on RFS models, which substantially improve
    performance.

    Transformation-Based Models of Video Sequences

    Joost van Amersfoort, Anitha Kannan, Marc'Aurelio Ranzato, Arthur Szlam, Du Tran, Soumith Chintala
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)

    In this work we propose a simple unsupervised approach for next frame
    prediction in video. Instead of directly predicting the pixels in a frame given
    past frames, we predict the transformations needed for generating the next
    frame in a sequence, given the transformations of the past frames. This leads
    to sharper results, while using a smaller prediction model.

    In order to enable a fair comparison between different video frame prediction
    models, we also propose a new evaluation protocol. We use generated frames as
    input to a classifier trained with ground truth sequences. This criterion
    guarantees that models scoring high are those producing sequences which
    preserve discrim- inative features, as opposed to merely penalizing any
    deviation, plausible or not, from the ground truth. Our proposed approach
    compares favourably against more sophisticated ones on the UCF-101 data set,
    while also being more efficient in terms of the number of parameters and
    computational cost.

    When Slepian Meets Fiedler: Putting a Focus on the Graph Spectrum

    Dimitri Van De Ville, Robin Demesmaeker, Maria Giulia Preti
    Comments: 4 pages, 4 figures, submitted to IEEE Signal Processing Letters
    Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)

    Network models play an important role in studying complex systems in many
    scientific disciplines. Graph signal processing is receiving growing interest
    as to design novel tools to combine the analysis of topology and signals. The
    graph Fourier transform, defined as the eigendecomposition of the graph
    Laplacian, allows extending conventional signal-processing operations to
    graphs. One main feature is to let emerge global organization from local
    interactions; i.e., the Fiedler vector has the smallest non-zero eigenvalue and
    is key for Laplacian embedding and graph clustering. Here, we introduce the
    design of Slepian graph signals, by maximizing energy concentration in a
    predefined subgraph for a given spectral bandlimit. We also establish a link
    with classical Laplacian embedding and graph clustering, for which the graph
    Slepian design can serve as a generalization.

    Click Through Rate Prediction for Contextual Advertisment Using Linear Regression

    Muhammad Junaid Effendi, Syed Abbas Ali
    Comments: 8 pages, 13 Figures, 11 Tables
    Subjects: Information Retrieval (cs.IR); Learning (cs.LG)

    This research presents an innovative and unique way of solving the
    advertisement prediction problem which is considered as a learning problem over
    the past several years. Online advertising is a multi-billion-dollar industry
    and is growing every year with a rapid pace. The goal of this research is to
    enhance click through rate of the contextual advertisements using Linear
    Regression. In order to address this problem, a new technique propose in this
    paper to predict the CTR which will increase the overall revenue of the system
    by serving the advertisements more suitable to the viewers with the help of
    feature extraction and displaying the advertisements based on context of the
    publishers. The important steps include the data collection, feature
    extraction, CTR prediction and advertisement serving. The statistical results
    obtained from the dynamically used technique show an efficient outcome by
    fitting the data close to perfection for the LR technique using optimized
    feature selection.

    PathNet: Evolution Channels Gradient Descent in Super Neural Networks

    Chrisantha Fernando, Dylan Banarse, Charles Blundell, Yori Zwols, David Ha, Andrei A. Rusu, Alexander Pritzel, Daan Wierstra
    Subjects: Neural and Evolutionary Computing (cs.NE); Learning (cs.LG)

    For artificial general intelligence (AGI) it would be efficient if multiple
    users trained the same giant neural network, permitting parameter reuse,
    without catastrophic forgetting. PathNet is a first step in this direction. It
    is a neural network algorithm that uses agents embedded in the neural network
    whose task is to discover which parts of the network to re-use for new tasks.
    Agents are pathways (views) through the network which determine the subset of
    parameters that are used and updated by the forwards and backwards passes of
    the backpropogation algorithm. During learning, a tournament selection genetic
    algorithm is used to select pathways through the neural network for replication
    and mutation. Pathway fitness is the performance of that pathway measured
    according to a cost function. We demonstrate successful transfer learning;
    fixing the parameters along a path learned on task A and re-evolving a new
    population of paths for task B, allows task B to be learned faster than it
    could be learned from scratch or after fine-tuning. Paths evolved on task B
    re-use parts of the optimal path evolved on task A. Positive transfer was
    demonstrated for binary MNIST, CIFAR, and SVHN supervised learning
    classification tasks, and a set of Atari and Labyrinth reinforcement learning
    tasks, suggesting PathNets have general applicability for neural network
    training. Finally, PathNet also significantly improves the robustness to
    hyperparameter choices of a parallel asynchronous reinforcement learning
    algorithm (A3C).

    Does Weather Matter? Causal Analysis of TV Logs

    Shi Zong, Branislav Kveton, Shlomo Berkovsky, Azin Ashkan, Nikos Vlassis, Zheng Wen
    Subjects: Computers and Society (cs.CY); Learning (cs.LG)

    Weather affects our mood and behaviors, and many aspects of our life. When it
    is sunny, most people become happier; but when it rains, some people get
    depressed. Despite this evidence and the abundance of data, weather has mostly
    been overlooked in the machine learning and data science research. This work
    presents a causal analysis of how weather affects TV watching patterns. We show
    that some weather attributes, such as pressure and precipitation, cause major
    changes in TV watching patterns. To the best of our knowledge, this is the
    first large-scale causal study of the impact of weather on TV watching
    patterns.

    A Comparative Study on Different Types of Approaches to Bengali document Categorization

    Md. Saiful Islam, Fazla Elahi Md Jubayer, Syed Ikhtiar Ahmed
    Comments: 6 pages
    Subjects: Computation and Language (cs.CL); Learning (cs.LG)

    Document categorization is a technique where the category of a document is
    determined. In this paper three well-known supervised learning techniques which
    are Support Vector Machine(SVM), Na”ive Bayes(NB) and Stochastic Gradient
    Descent(SGD) compared for Bengali document categorization. Besides classifier,
    classification also depends on how feature is selected from dataset. For
    analyzing those classifier performances on predicting a document against twelve
    categories several feature selection techniques are also applied in this
    article namely Chi square distribution, normalized TFIDF (term
    frequency-inverse document frequency) with word analyzer. So, we attempt to
    explore the efficiency of those three-classification algorithms by using two
    different feature selection techniques in this article.

    Self-Adaptation of Activity Recognition Systems to New Sensors

    David Bannach, Martin Jänicke, Vitor F. Rey, Sven Tomforde, Bernhard Sick, Paul Lukowicz
    Comments: 26 pages, very descriptive figures, comprehensive evaluation on real-life datasets
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)

    Traditional activity recognition systems work on the basis of training,
    taking a fixed set of sensors into account. In this article, we focus on the
    question how pattern recognition can leverage new information sources without
    any, or with minimal user input. Thus, we present an approach for opportunistic
    activity recognition, where ubiquitous sensors lead to dynamically changing
    input spaces. Our method is a variation of well-established principles of
    machine learning, relying on unsupervised clustering to discover structure in
    data and inferring cluster labels from a small number of labeled dates in a
    semi-supervised manner. Elaborating the challenges, evaluations of over 3000
    sensor combinations from three multi-user experiments are presented in detail
    and show the potential benefit of our approach.

    Predicting SMT Solver Performance for Software Verification

    Andrew Healy (Maynooth University), Rosemary Monahan (Maynooth University), James F. Power (Maynooth University)
    Comments: In Proceedings F-IDE 2016, arXiv:1701.07925
    Journal-ref: EPTCS 240, 2017, pp. 20-37
    Subjects: Software Engineering (cs.SE); Learning (cs.LG); Logic in Computer Science (cs.LO)

    The Why3 IDE and verification system facilitates the use of a wide range of
    Satisfiability Modulo Theories (SMT) solvers through a driver-based
    architecture. We present Where4: a portfolio-based approach to discharge Why3
    proof obligations. We use data analysis and machine learning techniques on
    static metrics derived from program source code. Our approach benefits software
    engineers by providing a single utility to delegate proof obligations to the
    solvers most likely to return a useful result. It does this in a time-efficient
    way using existing Why3 and solver installations – without requiring low-level
    knowledge about SMT solver operation from the user.

    One Size Fits All : Effectiveness of Local Search on Structured Data

    Vincent Cohen-Addad, Chris Schwiegelshohn
    Subjects: Data Structures and Algorithms (cs.DS); Computational Geometry (cs.CG); Learning (cs.LG)

    In this paper, we analyze the performance of a simple and standard Local
    Search algorithm for clustering on well behaved data. Since the seminal paper
    by Ostrovsky, Rabani, Schulman and Swamy [FOCS 2006], much progress has been
    made to characterize real-world instances. We distinguish the three main
    definitions — Distribution Stability (Awasthi, Blum, Sheffet, FOCS 2010) —
    Spectral Separability (Kumar, Kannan, FOCS 2010) — Perturbation Resilience
    (Bilu, Linial, ICS 2010) We show that Local Search performs well on the
    instances with the aforementioned stability properties. Specifically, for the
    (k)-means and (k)-median objective, we show that Local Search exactly recovers
    the optimal clustering if the dataset is (3+varepsilon)-perturbation
    resilient, and is a PTAS for distribution stability and spectral separability.
    This implies the first PTAS for instances satisfying the spectral separability
    condition. For the distribution stability condition we also go beyond previous
    work by showing that the clustering output by the algorithm and the optimal
    clustering are very similar. This is a significant step toward understanding
    the success of Local Search heuristics in clustering applications and supports
    the legitimacy of the stability conditions: They characterize some of the
    structure of real-world instances that make Local Search a popular heuristic.

    Feature base fusion for splicing forgery detection based on neuro fuzzy

    Habib Ghaffari Hadigheh, Ghazali bin sulong
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG)

    Most of researches on image forensics have been mainly focused on detection
    of artifacts introduced by a single processing tool. They lead in the
    development of many specialized algorithms looking for one or more particular
    footprints under specific settings. Naturally, the performance of such
    algorithms are not perfect, and accordingly the provided output might be noisy,
    inaccurate and only partially correct. Furthermore, a forged image in practical
    scenarios is often the result of utilizing several tools available by
    image-processing software systems. Therefore, reliable tamper detection
    requires developing more poweful tools to deal with various tempering
    scenarios. Fusion of forgery detection tools based on Fuzzy Inference System
    has been used before for addressing this problem. Adjusting the membership
    functions and defining proper fuzzy rules for attaining to better results are
    time-consuming processes. This can be accounted as main disadvantage of fuzzy
    inference systems. In this paper, a Neuro-Fuzzy inference system for fusion of
    forgery detection tools is developed. The neural network characteristic of
    these systems provides appropriate tool for automatically adjusting the
    membership functions. Moreover, initial fuzzy inference system is generated
    based on fuzzy clustering techniques. The proposed framework is implemented and
    validated on a benchmark image splicing data set in which three forgery
    detection tools are fused based on adaptive Neuro-Fuzzy inference system. The
    outcome of the proposed method reveals that applying Neuro Fuzzy inference
    systems could be a better approach for fusion of forgery detection tools.

    Deep Recurrent Neural Network for Protein Function Prediction from Sequence

    Xueliang Liu
    Subjects: Quantitative Methods (q-bio.QM); Learning (cs.LG); Biomolecules (q-bio.BM); Machine Learning (stat.ML)

    As high-throughput biological sequencing becomes faster and cheaper, the need
    to extract useful information from sequencing becomes ever more paramount,
    often limited by low-throughput experimental characterizations. For proteins,
    accurate prediction of their functions directly from their primary amino-acid
    sequences has been a long standing challenge. Here, machine learning using
    artificial recurrent neural networks (RNN) was applied towards classification
    of protein function directly from primary sequence without sequence alignment,
    heuristic scoring or feature engineering. The RNN models containing
    long-short-term-memory (LSTM) units trained on public, annotated datasets from
    UniProt achieved high performance for in-class prediction of four important
    protein functions tested, particularly compared to other machine learning
    algorithms using sequence-derived protein features. RNN models were used also
    for out-of-class predictions of phylogenetically distinct protein families with
    similar functions, including proteins of the CRISPR-associated nuclease,
    ferritin-like iron storage and cytochrome P450 families. Applying the trained
    RNN models on the partially unannotated UniRef100 database predicted not only
    candidates validated by existing annotations but also currently unannotated
    sequences. Some RNN predictions for the ferritin-like iron sequestering
    function were experimentally validated, even though their sequences differ
    significantly from known, characterized proteins and from each other and cannot
    be easily predicted using popular bioinformatics methods. As sequencing and
    experimental characterization data increases rapidly, the machine-learning
    approach based on RNN could be useful for discovery and prediction of
    homologues for a wide range of protein functions.


    Information Theory

    On the Lattice of Cyclic Linear Codes Over Finite Chain Rings

    Alexandre Fotue Tabue, Christophe Mouaha
    Subjects: Information Theory (cs.IT)

    Let ( exttt{R}) be a commutative finite chain ring of invariants ((q,s).) In
    this paper, the trace representation of any free cyclic ( exttt{R})-linear
    code of length (ell,) is presented, via the (q)-cyclotomic cosets modulo
    (ell,) when ( exttt{gcd}(ell, q) = 1.) The lattice
    (left( exttt{Cy}( exttt{R},ell), +, cap
    ight)) of cyclic
    ( exttt{R})-linear codes of length (ell,) is investigated. A lower bound on
    the Hamming distance of cyclic ( exttt{R})-linear codes of length (ell,) is
    established. When (q) is even, a family of MDS and self-orthogonal
    ( exttt{R})-linear cyclic codes, is constructed.

    Contraction of Cyclic Codes Over Finite Chain Rings

    Alexandre Fotue Tabue, Christophe Mouaha
    Subjects: Information Theory (cs.IT)

    Let ( exttt{R}) be a commutative finite chain ring of invariants ((q,s)) and
    (Gamma( exttt{R})) the Teichm”uller’s set of ( exttt{R}.) In this paper,
    the trace representation cyclic ( exttt{R})-linear codes of length (ell,) is
    presented, when ( exttt{gcd}(ell, q) = 1.) We will show that the contractions
    of some cyclic ( exttt{R})-linear codes of length (uell) are
    (gamma)-constacyclic ( exttt{R})-linear codes of length (ell,) where
    (gammainGamma( exttt{R})) and the multiplicative order of is (u.)

    On the Computation of the Shannon Capacity of a Discrete Channel with Noise

    Simon Cowell
    Comments: 15 pages
    Subjects: Information Theory (cs.IT)

    Muroga [M52] showed how to express the Shannon channel capacity of a discrete
    channel with noise [S49] as an explicit function of the transition
    probabilities. His method accommodates channels with any finite number of input
    symbols, any finite number of output symbols and any transition probability
    matrix. Silverman [S55] carried out Muroga’s method in the special case of a
    binary channel (and went on to analyse “cascades” of several such binary
    channels).

    This article is a note on the resulting formula for the capacity C(a, c) of a
    single binary channel. We aim to clarify some of the arguments and correct a
    small error. In service of this aim, we first formulate several of Shannon’s
    definitions and proofs in terms of discrete measure-theoretic probability
    theory. We provide an alternate proof to Silverman’s, of the feasibility of the
    optimal input distribution for a binary channel. For convenience, we also
    express C(a, c) in a single expression explicitly dependent on a and c only,
    which Silverman stopped short of doing.

    Variable-Length Resolvability for General Sources and Channels

    Hideki Yagi, Te Sun Han
    Comments: Submitted to IEEE Trans. on Inf. Theory, Jan. 2017
    Subjects: Information Theory (cs.IT)

    We introduce the problem of variable-length source resolvability, where a
    given target probability distribution is approximated by encoding a
    variable-length uniform random number, and the asymptotically minimum average
    length rate of the uniform random numbers, called the (variable-length)
    resolvability, is investigated. We first analyze the variable-length
    resolvability with the variational distance as an approximation measure. Next,
    we investigate the case under the divergence as an approximation measure. When
    the asymptotically exact approximation is required, it is shown that the
    resolvability under the two kinds of approximation measures coincides. We then
    extend the analysis to the case of channel resolvability, where the target
    distribution is the output distribution via a general channel due to the fixed
    general source as an input. The obtained characterization of the channel
    resolvability is fully general in the sense that when the channel is just the
    identity mapping, the characterization reduces to the general formula for the
    source resolvability. We also analyze the second-order variable-length
    resolvability.

    Signal Recovery from Unlabeled Samples

    Saeid Haghighatshoar, Giuseppe Caire
    Comments: 8 pages, 4 figures. A short version of the paper was submitted to ISIT 2017, Aachen, Germany
    Subjects: Information Theory (cs.IT); Machine Learning (stat.ML)

    In this paper, we study the recovery of a signal from a collection of
    unlabeled and possibly noisy measurements via a measurement matrix with random
    i.i.d. Gaussian components. We call the measurements unlabeled since their
    order is missing, namely, it is not known a priori which elements of the
    resulting measurements correspond to which row of the measurement matrix. We
    focus on the special case of ordered measurements, where only a subset of the
    measurements is kept and the order of the taken measurements is preserved. We
    identify a natural duality between this problem and the traditional Compressed
    Sensing, where we show that the unknown support (location of nonzero elements)
    of a sparse signal in Compressed Sensing corresponds in a natural way to the
    unknown location of the measurements kept in unlabeled sensing. While in
    Compressed Sensing it is possible to recover a sparse signal from an
    under-determined set of linear equations (less equations than the dimension of
    the signal), successful recovery in unlabeled sensing requires taking more
    samples than the dimension of the signal. We develop a low-complexity
    alternating minimization algorithm to recover the initial signal from the set
    of its unlabeled samples. We also study the behavior of the proposed algorithm
    for different signal dimensions and number of measurements both theoretically
    and empirically via numerical simulations. The results are a reminiscent of the
    phase-transition similar to that occurring in Compressed Sensing.

    Ultra Reliable Communication via Optimum Power Allocation for Type-I ARQ in Finite Block-Length

    Endrit Dosti, Uditha Lakmal Wijewardhana, Hirley Alves, Matti Latva-aho
    Comments: Accepted IEEE ICC 2017, May 21-25, Paris, France
    Subjects: Information Theory (cs.IT)

    We analyze the performance of the type-I automatic repeat request (ARQ)
    protocol with ultra-reliability constraints. First, we show that achieving a
    very low packet outage probability by using an open loop setup is a difficult
    task. Thus, we introduce the ARQ protocol as a solution for achieving the
    required low outage probabilities for ultra reliable communication. For this
    protocol, we present an optimal power allocation scheme that would allow us to
    reach any outage probability target in the finite block-length regime. We
    formulate the power allocation problem as minimization of the average
    transmitted power under a given outage probability and maximum transmit power
    constraint. By utilizing the Karush-Kuhn-Tucker (KKT) conditions, we solve the
    optimal power allocation problem and provide a closed form solution. Next, we
    analyze the effect of implementing the ARQ protocol on the throughput. We show
    that by using the proposed power allocation scheme we can minimize the loss of
    throughput that is caused from the retransmissions. Furthermore, we analyze the
    effect of the feedback delay length in our scheme.

    Low Dimensional Atomic Norm Representations in Line Spectral Estimation

    Maxime Ferreira Da Costa, Wei Dai
    Subjects: Information Theory (cs.IT)

    The line spectral estimation problem consists in recovering the frequencies
    of a complex valued time signal that is assumed to be sparse in the spectral
    domain from its discrete observations. Unlike the gridding required by the
    classical compressed sensing framework, line spectral estimation reconstructs
    signals whose spectral supports lie continuously in the Fourier domain. If
    recent advances have shown that atomic norm relaxation produces highly robust
    estimates in this context, the computational cost of this approach remains,
    however, the major flaw for its application to practical systems.

    In this work, we aim to bridge the complexity issue by studying the atomic
    norm minimization problem from low dimensional projection of the signal
    samples. We derive conditions on the sub-sampling matrix under which the
    partial atomic norm can be expressed by a low-dimensional semidefinite program.
    Moreover, we illustrate the tightness of this relaxation by showing that it is
    possible to recover the original signal in poly-logarithmic time for two
    specific sub-sampling patterns.

    Optimal Transport to the Entropy-Power Inequality and a Reverse Inequality

    Olivier Rioul
    Subjects: Information Theory (cs.IT)

    We present a simple proof of the entropy-power inequality using an optimal
    transportation argument which takes the form of a simple change of variables.
    The same argument yields a reverse inequality involving a conditional
    differential entropy which has its own interest. For each inequality, the
    equality case is easily captured by this method and the proof is formally
    identical in one and several dimensions.

    Non-Orthogonal Multiple Access Schemes in Wireless Powered Communication Networks

    Mohamed A. Abd-Elmagid, Alessandro Biason, Tamer ElBatt, Karim G. Seddik, Michele Zorzi
    Comments: Accepted for publication in IEEE International Conference on Communications (ICC), Paris, France, May 2017
    Subjects: Information Theory (cs.IT)

    We characterize time and power allocations to optimize the sum-throughput of
    a Wireless Powered Communication Network (WPCN) with Non-Orthogonal Multiple
    Access (NOMA). In our setup, an Energy Rich (ER) source broadcasts wireless
    energy to several devices, which use it to simultaneously transmit data to an
    Access Point (AP) on the uplink. Differently from most prior works, in this
    paper we consider a generic scenario, in which the ER and AP do not coincide,
    i.e., two separate entities. We study two NOMA decoding schemes, namely Low
    Complexity Decoding (LCD) and Successive Interference Cancellation Decoding
    (SICD). For each scheme, we formulate a sum-throughput optimization problem
    over a finite horizon. Despite the complexity of the LCD optimization problem,
    attributed to its non-convexity, we recast it into a series of geometric
    programs. On the other hand, we establish the convexity of the SICD
    optimization problem and propose an algorithm to find its optimal solution. Our
    numerical results demonstrate the importance of using successive interference
    cancellation in WPCNs with NOMA, and show how the energy should be distributed
    as a function of the system parameters.

    Fast and Lightweight Rate Control for Onboard Predictive Coding of Hyperspectral Images

    Diego Valsesia, Enrico Magli
    Subjects: Information Theory (cs.IT)

    Predictive coding is attractive for compression of hyperspecral images
    onboard of spacecrafts in light of the excellent rate-distortion performance
    and low complexity of recent schemes. In this letter we propose a rate control
    algorithm and integrate it in a lossy extension to the CCSDS-123 lossless
    compression recommendation. The proposed rate algorithm overhauls our previous
    scheme by being orders of magnitude faster and simpler to implement, while
    still providing the same accuracy in terms of output rate and comparable or
    better image quality.

    On Zero Error Capacity of Nearest Neighbor Error Channels with Multilevel Alphabet

    Takafumi Nakano, Tadashi Wadayama
    Subjects: Information Theory (cs.IT)

    This paper studies the zero error capacity of the Nearest Neighbor Error
    (NNE) channels with a multilevel alphabet. In the NNE channels, a transmitted
    symbol is a (d)-tuple of elements in ({0,1,2,dots, n-1 }). It is assumed
    that only one element error to a nearest neighbor element in a transmitted
    symbol can occur. The NNE channels can be considered as a special type of
    limited magnitude error channels, and it is closely related to error models for
    flash memories. In this paper, we derive a lower bound of the zero error
    capacity of the NNE channels based on a result of the perfect Lee codes. An
    upper bound of the zero error capacity of the NNE channels is also derived from
    a feasible solution of a linear programming problem defined based on the
    confusion graphs of the NNE channels. As a result, a concise formula of the
    zero error capacity is obtained using the lower and upper bounds.

    Communication Cost of Transforming a Nearest Plane Partition to the Voronoi Partition

    V. A. Vaishampayan, M. F. Bollauf
    Comments: 5 pages, 5 figures
    Subjects: Information Theory (cs.IT)

    We consider the problem of distributed computation of the nearest lattice
    point for a two dimensional lattice. An interactive model of communication is
    considered. We address the problem of reconfiguring a specific rectangular
    partition, a nearest plane, or Babai, partition, into the Voronoi partition.
    Expressions are derived for the error probability as a function of the total
    number of communicated bits. With an infinite number of allowed communication
    rounds, the average cost of achieving zero error probability is shown to be
    finite. For the interactive model, with a single round of communication,
    expressions are obtained for the error probability as a function of the bits
    exchanged. We observe that the error exponent depends on the lattice.

    On the Communication Cost of Determining an Approximate Nearest Lattice Point

    M. F. Bollauf, V. A. Vaishampayan, S. I. R. Costa
    Comments: 5 pages, 6 figures
    Subjects: Information Theory (cs.IT)

    We consider the closest lattice point problem in a distributed network
    setting and study the communication cost and the error probability for
    computing an approximate nearest lattice point, using the nearest-plane
    algorithm, due to Babai. Two distinct communication models, centralized and
    interactive, are considered. The importance of proper basis selection is
    addressed. Assuming a reduced basis for a two-dimensional lattice, we determine
    the approximation error of the nearest plane algorithm. The communication cost
    for determining the Babai point, or equivalently, for constructing the
    rectangular nearest-plane partition, is calculated in the interactive setting.
    For the centralized model, an algorithm is presented for reducing the
    communication cost of the nearest plane algorithm in an arbitrary number of
    dimensions.

    Steady-state performance analysis of the recursive maximum correntropy algorithm and its application in adaptive beamforming with alpha-stable noise

    Lu Lu, Haiquan Zhao
    Subjects: Information Theory (cs.IT)

    As a well-established adaptation criterion, the maximum correntropy criterion
    (MCC) has received increased attention due to its robustness against outliers.
    In this paper, a new complex recursive maximum correntropy (CRMC) algorithm
    without any priori information on the noise characteristics, is proposed under
    the MCC. We first study the steady-state excess mean-square-error (EMSE)
    behavior of the CRMC algorithm by using energy conservation relation and some
    reasonable approximations. Then, the proposed algorithm is introduced to
    adaptive beamforming problem, where the desired signal is contaminated by the
    impulsive noises. The results obtained from simulation study establish the
    effectiveness of this new beamformer.

    Integer-Forcing Message Recovering in Interference Channels

    Seyed Mohammad Azimi-Abarghouyi, Mohsen Hejazi, Behrooz Makki, Masoumeh Nasiri-Kenari, Tommy Svensson
    Comments: Submitted for possible journal publication
    Subjects: Information Theory (cs.IT)

    In this paper, we propose a scheme referred to as integer-forcing message
    recovering (IFMR) to enable receivers to recover their desirable messages in
    interference channels. Compared to the state-of-the- art integer-forcing linear
    receiver (IFLR), our proposed IFMR approach needs to decode considerably less
    number of messages. In our method, each receiver recovers independent linear
    integer combinations of the desirable messages each from two independent
    equations. We propose an efficient algorithm to sequentially find the equations
    and integer combinations with maximum rates. We evaluate the performance of our
    scheme and compare the results with the minimum mean-square error (MMSE) and
    zero-forcing (ZF), as well as the IFLR schemes. The results indicate that our
    IFMR scheme outperforms the MMSE and ZF schemes, in terms of achievable rate,
    considerably. Also, compared to IFLR, the IFMR scheme achieves slightly less
    rates in moderate signal-to-noise ratios, with significantly less
    implementation complexity.

    Channel Resolvability Theorems for General Sources and Channels

    Hideki Yagi
    Comments: Extended version for the paper submitted to 2017 IEEE International Symposium on Information Theory (ISIT2017)
    Subjects: Information Theory (cs.IT)

    In the problem of channel resolvability, where a given output probability
    distribution via a channel is approximated by transforming the uniform random
    numbers, characterizing the asymptotically minimum rate of the size of the
    random numbers, called the channel resolvability, has been open. This paper
    derives formulas for the channel resolvability for a given general source and
    channel pair. We also investigate the channel resolvability in an optimistic
    sense. It is demonstrated that the derived general formulas recapture a
    single-letter formula for the stationary memoryless source and channel. When
    the channel is the identity mapping, the established formulas reduce to an
    alternative form of the spectral sup-entropy rates, which play a key role in
    information spectrum methods. The analysis is also extended to the second-order
    channel resolvability.

    Scheduling Status Updates to Minimize Age of Information with an Energy Harvesting Sensor

    Baran Tan Bacinoglu, Elif Uysal-Biyikoglu
    Comments: A version of this paper has been submitted to ISIT 2017
    Subjects: Information Theory (cs.IT)

    Age of Information is a measure of the freshness of status updates in
    monitoring applications and update-based systems. We study a real-time remote
    sensing scenario with a sensor which is restricted by time-varying energy
    constraints and battery limitations. The sensor sends updates over a packet
    erasure channel with no feedback. The problem of finding an age-optimal
    threshold policy, with the transmission threshold being a function of the
    energy state and the estimated current age, is formulated. The average age is
    analyzed for the unit battery scenario under a memoryless energy arrival
    process. Somewhat surprisingly, for any finite arrival rate of energy, there is
    a positive age threshold for transmission, which corresponding to transmitting
    at a rate lower than that dictated by the rate of energy arrivals. A lower
    bound on the average age is obtained for general battery size.

    Construction of Fixed Rate Non-Binary WOM Codes based on Integer Programming

    Yoju Fujino, Tadashi Wadayama
    Subjects: Information Theory (cs.IT)

    In this paper, we propose a construction of non-binary WOM
    (Write-Once-Memory) codes for WOM storages such as flash memories. The WOM
    codes discussed in this paper are fixed rate WOM codes where messages in a
    fixed alphabet of size (M) can be sequentially written in the WOM storage at
    least (t^*)-times. In this paper, a WOM storage is modeled by a state
    transition graph. The proposed construction has the following two features.
    First, it includes a systematic method to determine the encoding regions in the
    state transition graph. Second, the proposed construction includes a labeling
    method for states by using integer programming. Several novel WOM codes for (q)
    level flash memories with 2 cells are constructed by the proposed construction.
    They achieve the worst numbers of writes (t^*) that meet the known upper bound
    in many cases. In addition, we constructed fixed rate non-binary WOM codes with
    the capability to reduce ICI (inter cell interference) of flash cells. One of
    the advantages of the proposed construction is its flexibility. It can be
    applied to various storage devices, to various dimensions (i.e, number of
    cells), and various kind of additional constraints.

    On Cooperation and Interference in the Weak Interference Regime (Full Version with Detailed Proofs)

    Daniel Zahavi, Ron Dabora
    Subjects: Information Theory (cs.IT)

    Handling interference is one of the main challenges in the design of wireless
    networks. In this paper we study the application of cooperation for
    interference management in the weak interference (WI) regime, focusing on the
    Z-interference channel with a causal relay (Z-ICR), when the channel
    coefficients are subject to ergodic phase fading, all transmission powers are
    finite, and the relay is full-duplex. In order to provide a comprehensive
    understanding of the benefits of cooperation in the WI regime, we characterize,
    for the first time, two major performance measures for the ergodic phase fading
    Z-ICR in the WI regime: The sum-rate capacity and the maximal generalized
    degrees-of-freedom (GDoF). In the capacity analysis, we obtain conditions on
    the channel coefficients, subject to which the sum-rate capacity of the ergodic
    phase fading Z-ICR is achieved by treating interference as noise at each
    receiver, and explicitly state the corresponding sum-rate capacity. In the GDoF
    analysis, we derive conditions on the exponents of the magnitudes of the
    channel coefficients, under which treating interference as noise achieves the
    maximal GDoF, which is explicitly characterized as well. It is shown that under
    certain conditions on the channel coefficients, {em relaying strictly
    increases} both the sum-rate capacity and the maximal GDoF of the ergodic phase
    fading Z-interference channel in the WI regime. Our results demonstrate {em
    for the first time} the gains from relaying in the presence of interference,
    {em when interference is weak and the relay power is finite}, both in
    increasing the sum-rate capacity and in increasing the maximal GDoF, compared
    to the channel without a relay.

    Multilevel Code Construction for Compound Fading Channels

    Antonio Campello, Ling Liu, Cong Ling
    Comments: 5 pages, 3 figures
    Subjects: Information Theory (cs.IT)

    We consider explicit constructions of multi-level lattice codes that
    universally approach the capacity of the compound block-fading channel.
    Specifically, building on algebraic partitions of lattices, we show how to
    construct codes with negligible probability of error for any channel
    realization and normalized log-density approaching the Poltyrev limit. Capacity
    analyses and numerical results on the achievable rates for each partition level
    are provided. The proposed codes have several enjoyable properties such as
    constructiveness and good decoding complexity, as compared to random one-level
    codes. Numerical results for finite-dimensional multi-level lattices based on
    polar codes are exhibited.

    On the Fronthaul Statistical Multiplexing Gain

    Liumeng Wang, Sheng Zhou
    Comments: to appear in IEEE Communications Letters
    Subjects: Information Theory (cs.IT)

    Breaking the fronthaul capacity limitations is vital to make cloud radio
    access network (C-RAN) scalable and practical. One promising way is aggregating
    several remote radio units (RRUs) as a cluster to share a fronthaul link, so as
    to enjoy the statistical multiplexing gain brought by the spatial randomness of
    the traffic. In this letter, a tractable model is proposed to analyze the
    fronthaul statistical multiplexing gain. We first derive the user blocking
    probability caused by the limited fronthaul capacity, including its upper and
    lower bounds. We then obtain the limits of fronthaul statistical multiplexing
    gain when the cluster size approaches infinity. Analytical results reveal that
    the user blocking probability decreases exponentially with the average
    fronthaul capacity per RRU, and the exponent is proportional to the cluster
    size. Numerical results further show considerable fronthaul statistical
    multiplexing gain even at a small to medium cluster size.

    Entropic Causality and Greedy Minimum Entropy Coupling

    Murat Kocaoglu, Alexandros G. Dimakis, Sriram Vishwanath, Babak Hassibi
    Comments: Submitted to ISIT 2017
    Subjects: Information Theory (cs.IT); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

    We study the problem of identifying the causal relationship between two
    discrete random variables from observational data. We recently proposed a novel
    framework called entropic causality that works in a very general functional
    model but makes the assumption that the unobserved exogenous variable has small
    entropy in the true causal direction.

    This framework requires the solution of a minimum entropy coupling problem:
    Given marginal distributions of m discrete random variables, each on n states,
    find the joint distribution with minimum entropy, that respects the given
    marginals. This corresponds to minimizing a concave function of nm variables
    over a convex polytope defined by nm linear constraints, called a
    transportation polytope. Unfortunately, it was recently shown that this minimum
    entropy coupling problem is NP-hard, even for 2 variables with n states. Even
    representing points (joint distributions) over this space can require
    exponential complexity (in n, m) if done naively.

    In our recent work we introduced an efficient greedy algorithm to find an
    approximate solution for this problem. In this paper we analyze this algorithm
    and establish two results: that our algorithm always finds a local minimum and
    also is within an additive approximation error from the unknown global optimum.

    Sampling Without Time: Recovering Echoes of Light via Temporal Phase Retrieval

    Ayush Bhandari, Aurelien Bourquard, Ramesh Raskar
    Comments: 12 pages, 4 figures, to appear at the 42nd IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
    Subjects: Information Theory (cs.IT); Computer Vision and Pattern Recognition (cs.CV)

    This paper considers the problem of sampling and reconstruction of a
    continuous-time sparse signal without assuming the knowledge of the sampling
    instants or the sampling rate. This topic has its roots in the problem of
    recovering multiple echoes of light from its low-pass filtered and
    auto-correlated, time-domain measurements. Our work is closely related to the
    topic of sparse phase retrieval and in this context, we discuss the advantage
    of phase-free measurements. While this problem is ill-posed, cues based on
    physical constraints allow for its appropriate regularization. We validate our
    theory with experiments based on customized, optical time-of-flight imaging
    sensors. What singles out our approach is that our sensing method allows for
    temporal phase retrieval as opposed to the usual case of spatial phase
    retrieval. Preliminary experiments and results demonstrate a compelling
    capability of our phase-retrieval based imaging device.

    Rotated Eigenstructure Analysis for Source Localization without Energy-decay Models

    Junting Chen, Urbashi Mitra
    Subjects: Information Theory (cs.IT)

    Herein, the problem of simultaneous localization of two sources given a
    modest number of samples is examined. In particular, the strategy does not
    require knowledge of the target signatures of the sources a priori, nor does it
    exploit classical methods based on a particular decay rate of the energy
    emitted from the sources as a function of range. General structural properties
    of the signatures such as unimodality are exploited. The algorithm localizes
    targets based on the rotated eigenstructure of a reconstructed observation
    matrix. In particular, the optimal rotation can be found by maximizing the
    ratio of the dominant singular value of the observation matrix over the nuclear
    norm of the optimally rotated observation matrix. It is shown that this ratio
    has a unique local maximum leading to computationally efficient search
    algorithms. Moreover, analytical results are developed to show that the squared
    localization error decreases at a rate faster than the baseline scheme.

    An asymptotic equipartition property for measures on model spaces

    Tim Austin
    Comments: 30 pages
    Subjects: Dynamical Systems (math.DS); Information Theory (cs.IT); Probability (math.PR)

    Let (G) be a sofic group, and let (Sigma = (sigma_n)_{ngeq 1}) be a sofic
    approximation to it. For a probability-preserving (G)-system, a variant of the
    sofic entropy relative to (Sigma) has recently been defined in terms of
    sequences of measures on its model spaces that `converge’ to the system in a
    certain sense. Here we prove that, in order to study this notion, one may
    restrict attention to those sequences that have the asymptotic equipartition
    property. This may be seen as a relative in the sofic setting of the
    Shannon–McMillan theorem.

    We also give some first applications of this result, including a new formula
    for the sofic entropy of a ((G imes H))-system obtained by co-induction from a
    (G)-system, where (H) is any other infinite sofic group.

    Pure Rough Mereology and Counting

    A. Mani
    Comments: IEEE Women in Engineering Conference, WIECON-ECE’2017 (Accepted for IEEEXplore)
    Subjects: Artificial Intelligence (cs.AI); Information Theory (cs.IT); Logic in Computer Science (cs.LO); Logic (math.LO)

    The study of mereology (parts and wholes) in the context of formal approaches
    to vagueness can be approached in a number of ways. In the context of rough
    sets, mereological concepts with a set-theoretic or valuation based ontology
    acquire complex and diverse behavior. In this research a general rough set
    framework called granular operator spaces is extended and the nature of
    parthood in it is explored from a minimally intrusive point of view. This is
    used to develop counting strategies that help in classifying the framework. The
    developed methodologies would be useful for drawing involved conclusions about
    the nature of data (and validity of assumptions about it) from antichains
    derived from context. The problem addressed is also about whether counting
    procedures help in confirming that the approximations involved in formation of
    data are indeed rough approximations?

    Low Rank Magnetic Resonance Fingerprinting

    Gal Mazor, Lior Weizman, Assaf Tal, Yonina C. Eldar
    Comments: 11 pages, 11 figures
    Subjects: Medical Physics (physics.med-ph); Information Theory (cs.IT)

    Magnetic Resonance Fingerprinting (MRF) is a relatively new approach that
    provides quantitative MRI measures using randomized acquisition. Extraction of
    physical quantitative tissue parameters is performed off-line, based on
    acquisition with varying parameters and a dictionary generated according to the
    Bloch equations. MRF uses hundreds of radio frequency (RF) excitation pulses
    for acquisition, and therefore high under-sampling ratio in the sampling domain
    (k-space) is required for reasonable scanning time. This under-sampling causes
    spatial artifacts that hamper the ability to accurately estimate the tissue’s
    quantitative values. In this work, we introduce a new approach for quantitative
    MRI using MRF, called magnetic resonance Fingerprinting with LOw Rank (FLOR).
    We exploit the low rank property of the concatenated temporal imaging
    contrasts, on top of the fact that the MRF signal is sparsely represented in
    the generated dictionary domain. We present an iterative scheme that consists
    of a gradient step followed by a low rank projection using the singular value
    decomposition. Experiments on real MRI data, acquired using a spirally-sampled
    MRF FISP sequence, demonstrate improved resolution compared to other
    compressed-sensing based methods for MRF at 5% sampling ratio.




沪ICP备19023445号-2号
友情链接