IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    arXiv Paper Daily: Fri, 5 May 2017

    我爱机器学习(52ml.net)发表于 2017-05-05 00:00:00
    love 0

    Neural and Evolutionary Computing

    Evolutionary learning of fire fighting strategies

    Martin Kretschmer, Elmar Langetepe
    Subjects: Neural and Evolutionary Computing (cs.NE)

    The dynamic problem of enclosing an expanding fire can be modelled by a
    discrete variant in a grid graph. While the fire expands to all neighbouring
    cells in any time step, the fire fighter is allowed to block (c) cells in the
    average outside the fire in the same time interval. It was shown that the
    success of the fire fighter is guaranteed for (c>1.5) but no strategy can
    enclose the fire for (cleq 1.5). For achieving such a critical threshold the
    correctness (sometimes even optimality) of strategies and lower bounds have
    been shown by integer programming or by direct but often very sophisticated
    arguments. We investigate the problem whether it is possible to find or to
    approach such a threshold and/or optimal strategies by means of evolutionary
    algorithms, i.e., we just try to learn successful strategies for different
    constants (c) and have a look at the outcome. The main general idea is that
    this approach might give some insight in the power of evolutionary strategies
    for similar geometrically motivated threshold questions. We investigate the
    variant of protecting a highway with still unknown threshold and found
    interesting strategic paradigms.

    Keywords: Dynamic environments, fire fighting, evolutionary strategies,
    threshold approximation

    Pixel Normalization from Numeric Data as Input to Neural Networks

    Parth Sane, Ravindra Agrawal
    Comments: IEEE WiSPNET 2017 conference in Chennai
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    Text to image transformation for input to neural networks requires
    intermediate steps. This paper attempts to present a new approach to pixel
    normalization so as to convert textual data into image, suitable as input for
    neural networks. This method can be further improved by its Graphics Processing
    Unit (GPU) implementation to provide significant speedup in computational time.


    Computer Vision and Pattern Recognition

    Recurrent Soft Attention Model for Common Object Recognition

    Liliang Ren, Tong Xiao, Xiaogang Wang
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We propose the Recurrent Soft Attention Model, which integrates the visual
    attention from the original image to a LSTM memory cell through a down-sample
    network. The model recurrently transmits visual attention to the memory cells
    for glimpse mask generation, which is a more natural way for attention
    integration and exploitation in general object detection and recognition
    problem. We test our model under the metric of the top-1 accuracy on the
    CIFAR-10 dataset. The experiment shows that our down-sample network and
    feedback mechanism plays an effective role among the whole network structure.

    Auto-painter: Cartoon Image Generation from Sketch by Using Conditional Generative Adversarial Networks

    Yifan Liu, Zengchang Qin, Zhenbo Luo, Hua Wang
    Comments: 12 pages, 7 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Recently, realistic image generation using deep neural networks has become a
    hot topic in machine learning and computer vision. Images can be generated at
    the pixel level by learning from a large collection of images. Learning to
    generate colorful cartoon images from black-and-white sketches is not only an
    interesting research problem, but also a potential application in digital
    entertainment. In this paper, we investigate the sketch-to-image synthesis
    problem by using conditional generative adversarial networks (cGAN). We propose
    the auto-painter model which can automatically generate compatible colors for a
    sketch. The new model is not only capable of painting hand-draw sketch with
    proper colors, but also allowing users to indicate preferred colors.
    Experimental results on two sketch datasets show that the auto-painter performs
    better that existing image-to-image methods.

    Edge-based Component-Trees for Multi-Channel Image Segmentation

    Tobias Böttger, Dominik Gutermuth
    Comments: 11 pages, 8 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We introduce the concept of edge-based component-trees for images with an
    arbitrary number of channels. The approach is a natural extension of the
    classical component-tree devoted to gray-scale images. The similar structure
    enables the translation of many gray-level image processing techniques based on
    the component-tree to hyperspectral and color images. As an example
    application, we present an image segmentation approach that extracts Maximally
    Stable Homogeneous Regions (MSHR). The approach very similar to MSER but can be
    applied to images with an arbitrary number of channels. As opposed to MSER, our
    approach implicitly segments regions with are both lighter and darker than
    their background for gray-scale images and can be used in OCR applications
    where MSER will fail. We introduce a local flooding-based immersion for the
    edge-based component-tree construction which is linear in the number of pixels.
    In the experiments, we show that the runtime scales favorably with an
    increasing number of channels and may improve algorithms which build on MSER.

    Action Tubelet Detector for Spatio-Temporal Action Localization

    Vicky Kalogeiton, Philippe Weinzaepfel, Vittorio Ferrari, Cordelia Schmid
    Comments: 9 pages, 8 figures
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Current state-of-the-art approaches for spatio-temporal action detection rely
    on detections at the frame level that are then linked or tracked across time.
    In this paper, we leverage the temporal continuity of videos instead of
    operating at the frame level. We propose the ACtion Tubelet detector
    (ACT-detector) that takes as input a sequence of frames and outputs tubeletes,
    i.e., sequences of bounding boxes with associated scores. The same way
    state-of-the-art object detectors rely on anchor boxes, our ACT-detector is
    based on anchor cuboids. We build upon the state-of-the-art SSD framework
    (Single Shot MultiBox Detector). Convolutional features are extracted for each
    frame, while scores and regressions are based on the temporal stacking of these
    features, thus exploiting information from a sequence. Our experimental results
    show that leveraging sequences of frames significantly improves detection
    performance over using individual frames. The gain of our tubelet detector can
    be explained by both more relevant scores and more precise localization. Our
    ACT-detector outperforms the state of the art methods for frame-mAP and
    video-mAP on the J-HMDB and UCF-101 datasets, in particular at high overlap
    thresholds.

    A Deep Learning Perspective on the Origin of Facial Expressions

    Ran Breuer, Ron Kimmel
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Facial expressions play a significant role in human communication and
    behavior. Psychologists have long studied the relationship between facial
    expressions and emotions. Paul Ekman et al., devised the Facial Action Coding
    System (FACS) to taxonomize human facial expressions and model their behavior.
    The ability to recognize facial expressions automatically, enables novel
    applications in fields like human-computer interaction, social gaming, and
    psychological research. There has been a tremendously active research in this
    field, with several recent papers utilizing convolutional neural networks (CNN)
    for feature extraction and inference. In this paper, we employ CNN
    understanding methods to study the relation between the features these
    computational networks are using, the FACS and Action Units (AU). We verify our
    findings on the Extended Cohn-Kanade (CK+), NovaEmotions and FER2013 datasets.
    We apply these models to various tasks and tests using transfer learning,
    including cross-dataset validation and cross-task performance. Finally, we
    exploit the nature of the FER based CNN models for the detection of
    micro-expressions and achieve state-of-the-art accuracy using a simple
    long-short-term-memory (LSTM) recurrent neural network (RNN).

    Pixel Normalization from Numeric Data as Input to Neural Networks

    Parth Sane, Ravindra Agrawal
    Comments: IEEE WiSPNET 2017 conference in Chennai
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

    Text to image transformation for input to neural networks requires
    intermediate steps. This paper attempts to present a new approach to pixel
    normalization so as to convert textual data into image, suitable as input for
    neural networks. This method can be further improved by its Graphics Processing
    Unit (GPU) implementation to provide significant speedup in computational time.

    From Zero-shot Learning to Conventional Supervised Classification: Unseen Visual Data Synthesis

    Yang Long, Li Liu, Ling Shao, Fumin Shen, Guiguang Ding, Jungong Han
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Robust object recognition systems usually rely on powerful feature extraction
    mechanisms from a large number of real images. However, in many realistic
    applications, collecting sufficient images for ever-growing new classes is
    unattainable. In this paper, we propose a new Zero-shot learning (ZSL)
    framework that can synthesise visual features for unseen classes without
    acquiring real images. Using the proposed Unseen Visual Data Synthesis (UVDS)
    algorithm, semantic attributes are effectively utilised as an intermediate clue
    to synthesise unseen visual features at the training stage. Hereafter, ZSL
    recognition is converted into the conventional supervised problem, i.e. the
    synthesised visual features can be straightforwardly fed to typical classifiers
    such as SVM. On four benchmark datasets, we demonstrate the benefit of using
    synthesised unseen data. Extensive experimental results suggest that our
    proposed approach significantly improve the state-of-the-art results.

    Am I Done? Predicting Action Progress in Videos

    Federico Becattini, Tiberio Uricchio, Lamberto Ballan, Lorenzo Seidenari, Alberto Del Bimbo
    Comments: Submitted to BMVC 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    In this paper we introduce the problem of predicting action progress in
    untrimmed videos. We argue that this is an extremely important task because, on
    the one hand, it can be valuable for a wide range of applications and, on the
    other hand, it facilitates better action detection results. To solve this
    problem we introduce a novel approach, named ProgressNet, capable of predicting
    when an action takes place in a video, where it is located within the frames,
    and how far it has progressed during its execution. Motivated by the recent
    success obtained from the interaction of Convolutional and Recurrent Neural
    Networks, our model is based on a combination of the well known Faster R-CNN
    framework, to make framewise predictions, and LSTM networks, to estimate action
    progress through time. After introducing two evaluation protocols for the task
    at hand, we demonstrate the capability of our model to effectively predict
    action progress on a subset of 11 classes from UCF-101, all of which exhibit
    strong temporal structure. Moreover, we show that this leads to
    state-of-the-art spatio-temporal localization results.

    Deep 360 Pilot: Learning a Deep Agent for Piloting through 360° Sports Video

    Hou-Ning Hu, Yen-Chen Lin, Ming-Yu Liu, Hsien-Tzu Cheng, Yung-Ju Chang, Min Sun
    Comments: 13 pages, 8 figures, To appear in CVPR 2017 as an Oral paper. The first two authors contributed equally to this work. this https URL
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Multimedia (cs.MM)

    Watching a 360{deg} sports video requires a viewer to continuously select a
    viewing angle, either through a sequence of mouse clicks or head movements. To
    relieve the viewer from this “360 piloting” task, we propose “deep 360 pilot”
    — a deep learning-based agent for piloting through 360{deg} sports videos
    automatically. At each frame, the agent observes a panoramic image and has the
    knowledge of previously selected viewing angles. The task of the agent is to
    shift the current viewing angle (i.e. action) to the next preferred one (i.e.,
    goal). We propose to directly learn an online policy of the agent from data. We
    use the policy gradient technique to jointly train our pipeline: by minimizing
    (1) a regression loss measuring the distance between the selected and ground
    truth viewing angles, (2) a smoothness loss encouraging smooth transition in
    viewing angle, and (3) maximizing an expected reward of focusing on a
    foreground object. To evaluate our method, we build a new 360-Sports video
    dataset consisting of five sports domains. We train domain-specific agents and
    achieve the best performance on viewing angle selection accuracy and transition
    smoothness compared to [51] and other baselines.

    Attributes2Classname: A discriminative model for attribute-based unsupervised zero-shot learning

    Berkan Demirel, Ramazan Gokberk Cinbis, Nazli Ikizler Cinbis
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    We propose a novel approach for unsupervised zero-shot learning (ZSL) of
    classes based on their names. Most existing unsupervised ZSL methods aim to
    learn a model for directly comparing image features and class names. However,
    this proves to be a difficult task due to dominance of non-visual semantics in
    underlying vector-space embeddings of class names. To address this issue, we
    discriminatively learn a word representation such that the similarities between
    class and combination of attribute names fall in line with the visual
    similarity. Contrary to the traditional zero-shot learning approaches that are
    built upon attribute presence, our approach avoids the laborious
    attribute-class relation annotations for unseen classes. In addition, our
    proposed approach renders text-only training possible, hence, the training can
    be augmented without the need to collect additional image data. The
    experimental results show that our method yields state-of-the-art results for
    unsupervised ZSL in three benchmark datasets.

    Generative Convolutional Networks for Latent Fingerprint Reconstruction

    Jan Svoboda, Federico Monti, Michael M. Bronstein
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Performance of fingerprint recognition depends heavily on the extraction of
    minutiae points. Enhancement of the fingerprint ridge pattern is thus an
    essential pre-processing step that noticeably reduces false positive and
    negative detection rates. A particularly challenging setting is when the
    fingerprint images are corrupted or partially missing. In this work, we apply
    generative convolutional networks to denoise visible minutiae and predict the
    missing parts of the ridge pattern. The proposed enhancement approach is tested
    as a pre-processing step in combination with several standard feature
    extraction methods such as MINDTCT, followed by biometric comparison using MCC
    and BOZORTH3. We evaluate our method on several publicly available latent
    fingerprint datasets captured using different sensors.

    VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera

    Dushyant Mehta, Srinath Sridhar, Oleksandr Sotnychenko, Helge Rhodin, Mohammad Shafiei, Hans-Peter Seidel, Weipeng Xu, Dan Casas, Christian Theobalt
    Comments: Accepted to SIGGRAPH 2017
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)

    We present the first real-time method to capture the full global 3D skeletal
    pose of a human in a stable, temporally consistent manner using a single RGB
    camera. Our method combines a new convolutional neural network (CNN) based pose
    regressor with kinematic skeleton fitting. Our novel fully-convolutional pose
    formulation regresses 2D and 3D joint positions jointly in real time and does
    not require tightly cropped input frames. A real-time kinematic skeleton
    fitting method uses the CNN output to yield temporally stable 3D global pose
    reconstructions on the basis of a coherent kinematic skeleton. This makes our
    approach the first monocular RGB method usable in real-time applications such
    as 3D character control—thus far, the only monocular methods for such
    applications employed specialized RGB-D cameras. Our method’s accuracy is
    quantitatively on par with the best offline 3D monocular RGB pose estimation
    methods. Our results are qualitatively comparable to, and sometimes better
    than, results from monocular RGB-D approaches, such as the Kinect. However, we
    show that our approach is more broadly applicable than RGB-D solutions, i.e. it
    works for outdoor scenes, community videos, and low quality commodity RGB
    cameras.

    Toward Open Set Face Recognition

    Manuel Günther, Steve Cruz, Ethan M. Rudd, Terrance E. Boult
    Subjects: Computer Vision and Pattern Recognition (cs.CV)

    Much research has been conducted on both face identification and face
    verification problems, with greater focus on the latter. Research on face
    identification has mostly focused on using closed-set protocols, which assume
    that all probe images used in evaluation contain identities of subjects that
    are enrolled in the gallery. Real systems, however, where only a fraction of
    probe sample identities are enrolled in the gallery, cannot make this
    closed-set assumption. Instead, they must assume an open set of probe samples
    and be able to reject/ignore those that correspond to unknown identities. In
    this paper, we address the widespread misconception that thresholding
    verification-like scores is sufficient to solve the open-set face
    identification problem, by formulating an open-set face identification protocol
    and evaluating different strategies for assessing similarity. Our open-set
    identification protocol is based on the canonical labeled faces in the wild
    (LFW) dataset. We compare three algorithms for assessing similarity in a deep
    feature space under an open-set protocol: thresholded verification-like scores,
    linear discriminant analysis (LDA) scores, and an extreme value machine (EVM)
    output probabilities. Our findings suggest that thresholding similarity
    measures that are open-set by design outperforms verification-like score level
    thresholding.

    Fast k-means based on KNN Graph

    Cheng-Hao Deng, Wan-Lei Zhao
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

    In the era of big data, k-means clustering has been widely adopted as a basic
    processing tool in various contexts. However, its computational cost could be
    prohibitively high as the data size and the cluster number are large. It is
    well known that the processing bottleneck of k-means lies in the operation of
    seeking closest centroid in each iteration. In this paper, a novel solution
    towards the scalability issue of k-means is presented. In the proposal, k-means
    is supported by an approximate k-nearest neighbors graph. In the k-means
    iteration, each data sample is only compared to clusters that its nearest
    neighbors reside. Since the number of nearest neighbors we consider is much
    less than k, the processing cost in this step becomes minor and irrelevant to
    k. The processing bottleneck is therefore overcome. The most interesting thing
    is that k-nearest neighbor graph is constructed by iteratively calling the fast
    (k)-means itself. Comparing with existing fast k-means variants, the proposed
    algorithm achieves hundreds to thousands times speed-up while maintaining high
    clustering quality. As it is tested on 10 million 512-dimensional data, it
    takes only 5.2 hours to produce 1 million clusters. In contrast, to fulfill the
    same scale of clustering, it would take 3 years for traditional k-means.


    Artificial Intelligence

    Semi-supervised model-based clustering with controlled clusters leakage

    Marek Śmieja, Łukasz Struski, Jacek Tabor
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    In this paper, we focus on finding clusters in partially categorized data
    sets. We propose a semi-supervised version of Gaussian mixture model, called
    C3L, which retrieves natural subgroups of given categories. In contrast to
    other semi-supervised models, C3L is parametrized by user-defined leakage
    level, which controls maximal inconsistency between initial categorization and
    resulting clustering. Our method can be implemented as a module in practical
    expert systems to detect clusters, which combine expert knowledge with true
    distribution of data. Moreover, it can be used for improving the results of
    less flexible clustering techniques, such as projection pursuit clustering. The
    paper presents extensive theoretical analysis of the model and fast algorithm
    for its efficient optimization. Experimental results show that C3L finds high
    quality clustering model, which can be applied in discovering meaningful groups
    in partially classified data.

    A Reasoning System for a First-Order Logic of Limited Belief

    Christoph Schwering
    Comments: 22 pages, 0 figures, Twenty-sixth International Joint Conference on Artificial Intelligence (IJCAI-17)
    Subjects: Artificial Intelligence (cs.AI)

    Logics of limited belief aim at enabling computationally feasible reasoning
    in highly expressive representation languages. These languages are often
    dialects of first-order logic with a weaker form of logical entailment that
    keeps reasoning decidable or even tractable. While a number of such logics have
    been proposed in the past, they tend to remain for theoretical analysis only
    and their practical relevance is very limited. In this paper, we aim to go
    beyond the theory. Building on earlier work by Liu, Lakemeyer, and Levesque, we
    develop a logic of limited belief that is highly expressive while remaining
    decidable in the first-order and tractable in the propositional case and
    exhibits some characteristics that make it attractive for an implementation. We
    introduce a reasoning system that employs this logic as representation language
    and present experimental results that showcase the benefit of limited belief.

    Tramp Ship Scheduling Problem with Berth Allocation Considerations and Time-dependent Constraints

    Francisco López-Ramos, Armando Guarnaschelli, José-Fernando Camacho-Vallejo, Laura Hervert-Escobar, Rosa G. González-Ramírez
    Comments: 16 pages, 3 figures, 5 tables, proceedings paper of Mexican International Conference on Artificial Intelligence (MICAI) 2016
    Subjects: Artificial Intelligence (cs.AI)

    This work presents a model for the Tramp Ship Scheduling problem including
    berth allocation considerations, motivated by a real case of a shipping
    company. The aim is to determine the travel schedule for each vessel
    considering multiple docking and multiple time windows at the berths. This work
    is innovative due to the consideration of both spatial and temporal attributes
    during the scheduling process. The resulting model is formulated as a
    mixed-integer linear programming problem, and a heuristic method to deal with
    multiple vessel schedules is also presented. Numerical experimentation is
    performed to highlight the benefits of the proposed approach and the
    applicability of the heuristic. Conclusions and recommendations for further
    research are provided.

    Semi-supervised cross-entropy clustering with information bottleneck constraint

    Marek Śmieja, Bernhard C. Geiger
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    In this paper, we propose a semi-supervised clustering method, CEC-IB, that
    models data with a set of Gaussian distributions and that retrieves clusters
    based on a partial labeling provided by the user (partition-level side
    information). By combining the ideas from cross-entropy clustering (CEC) with
    those from the information bottleneck method (IB), our method trades between
    three conflicting goals: the accuracy with which the data set is modeled, the
    simplicity of the model, and the consistency of the clustering with side
    information. Experiments demonstrate that CEC-IB has a performance comparable
    to Gaussian mixture models (GMM) in a classical semi-supervised scenario, but
    is faster, more robust to noisy labels, automatically determines the optimal
    number of clusters, and performs well when not all classes are present in the
    side information. Moreover, in contrast to other semi-supervised models, it can
    be successfully applied in discovering natural subgroups if the partition-level
    side information is derived from the top levels of a hierarchical clustering.

    Fast k-means based on KNN Graph

    Cheng-Hao Deng, Wan-Lei Zhao
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

    In the era of big data, k-means clustering has been widely adopted as a basic
    processing tool in various contexts. However, its computational cost could be
    prohibitively high as the data size and the cluster number are large. It is
    well known that the processing bottleneck of k-means lies in the operation of
    seeking closest centroid in each iteration. In this paper, a novel solution
    towards the scalability issue of k-means is presented. In the proposal, k-means
    is supported by an approximate k-nearest neighbors graph. In the k-means
    iteration, each data sample is only compared to clusters that its nearest
    neighbors reside. Since the number of nearest neighbors we consider is much
    less than k, the processing cost in this step becomes minor and irrelevant to
    k. The processing bottleneck is therefore overcome. The most interesting thing
    is that k-nearest neighbor graph is constructed by iteratively calling the fast
    (k)-means itself. Comparing with existing fast k-means variants, the proposed
    algorithm achieves hundreds to thousands times speed-up while maintaining high
    clustering quality. As it is tested on 10 million 512-dimensional data, it
    takes only 5.2 hours to produce 1 million clusters. In contrast, to fulfill the
    same scale of clustering, it would take 3 years for traditional k-means.

    Of the People: Voting Is More Effective with Representative Candidates

    Yu Cheng, Shaddin Dughmi, David Kempe
    Subjects: Computer Science and Game Theory (cs.GT); Artificial Intelligence (cs.AI)

    In light of the classic impossibility results of Arrow and Gibbard and
    Satterthwaite regarding voting with ordinal rules, there has been recent
    interest in characterizing how well common voting rules approximate the social
    optimum. In order to quantify the quality of approximation, it is natural to
    consider the candidates and voters as embedded within a common metric space,
    and to ask how much further the chosen candidate is from the population as
    compared to the socially optimal one. We use this metric preference model to
    explore a fundamental and timely question: does the social welfare of a
    population improve when candidates are representative of the population? If so,
    then by how much, and how does the answer depend on the complexity of the
    metric space?

    We restrict attention to the most fundamental and common social choice
    setting: a population of voters, two independently drawn candidates, and a
    majority rule election. When candidates are not representative of the
    population, it is known that the candidate selected by the majority rule can be
    thrice as far from the population as the socially optimal one. We examine how
    this ratio improves when candidates are drawn independently from the population
    of voters. Our results are two-fold: When the metric is a line, the ratio
    improves from 3 to 4-2 sqrt{2}, roughly 1.1716; this bound is tight. When the
    metric is arbitrary, we show a lower bound of 1.5 and a constant upper bound
    strictly better than 2 on the approximation ratio of the majority rule.

    The positive result depends in part on the assumption that candidates are
    independent and identically distributed. However, we show that independence
    alone is not enough to achieve the upper bound: even when candidates are drawn
    independently, if the population of candidates can be different from the
    voters, then an upper bound of 2 on the approximation is tight.

    Gait Pattern Recognition Using Accelerometers

    Vahid Alizadeh
    Comments: 6 pages, project report
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

    Motion ability is one of the most important human properties, including gait
    as a basis of human transitional movement. Gait, as a biometric for recognizing
    human identities, can be non-intrusively captured signals using wearable or
    portable smart devices. In this study gait patterns is collected using a
    wireless platform of two sensors located at chest and right ankle of the
    subjects. Then the raw data has undergone some preprocessing methods and
    segmented into 5 seconds windows. Some time and frequency domain features is
    extracted and the performance evaluated by 5 different classifiers. Decision
    Tree (with all features) and K-Nearest Neighbors (with 10 selected features)
    classifiers reached 99.4% and 100% respectively.


    Computation and Language

    A Finite State and Rule-based Akshara to Prosodeme (A2P) Converter in Hindi

    Somnath Roy
    Comments: If you need software (A2P Converter), you have to write for the same at “somnathroy86@gmail.com” or “somnat75_llh@jnu.ac.in”
    Subjects: Computation and Language (cs.CL)

    This article describes a software module called Akshara to Prosodeme (A2P)
    converter in Hindi. It converts an input grapheme into prosedeme (sequence of
    phonemes with the specification of syllable boundaries and prosodic labels).
    The software is based on two proposed finite state machines extemdash one for
    the syllabification and another for the syllable labeling. In addition to that,
    it also uses a set of nonlinear phonological rules proposed for foot formation
    in Hindi, which encompass solutions to schwa-deletion in simple, compound,
    derived and inflected words. The nonlinear phonological rules are based on
    metrical phonology with the provision of recursive foot structure. A software
    module is implemented in Python. The testing of the software for
    syllabification, syllable labeling, schwa deletion and prosodic labeling yield
    an accuracy of more than 99% on a lexicon of size 28664 words.

    Probabilistic Typology: Deep Generative Models of Vowel Inventories

    Ryan Cotterell, Jason Eisner
    Comments: ACL 2017
    Subjects: Computation and Language (cs.CL)

    Linguistic typology studies the range of structures present in human
    language. The main goal of the field is to discover which sets of possible
    phenomena are universal, and which are merely frequent. For example, all
    languages have vowels, while most—but not all—languages have an /u/ sound.
    In this paper we present the first probabilistic treatment of a basic question
    in phonological typology: What makes a natural vowel inventory? We introduce a
    series of deep stochastic point processes, and contrast them with previous
    computational, simulation-based approaches. We provide a comprehensive suite of
    experiments on over 200 distinct languages.


    Distributed, Parallel, and Cluster Computing

    Execution Templates: Caching Control Plane Decisions for Strong Scaling of Data Analytics

    Omid Mashayekhi, Hang Qu, Chinmayee Shah, Philip Levis
    Comments: To appear at USENIX ATC 2017
    Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

    Control planes of cloud frameworks trade off between scheduling granularity
    and performance. Centralized systems schedule at task granularity, but only
    schedule a few thousand tasks per second. Distributed systems schedule hundreds
    of thousands of tasks per second but changing the schedule is costly.

    We present execution templates, a control plane abstraction that can schedule
    hundreds of thousands of tasks per second while supporting fine-grained,
    per-task scheduling decisions. Execution templates leverage a program’s
    repetitive control flow to cache blocks of frequently-executed tasks. Executing
    a task in a template requires sending a single message. Large-scale scheduling
    changes install new templates, while small changes apply edits to existing
    templates.

    Evaluations of execution templates in Nimbus, a data analytics framework,
    find that they provide the fine-grained scheduling flexibility of centralized
    control planes while matching the strong scaling of distributed ones. Execution
    templates support complex, real-world applications, such as a fluid simulation
    with a triply nested loop and data dependent branches.


    Learning

    Fast k-means based on KNN Graph

    Cheng-Hao Deng, Wan-Lei Zhao
    Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

    In the era of big data, k-means clustering has been widely adopted as a basic
    processing tool in various contexts. However, its computational cost could be
    prohibitively high as the data size and the cluster number are large. It is
    well known that the processing bottleneck of k-means lies in the operation of
    seeking closest centroid in each iteration. In this paper, a novel solution
    towards the scalability issue of k-means is presented. In the proposal, k-means
    is supported by an approximate k-nearest neighbors graph. In the k-means
    iteration, each data sample is only compared to clusters that its nearest
    neighbors reside. Since the number of nearest neighbors we consider is much
    less than k, the processing cost in this step becomes minor and irrelevant to
    k. The processing bottleneck is therefore overcome. The most interesting thing
    is that k-nearest neighbor graph is constructed by iteratively calling the fast
    (k)-means itself. Comparing with existing fast k-means variants, the proposed
    algorithm achieves hundreds to thousands times speed-up while maintaining high
    clustering quality. As it is tested on 10 million 512-dimensional data, it
    takes only 5.2 hours to produce 1 million clusters. In contrast, to fulfill the
    same scale of clustering, it would take 3 years for traditional k-means.

    Optimal Approximation with Sparsely Connected Deep Neural Networks

    Helmut Bölcskei, Philipp Grohs, Gitta Kutyniok, Philipp Petersen
    Subjects: Learning (cs.LG); Functional Analysis (math.FA)

    We derive fundamental lower bounds on the connectivity and the memory
    requirements of deep neural networks guaranteeing uniform approximation rates
    for arbitrary function classes in (L^2(mathbb{R}^d)). In other words, we
    establish a connection between the complexity of a function class and the
    complexity of deep neural networks approximating functions from this class to
    within a prescribed accuracy.

    Additionally, we prove that our lower bounds are achievable for a broad
    family of function classes. Specifically, all function classes that are
    optimally approximated by a general class of representation systems—so-called
    emph{affine systems}—can be approximated by deep neural networks with
    minimal connectivity and memory requirements. Affine systems encompass a wealth
    of representation systems from applied harmonic analysis such as wavelets,
    ridgelets, curvelets, shearlets, (alpha)-shearlets, and more generally
    (alpha)-molecules. This result elucidates a remarkable universality property
    of neural networks and shows that they achieve the optimum approximation
    properties of all affine systems combined. As a specific example, we consider
    the class of (1/alpha)-cartoon-like functions, which is approximated optimally
    by (alpha)-shearlets.

    We also explain how our results can be extended to the case of functions on
    low-dimensional immersed manifolds.

    Finally, we present numerical experiments demonstrating that the standard
    stochastic gradient descent algorithm generates deep neural networks providing
    close-to-optimal approximation rates at minimal connectivity. Moreover, these
    results show that stochastic gradient descent actually learns approximations
    that are sparse in the representation systems optimally sparsifying the
    function class the network is trained on.

    Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks

    Minsoo Rhu, Mike O'Connor, Niladrish Chatterjee, Jeff Pool, Stephen W. Keckler
    Subjects: Learning (cs.LG); Hardware Architecture (cs.AR)

    Popular deep learning frameworks require users to fine-tune their memory
    usage so that the training data of a deep neural network (DNN) fits within the
    GPU physical memory. Prior work tries to address this restriction by
    virtualizing the memory usage of DNNs, enabling both CPU and GPU memory to be
    utilized for memory allocations. Despite its merits, virtualizing memory can
    incur significant performance overheads when the time needed to copy data back
    and forth from CPU memory is higher than the latency to perform the
    computations required for DNN forward and backward propagation. We introduce a
    high-performance virtualization strategy based on a “compressing DMA engine”
    (cDMA) that drastically reduces the size of the data structures that are
    targeted for CPU-side allocations. The cDMA engine offers an average 2.6x
    (maximum 13.8x) compression ratio by exploiting the sparsity inherent in
    offloaded data, improving the performance of virtualized DNNs by an average 32%
    (maximum 61%).

    Learning with Confident Examples: Rank Pruning for Robust Classification with Noisy Labels

    Curtis G. Northcutt, Tailin Wu, Isaac L. Chuang
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Noisy PN learning is the problem of binary classification when training
    examples may be mislabeled (flipped) uniformly with noise rate rho1 for
    positive examples and rho0 for negative examples. We propose Rank Pruning (RP)
    to solve noisy PN learning and the open problem of estimating the noise rates.
    Unlike prior solutions, RP is efficient and general, requiring O(T) for any
    unrestricted choice of probabilistic classifier with T fitting time. We prove
    RP achieves consistent noise estimation and equivalent empirical risk as
    learning with uncorrupted labels in ideal conditions, and derive closed-form
    solutions when conditions are non-ideal. RP achieves state-of-the-art noise
    rate estimation and F1, error, and AUC-PR on the MNIST and CIFAR datasets,
    regardless of noise rates. To highlight, RP with a CNN classifier can predict
    if a MNIST digit is a “1” or “not 1” with only 0.25% error, and 0.46% error
    across all digits, even when 50% of positive examples are mislabeled and 50% of
    observed positive labels are mislabeled negative examples.

    Semi-supervised model-based clustering with controlled clusters leakage

    Marek Śmieja, Łukasz Struski, Jacek Tabor
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    In this paper, we focus on finding clusters in partially categorized data
    sets. We propose a semi-supervised version of Gaussian mixture model, called
    C3L, which retrieves natural subgroups of given categories. In contrast to
    other semi-supervised models, C3L is parametrized by user-defined leakage
    level, which controls maximal inconsistency between initial categorization and
    resulting clustering. Our method can be implemented as a module in practical
    expert systems to detect clusters, which combine expert knowledge with true
    distribution of data. Moreover, it can be used for improving the results of
    less flexible clustering techniques, such as projection pursuit clustering. The
    paper presents extensive theoretical analysis of the model and fast algorithm
    for its efficient optimization. Experimental results show that C3L finds high
    quality clustering model, which can be applied in discovering meaningful groups
    in partially classified data.

    Near-optimal linear decision trees for k-SUM and related problems

    Daniel M. Kane, Shachar Lovett, Shay Moran
    Comments: 18 paged, 1 figure
    Subjects: Computational Geometry (cs.CG); Computational Complexity (cs.CC); Discrete Mathematics (cs.DM); Learning (cs.LG); Combinatorics (math.CO)

    We construct near optimal linear decision trees for a variety of decision
    problems in combinatorics and discrete geometry. For example, for any constant
    (k), we construct linear decision trees that solve the (k)-SUM problem on (n)
    elements using (O(n log^2 n)) linear queries. Moreover, the queries we use are
    comparison queries, which compare the sums of two (k)-subsets; when viewed as
    linear queries, comparison queries are (2k)-sparse and have only ({-1,0,1})
    coefficients. We give similar constructions for sorting sumsets (A+B) and for
    solving the SUBSET-SUM problem, both with optimal number of queries, up to
    poly-logarithmic terms.

    Our constructions are based on the notion of “inference dimension”, recently
    introduced by the authors in the context of active classification with
    comparison queries. This can be viewed as another contribution to the fruitful
    link between machine learning and discrete geometry, which goes back to the
    discovery of the VC dimension.

    Semi-Supervised AUC Optimization based on Positive-Unlabeled Learning

    Tomoya Sakai, Gang Niu, Masashi Sugiyama
    Subjects: Machine Learning (stat.ML); Learning (cs.LG)

    Maximizing the area under the receiver operating characteristic curve (AUC)
    is a standard approach to imbalanced classification. So far, various supervised
    AUC optimization methods have been developed and they are also extended to
    semi-supervised scenarios to cope with small sample problems. However, existing
    semi-supervised AUC optimization methods rely on strong distributional
    assumptions, which are rarely satisfied in real-world problems. In this paper,
    we propose a novel semi-supervised AUC optimization method that does not
    require such restrictive assumptions. We first develop an AUC optimization
    method based only on positive and unlabeled data (PU-AUC) and then extend it to
    semi-supervised learning by combining it with a supervised AUC optimization
    method. We theoretically prove that, without the restrictive distributional
    assumptions, unlabeled data contribute to improving the generalization
    performance in PU and semi-supervised AUC optimization methods. Finally, we
    demonstrate the practical usefulness of the proposed methods through
    experiments.

    Generative Convolutional Networks for Latent Fingerprint Reconstruction

    Jan Svoboda, Federico Monti, Michael M. Bronstein
    Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

    Performance of fingerprint recognition depends heavily on the extraction of
    minutiae points. Enhancement of the fingerprint ridge pattern is thus an
    essential pre-processing step that noticeably reduces false positive and
    negative detection rates. A particularly challenging setting is when the
    fingerprint images are corrupted or partially missing. In this work, we apply
    generative convolutional networks to denoise visible minutiae and predict the
    missing parts of the ridge pattern. The proposed enhancement approach is tested
    as a pre-processing step in combination with several standard feature
    extraction methods such as MINDTCT, followed by biometric comparison using MCC
    and BOZORTH3. We evaluate our method on several publicly available latent
    fingerprint datasets captured using different sensors.

    Semi-supervised cross-entropy clustering with information bottleneck constraint

    Marek Śmieja, Bernhard C. Geiger
    Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

    In this paper, we propose a semi-supervised clustering method, CEC-IB, that
    models data with a set of Gaussian distributions and that retrieves clusters
    based on a partial labeling provided by the user (partition-level side
    information). By combining the ideas from cross-entropy clustering (CEC) with
    those from the information bottleneck method (IB), our method trades between
    three conflicting goals: the accuracy with which the data set is modeled, the
    simplicity of the model, and the consistency of the clustering with side
    information. Experiments demonstrate that CEC-IB has a performance comparable
    to Gaussian mixture models (GMM) in a classical semi-supervised scenario, but
    is faster, more robust to noisy labels, automatically determines the optimal
    number of clusters, and performs well when not all classes are present in the
    side information. Moreover, in contrast to other semi-supervised models, it can
    be successfully applied in discovering natural subgroups if the partition-level
    side information is derived from the top levels of a hierarchical clustering.


    Information Theory

    Blind Detection with Polar Codes

    Carlo Condo, Seyyed Ali Hashemi, Warren J. Gross
    Subjects: Information Theory (cs.IT)

    In blind detection, a set of candidates has to be decoded within a strict
    time constraint, to identify which transmissions are directed at the user
    equipment. Blind detection is an operation required by the 3GPP
    LTE/LTE-Advanced standard, and it will be required in the 5th generation
    wireless communication standard (5G) as well. We propose a blind detection
    scheme based on polar codes, where the radio network temporary identifier
    (RNTI) is transmitted instead of some of the frozen bits. A low-complexity
    decoding stage decodes all candidates, selecting a subset that is decoded by a
    high-performance algorithm. Simulations results show good missed detection and
    false alarm rates, that meet the system specifications. We also propose an
    early stopping criterion for the second decoding stage that can reduce the
    number of operations performed, improving both average latency and energy
    consumption. The detection speed is analyzed and different system parameter
    combinations are shown to meet the stringent timing requirements, leading to
    various implementation trade-offs.

    PER Approximation for Cross-Layer Optimization under Reliability and Energy Constraints

    Aamir Mahmood, Mikael Gidlund, M M Aftab Hossain
    Subjects: Information Theory (cs.IT)

    The vision of connecting billions of battery operated devices to be used for
    diverse emerging applications calls for a wireless communication system that
    can support stringent reliability and latency requirements. Both reliability
    and energy efficiency are critical for many of these applications that involve
    communication with short packets which undermine the coding gain achievable
    from large packets. In this paper, we first revisit the packet error rate (PER)
    performance of uncoded schemes in block fading channels and derive a simple and
    accurate PER expression. Specifically, we show that the waterfall threshold in
    the PER upper bound in Nakagami-(m) block fading channels is tightly
    approximated by the (m)-th moment of an asymptotic distribution of PER in AWGN
    channel. This PER expression gives an explicit connection between the
    parameters of both the physical and link layers and the PER. We utilize this
    connection for cross-layer design and optimization of communication links. To
    this end, we optimize signal-to-noise ratio (SNR) and modulation order at
    physical layer, and the packet length and number of retransmissions at link
    layer with respect to distance under the prescribed delay and reliability
    constraint.

    On the Design of Matched Filters for Molecule Counting Receivers

    Vahid Jamali, Arman Ahmadzadeh, Robert Schober
    Comments: To appear in IEEE Communications Letter
    Subjects: Information Theory (cs.IT)

    In this paper, we design matched filters for diffusive molecular
    communication systems taking into account the following impairments:
    signal-dependent diffusion noise, inter-symbol interference (ISI), and external
    interfering molecules. The receiver counts the number of observed molecules
    several times within one symbol interval and employs linear filtering to detect
    the transmitted data. We derive the optimal matched filter by maximizing the
    expected signal-to-interference-plus-noise ratio of the decision variable.
    Moreover, we show that for the special case of an ISI-free channel, the matched
    filter reduces to a simple sum detector and a correlator for the channel
    impulse response for the diffusion noise-limited and (external)
    interference-limited regimes, respectively. Our simulation results reveal that
    the proposed matched filter considerably outperforms the benchmark schemes
    available in literature, especially when ISI is severe.

    Wireless Channel Modeling Perspectives for Ultra-Reliable Communications

    Patrick C. F. Eggers, Petar Popovski
    Comments: Submitted to IEEE Transactions on Wireless Communications
    Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

    Ultra-Reliable Communication (URC) is one of the distinctive features of the
    upcoming 5G wireless communication. The level of reliability, going down to
    packet error rates (PER) of (10^{-9}), should be sufficiently convincing in
    order to remove cables in an industrial setting or provide remote control of
    robots with mission-critical function. In this paper we present elements of
    physical and statistical modeling of the wireless channel that are relevant for
    characterization of the lower tail of the channel Cumulative Distribution
    Function (CDF). There are channel models, such as Two-Wave with Diffuse Power
    (TWDP) or Suzuki, where finding the full CDF is not tractable. We show that,
    for a wide range of channel models, the outage probability at URC levels can be
    approximated by a simple expression, whose exponent depends on the actual
    channel model. Furthermore, it is seen that the two-wave model leads to
    pessimistic predictions of the fading in the region of ultra-reliable
    communications, while the CDFs of models that contain diffuse components have
    slopes that correspond to the slope of a Rayleigh fading. We provide analysis
    of the receive antenna diversity schemes for URC-relevant statistics and obtain
    a new expression for Maximum Ratio Combining (MRC) in Weibull channels.

    3GPP-inspired HetNet Model using Poisson Cluster Process: Sum-product Functionals and Downlink Coverage

    Chiranjib Saha, Mehrnaz Afshang, Harpreet S. Dhillon
    Comments: Submitted to IEEE Transactions on Communications. A part of this paper appeared in 2017 ITA Workshop. It is available at arXiv:1702.05706
    Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

    The growing complexity of heterogeneous cellular networks (HetNets) has
    necessitated a variety of user and base station (BS) configurations to be
    considered for realistic performance evaluation and system design. This is
    directly reflected in the HetNet simulation models proposed by standardization
    bodies, such as the third generation partnership project (3GPP). Complementary
    to these simulation models, stochastic geometry-based approach, modeling the
    locations of the users and the K tiers of BSs as independent and homogeneous
    Poisson point processes (PPPs), has gained prominence in the past few years.
    Despite its success in revealing useful insights, this PPP-based K-tier HetNet
    model is not rich enough to capture spatial coupling between user and BS
    locations that exists in real-world HetNet deployments and is included in 3GPP
    simulation models. In this paper, we demonstrate that modeling a fraction of
    users and arbitrary number of BS tiers alternatively with a Poisson cluster
    process (PCP) captures the aforementioned coupling, thus bridging the gap
    between the 3GPP simulation models and the PPP-based analytic model for
    HetNets. We further show that the downlink coverage probability of a typical
    user under maximum signal-to-interference-ratio association can be expressed in
    terms of the sum-product functionals over PPP, PCP, and its associated
    offspring point process, which are all characterized as a part of our analysis.
    We also show that the proposed model converges to the PPP-based HetNet model as
    the cluster size of the PCPs tends to infinity. Finally, we specialize our
    analysis based on general PCPs for Thomas and Matern cluster processes. Special
    instances of the proposed model closely resemble the different configurations
    for BS and user locations considered in 3GPP simulations.

    State-Dependent Gaussian Multiple Access Channels: New Outer Bounds and Capacity Results

    Wei Yang, Yingbin Liang, Shlomo Shamai (Shitz), H. Vincent Poor
    Comments: The material of this paper will be presented in part at the 2017 International Symposium on Information Theory (ISIT)
    Subjects: Information Theory (cs.IT)

    This paper studies a two-user state-dependent Gaussian multiple-access
    channel (MAC) with state noncausally known at one encoder. Two scenarios are
    considered: i) each user wishes to communicate an independent message to the
    common receiver, and ii) the two encoders send a common message to the receiver
    and the non-cognitive encoder (i.e., the encoder that does not know the state)
    sends an independent individual message (this model is also known as the MAC
    with degraded message sets). For both scenarios, new outer bounds on the
    capacity region are derived, which improve uniformly over the best known outer
    bounds. In the first scenario, the two corner points of the capacity region as
    well as the sum rate capacity are established, and it is shown that a
    single-letter solution is adequate to achieve both the corner points and the
    sum rate capacity. Furthermore, the full capacity region is characterized in
    situations in which the sum rate capacity is equal to the capacity of the
    helper problem. The proof exploits the optimal-transportation idea of
    Polyanskiy and Wu (which was used previously to establish an outer bound on the
    capacity region of the interference channel) and the worst-case Gaussian noise
    result for the case in which the input and the noise are dependent.

    Capacity of Burst Noise-Erasure Channels With and Without Feedback and Input Cost

    Lin Song, Fady Alajaji, Tamás Linder
    Comments: Parts of this work will be presented at the 2017 IEEE International Symposium on Information Theory
    Subjects: Information Theory (cs.IT)

    A class of burst noise-erasure channels which incorporate both errors and
    erasures during transmission is studied. The channel, whose output is
    explicitly expressed in terms of its input and a stationary ergodic
    noise-erasure process, is shown to have a so-called “quasi-symmetry” property
    under certain invertibility conditions. As a result, it is proved that a
    uniformly distributed input process maximizes the channel’s block mutual
    information, resulting in a closed-form formula for its non-feedback capacity
    in terms of the noise-erasure entropy rate and the entropy rate of an auxiliary
    erasure process. The feedback channel capacity is also characterized, showing
    that feedback does not increase capacity and generalizing prior related
    results. The capacity-cost function of the channel with and without feedback is
    also investigated. A sequence of finite-letter upper bounds for the
    capacity-cost function without feedback is derived. Finite-letter lower bonds
    for the capacity-cost function with feedback are obtained using a specific
    encoding rule. Based on these bounds, it is demonstrated both numerically and
    analytically that feedback can increase the capacity-cost function for a class
    of channels with Markov noise-erasure processes.

    Fourth-order Tensors with Multidimensional Discrete Transforms

    Xiao-Yang Liu, Xiaodong Wang
    Subjects: Numerical Analysis (cs.NA); Information Theory (cs.IT)

    The big data era is swamping areas including data analysis, machine/deep
    learning, signal processing, statistics, scientific computing, and cloud
    computing. The multidimensional feature and huge volume of big data put urgent
    requirements to the development of multilinear modeling tools and efficient
    algorithms. In this paper, we build a novel multilinear tensor space that
    supports useful algorithms such as SVD and QR, while generalizing the matrix
    space to fourth-order tensors was believed to be challenging. Specifically,
    given any multidimensional discrete transform, we show that fourth-order
    tensors are bilinear operators on a space of matrices. First, we take a
    transform-based approach to construct a new tensor space by defining a new
    multiplication operation and tensor products, and accordingly the analogous
    concepts: identity, inverse, transpose, linear combinations, and orthogonality.
    Secondly, we define the (mathcal{L})-SVD for fourth-order tensors and present
    an efficient algorithm, where the tensor case requires a stronger condition for
    unique decomposition than the matrix case. Thirdly, we define the tensor
    (mathcal{L})-QR decomposition and propose a Householder QR algorithm to avoid
    the catastrophic cancellation problem associated with the conventional
    Gram-Schmidt process. Finally, we validate our schemes on video compression and
    one-shot face recognition. For video compression, compared with the existing
    tSVD, the proposed (mathcal{L})-SVD achieves (3sim 10)dB gains in RSE, while
    the running time is reduced by about (50\%) and (87.5\%), respectively. For
    one-shot face recognition, the recognition rate is increased by about (10\%
    sim 20\%).




沪ICP备19023445号-2号
友情链接