IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    [导读]Machine Learning Video Library

    我爱机器学习(52ml.net)发表于 2016-10-07 14:32:42
    love 0

    来源:caltech
    原文:Machine Learning Video Library

    机器学习相关知识点汇总:

    map1

    map2

    Caltech的机器学习相关视频,对应公开课程为《Learning from Data》,具体信息如下:

    Outline

    This is an introductory course in machine learning (ML) that covers the basictheory, algorithms, and applications. ML is a key technology in Big Data, and in many financial, medical, commercial, and scientific applications. It enables computational systems to adaptively improve their performance with experience accumulated from the observed data. ML has become one of the hottest fields of study today, taken up by undergraduate and graduate students from 15 different majors at Caltech. This course balances theory and practice, and covers the mathematical as well as the heuristic aspects. The lectures below follow each other in a story-like fashion:

    • What is learning?
    • Can a machine learn?
    • How to do it?
    • How to do it well?
    • Take-home lessons.

    The 18 lectures are about 60 minutes each plus Q&A. The content of each lecture is color coded:

    theory; mathematical
    technique; practical
    analysis; conceptual
    • Lecture 1: The Learning Problem
    • Lecture 2: Is Learning Feasible?
    • Lecture 3: The Linear Model I
    • Lecture 4: Error and Noise
    • Lecture 5: Training versus Testing
    • Lecture 6: Theory of Generalization
    • Lecture 7: The VC Dimension
    • Lecture 8: Bias-Variance Tradeoff
    • Lecture 9: The Linear Model II
    • Lecture 10: Neural Networks
    • Lecture 11: Overfitting
    • Lecture 12: Regularization
    • Lecture 13: Validation
    • Lecture 14: Support Vector Machines
    • Lecture 15: Kernel Methods
    • Lecture 16: Radial Basis Functions
    • Lecture 17: Three Learning Principles
    • Lecture 18: Epilogue

    完整的视频列表在:https://www.youtube.com/watch?v=mbyG85GZ0PI

    分知识点的视频列表在:

    • Aggregation
      • Overview of ensemble learning (boosting, blending, before and after the fact)
    • Bayesian Learning
      • Validity of the Bayesian approach (prior, posterior, unknown versus probabilistic)
    • Bias-Variance Tradeoff
      • Basic derivation (overfit and underfit, approximation-generalization tradeoff)
      • Example (sinusoidal target function)
      • Noisy case (Bias-variance-noise decomposition)
    • Bin Model
      • Hoeffding Inequality (law of large numbers, sample, PAC)
      • Relation to learning (from bin to hypothesis, training data)
      • Multiple bins (finite hypothesis set, learning: search for green sample)
      • Union Bound (uniform inequality, M factor)
    • Data Snooping
      • Definition and analysis (data contamination, model selection)
    • Error Measures
      • User-specified error function (pointwise error, CIA, supermarket)
    • Gradient Descent
      • Basic method (Batch GD) (first-order optimization)
      • Discussion (initialization, termination, local minima, second-order methods)
      • Stochastic Gradient Descent (the algorithm, SGD in action)
      • Initialization – Neural Networks (random weights, perfect symmetry)
    • Learning Curves
      • Definition and illustration (complex models versus simple models)
      • Linear Regression example (learning curves for noisy linear target)
    • Learning Diagram
      • Components of learning (target function, hypothesis set, learning algorithm)
      • Input probability distribution (unknown distribution, bin, Hoeffding)
      • Error measure (role in learning algorithm)
      • Noisy targets (target distribution)
      • Where the VC analysis fits (affected blocks in learning diagram)
    • Learning Paradigms
      • Types of learning (supervised, reinforcement, unsupervised, clustering)
      • Other paradigms (review, active learning, online learning)
    • Linear Classification
      • The Perceptron (linearly separable data, PLA)
      • Pocket algorithm (non-separable data, comparison with PLA)
    • Linear Regression
      • The algorithm (real-valued function, mean-squared error, pseudo-inverse)
      • Generalization behavior (learning curves for linear regression)
    • Logistic Regression
      • The model (soft threshold, sigmoid, probability estimation)
      • Cross entropy error (maximum likelihood)
      • The algorithm (gradient descent)
    • Netflix Competition
      • Movie rating (singular value decomposition, essence of machine learning)
      • Applying SGD (stochastic gradient descent, SVD factors)
    • Neural Networks
      • Biological inspiration (limits of inspiration)
      • Multilayer perceptrons (the model and its power and limitations)
      • Neural Network model (feedforward layers, soft threshold)
      • Backpropagation algorithm (SGD, delta rule)
      • Hidden layers (interpretation)
      • Regularization (weight decay, weight elimination, early stopping)
    • Nonlinear Transformation
      • Basic method (linearity in the parameters, Z space)
      • Illustration (non-separable data, quadratic transform)
      • Generalization behavior (VC dimension of a nonlinear transform)
    • Occam’s Razor
      • Definition and analysis (definition of complexity, why simpler is better)
    • Overfitting
      • The phenomenon (fitting the noise)
      • A detailed experiment (Legendre polynomials, types of noise)
      • Deterministic noise (target complexity, stochastic noise)
    • Radial Basis Functions
      • Basic RBF model (exact interpolation, nearest neighbor)
      • K Centers (Lloyd’s algorithm, unsupervised learning, pseudo-inverse)
      • RBF network (neural networks, local versus global, EM algorithm)
      • Relation to other techniques (SVM kernel, regularization)
    • Regularization
      • Introduction (putting the brakes, function approximation)
      • Formal derivation (Legendre polynomials, soft-order constraint, augmented error)
      • Weight decay (Tikhonov, smoothness, neural networks)
      • Augmented error (proxy for out-of-sample error, choosing a regularizer)
      • Regularization parameter (deterministic noise, stochastic noise)
    • Sampling Bias
      • Definition and analysis (Truman versus Dewey, matching the distributions)
    • Support Vector Machines
      • SVM basic model (hard margin, constrained optimization)
      • The solution (KKT conditions, Lagrange, dual problem, quadratic programming)
      • Soft margin (non-separable data, slack variables)
      • Nonlinear transform (Z space, support vector pre-images)
      • Kernel methods (generalized inner product, Mercer’s condition, RBF kernel)
    • Validation
      • Introduction (validation versus regularization, optimistic bias)
      • Model selection (data contamination, validation set versus test set)
      • Cross Validation (leave-one-out, 10-fold cross validation)
    • VC Dimension
      • Growth function (dichotomies, Hoeffding Inequality)
      • Examples (growth function for simple hypothesis sets)
      • Break points (polynomial growth functions)
      • Bounding the growth function (mathematical induction, polynomial bound)
      • Definition of VC Dimension (shattering, distribution-free, Vapnik-Chervonenkis)
      • VC Dimension of Perceptrons (number of parameters, lower and upper bounds)
      • Interpreting the VC Dimension (degrees of freedom, Number of examples)


沪ICP备19023445号-2号
友情链接