IT博客汇 | Stat212b: Topics Course on Deep Learning

Stat212b: Topics Course on Deep Learning

我爱机器学习(52ml.net)发表于 2016-10-10 14:49:23

来源：https://joanbruna.github.io
原文：Stat212b: Topics Course on Deep Learning

来自Berkeley统计系的Deep Learning课程，偏数学、统计一些。
讲师：by Joan Bruna, UC Berkeley, Stats Department
时间：Spring 2016

Detailed Syllabus and Lectures

Lec1: Intro and Logistics
Lec2: Representations for Recognition : stability, variability. Kernel approaches / Feature extraction.
- Elements of Statistical Learning, chapt. 12, Hastie, Tibshirani, Friedman.
Lec3: Groups, Invariants and Filters.
- Learning Stable Group Invariant Representations with Convolutional Networks
- Understanding Deep Convolutional Networks, S. Mallat.
- A Wavelet Tour of Signal Processing, chapt 2-5,7, S. Mallat.
Lec4: Scattering Convolutional Networks.
- Invariant Scattering Convolutional Networks
further reading
- Group Invariant Scattering, S. Mallat
- Scattering Representations for Recognition
Lec5: Further Scattering: Properties and Extensions.
- Rotation, Scaling and Deformation Invariant Scattering for Texture Discrimination, Sifre & Mallat.
Lec6: Convolutional Neural Networks: Geometry and first Properties.
- Deep Learning Y. LeCun, Bengio & Hinton.
- Understanding Deep Convolutional Networks, S. Mallat.
Lec7: Properties of learnt CNN representations: Covariance and Invariance, redundancy, invertibility.
- Deep Neural Networks with Random Gaussian Weights: A universal Classification Strategy?, R. Giryes, G. Sapiro, A. Bronstein.
- Intriguing Properties of Neural Networks C. Szegedy et al.
- Geodesics of Learnt Representations O. Henaff & E. Simoncelli.
- Inverting Visual Representations with Convolutional Networks, A. Dosovitskiy, T. Brox.
- Visualizing and Understanding Convolutional Networks M. Zeiler, R. Fergus.
Lec8: Connections with other models (Dict. Learning, Random Forests)
- Proximal Splitting Methods in Signal Processing Combettes & Pesquet.
- A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems Beck & Teboulle
- Learning Fast Approximations of Sparse Coding K. Gregor & Y. LeCun
- Task Driven Dictionary Learning J. Mairal, F. Bach, J. Ponce
- Exploiting Generative Models in Discriminative Classifiers T. Jaakkola & D. Haussler
- Improving the Fisher Kernel for Large-Scale Image Classification F. Perronnin et al.
- NetVLAD R. Arandjelovic et al.
Lec9: Other high level tasks: localization, regression, embedding, inverse problems.
- Object Detection with Discriminatively Trained Deformable Parts Model Felzenswalb, Girshick, McAllester and Ramanan, PAMI’10
- Deformable Parts Models are Convolutional Neural Networks, Girshick, Iandola, Darrel and Malik, CVPR’15.
- Rich Feature Hierarchies for accurate object detection and semantic segmentationGirshick, Donahue, Darrel and Malik, PAMI’14.
- Graphical Models, message-passing algorithms and convex optimization M. Wainwright.
- Conditional Random Fields as Recurrent Neural Networks Zheng et al, ICCV’15
- Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation Tompson, Jain, LeCun and Bregler, NIPS’14.
Lec10: Extensions to non-Euclidean domain. Representations of stationary processes. Properties.
- Dimensionality Reduction by Learning an Invariant Mapping Hadsell, Chopra, LeCun,’06.
- Deep Metric Learning via Lifted Structured Feature Embedding Oh Song, Xiang, Jegelka, Savarese,’15.
- Spectral Networks and Locally Connected Networks on Graphs Bruna, Szlam, Zaremba, LeCun,’14.
- Spatial Transformer Networks Jaderberg, Simonyan, Zisserman, Kavukcuoglu,’15.
- Intermittent Process Analysis with Scattering Moments Bruna, Mallat, Bacry, Muzy,’14.
Lec11: Guest Lecture ( W. Zaremba, OpenAI ) Discrete Neural Turing Machines.
Lec12: Representations of Stationary Processes (contd). Sequential Data: Recurrent Neural Networks.
- Intermittent Process Analysis with Scattering Moments J.B., Mallat, Bacry and Muzy, Annals of Statistics,’13.
- A mathematical motivation for complex-valued convolutional networks Tygert et al., Neural Computation’16.
- Texture Synthesis Using Convolutional Neural Networks Gatys, Ecker, Betghe, NIPS’15.
- A Neural Algorithm of Artistic Style, Gatys, Ecker, Betghe, ’15.
- Time Series Analysis and its Applications Shumway, Stoffer, Chapter 6.
- Deep Learning Goodfellow, Bengio, Courville,’16. Chapter 10.
Lec13: Recurrent Neural Networks (contd). Long Short Term Memory. Applications.
- Deep Learning Goodfellow, Bengio, Courville,’16. Chapter 10.
- Generating Sequences with Recurrent Neural Networks A. Graves.
- The Unreasonable Effectiveness of Recurrent Neural Networks A. Karpathy
- The Unreasonable effectiveness of Character-level Language Models Y. Goldberg
Lec14: Unsupervised Learning: Curse of dimensionality, Density estimation. Graphical Models, Latent Variable models.
- Describing Multimedia Content Using Attention-based Encoder-Decoder Networks K. Cho, A. Courville, Y. Bengio
- Graphical Models, Exponential Families and Variational Inference M. Wainwright, M. Jordan.
Lec15: Autoencoders. Variational Inference. Variational Autoencoders.
- Graphical Models, Exponential Families and Variational Inference, chapter 3 M. Wainwright, M. Jordan.
- Variational Inference with Stochastic Search J.Paisley, D. Blei, M.Jordan.
- Stochastic Variational Inference M. Hoffman, D. Blei, Wang, Paisley.
- Auto-Encoding Variational Bayes, Kingma & Welling.
- Stochastic Backpropagation and variational inference in deep latent gaussian models D. Rezende, S. Mohamed, D. Wierstra.
Lec16: Variational Autoencoders (contd). Normalizing Flows. Generative Adversarial Networks.
- Semi-supervised learning with Deep generative models Kingma, Rezende, Mohamed, Welling.
- Importance Weighted Autoencoders Burda, Grosse, Salakhutdinov.
- Variational Inference with Normalizing Flows Rezende, Mohamed.
- Unsupervised Learning using Nonequilibrium Thermodynamics Sohl-Dickstein et al.
- Generative Adversarial Networks, Goodfellow et al.
Lec17: Generative Adversarial Networks (contd).
- Generative Adversarial Networks, Goodfellow et al.
- Deep Generative Image Models using a Laplacian Pyramid of Adversarial NetworksDenton, Chintala, Szlam, Fergus.
- Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks Radford, Metz, Chintala.
Lec18: Maximum Entropy Distributions. Self-supervised models (analogies, video prediction, text, word2vec).
- Graphical Models, Exponential Families and Variational Inference, chapter 3 M. Wainwright, M. Jordan.
- An Introduction to MCMC for Machine Learning Andrieu, de Freitas, Doucet, Jordan.
- Stochastic relaxation, Gibbs distributions and the Bayesian Restoration of Images Geman & Geman.
- Distributed Representations of Words and Phrases and their compositionality Mikolov et al.
- word2vec Explained: deriving Mikolov et al’s negative-sampling embedding methodGoldberg & Levy.
Lec19: Self-supervised models (contd). Non-convex Optimization. Stochastic Optimization.
- Pixel Recurrent Neural Networks A. van den Oord, N. Kalchbrenner, K. Kavukcuoglu.
- The tradeoffs of Large Scale Learning Bottou, Bousquet.
- Introduction to Statistical Learning Theory Bousquet, Boucheron, Lugosi.
Lec20: Guest Lecture (S. Chintala, Facebook AI Research), “The Adversarial Network Nonsense”.
Lec21: Accelerated Gradient Descent, Regularization, Dropout.
- Convex Optimization: Algorithms and Complexity S. Bubeck
- Optimization, Simons Big Data Boot Camp B. Recht
- The Zen of Gradient Descent M. Hardt.
- Train Faster, Generalize Better: Stability of Stochastic Gradient Descent M. Hardt, B. Recht, Y. Singer.
- Dropout: a simple way to prevent neural networks from Overfitting Srivastava, Hinton et al.
Lec22: Dropout (contd). Batch Normalization, Tensor Decompositions.
- Dropout Training as Adaptive Regularization Wager, Wang, Liang.
- Batch Normalization: accelerating Deep Network Training by Reducing internal covariate shift Ioffe, Szegedy.
- Global Optimality in Tensor Factorization, Deep Learning and Beyond Haefflele, Vidal.
Lec23 Guest Lecture (Yann Dauphin, Facebook AI Research), “Optimizing Deep Nets”.
Lec24: Tensor Decompositions (contd), Spin Glasses.
- On the expressive power of Deep Learning: a tensor analysis Cohen, Sharir, Shashua.
- Beating the Perils of non-convexity: Guaranteed Training of Neural Networks using Tensor methods Janzamin, Sedghi, Anandkumar.
- The Loss Surfaces of Multilayer Networks Choromaska, Henaff, Mathieu, Ben Arous, LeCun.