CSE 515T: Bayesian Methods in Machine Learning – Spring 2015

Instructor: Professor Roman Garnett
TA: Wenlin Chen
Time/Location: Monday/Wednesday 4–5:30pm, Cupples II 230
Office hours (Garnett): Thursdays 3–5pm, Jolley Hall 504
Office hours (Chen): Tuesdays 10:30am–12pm, Bryan Hall 422
syllabus
Piazza message board
course questionnaire


Description

This course will cover modern machine learning techniques from a Bayesian probabilistic perspective. Bayesian probability allows us to model and reason about all types of uncertainty. The result is a powerful, consistent framework for approaching many problems that arise in machine learning, including parameter estimation, model comparison, and decision making. We will begin with a high-level introduction to Bayesian inference, then proceed to cover more-advanced topics.

Midterm

Midterm (solutions), due either 6 March 2015 or 7 March 2015; see front page of document for details.

Please post questions to Piazza as a private message to the instructors!

Assignments

Please post questions to Piazza!

Assignment 1, due 28 January, 2015. (solutions)
Assignment 2, due 16 February, 2015. (solutions)
Assignment 3, due 16 March, 2015.

Project

You can find more info on the project here, including some ideas and datasets, etc.

Lectures

Lecture 1: Introduction to the Bayesian Method

Monday, 12 January 2015
lecture notes

Additional Resources:

Lecture 2: Bayesian Inference I (coin flipping)

Wednesday, 14 January 2015
lecture notes

Additional Resources:

Lecture 3: Bayesian Inference II (decision theory)

Wednesday, 21 January 2015
lecture notes

Additional Resources:

  • Book: Bishop PRML: Section 1.5 (Decision theory)
  • Book: Berger Chapter 1 (Basic concepts), Section 4.4 (Bayesian decision theory)
  • Book: Robert Section 4.2 (Bayesian decision theory)
  • Videos: YouTube user mathematicalmonk has a great series of machine-learning lectures available. Chapter 11 concerns decision theroy.

Lecture 4: The Gaussian Distribution

Monday, 26 January 2015
Wednesday, 28 January 2015
lecture notes

Additional Resources:

  • Book: Bishop PRML: Section 2.3 (The Gaussian Distribution). This is a truly excellent and in-depth discussion!
  • Book: Barber BRML: Section 8.4 (Multivariate Gaussian).
  • Book/reference: Rasmussen and Williams GPML: Section A.2 (Gaussian Identities), available here. This is a good cheat sheet!
  • Notes: Chuong B. Do put together some notes on the multivariate Gaussian for the Stanford machine learning class here. These go a bit more in depth than my notes, if you want to see more details.
  • Website: The Wikipedia articles on the normal distribution and the multivariate normal distribution are quite complete.
  • Video: YouTube user mathematicalmonk has a lecture on the multivariate normal available as well.
  • Video: Alexander Ihler also has a lecture on the multivariate normal, including information on how to sample from the distribution.

Lecture 5: Bayesian Linear Regression

Monday, 2 February 2015
lecture notes

Additional Resources:

  • Book: Bishop PRML: Section 3.3 (Bayesian Linear Regression).
  • Book: Barber BRML: Section 18.1 (Regression with Additive Gaussian Noise).
  • Book: Rasmussen and Williams GPML: Section 2.1 (Weight-space View), available here.
  • Video: YouTube user mathematicalmonk has an entire section devoted to Bayesian linear regression. See ML 10.1–7 here.
  • Videos: Nando de Freitas has a series of lectures on Bayesian linear regression. Part one is here, and part two is here.

Lecture 6: Bayesian Model Selection

Wednesday, 4 February 2015 Monday, 9 February 2015
lecture notes

Additional Resources:

  • Book: Bishop PRML: Section 3.4 (Bayesian Model Comparison).
  • Book: Barber BRML: Chapter 12 (Bayesian Model Selection).
  • Book: MacKay ITILA: Chapter 28 (Occam's Razor and Model Comparison).
  • Video: YouTube user mathematicalmonk has a lecture about Bayesian model selection (some nearby videos are related as well).

Lecture 7: Bayesian Logistic Regression / The Laplace Approximation

Wednesday, 11 February 2015
lecture notes

Additional Resources:

  • Book: Bishop PRML: Chapter 4 (Linear Models for Classificaiton).
  • Book: Barber BRML: Section 18.2 (Classification).
  • Book: Rasmussen and Williams GPML: Sections 3.1 and 3.2 (Classification Problems and Linear Models for Classification), available here.
  • Video: YouTube user mathematicalmonk has a lecture about Bayesian logistic regression.

Lecture 8: The Kernel Trick

Monday, 16 February 2015
lecture notes

Additional Resources:

  • Book: Rasmussen and Williams GPML: Chapter 2 through 2.1 (Weight-space View), available here.

Lecture 9: Gaussian Process Regression

Wednesday, 18 February 2015
lecture slides

Additional Resources/Notes:

  • Book: Rasmussen and Williams GPML: Sections 2.2 – 2.5, available here.
  • Book: Barber BRML: Chapter 19 (Gaussian processes).
  • Video: Nando de Freitas has a lecture here.
  • Video: Philipp Hennig has a series of lectures from the 2013 Machine Learning Summer School; part one is here. The slides, which have some cool animations, are available here.
  • Video: Carl Rasmussen has a two-part introduction to Gaussian processes here.
  • Video: David MacKay gave an introduction to Gaussian processes here.

Lecture 10: Kernels

Monday, 2 March 2015

Resources/Notes:

Lecture 11: Bayesian Quadrature

Wednesday, 4 March 2015
lecture notes

Additional Resources/Notes:

  • Slides: David Duvenaud has a set of slides introucing Bayesian quadrature, available here.
  • Paper: Carl Rasmussen and Zoubin Ghahramani discuss Bayesian quadrature under the name "Bayesian Monte Carlo" in this paper. Many references therein are also interesting, especially the provocatively titled Monte Carlo is fundamentally unsound by Anthony O'Hagan.
  • Paper: Tom Minka wrote a report on "Deriving quadrature rules from Gaussian processes," available here.

Lecture 12: Bayesian Optimization

Monday, 16 March 2015
lecture notes

Additional Resources/Notes:

  • Tutorial: Eric Brochu, Vlad M. Cora, and Nando de Freitas have a tutorial on Bayesian optimization, available here.
  • Paper: Michael Osborne, Stephen J. Roberts, and I discuss the expected improvement approach to Bayesian optimization (with some tweaks/extensions) in this paper.
  • Paper: Niranjan Srinivas, Andreas Krause, Sham Kakade, and Mattias Seeger discuss the GP-UCB algorithm (including theoretical results!) in this landmark paper.
  • Paper: Jasper Snoek, Hugo Larochelle, and Ryan P. Adams discuss the AutoML application of Bayesian optimization here.
  • Slides: Ryan P. Adams has a set of tutorial slides covering many topics available here.

Lecture 13: GP Classification / Assumed Density Filtering / Expectation Propagation

Monday, 23 March 2015
lecture notes

Additional Resources/Notes:

  • Book: Rasmussen and Williams GPML: Chapter 3 (Classification), especially Section 3.6 (Expecation Propagation) available here.
  • Book: Barber BRML: Section 28.8 (Expectation Propagation).
  • Book: Bishop PRML: Section 10.7 (Expectation Propagation).
  • Paper: Expectation propagation as a way of life.

Lecture 14: Monte Carlo, Sampling, Rejection Sampling

Wednesday, 25 March 2015
lecture slides (from Iain Murray's introduction at the 2009 Machine Learning Summer School)

Additional Resources/Notes:

  • Video: You can watch Iain Murray present the slides himself here.
  • Book: Barber BRML: Section 27.1 (Sampling: Introduction).
  • Book: Bishop PRML: Section 11.1 (Basic Sampling Algorithms).
  • Videos: YouTube user mathematicalmonk has a chapter devoted to sampling methods (#17), beginning here.

Lecture 15: Importance Sampling, MCMC

Monday, 30 March 2015
lecture slides (from Iain Murray's introduction at the 2009 Machine Learning Summer School)

Additional Resources/Notes:

  • Video: You can watch Iain Murray present the slides himself here.
  • Book: Barber BRML: Sections 27.4 (Markov Chain Monte Carlo (MCMC)), 27.3 (Gibbs Sampling), and 27.6 (Importance Sampling).
  • Book: Bishop PRML: Sections 11.2 (Markov Chain Monte Carlo) and 11.3 (Gibbs Sampling).
  • Videos: YouTube user mathematicalmonk has a chapter devoted to sampling methods (#17), beginning here.

Lecture 16: The Kalman Filter

Monday, 5 April 2015
lecture notes

Additional Resources/Notes:

  • Book: Bishop PRML: Section 13.3 (Linear Dynamical Systems).

Resources

Books

There is no required book for this course. That said, there are a wide variety of machine-learning books available, some of which are available for free online. The following books all have a Bayesian slant to them:

  • Pattern Recognition and Machine Learning (PRML) by Christopher M. Bishop. Covers many machine-learning topics thoroughly. Definite Bayesian focus. Can also be very mathematical and take some effort to read.
  • Bayesian Reasoning and Machine Learning (BRML) by David Barber. Geared (as much as a machine-learning book can be!) towards computer scientists. Lots of material on graphical models. Freely available online.
  • Gaussian Processes for Machine Learning (GPML) by Carl Rasmussen and Christopher Williams. Excellent reference for Gaussian processes. Freely available online.
  • Information Theory, Inference, and Learning Algorithms by David J. C. Mackay. Very strong focus on information theory. If you have a background in physics or are interested in information theory, this is the book for you. Freely available online.
For a more-frequentist perspective, check out the excellent The Elements of Statistical Learning by Trevor Hastie, Robert Tibshirani, and Jerome Friedman. Freely available online.

Websites

Other

The Matrix Cookbook by Kaare B. Petersen and Michael S. Pedersen can be incredibly useful for helping with tricky linear alegbra problems!