Instructor: Professor Roman Garnett

TA: Shali Jiang

Time/Location: Monday/Wednesday 4–5:30pm, Busch 100

Office hours (Garnett): Wednesday 5:30–6:30pm, Jolley Hall 504

Office hours (Jiang): TBA

syllabus

Piazza message board

This course will cover modern machine learning techniques from a Bayesian probabilistic perspective. Bayesian probability allows us to model and reason about all types of uncertainty. The result is a powerful, consistent framework for approaching many problems that arise in machine learning, including parameter estimation, model comparison, and decision making. We will begin with a high-level introduction to Bayesian inference, then proceed to cover more-advanced topics.

Please post questions (as a private message!) to Piazza!

Please post questions to Piazza!

Assignment 1, due **7 February 2018.**

Assignment 2, due **26 February 2018.**

Assignment 3, due **19 March 2018.**

You can find more info on the project here, including some ideas and datasets, etc.

lecture notes

Additional Resources:

- Book: Bishop PRML: Section 1.2 (Probability theory)
- Book: Barber BRML: Chapter 1 (Probabilistic reasoning)
- Video: Bayesian Method for Hackers (Cam Davidson Pilon) Great high-level overview from an atypical perspective!
- Video: Introduction to Machine Learning (Nando de Freitas)
- Video: Bayesian Inference I (Zoubin Ghahramani) (the first 30 minutes or so)
- Video: Machine Learning Coursera course (Andrew Ng) The first week gives a good general overview of machine learning and the third week provides a linear-algebra refresher.

lecture notes

Additional Resources:

- Book: Bishop PRML: Section 2.1 (Binary variables)
- Website: Wikipedia has an article on checking whether a coin is fair.
- Website: Marcus Brinkmann (lambdafu) has put together a Python notebook on Bayesian coin flipping.

(lecture notes coming soon)

Additional Resources:

- Article: "The Fallacy of Placing Confidence in Confidence Intervals" available here or here

lecture notes

Additional Resources:

- Book: Bishop PRML: Section 1.5 (Decision theory)
- Book: Berger Chapter 1 (Basic concepts), Section 4.4 (Bayesian decision theory)
- Book: Robert Section 4.2 (Bayesian decision theory)
- Videos: YouTube user mathematicalmonk has a great series of machine-learning lectures available. Chapter 11 concerns decision theroy.

lecture notes

Additional Resources:

- Book: Bishop PRML: Section 2.3 (The Gaussian Distribution). This is a truly excellent and in-depth discussion!
- Book: Barber BRML: Section 8.4 (Multivariate Gaussian).
- Book/reference: Rasmussen and Williams GPML: Section A.2 (Gaussian Identities), available here. This is a good cheat sheet!
- Notes: Chuong B. Do put together some notes on the multivariate Gaussian for the Stanford machine learning class here. These go a bit more in depth than my notes, if you want to see more details.
- Website: The Wikipedia articles on the normal distribution and the multivariate normal distribution are quite complete.
- Video: YouTube user mathematicalmonk has a lecture on the multivariate normal available as well.
- Video: Alexander Ihler also has a lecture on the multivariate normal, including information on how to sample from the distribution.

lecture notes

Additional Resources:

- Book: Bishop PRML: Section 3.3 (Bayesian Linear Regression).
- Book: Barber BRML: Section 18.1 (Regression with Additive Gaussian Noise).
- Book: Rasmussen and Williams GPML: Section 2.1 (Weight-space View), available here.
- Video: YouTube user mathematicalmonk has an entire section devoted to Bayesian linear regression. See ML 10.1–7 here.
- Videos: Nando de Freitas has a series of lectures on Bayesian linear regression. Part one is here, and part two is here.

lecture notes

Additional Resources:

- Book: Bishop PRML: Section 3.4 (Bayesian Model Comparison).
- Book: Barber BRML: Chapter 12 (Bayesian Model Selection).
- Book: MacKay ITILA: Chapter 28 (Occam's Razor and Model Comparison).
- Video: YouTube user mathematicalmonk has a lecture about Bayesian model selection (some nearby videos are related as well).

lecture notes

Additional Resources:

- Book: Bishop PRML: Chapter 4 (Linear Models for Classificaiton).
- Book: Barber BRML: Section 18.2 (Classification).
- Book: Rasmussen and Williams GPML: Sections 3.1 and 3.2 (Classification Problems and Linear Models for Classification), available here.
- Video: YouTube user mathematicalmonk has a lecture about Bayesian logistic regression.

lecture notes

Additional Resources:

- Book: Rasmussen and Williams GPML: Chapter 2 through 2.1 (Weight-space View), available here.

lecture slides

Additional Resources/Notes:

- Book: Rasmussen and Williams GPML: Sections 2.2 – 2.5, available here.
- Book: Barber BRML: Chapter 19 (Gaussian processes).
- Video: Nando de Freitas has a lecture here.
- Video: Philipp Hennig has a series of lectures from the 2013 Machine Learning Summer School; part one is here. The slides, which have some cool animations, are available here.
- Video: Carl Rasmussen has a two-part introduction to Gaussian processes here.
- Video: David MacKay gave an introduction to Gaussian processes here.

Resources/Notes:

- Book: Bishop PRML: Chapter 6 (Kernel Methods).
- Book: Barber BRML: Section 19.3 (Covariance Functions).
- Book: Rasmussen and Williams GPML: Chapter 4 (Covariance Functions), available here.
- Website: David Duvenaud has made a "kernel cookbook," available here. This webpage also became a chapter of his thesis.
- Website: Metacademy has a pages on the kernel trick and constructing kernels.

lecture notes

Additional Resources/Notes:

- Tutorial: Eric Brochu, Vlad M. Cora, and Nando de Freitas have a tutorial on Bayesian optimization, available here.
- Paper: Michael Osborne, Stephen J. Roberts, and I discuss the expected improvement approach to Bayesian optimization (with some tweaks/extensions) in this paper.
- Paper: Niranjan Srinivas, Andreas Krause, Sham Kakade, and Mattias Seeger discuss the GP-UCB algorithm (including theoretical results!) in this landmark paper.
- Paper: Jasper Snoek, Hugo Larochelle, and Ryan P. Adams discuss the AutoML application of Bayesian optimization here.
- Slides: Ryan P. Adams has a set of tutorial slides covering many topics available here.

lecture notes

Additional Resources/Notes:

- Slides: David Duvenaud has a set of slides introucing Bayesian quadrature, available here.
- Paper: Carl Rasmussen and Zoubin Ghahramani discuss Bayesian quadrature under the name "Bayesian Monte Carlo" in this paper. Many references therein are also interesting, especially the provocatively titled Monte Carlo is fundamentally unsound by Anthony O'Hagan.
- Paper: Tom Minka wrote a report on "Deriving quadrature rules from Gaussian processes," available here.

lecture notes

Additional Resources/Notes:

- Book: Rasmussen and Williams GPML: Chapter 3 (Classification), especially Section 3.6 (Expecation Propagation) available here.
- Book: Barber BRML: Section 28.8 (Expectation Propagation).
- Book: Bishop PRML: Section 10.7 (Expectation Propagation).
- Paper: Expectation propagation as a way of life.

lecture slides (from Iain Murray's introduction at the 2009 Machine Learning Summer School)

Additional Resources/Notes:

lecture slides (from Iain Murray's introduction at the 2009 Machine Learning Summer School)

Additional Resources/Notes:

- Video: You can watch Iain Murray present the slides himself here.
- Book: Barber BRML: Sections 27.4 (Markov Chain Monte Carlo (MCMC)), 27.3 (Gibbs Sampling), and 27.6 (Importance Sampling).
- Book: Bishop PRML: Sections 11.2 (Markov Chain Monte Carlo) and 11.3 (Gibbs Sampling).
- Videos: YouTube user mathematicalmonk has a chapter devoted to sampling methods (#17), beginning here.

Additional Resources/Notes:

- Book: Berger, Sequential Decision Theory and Bayesian Analysis: Section 7.4 (Bayesian Sequential Analysis).
- Book: DeGroot, Optimal Statistical Decisions: Part 4 (Sequential Decisions).
- Paper: Garnett, et al., Bayesian Optimal Active Search and Surveying (ICML 2012).

lecture notes

Additional Resources/Notes:

- Book: Bishop PRML: Section 13.3 (Linear Dynamical Systems).

There is no required book for this course. That said, there are a wide variety of machine-learning books available, some of which are available for free online. The following books all have a Bayesian slant to them:

*Pattern Recognition and Machine Learning*(PRML) by Covers many machine-learning topics thoroughly. Definite Bayesian focus. Can also be very mathematical and take some effort to read.*Bayesian Reasoning and Machine Learning*(BRML) by Geared (as much as a machine-learning book can be!) towards computer scientists. Lots of material on graphical models. Freely available online.*Gaussian Processes for Machine Learning*(GPML) by Excellent reference for Gaussian processes. Freely available online.*Information Theory, Inference, and Learning Algorithms*by Very strong focus on information theory. If you have a background in physics or are interested in information theory, this is the book for you. Freely available online.

- I will post the source for lecture notes, demo code, etc. on this GitHub page. Even the source for the syllabus and this website are there.
- I have created a Piazza message board for this class. Please post any questions about the homework, etc. to the message board! Chances are that someone else has the same question and we can all benefit from a public discussion. If you have a question just for me and/or me and the TA, please also post this to Piazza rather than emailing us directly; you should be able to mark your message appropriately to keep it private.
- Metacademy's roadmap to Bayesian machine learning. This is a great resource for finding additional materials related to essentially every subject we will cover in this course.
- There are several relevant courses available on Coursera. Coursera gives you access to video lecture series, often from world experts, all available for free! In particular, the following three courses are all presented by leaders in the field:
- Andrew Ng's Machine Learning course (Stanford University)
- Pedro Domingos's Machine Learning course (University of Washington)
- Daphne Koller's Probabilistic Graphical Models course (Stanford University)

The Matrix Cookbook by Kaare B. Petersen and Michael S. Pedersen can be incredibly useful for helping with tricky linear alegbra problems!