Washington University in St. Louis
Department of Computer Science and Engineering

CSE 417A: Introduction to Machine Learning

Fall 2015



This course is a broad introduction to machine learning, covering supervised learning and unsupervised learning. Topics that will be covered include generative and discriminative techniques for classification (likely including regression, Naive Bayes, decision trees, neural networks, nearest-neighbor methods, support vector machines, boosting, and random forests), clustering and dimensionality reduction. Note that there is some overlap with topics in the 500-level courses on Artificial Intelligence and Machine Learning, but the material covered in this class will be at a more elementary level.


Instructor: Sanmay Das
Office: Jolley 510
Office hours: Thursdays 11:30-12:30, and by appointment.

TAs: There are several TAs for the class. Hao Yan (haoyan at wustl) will be the head TA and will conduct various recitation sessions as needed. Alicia Sun (sun.yi at wustl), Tong Mu (mutong at wustl), and Elizabeth Halper (elizabeth.halper at wustl) will also serve as TAs. The TAs will hold regular office hours starting in the second week of class, grade homeworks, and answer questions on Piazza.

TA office hours will be held in Urbauer 114, the ACM Lounge. The complete office hour schedule is as follows (Sanmay's office hours will be in Jolley 510):
Mondays 1-3 (Lizzie), 3-5 (Hao)
Tuesdays 4-6 (Alicia)
Wednesdays 3-5 (Tong)
Thursdays 11:30-12:30 (Sanmay)


Detailed policies are in the official syllabus. A few points to highlight: please read and understand the collaboration policy and the late day policy. There will be two in-class exams, each covering approximately half the course material, and no separate final exam.


There are two textbooks for this class.


CSE 241 and ESE 326 (or Math 320) or equivalents; Linear algebra and multi-variable calculus. If you do not have a basic background in CS through data structures and algorithms, or if you are not comfortable with calculus and probability, you may have a hard time in this class.
Date Topics Readings Extras
Aug 25 Introduction. Course policies. Course overview. Slides; AML 1.1, 1.2.
Aug 27 The perceptron learning algorithm. Is learning feasible? AML Section 1.1.2, Problem 1.3, Section 1.3.1
Sep 1 Generalizing outside the training set. Error and noise. AML 1.3, 1.4 HW1 out (due Sep 8)
Sep 3 Infinite hypothesis spaces. VC dimension. AML 2.1.1-2.1.3
Sep 8 The VC generalization bound. AML 2.1.4, 2.2 HW2 out (due Sep 17)
Sep 10 The bias-variance tradeoff. AML 2.3.1
Sep 15 Bias-variance tradeoff, continued. Learning linear models with noisy data. AML 2.3.2, 3.1
Sep 17 Linear regression. AML 3.2
Sep 22 Logistic regression and gradient descent. AML 3.3
Sep 24 No class, Sanmay will be at a conference. HW3 out (due Oct 6)
Sep 29 Nonlinear transformations. Overfitting. AML 3.4, 4.1
Oct 1 Overfitting, Intro to regularization AML 4.1, 4.2.1 Malik Magdon-Ismail's slides on overfitting
Oct 6 Regularization contd. Validation. AML 4.2, 4.3 HW4 out (due Oct 13)
Oct 8 Cross-validation. Occam's razor and sample selection bias. AML 4.3, 5.1, 5.2 Malik Magdon-Ismail's slides on validation
Oct 13 Data snooping. Midterm review. AML 5.3
Oct 15 In-class exam #1
Oct 20 Exam discussion. Intro to decision trees. HTF 9.2
Oct 22 Decision trees, contd. HTF 9.2 HW5 out (due Oct 29)
Oct 27 Pruning. Bagging. HTF 9.2, 8.7
Oct 29 Random forests. Intro to boosting. HTF 8.7, 15.1-15.3, 10.1
Nov 3 Guest lecture by Prof. Brendan Juba on connections between machine learning and cryptography. HW6 out (due Nov 17)
Nov 5 AdaBoost HTF 10.1. This short proof of the training error theorem.
Nov 10 Gradient boosting. Intro to neural networks. HTF 10.2-10.5, 10.9, 10.10, 11.1, 11.3
Nov 12 Class canceled.
Nov 17 Learning neural networks. HTF 11.4, 11.5
Nov 19 Support vector machines. HTF 12.1-12.3.1 HW7 out (due Dec 1)
Nov 24 Nonparametric methods, nearest neighbors, and k-d trees HTF 13.3 (except 13.3.3), Wikipedia article on k-d trees
Dec 1 Brief overview of unsupervised learning (k-means, Expectation Maximization, hierarchical agglomerative clustering). HTF 14.3.4-14.3.7, 8.5.1, 14.3.12
Dec 3 In-class exam #2