CSE 7111: Seminar on Machine Learning

Wednesdays, 4:00-6:00 p.m.
Lopata 507
Contacts: Bill SmartRob Glaubius

Course Description

This seminar will focus on recent advances in the field of reinforcement learning. Students will read, present, discuss, and implement recent work from the research literature. Emphasis will be placed on recent research that applies insights from topology to reinforcement learning problems. Prereqs: CSE 517A or permission of instructor.

Schedule

January 25Bill Smart Reinforcement Learning: A Survey
February 1Rob GlaubiusQ-Learning (to be handed out)
February 8Tim Gatzke Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding
February 15Andrew LevineReinforcement Learning with Soft State Aggregation
February 22Motoi Namihira Online Learning with Random Representations
March 1Adam Covington Variable Resolution Discretization for High-Accuracy Solutions of Optimal Control Problems
March 8
March 15Spring Break
March 22Moshe Looks Kernel-Based Reinforcement Learning
March 29Robert Glaubius Reinforcement Learning with Function Approximation Converges to a Region
April 5Tim Gatzke Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms
April 12Fritz Heckel Toward a Topological Theory of Relational Reinforcement Learning for Navigation Tasks
April 19Motoi Namihira Value Function Approximation using Diffusion Wavelets and Laplacian Eigenfunctions
April 26Monika Ray Hierarchical Control of MDPs
May 3Adam Covington
Andrew Scully
Machine Learning for Fast Quadrupedal Locomotion
Temporal Difference Learning and TD-Gammon

Papers

C. Watkins and P. Dayan. "Q-Learning". Machine Learning 8:279--292, 1992.

T. Jaakkola, M. I. Jordan, and S. P. Singh. "Convergence of Stochastic Iterative Dynamic Programming Algorithms". Advances in Neural Information Processing Systems 6:703--710, 1993. pdf

R. S. Sutton and S. D. Whitehead. "Online Learning with Random Representations". Machine Learning: Proceedings of the Tenth International Conference, 314--321, 1993. pdf

S. P. Singh, T. Jaakkola, and M. I. Jordan. "Reinforcement Learning with Soft State Aggregation". Advances in Neural Information Processing Systems 7:361--368, 1995. pdf

G. Tesauro. "Temporal Difference Learning and TD-Gammon". Communications of the ACM, 38:58--68, 1995. pdf

R. S. Sutton. "Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding". Advances in Neural Information Processing Systems 8:1038--1044, 1996. pdf

L. P. Kaelbling, M. L. Littman, and A. W. Moore. "Reinforcement Learning: A Survey". Journal of Artificial Intelligence Research 4:237--285, 1996. pdf

A. McGovern, D. Precup, B. Ravindran, S. Singh, and R. S. Sutton. "Hierarchical Control of MDPs". Proceedings of the Yale Workshop on Adaptive and Learning Systems, 1998. pdf

M. Kearns and S. Singh. "Near-Optimal Reinforcement Learning in Polynomial Time". Proceedings of the 15th International Conference on Machine Learning (ICML), 1998. pdf

M. Kearns and S. Singh. "Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms". Advances in Neural Information Processing Systems, 1999. pdf

R. Munos and A. Moore. "Variable Resolution Discretization for High-Accuracy Solutions of Optimal Control Problems". International Joint Conference on Artificial Intelligence, 1999. pdf

G. Gordon. "Reinforcement Learning with Function Approximation Converges to a Region". Advances in Neural Information Processing Systems, 2000. pdf

R. Munos and A. Moore. "Variable Resolution Discretization in Optimal Control". Machine Learning 49:1--24, 2002. pdf

D. Ormoneit and S. Sen. "Kernel-Based Reinforcement Learning". Machine Learning 49:161--178, 2002. pdf

M. G. Lagoudakis and R. Parr. "Least-squares Policy Iteration". Journal of Machine Learning Research 4:1107-1149, 2003. pdf

N. Kohl and P. Stone. "Machine Learning for Fast Quadrupedal Locomotion". Proceedings of the National Conference on Artificial Intelligence (AAAI), 2004. pdf

T. Lane and A. Wilson. "Toward a Topological Theory of Relational Reinforcement Learning for Navigation Tasks". Proceedings of the Eighteenth International Florida Artificial Intelligence Research Society Conference, 2005. pdf

S. Mahadevan. "Samuel meets Amarel: Automating Value Function Approximation using Global State Space Analysis". Proceedings of the National Conference on Artificial Intelligence (AAAI), 2005.  pdf

S. Mahadevan and M. Maggioni. "Value Function Approximation using Diffusion Wavelets and Laplacian Eigenfunctions". Advances in Neural Information Processing Systems, 2005. pdf

Additional Material