Wednesdays, 4:00-6:00 p.m.
Lopata 507
Contacts: Bill Smart,
Rob Glaubius
This seminar will focus on recent advances in the field of reinforcement learning. Students will read, present, discuss, and implement recent work from the research literature. Emphasis will be placed on recent research that applies insights from topology to reinforcement learning problems. Prereqs: CSE 517A or permission of instructor.
C. Watkins and P. Dayan. "Q-Learning". Machine Learning 8:279--292, 1992.
T. Jaakkola, M. I. Jordan, and S. P. Singh. "Convergence of Stochastic Iterative Dynamic Programming Algorithms". Advances in Neural Information Processing Systems 6:703--710, 1993. pdf
R. S. Sutton and S. D. Whitehead. "Online Learning with Random Representations". Machine Learning: Proceedings of the Tenth International Conference, 314--321, 1993. pdf
S. P. Singh, T. Jaakkola, and M. I. Jordan. "Reinforcement Learning with Soft State Aggregation". Advances in Neural Information Processing Systems 7:361--368, 1995. pdf
G. Tesauro. "Temporal Difference Learning and TD-Gammon". Communications of the ACM, 38:58--68, 1995. pdf
R. S. Sutton. "Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding". Advances in Neural Information Processing Systems 8:1038--1044, 1996. pdf
L. P. Kaelbling, M. L. Littman, and A. W. Moore. "Reinforcement Learning: A Survey". Journal of Artificial Intelligence Research 4:237--285, 1996. pdf
A. McGovern, D. Precup, B. Ravindran, S. Singh, and R. S. Sutton. "Hierarchical Control of MDPs". Proceedings of the Yale Workshop on Adaptive and Learning Systems, 1998. pdf
M. Kearns and S. Singh. "Near-Optimal Reinforcement Learning in Polynomial Time". Proceedings of the 15th International Conference on Machine Learning (ICML), 1998. pdf
M. Kearns and S. Singh. "Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms". Advances in Neural Information Processing Systems, 1999. pdf
R. Munos and A. Moore. "Variable Resolution Discretization for High-Accuracy Solutions of Optimal Control Problems". International Joint Conference on Artificial Intelligence, 1999. pdf
G. Gordon. "Reinforcement Learning with Function Approximation Converges to a Region". Advances in Neural Information Processing Systems, 2000. pdf
R. Munos and A. Moore. "Variable Resolution Discretization in Optimal Control". Machine Learning 49:1--24, 2002. pdf
D. Ormoneit and S. Sen. "Kernel-Based Reinforcement Learning". Machine Learning 49:161--178, 2002. pdf
M. G. Lagoudakis and R. Parr. "Least-squares Policy Iteration". Journal of Machine Learning Research 4:1107-1149, 2003. pdf
N. Kohl and P. Stone. "Machine Learning for Fast Quadrupedal Locomotion". Proceedings of the National Conference on Artificial Intelligence (AAAI), 2004. pdf
T. Lane and A. Wilson. "Toward a Topological Theory of Relational Reinforcement Learning for Navigation Tasks". Proceedings of the Eighteenth International Florida Artificial Intelligence Research Society Conference, 2005. pdf
S. Mahadevan. "Samuel meets Amarel: Automating Value Function Approximation using Global State Space Analysis". Proceedings of the National Conference on Artificial Intelligence (AAAI), 2005. pdf
S. Mahadevan and M. Maggioni. "Value Function Approximation using Diffusion Wavelets and Laplacian Eigenfunctions". Advances in Neural Information Processing Systems, 2005. pdf
Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction, MIT Press, 1998.
This is the Sutton and Barto book mentioned during the organizational meeting. It provides textbook coverage of RL from the ground up, and is a fairly easy read.