[UNDER CONSTRUCTION]

Reinforcement learning algorithms often try to approximate a solution over the entire state space. We try to study cases where the optimal policy can be learned only

iLQG, DDP, iLDP policy gradient (Munos)

Page last modified on September 11, 2007, at 10:37 PM.