[UNDER CONSTRUCTION]
Reinforcement learning algorithms often try to approximate a solution over the entire state space. We try to study cases where the optimal policy can be learned only
iLQG, DDP, iLDP policy gradient (Munos)
[UNDER CONSTRUCTION]
Reinforcement learning algorithms often try to approximate a solution over the entire state space. We try to study cases where the optimal policy can be learned only
iLQG, DDP, iLDP policy gradient (Munos)