Project overview
The OpenAI Gym CarRacing environment is a game where keyboard input is used to drive a car around a racetrack.
The environment provides a reward at each timestep based on how much new track is covered, with penalties for going off-track.
The goal is to maximize the total reward by completing laps as quickly as possible while staying on the track.
RL is a lot easier to implement, but the real world does not come with reward functions, so
a lot of my effort has gone into improving the imitation learning network using
DAgger.
The RL agent achieves an average score of about 710 out of a maximum possible 1000 (state-of-the-art scores range around 800-900). The imitation learning agent
achieves a score of about 560, but continues to get better as I improve the demonstration data. Using DAgger to improve an imitation model takes time because it requires collecting new data with the expert in the loop.