Summary
- The objective was to train a RL agent to play the world’s hardest game which is essentially to reach a goal point among arbitratrily moving obstacles.
- Created a simple MLP which projects from a n-dimensional state space to a m-dimensional action state which the RL agent should take.
- Created a custom reward function which penalizes the agent for being idle or getting stuck by an obstacle and rewards it for reaching the goal.