Q-Learning demo implemented in JavaScript and three.js. R2D2 has no knowledge of the game dynamics, can only see 3 blocks around and only gets notified over a reward block (green) or a pubishment block (black). Over time, R2D2 will learn to understand that green blocks are good, learn how to move out of the way of a black block and how to optimize for green blocks. Learning takes time, but eventually, it becomes clear that RD2D has mastered this game.