RL Algorithm Comparison

As a part of an internship in the AI department of China's second largest cellphone company, I was tasked with implementing and comparing the performance of different Reinforcement Learning algorithms (including DQNs, PPO, A2C) on contextual bandits. The specific game chosen was the Wechat game 跳一跳,

Check it out at: https://github.com/MyJumperBroke23/WechatJump_RL