Balancing a cart pole with policy gradients algorithm

In this post we are going to analyze a type of reinforcement learning algorithm called policy gradients. In the field of reinforcement learning, we have an agent making observations and taking actions within an environment in order to receive some rewards and its main objective is to learn a policy such that its actions will maximize its expected long-term rewards. In this case, our agent … Continue reading Balancing a cart pole with policy gradients algorithm