Challenging the memory of RL agents

Reinforcement learning agents are usually trained to maximize their rewards by taking actions in an environment following a Markov Decision Process (MDP). A Markov Decision Process is simply a model that defines the state of an environment by its current state, actions, and rewards, including also its possible future states. The key point is that agents know information from the present and can approximately predict … Continue reading Challenging the memory of RL agents

Exploring Transformer Model for Reinforcement Learning

MLP is widely used in RL to implement a learnable agent in a certain environment trained according to a specific algorithm. Recent works in NLP have already proved that Transformer can replace and outperform MLP in most tasks leading to expanding its utilization in areas outside of NLP such as Computer Vision. However, in RL the Transformer architecture is still not widely adopted, and agents … Continue reading Exploring Transformer Model for Reinforcement Learning

Learning to imitate: using GAIL to imitate PPO

Usually, in reinforcement learning, the agent is provided with a reward according to the action it executes to interact with the environment and its goal is to optimize its total cumulative reward over multiple steps. Actions are selected according to some observations the agent has to learn to interpret. In this post, we are going to explore a new field called imitation learning: the agent … Continue reading Learning to imitate: using GAIL to imitate PPO

SeqGAN: text generation with generative models

In this post, we propose to review the recent history of research in the Natural Language Generation (NLG) tasks of the Natural Language Processing domain. Realistic human-like language generation has been a challenge for researchers that has recently come into greater focus with the release of large neural models for NLP like the GPT and BERT models. In this post, we propose to focus on … Continue reading SeqGAN: text generation with generative models