Visualizing Loss Landscape of GAIL

This post aims to visualize the loss landscape of some imitation policies (IL policies) trained with GAIL, and their discriminator trained in three common environments: Cartpole, Lunarlander, and Walker2d from Mujoco.┬áThe expert policy of Cartpole and Lunarlander is a simple Double DQN while the expert of Walker2d, which supports continuous actions, is a DDPG policy. The imitation policies are the same policies employed by their … Continue reading Visualizing Loss Landscape of GAIL

SeqGAN: text generation with generative models

In this post we propose to review recent history of research in the Natural Language Generation (NLG) tasks of the Natural Language Processing domain. Realistic human-like language generation has been a challenge for researches that has recently come into greater focus with the release of large neural models for NLP like the GPT and BERT models. In this post we propose to focus ourselves on … Continue reading SeqGAN: text generation with generative models