Adversarial policies: attacking TicTacToe multi-agent environment

In a previous post we discussed about the possibility for an attacker to fool image classification models by injecting adversarial noise directly to the input images. Similarly, in this post we are going to see how is it possible to attack deep reinforcements learning agents on multi-agent environments (where two or more agents interact within the same environment) such that one or more agents are … Continue reading Adversarial policies: attacking TicTacToe multi-agent environment

Teaching AI to play Snake with Reinforcement Learning

It is well known that two of the most fascinating fields of computer science are gaming and artificial intelligence. The gaming field saw its origins back in the 1970s when gaming consoles such as Atari 2600, along with graphics on computer screens and home computer games were introduced to the general public giving birth to different kinds of arcade games like Pong and Pacman. In … Continue reading Teaching AI to play Snake with Reinforcement Learning

Introduction to Deep Reinforcement Learning

Deep Reinforcement Learning is the result of the combination of two well-known machine learning approaches: Deep Learning and Reinforcement Learning. Its main goal is the one to create a single agent able to handle any human-level task but achieving super-human results on it. A famous AI implementing this technique is AlphaGo that, in March 2016, defeated for the first time in the history a 9-dan … Continue reading Introduction to Deep Reinforcement Learning

Balancing a cart pole with policy gradients algorithm

In this post we are going to analyze a type of reinforcement learning algorithm called policy gradients. In the field of reinforcement learning, we have an agent making observations and taking actions within an environment in order to receive some rewards and its main objective is to learn a policy such that its actions will maximize its expected long-term rewards. In this case, our agent … Continue reading Balancing a cart pole with policy gradients algorithm