Source: Viaframe/Getty Image

I’m not a doctor in medicine, but I’m really fascinated by biology and especially how animals work. The only real pieces of intelligence we know so far are animals. That’s where almost all our inspiration in AI comes from.

Animals, for a large part, have a sensory and/or nervous system, and sometimes a central system to analyze information deeply. They are agents evolving in an environment, which only “goal” is to survive — it is implicit since this survival behaviour is embedded in the world, you can’t escape it (otherwise, you’re dead). …

Black holes are very dense objects living in the universe. Their density is about 10¹⁰ times the density of our sun. Such a density means that close to their core, under a given radius, nothing can escape, not even light.

A lot of scientists imagined such ‘holes’ where nothing can escape. Einstein was the first to actually find a good scientific theory to model them, one century ago. However, we have only been recently able to observe them directly (the famous photo of a black hole, April 2019).

In this article, we’ll discuss how a black hole can still emit…

Policy gradient algorithms is a big family of reinforcement learning algorithms, including reinforce, A2/3C, PPO and others. Q-learning is another family, with many significant improvements over the past few years : target network, double DQN, experience replay/sampling …

I’ve always wondered if it was possible to take the best of the two and create a better learning algorithm. And this dream has come true, when I discovered Mean actor critic [1].

Quick background

In reinforcement learning, there are rewards and the goal is to build an agent maximizing its overall reward over an episode. This quantity is :

How to speed up your learning algorithms

Tensorflow is a tremendous tool to experiment deep learning algorithms. But to exploit the power of deep learning, you need to leverage it with computing power, and good engineering. You will eventually need to use multiple GPU, and maybe even multiple processes to reach your goals. I recommend you to read the official tutorial about GPUs by TensorFlow first.

One process, many GPUs

This is the most common case, since the majority of the deep learning community is doing supervised learning, with a big dataset (images, text, sound …) and many parameters. …

Learning a complete probability distribution through adversarial training

I am a machine learning student, and as all of us I heard people working on GANs, GANs, GANs [1] everywhere. They could create images that look pretty much as real images. I didn’t really pay attention because I was (and I’m still) focused on reinforcement learning, and not computer vision. I was criticizing this “GAN rush”, and one day I dived in to see the point.

Imagine you have a finite set X of points in a space E, which are sampled from a probability distribution π on E. X is a subset of a larger set, A, which…

Grégoire Delétang

Research Engineer at Deepmind

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store