The Parrot has Landed

The Generative Deep Learning Book— The Parrot Has Landed.

My 459 day journey from blog to book and back again.

David Foster
7 min readJul 20, 2019

--

It’s 20:17 UTC on July 20th 2019.

50 years ago to the minute, the Eagle — the lunar module piloted by Neil Armstrong and Buzz Aldrin, landed on the moon. A spectacular feat of engineering, courage and sheer determination.

Fast forward 50 years and the processing power of the Apollo Guidance Computer (AGC) that took those men to the surface of the moon is now in your pocket, multiple times over— in fact, an iPhone 6 could be used to guide 120 million Apollo 11 spacecraft to the moon, all at the same time.

This factoid doesn’t really do justice to the brilliance of the AGC. Given Moore’s law, you could pick anything computational and say that 50 years later, there will exist a machine that can run it 2²⁵ faster.

It was people like Margaret Hamilton, the lead for the software team who coded the AGC, who chose not to see current hardware limitations as a barrier, but instead as a challenge. She used the resource available to her at the time to achieve the unthinkable.

Margaret Hamilton with the software for the AGC (source: Science History Images)

Which brings me to…

The Generative Deep Learning Book

459 days ago, I received a message from O’Reilly Media asking if I’d be interested in writing a book. It seemed like a good idea at the time so I said yes and decided to write an up-to-date guide to Generative Modelling — in particular, a practical manual on how to build state-of-the-art deep learning models that can paint, write, compose and play.

Crucially, I wanted this book to give the reader an in-depth understanding of generative deep learning and build models that are capable of amazing things, without the requirement for heaps of expensive and time-consuming computational resource.

I’m pleased to say that the resulting book is now available in print through Amazon and also in electronic format through the O’Reilly website.

It’s my firm belief that the secret to mastering anything technical is to first tackle small problems, but in such detail that you understand the rationale behind every single line of code.

If you start with huge datasets and models that take a day to run instead of an hour, then you don’t learn anything more — you just learn 24 times more slowly.

If the lunar landing has taught us anything, it’s that truly amazing things can be achieved on very little computational resource and it’s my aim to make you feel the same way about generative modelling, after reading this book.

What’s with the parrot?

The great thing about writing for O’Reilly is that they draw you an animal to feature on the front cover of your book — I got given a painted parakeet, who I’ve affectionately named Neil Wingstrong.

Neil Wingstrong the Parakeet

So now that the parrot has landed, what can you expect from the book?

What’s the book about?

This is a book is a hands-on guide to generative modelling.

It takes you through the very rudiments of how to build basic generative models, then builds up to more complex models step-by-step — all the time with practical examples, architecture diagrams and code.

This book is for anyone who wants to understand the current hype around generative modelling at a deeper level. There’s no prior knowledge of deep learning required and all code examples are in Python.

What’s covered?

I’ve tried to cover all of the key generative modelling developments from the last 5 years. In particular, all of those shown on the timeline below.

The book is divided into two parts and the chapter outline is given below:

Part 1: Introduction to Generative Deep Learning

The first four chapters of the book aim to introduce the core techniques that you’ll need to start building generative deep learning models.

1. Generative Modelling

We take a broad look at the field of generative modelling and consider the type of problem that we are trying to solve from a probabilistic perspective. We then explore our first example of a basic probabilistic generative model and analyse why deep learning techniques may need to be deployed as the complexity of the generative task grows.

2. Deep Learning

This chapter is a guide to the deep learning tools and techniques that you need to start building more complex generative models. We will introduce Keras, a framework for building neural networks that can be used to construct and train some of the most cutting-edge deep neural network architectures published in the literature.

3. Variational Autoencoders

In this chapter we take a look at our first generative deep learning model, the variational autoencoder. This powerful technique will allow us to generate realistic faces from scratch and alter existing images — for example, by adding a smile or changing the colour of someone’s hair.

4. Generative Adversarial Networks (GANs)

This chapter explores one of the most successful generative modelling techniques of recent years, the generative adversarial network. This elegant framework for structuring a generative modelling problem is the underlying engine behind most state-of-the-art generative models. We shall see the ways that it has been fine-tuned and adapted to continually push the boundaries of what generative modelling is able to achieve.

Part 2: Teaching Machines to Paint, Write, Compose and Play

Part 2 presents a set of case studies showing how generative modelling techniques can be applied to particular tasks.

5. Paint

In this chapter, we examine two techniques related to machine painting. First we look at CycleGAN, which as the name suggests is an adaptation of the GAN architecture that allows the model to learn how to convert a photograph into a painting in a particular style (or vice versa). We also explore the neural style transfer technique contained within many photo editing apps that allows you to transfer the style of a painting onto a photograph, to give the impression that it is a painting by the same artist.

6. Write

In this chapter, we turn our attention to machine writing, a task that presents different challenges to image generation. This chapter introduces the recurrent neural network (RNN) architecture that allows us to tackle problems involving sequential data. We shall also see how the encoder–decoder architecture works and build a question-answer generator.

7. Compose

This chapter looks at music generation, which, while also a sequential generation problem, presents additional challenges such as modelling musical pitch and rhythm. We’ll see that many of the techniques that worked for text generation can still be applied in this domain, but we’ll also explore a deep learning architecture known as MuseGAN that applies ideas from Chapter 4 (on GANs) to musical data.

8. Play

This chapter shows how generative models can be used within other machine learning domains, such as reinforcement learning. We present one of the most exciting papers published in recent years ‘World Models’, in which the authors show how a generative model can be used as the environment in which the agent trains, thus essentially allowing the agent to ‘dream’ of possible future scenarios and imagine what might happen if it were to take certain actions, entirely within its own conceptual model of the environment.

9. The Future of Generative Modelling

This is a summary of the current landscape of generative modelling and looks back on the techniques that have been presented in this book. We will also look to the future and explore how the most cutting-edge techniques available today such as GPT-2 and BigGAN might change the way in which we view creativity, and whether we will ever be able to create an artificial entity that can produce content that is creatively indistinguishable from works created by the human pioneers of art, literature, and music.

10. Conclusion

Concluding thoughts on why generative deep learning may prove to be the most important and influential field of machine learning in the next 5–10 years.

Summary

In a world where fact and fiction are not so easily separated, it is vital that there are engineers who understand the workings of generative models in detail and are not put off by technological limitations.

Hopefully this book goes a small way towards shedding light on the current state-of-the-art and makes for an enjoyable read at the same time.

If you do choose to buy the book, please feel free to leave me a review as well — all feedback is welcomed!

The Parrot has Landed

This is the blog of Applied Data Science, a consultancy that develops innovative data science solutions for businesses. To learn more, feel free to get in touch through our website.

… and if you like this article, feel free to leave a few hearty claps :)

--

--

David Foster

Author of the Generative Deep Learning book :: Founding Partner of Applied Data Science Partners