Using AI for Good and Not for Evil — Generating Images with Just Noise
Using GANs for data augmentation to prevent power outages and fires sparked by falling trees and storms.
How I get here
I’m a software engineering and co-founder of a consultancy. The moment I heard about Omdena, I recalled that I already had some experience mostly in image processing using tools like OpenCV and Tensorflow to implement some simple convolutional neural network. But by no means, I consider myself an expert of some sort in the field. Just a tweet from a friend introduced me to Omdena. That was all. Once I got in touch with them to learn more about this community, I got really excited, for the high diversity of the people in the challenge. I was thinking about how could I learn from them.
My first contact with GAN’s
The problem for the challenge that I enrolled was to find a way to create an AI model to detect trees in satellite images more accurately, essentially, how we can teach a machine to differentiate between trees, from other things green in a picture, like grass, farmlands, bushes, etc. This aligned as a part of the solution of a bigger problem that was to prevent fires in the forest.
In the beginning, we divided the work into subtasks and took a strategy to approach the problem. And for some reason I ended up in the GAN team, despite at the time, I didn’t know what they do exactly. But I started to read some documents shared by the team members. I did some research by myself, and then I understood what they do and how we can actually use them. I discovered that it was something that I already heard about, but with different applications, as deep-fake uses or to generate an image of an older you (examples of not very ethical uses or trivial uses at the best). But how we can use this technology in the challenge? This is the interesting part. Another subtask was started by another team. This task was to start using the available dataset of the images and label the important parts of interest like the trees and the things that weren’t trees. This is a very important step because this was the information that was going to be used. The purpose of it is to train the model to detect these features. But also this is one of the most time-consuming processes. So the idea was to use a GAN to generate synthetic images already pre-labelled for data augmentation. This was for the other models to train them without the effort to label them manually.
That is to say, instead of label a real image, create a fake one that looks like a real one, but from a template where we already know where we are going to be the trees and the things that are not trees.
In the beginning, I was a little skeptical. How can this be even possible to accomplish and where to start? So let’s start from the beginning. What is a GAN any way? The GAN stands for Generative Adversarial Network, which is essentially applying game theory and put a couple of artificial neural networks to compete with each other while they are trained at the same time. One network tries to generate the image and the other tries to detect if it is real or fake. Actually, it is something very simple, but pretty effective too. This is clearer with an image:
But again, how can we use this to accomplish our goal? It turns out that there is a kind of GAN named pix2pix. This kind of GAN can be used as an input, a pre-defined sketch of the real one. Like take a doodle and from there build a picture like a landscape or anything you want. An example of this is the application that Nvidia did to generate artificial landscapes.
Ok, so maybe this can work. At that moment the label team has already labeled some images, so if we use these labels to build some doodles, then we can use this to train a GAN to generate the images. It actually works!
So now we just need to find a way to generate random doodles to feed the pix2pix GAN. So here is another GAN to the rescue, a DCGAN in this case. So, in this case, the idea was to generate a random doodle from random noise. Getting something like this:
And finally putting all the pieces together, with the help of some Python and Opencv code, we end up with a script that generates a 100% random image from pure noise with the corresponding labels. At the moment we can generate thousands of synthetic images with their corresponding labels in a JSON file in coco format. For the labels, we use the doodle to get labels by masking the colors and then build the synthetic images from the doodle.
For now, the results look promising, but they are just preliminary results and can be enhanced, for example, the labels that we use, only had labels for trees or not trees, this can be enhanced by another label to make the model more specific and accurate, like for example also label roads, fields, buildings, lakes, rivers and so on, to make the model generate this stuff.
Finally, it is very satisfying to learn how to use these tools and technics and how we can use them to help to solve part of a problem, maybe just a little part of the problem individually but a larger part of it together as a community of people that just want to help and make a difference.
During this process, they can also learn, learn from each other and interact as co-workers and co-learners.