How 3 engineers built a record-breaking supernova identification system with deep learning
Pop into Dessa’s offices and you’ll soon find traces of the company’s fascination with outer space. A Lego replica of Saturn V, the rocket that made it to the moon, sits on our reception area coffee table. Venture a bit further, and you’ll discover that each of the meeting rooms are named after spaceports from around the world. For a few employees on Dessa’s software and machine learning engineering teams, this company-wide enthusiasm for space has evolved into a full-blown passion project.
Combining their interest in space with an exploration of deep learning’s potential applications for astronomy, 3 of Dessa’s engineers have collaborated on a project called space2vec since summer 2017. Recently, the team built a deep learning system that identifies supernovas from telescope images with record-breaking speed: cutting the time it would take astronomers to identify supernovas almost in half! Here’s the team’s personal account of how they did it.
How we started
“Clustering the goddamn universe.”
It was summer 2017 when Cole, one of our Machine Learning Engineers, first started flying these words around the Dessa office. The team was discussing dream projects to work on within the field of deep learning, and a quick survey of the room confirmed our initial suspicions… to work on something space-related would be pretty freaking awesome!
Luckily, we found ourselves in the right place at the right time. The company had recently started a program for employees to dedicate a portion of working hours to personal projects that would help us advance our machine learning knowledge.
We knew this program would help advance our product, too. At Dessa, two-thirds of the space2vec team (Jinnah and Pippin) spend the majority of work-time building features for Foundations, a platform for engineering enterprise-grade AI solutions. In order to build tools for ML engineers that they actually want to use, it’s essential for us to understand how they work and what problems they face from end to end. While the frequent exchange of feedback between our machine learning and software engineering teams is extremely useful, one of the best ways to build empathy into the product is to tackle a machine learning project hands on. Working on space2vec offered us an amazing way to get there, while also giving us the opportunity to grow our knowledge of space.
Before we started writing code, we wanted to figure out if we could find an answer to the following question:
Can we find a real, annoying problem in astronomy where we can help by applying advanced machine learning techniques like deep learning?
To find out, we interviewed six different astronomers. As we began our research, we were very aware of our lack of astronomy knowledge, but to our delight the astronomers we spoke with were all wonderfully welcoming.
Here are some of the questions we asked:
1. What tools are modern astronomers using when it comes to software?
We were amazed to discover that the majority of the astronomy community had recently switched to Python and embraced open source methodologies. This was exciting to us, since a lot of the tools in machine learning are also run on Python.
2. Data is one of the biggest pieces in machine learning problems, so what is the quality of data like in astronomy?
Astronomy is quickly amassing more and more data than ever before. Some of the bigger telescopes like the one used in the LSST project produce 15–30 terabytes of data a night. This is nuts, for any industry, and immediately stuck out to us!
3. Are astronomers using machine learning? Are astronomers using modern techniques like neural nets?
After talking to astronomers and scrolling through papers online, we soon found out the answer was ‘not really.’ Over on arXiv, we were only able to find a few papers that touched on both machine learning and astronomy, and the majority of these dealt with simpler techniques that have already been commonplace in other fields for many years.
Finding a specific problem to solve
Despite our research’s fledgling results, we ended up finding one paper that we thought had a lot of potential: ATIDES, or, Automated Transient Identification in the Dark Energy Survey. Before we launch into the details, though, it might be helpful to share a bit more background on what the survey is about.
The Dark Energy Survey
The Dark Energy Survey, or DES for short, is an international attempt that aims to better understand dark energy, one of astronomy’s most enigmatic findings to date. Here’s what NASA has to say about dark energy, which was first discovered by the Hubble telescope:
“It sounds rather strange that we have no firm idea about what makes up 74% of the universe. It’s as though we had explored all the land on the planet Earth and never in all our travels encountered an ocean. But now that we’ve caught sight of the waves, we want to know what this huge, strange, powerful entity really is.” — Hubble Site
One of the key ways astronomers further their understanding of dark energy is by studying supernovas, which are basically giant exploding stars. One kind of supernova called the Type Ia, for example, results in a release of energy that’s roughly 500x the amount of energy as the sun emits. The wild release of energy that supernovas create helps astronomers better understand dark matter.
Astronomers have built telescopes and satellites in order to capture and further study these intense gravitational forces. But how do they find the supernovas once they’ve gathered the images?
Even today, it’s not unusual for astronomers to sift through telescope images by hand, distinguishing by sight which contain supernovas, and which don’t. While this manual process usually only takes a half-second or so per image, there is still a huge amount of efficiency to be gained from automation. Manually identifying supernovas takes up time that could be much better spent by astronomers on higher-impact projects.
On top of that, more and more data is being collected from telescopes like those used in the DES everyday. At some point, the sheer volume of data could very well outpace the limited resources astronomers have available for important yet repetitive tasks like these.
Automating Supernova Identification with Autoscan
Here’s how ATIDES, the paper we found on arXiv, tackled the issue of automating the process. The paper outlined an algorithm built called Autoscan, making use of a random forest model (a type of classical machine learning) to identify supernovas from the Dark Energy Survey. Using their model, the researchers were able to reduce the number of images astronomers had to look at by a factor of 13.4. This is compounded when you realize that DES produced about 400 million images in its first three years alone.
To illustrate, let’s see what would happen if astronomers had 1 million images to classify. Done manually, at 0.5 seconds per image, it would take astronomers 138.8 hours. By using Autoscan, astronomers could reduce the image pool that needs classification down to 74,556 images — translating to just 10.4 hours of astronomer classification time. This is a huge amount of time saved: that 128.4 hour difference turns out to be just over 16 working days!
Seeing these results sparked our thinking. If classical models like random forests could make a dent, wouldn’t deep learning’s results be even more impressive?
Modelling: from random forest to CNN
One of the limitations of using a random forest model like Autoscan is that it requires feature engineering. Essentially, domain experts must spend time abstracting images into rows of tabular data (made up of features) that represent the images numerically. This takes considerable time, and finding the perfect combination of features is messy. Feature engineering also requires a lot of engagement from experts that know what parts of the dataset will help solve the problem at hand. With Autoscan, for example, producing a dataset with 898,963 rows required collaboration between more than 50 different researchers. That’s a lot of time and effort!
For our own model, we decided to test out CNNs (convolutional neural networks), AKA the class of deep learning networks that played a huge part in making AI what it is today. Since Alex Krizhevsky’s creation of the CNN AlexNet in 2012, the architecture has fuelled a wave of groundbreaking results with deep learning around the world. One of the things that makes CNNs so great is that they skip over the feature engineering process entirely.
Instead, raw image data can be fed into a CNN directly, and the algorithm learns the features that identify an image without any human intervention. In the case of space2vec, our CNN could learn how to recognize supernovas by being fed telescope image data alone. Doing away with feature engineering would dramatically reduce the time astronomers need to spend when classifying images.
We found our dataset in the ATIDES paper, which featured 2 datasets, both collected from the first season of the Dark Energy Survey:
- Image data “stamps” which contains 898,963 images from the telescope. This is the data we used to train our CNN model.
- Feature engineered data, a CSV file that contains 898,963 rows, 1 for each image subsection of the sky, and 38 features. This was the data used for the Autoscan random forest model.
For our CNN model, we were able to use the first set of image stamps that were collected from raw telescope data. The classification goal of the model was to determine whether each image contained a supernova, or not.
Our first few CNN models were just a single layer models to prove that we could run the data through the model correctly. Getting models going quickly was a priority, so we decided to make use of Keras.
Examining our results
[Warning: the next section gets pretty technical. Non-technical readers may wish to skip ahead to the ‘Results: space2vec vs. Autoscan section.’]
Our model results were very interesting to look into for our untrained eyes. There were many times where we followed the stages of surprise, confusion, bargaining, and acceptance…the latter usually coming only after consulting the savviest of Dessa’s Machine Learning Engineers. The numbers here are pretty fun to pull apart, so let’s take a look!
We used the Python library Pandas to easily generate a histogram of our model’s predictions. We can see very clearly that our model is able to separate the classes (supernova, not supernova) really nicely! Seeing this chart for the first time kicked off a few days of making sure we knew what we were talking about as it seemed too good for some random engineers to have made.
Echoing Autoscan’s methodology, we wanted to make sure that we captured 99% of all supernova passed through the model. To achieve this, we had to set our threshold at 0.06 — this is because a model’s predictions are not nice round numbers (e.g. 0.01244 or 0.68933) even though we need them to be. A threshold basically forces the model to make up its mind on what it is guessing at — we then push everything below 0.06 to 0 and everything above 0.06 to 1.
Setting this specific threshold on our test set (1% of our training set or 89441 examples) gave us 44,954 images that we correctly classified as supernova and 6,307 images that we said were supernova but were not — a false discovery rate of only 0.123! This also means that our model, set with a threshold of 0.06, has a ratio of 0.14 supernova to non-supernova. The full confusion matrix is below.
So what do these results actually mean? After feeding the images into our model, it predicts that a bunch of them were supernova — specifically 51,261 images. Based on our false discovery rate, we know that 12% of those 51,261 images are not actually supernova. This means that 88% of the astronomers’ time will be spent looking at actual supernova. In this case, looking at all 89,441 images would take 12.42 hours. With our model’s help, time spents gets cut nearly in half to 7.12 hours, with only 0.85 hours — not even a full hour — being wasted time!
Results: space2vec vs. Autoscan
After a few months of model development we ended up finding a model architecture that set a new world standard (!) for supernova detection.The final CNN model used 9 layers, using Relu activation layers in between. In total we ran almost 500 experiments, with our top CNN model beating the current state-of-the-art random forest techniques by over 10%.
What does this mean in terms of time saved? We previously said a random forest model would save 16 work days of astronomer time per 1M images. In comparison, our space2vec model results in 20 days of working days saved, not counting the considerable legwork saved by omitting the need for feature engineering.
One of our favourite realizations throughout this project was that we’re at a really interesting point in time. With the new wave of models that don’t require feature engineering, people like us (who don’t have PhDs in Astronomy) can nonetheless contribute something capable of producing real impact for science and other fields.
Although this is our first significant ML project within astronomy, we’re optimistic that there’s many other areas within the field where we can further explore with deep learning techniques.
We have also used this experience to help influence our day-to-day work on Foundations. We now appreciate why model reproducibility (data provenance), project tracking, model version control, and compute scheduling are necessary tools to MLEs.
Build your own supernova classifier
We’ve open sourced our code, including everything you need to get started classifying supernovas. It’s available on Github, and as a Google Colaboratory notebook so you can get started right away. The model doesn’t use the entire dataset, but if you want you can download the entire data here and run it on your own GPU machine.
Download here: GitHub
Thanks to Dustin Lang, Dan Foreman-Mackey, Ross Fadely & Nathalie Ouellette for guiding our astronomy work.
Thanks to Shahzad Raza, Hashiam Kadhim, Alex Krizhevsky & Ragavan Thurairatnam for guiding our ML work.
Thanks to Ashwin Jiwane, Brendan Ross, Brian Chen & Cole Jackes for helping us through the late night space sessions.
References / Further Reading
- Goldstein, et al.: “Automated Transient Identification in the Dark Energy Survey” AJ (accepted). [Autoscan code here]
- Hubble Site (NASA): What is Dark Energy?
- Olah, Chris: CNN Models
- Schatz, Dennis. “Why Should We Care About Exploding Stars?” [link]
Written by Dessa engineers Cole Clifford, Pippin Lee and Jinnah Al-Clarke! Impressed by their ingenuity? Make sure to give this article a clap! Also be sure to follow us on Medium to find the next instalment of the space2vec journey.