GSoC 2021 with ML4Sci: Domain Adaptation for Decoding Dark Matter

Marcos Tidball
9 min readAug 23, 2021


Hey! In this post I’m going to talk a little bit about my experince on Google Summer of Code (GSoC) 2021 and also the GSoC 2021 project that I developed under the Machine Learning for Science (ML4Sci) umbrella organization. You can check out more about the ML4Sci organization and its others GSoC projects on ML4Sci’s website.

In my project I implemented different Unsupervised Domain Adaptation algorithms for the DeepLense pipeline.

Credits: Google Summer of Code

DeepLense and gravitational lenses

DeepLense is a deep learning pipeline that combines state-of-the art of deep learning models with strong lensing simulations. In this context, my project aims to continue the work published by my mentors in “Decoding Dark Matter Substructure without Supervision”, where the team implemented unsupervised machine learning techniques to infer the presence of dark matter substructure in strong gravitational lenses. Pranath Reddy provides a great explanation of this project in his blog post.

A gravitational lens occurs when we have a galaxy or (other massive object) between us, the observers, and a source object. Due to the mass of the intermediate object, it is capable of bending the light from the source, creating an effect very similar to the lenses used on our day-to-day life, where the source object now appears to be in another location (image).

Credits: Michael Sachs, CC BY-SA 3.0, via Wikimedia Commons

But how does this relate to dark matter?

Dark matter is a form of matter that generates mass but doesn’t interact with electromagnetic forces. That means that it doesn’t interact with light, making it very difficult to be identified and studied. Luckily, a promising means to identify the nature of dark matter is to study it through dark matter halos, and strong gravitational lenses have seen encouraging results in detecting the existence of dark matter substructure!

Unfortunately, there isn’t a lot of data of strong gravitational lenses available, which means that, if we want to train a machine learning model to identify the different kinds of dark matter substructure, we’d need to use simulations. The problem though, is that a model trained on simulated data does not generalize well to real data, having a very bad performance.

This project aims to fix this problem by using Unsupervised Domain Adaptation (UDA) techniques to adapt a model trained on simulated data to real data!

A bit about me

I’m Marcos Tidball (nice to meet you!), a junior Physics student at the Universidade Federal do Rio Grande do Sul (UFRGS) in Brazil (I’m not good at soccer though). Previous to GSoC I was a research intern developing a method that uses convolutional neural networks to identify low surface brightness galaxies. And pretty soon I’m going to start an internship at BTG Pactual, the largest investment bank in Latin America.

I was interested in this project as soon as I read about it! Since I was already working on machine learning models for astronomical data (and suffering from the lack of accuracy when attempting to use a model trained on simulations to real world data), it seemed like the perfect fit to me!

The code

All my code is available at the ML4Sci DeepLense repository. While you’re there you can also check out the code and projects of the other students that contributed to DeepLense!

The data

The dataset I used throughout my project consists of simulated strong gravitational lens images generated with PyAutoLense. The parameters of these simulations can be found in “Decoding Dark Matter Substructure without Supervision”.

In this dataset there are three classes:

  • No substructure: gravitational lenses simulated without dark matter.
  • Spherical substructure: gravitational lenses simulated with subhalos of cold dark matter.
  • Vortex substructure: gravitational lenses simulated with vortices of superfluid dark matter.

As a proof-of-concept, we don’t use simulated and real data for the domain adaptation. We use “Model A” simulations for the source domain (what we train on) and “Model B” simulations for the target domain (what we want to adapt to). Model B’s simulations are more complex and more representative of real-world data, while Model A’s are easier. In more practical terms, Model B is simulated with a variable redshift and signal-to-noise ratio while Model A has these parameters fixed.

The dataset has 30'000 grayscale images of size 150x150 for each domain. All images are stored as NumPy arrays. More information about the dataset is available on the repository.

Here are some of the images that come from the source dataset:

Some of our source data!

Unsupervised Domain Adaptation

Unsupervised domain adaptation is a problem in which one attempts to transfer knowledge gained from a labeled source dataset to a distinct unlabeled target dataset, within the constraint that the objective (e.g. digit classification) must remain the same. One of the most common baseline datasets for this kind of technique is the VisDA2017 dataset:

Credits: Geoffrey French, Michal Mackiewicz, Mark Fisher

I have studied and implemented many UDA models. The choice of which model to be implemented was made after comparing their results on baseline datasets. The intention was to implement some of the best performing algorithms. In this section, I’ll be talking about the four models I’ve implemented: ADDA, Self-Ensemble, CGDM and AdaMatch.


In order to use the algorithms I’m going to use the package I created from this project: deeplense_domain_adaptation. The first step is to download it!:

pip install --upgrade deeplense_domain_adaptation

After that, we must also define the path for our data. In our dataset, the image data is separated from the label data. As such, we’ll define the path to both our source and target datasets as:

# source domain: model_f
# target domain: model_j

Now we can start looking at the methods and also how to use the deeplense_domain_adaptation package in order to train these algorithms!


ADDA (from “Adversarial Discriminative Domain Adaptation” by Eric Tzeng, Judy Hoffman, Kate Saenko, Trevor Darrell) is an adversarial domain adaptation method, where the goal is to minimize the domain discrepancy distance through an adversarial objective with respect to a discriminator.

Credits: Eric Tzeng, Judy Hoffman, Kate Saenko, Trevor Darrell

We want the discriminator to be unable to distinguish between the source and the target distributions!

ADDA learns a discriminative representations using the labels in the source domain and then a separate encoding that maps the target data to the same space. Our goal is to fool the domain discriminator so that it is unable to distinguish the source from the target.

In order to train ADDA, we must use the encoder and classifier trained on the source as initial inputs. While this is the only algorithm that needs this kind of transfer learning, this technique is beneficial to the other algorithms.

In order to use data we must first load the data:

from import augmentations
from import get_dataloader
# get ADDA transforms
train_transform_source, train_transform_target, test_transform = augmentations.adda_augmentations()
# load data
bs = 100
source_dataloader = get_dataloader(model_f_train_data_path, model_f_train_labels_path, train_transform_source, bs)
source_dataloader_test = get_dataloader(model_f_test_data_path, model_f_test_labels_path, test_transform, bs)
target_dataloader = get_dataloader(model_j_train_data_path, model_j_train_labels_path, train_transform_target, bs)
target_dataloader_test = get_dataloader(model_j_test_data_path, model_j_test_labels_path, test_transform, bs)

Then we can instantiate the network architectures:

from deeplense_domain_adaptation.networks import resnet
from deeplense_domain_adaptation.networks import discriminator
# the source_encoder and classifier should be pre-trained on source
source_encoder = resnet.Encoder('18')
target_encoder = resnet.Encoder('18')
classifier = resnet.Classifier()
discriminator = discriminator.Discriminator()

And finally train it!

from import hyperparams
from deeplense_domain_adaptation.algorithms import adda
# get hyperparameters
hparams = hyperparams.adda_hyperparams()
# instantiate ADDA
adda = adda.Adda(source_encoder, target_encoder, classifier, discriminator)
# train ADDA
epochs = 100
save_path = "./"
encoder, classifier = adda.train(source_dataloader, target_dataloader, target_dataloader_test, epochs, hparams, save_path)

Then we’re able to plot the training metrics and also evaluate ADDA on the test dataset:

# plot training metrics
# evaluate on test set
## returns accuracy on the test set
print(f"accuracy on test set = {adda.evaluate(target_dataloader_test)}")
## returns a confusion matrix plot and a ROC curve plot (that also shows the AUROC)

The confusion matrix in this case is:

Confusion matrix for ADDA

And the ROC curve plot is:

ROC curve plot for ADDA

This pipeline is used for all algorithms. If you’re interested in checking out more about the usage of deeplense_domain_adaptation check out the tutorial on the repository!


Self-Ensemble (from “Self-ensembling for visual domain adaptation” by Geoffrey French, Michal Mackiewicz, Mark Fisher) is based off of the mean teacher model used in semi-supervised learning. The mean teacher model has two networks: a student (trained with gradient descent) and a teacher (weights are an exponential moving average of the student’s weights).

You can see the training pipeline of this model in the following image:

Credits: Geoffrey French, Michal Mackiewicz, Mark Fisher


CGDM (from “Cross-Domain Gradient Discrepancy Minimization for Unsupervised Domain Adaptation” by Zhekai Du, Jingjing Li, Hongzu Su, Lei Zhu, Ke Lu) is a bi-classifier adversarial learning method.

CGDM minimizes the discrepancy of gradients generated by source and target samples. To compute the gradients of the target samples, it uses a clustering-based strategy to obtain more reliable pseudo-labels. It then uses self-supervised learning on the pseudo-labels in order to optimize the model with data from the source and the target domain.

Given its bi-classifier nature, after training CGDM we’ll obtain an encoder and two classifiers, that are used together when evaluating a new data point.


AdaMatch (from “AdaMatch: A Unified Approach to Semi-Supervised Learning and Domain Adaptation” by David Berthelot, Rebecca Roelofs, Kihyuk Sohn, Nicholas Carlini, Alex Kurakin) is a novel method that unifies UDA with semi-supervised learning and semi-supervised domain adaptation.

This method augments each image twice, once with a weak augmentation and once with a strong augmentation. From those images we’re able to extract logits, that are randomly interpolated. It then performs a distribution allignment to find target pseudo-labels.

Current results

In order to train my models I used early stopping with a patience of 15 epochs in order to save computational resources. I also use the model with the best accuracy on a validation set for inference. All models were trained for 100 epochs using Kaggle’s GPUs using a ResNet18 network as their backbone. The current best results for each method are:

Meanwhile, if we try use a model that was trained only on the source dataset to infer on the target dataset, we get:

We can conclude that UDA actually helps a bunch!

Future work and final thoughts

Though current results show a clear increase in the accuracy of our models over not applying any kind of domain adaptation, there’s still a lot of room for improvement. As seen in “DeepMerge II: Building Robust Deep Learning Algorithms for Merging Galaxy Identification Across Domains”, even though a model + UDA algorithm might perform well on a easier simulation to harder simulation adaptation, it’s hard to get a large boost in performance when adapting to real data.

The first step in my list of priorities is to test using equivariant neural networks, since this architecture achieves promising results in classification tasks related to gravitational lenses. Also trying out other convolutional neural network architectures such as ResNet50 and EfficientNet could prove very beneficial.

Another very important step is to make more thorough hyperparameter searches. Though time-consuming, it’s extremely important to find the best hyperparameters in order to achieve good results. Since two of the UDA algorithms we use are very dependent on the augmentations used (Self-Ensemble and AdaMatch), further exploration of augmentations is also very important.

Of course, keeping an eye out for novel UDA algorithms is also always a good practice to keep in mind, especially since this field is so hot right now!

All in all I got to say that this project was the best opportunity that I’ve gotten so far in my professional career. Being able to work with my mentors while exploring the field of UDA was amazing! And I honestly cannot even begin to fathom the amount of opportunities that will open up thanks to this project :)

I’d like to thank Pranath Reddy, Michael Toomey, Sergei Gleyzer, Anna Parul and Sourav Raha for helping me out and mentoring me during the program!

And finally, thank you Google and the people organizing Google Summer of Code for this amazing opportunity!