Declarative Configs for Neuron Reconstructions @ PyCon US

Tom Herold
WEBKNOSSOS
Published in
2 min readJun 17, 2021

At scalable minds, we are very technology-focused and work on many challenging engineering problems. Many of our daily tasks involve using Python for machine learning and computer vision tasks, e.g. tissue segmentation from electron microscopy (EM) images. We are happy to share some insights into our routine with the community, as happened recently with our talk at the PyCon US 2021 by my colleague Jonathan Striebel.

Motivation

Our biggest projects are centered around neuron reconstruction in large-scale 3D EM datasets (500GB-1PB) in a research field called Connectomics. The sheer size of these image stacks makes it impractical to do any sort of manual annotation and we make heavy use of custom convolution neural networks to segment these images. (View a full reconstruction online in webKnossos.)

An example of a typical neuron reconstruction project. We segment gray-scale electron microscopy images (left) using machine learning to produce a detailed reconstruction of individual cells at the nanometer scale. Raw Data: Max Planck Insitute for Brain Research. Segmentation: scalable minds

The computation of these massive datasets is executed on large scientific compute clusters and typically takes days or weeks to finish. At the same time, we run many experiments, tweak parameters, and fine-tune thresholds of our pipeline. We, therefore, rely on a robust task executor and a way to codify our parameters in configuration files in a declarative manner. We need to be able to reliably re-run experiments, see the difference between runs and make sure that older results/artifacts can still be used in combination with newer changes to our codebase.

Using Declarative Configs for Maintainable Reproducible Code

In his talk “Using Declarative Configs for Maintainable Reproducible Code” at PyCon US 2021, Jonathan explains how we build a declarative config system with schema validation, version migration, and reproducibility in mind. The full talk was recorded and is available on YouTube. Enjoy!

If you have any question about the system or want to know more about neuron reconstruction on your datasets, feel free to contact us at hello@scalableminds.com.

--

--