Developing a system to automate PCR testing during the pandemic.

Cool things we’ve learned with my team while helping to build a system to improve qPCR test analysis for SARS-CoV-2.

About who’s writing.

Hello, my name is Santiago Goncalves and I’m a 26-year-old Software Engineer from Montevideo –a small city from an even smaller country, Uruguay. I’ve been in the software development business for almost 10 years now, while also developing my own Statistics & Data Science skills for three and a half years.

There will be a series of 2 posts about this project: this being the most informative about our journey, and the next a hands-on on how to apply Statistics and Artificial Intelligence to make qPCR (quantitative polymerase chain reaction) analysis & predictions. Hope you enjoy them!

How this project started.

First, let’s rewind a little bit and talk about how this project was possible in the first place. One of my boss’ closest friends happens to be Gonzalo Moratorio, Associate Researcher & Virologist from Institut Pasteur Montevideo, and “one of the top 10 people who helped shape science” according to Nature.

Gonzalo Moratorio
Yes, this cool guy with vanished hair.

Why? He and his team developed the test locally, which helped Uruguay to have more independence and increase the tests carried out in the worst moments of the 2020 pandemic.

In one of their casual conversations, the possibility of somehow applying technology to speed up the sample processing of COVID-19 tests popped up. And that’s where my colleagues and I come in. My boss called us, and we started working on how to bring all those ideas to reality.

Institut Pasteur from Montevideo/Uruguay.

Arriving at the lab for the first time.

In this section I’ll briefly explain some important things we’ve learned about the whole operation at Institut Pasteur’s lab where we were kindly received by Gonzalo and his whole team. During the tour we witnessed the entire process and discussed about all the improvements we could make.

So what is qPCR?

qPCR stands for quantitative polymerase chain reaction and is a technology used for measuring DNA.

It is a technique widely used in Molecular Biology to amplify a segment (fragment) of DNA (deoxyribonucleic acid). The already known chains forming the typical double helix.

These chains are made of a sequence of nucleotides that are the pieces that concatenate and form these chains.

How is the amplification performed by PCR?

Thanks to another enzyme, DNA polymerase. It is an enzyme that copies the “letters” of the DNA chain with the help of a signal, which is a short sequence of nucleotides (a short phrase of letters), which binds to DNA and tells the polymerase where to start.

If we know the sequence, we design two signals or “primers” at the beginning and at the end of the “target” sequence. This is how we teach the polymerase what to copy using the “bricks or nucleotides” that we also have to provide.

Real-time PCR is carried out in a thermal cycler that illuminates each sample with a beam of light of at least one specified wavelength to detect the fluorescence emitted by the excited fluorophore. The thermal cycler is also able to rapidly heat and chill samples, thereby taking advantage of the physicochemical properties of the nucleic acids and DNA polymerase.

The PCR process generally consists of a series of temperature changes that are repeated 25–50 times.

Here we can see 2 target sequences amplified 40 cycles for an individual test.

Usually scientific workers look at similar plots like above for each sample and check if the sequences are valid.

What is group/pooled testing?

The idea of pooled testing is that it allows testing small groups called pools using only one test. This means you can test more people faster, using fewer tests and for less money.

Instead of using one sample per test, samples from multiple individuals are mixed together and tested as one. If test results are negative, everyone in the pool is clear. If positive, each member of the pool is then tested individually.

Fortunately, researchers have already shown that pooled testing is about as accurate as individual testing in pools as large as eight people.

Another concern that we need to be aware of is infection prevalence, you can only do pooled tests if the prevalence of the virus in the population is low, high amount of positives will add noise and difficulty when checking results and will end up testing all samples individually.

Integra CCS team at the lab.

Improving things.

So, the purpose of this whole system was to help them improve their whole testing pipeline in an honorary way for all their effort in helping the country.

In order to help them organize and automate the whole process since samples arrive to the lab until their results are reported, we developed several features that now enable them to:

  • Manage all their health providers and the format in which they receive sample data.
  • Manage samples visually, loading any type of spreadsheet or via API.
  • Manage testing groups (a.k.a. ‘pools’ or ‘pooled’ tests).
  • Manage PCR plate setups visually or with an API.
  • Watch the pre-validated test results using Artificial Intelligence.
  • Analyse stats of all processed tests and their final result.
  • Report results to government health authorities and individuals.

We developed this project using a low-no-code tool that we developed during my early years of work, which is used to develop UI‘s via drag-and-drop elements and write its logic in any kind of javascript.

To round up this story, I just wanted to thank all the people who were somehow part of this process. Also, thank you for reading and hope you’ve enjoyed it. If you have any questions or comments, please feel free to get in touch :)

The second article about this project is already available to read here.

--

--