Undiagnosed Hackathon 2023: Moving the Borders of Rare Disease Genetic Diagnostics

German Demidov
10 min readJun 20, 2023

--

As I promised, the thread about the Undiagnosed Hackathon — I wanted to write it down before I start to forget the details.

Disclaimer

I won’t describe anything which is already described on the website but I will try to do an introduction, why it is called a hackathon and why this event is so unique

Before I even start I would like to express my gratitude to 1) everyone who organized the Undiagnosed Hackathon, 2) Solve-RD consortium which chose and sent me to this hackathon. Without this help this wonderful experience would not happen to me. The organization of such an event was, without a doubt, a titanic effort of many people. I appreciate it in full, but will not cover the preparation in detail since I was not involved.

Introduction

So, what is it all about? It is about rare disease genetic medicine, in particular, diagnostics. Despite the recent progress in this field, still around 60% of patients with suspected genetic causes do not receive any decisive diagnosis. It is true that the diagnosis is just the start and there is no cure for most of such diseases, but knowing the diagnosis brings a huge relief to the patient and their family. It allows to prognose and predict, it allows patients to join their efforts in patient groups and organize the care efficiently.

An infografic of how complex the rare disease odyssey can be for a particular patient (from here)

So, you may ask, it sounds like a conference for medical doctors who meet, discuss undiagnosed patients and try to find a new diagnosis, what has it to do with a “hackathon”? Hackathon is about programming, hacking and solving some computational problems, is it a simple misuse of a buzzword?

Not at all. Modern genetic diagnostics is crucially dependent on specialists from various fields: medicine (medical genetics as well as more “traditional” fields such as neurology), molecular biology, IT and … bioinformatics. I am not going to explain what bioinformatics is, read wikipedia for it.

I think the closest analogy here would be different troop types in modern warfare — please, forgive me for this comparison, I just could not come up with a better peaceful analogy. Artillery works in a close interaction with air forces, infantry, tanks and others — and even if a single link in this chain is not working in coordination with others, the whole operation will fail. Previously genetics medicine for tens of years was studied from the perspective of only one field of science, but in recent years the necessity of “joint warfare against the genetic diseases” became clear. We are simply moving nowhere without a very close interaction between different research fields. It is still not clear for many “old school” professors. Sometimes they just do not support their group interdisciplinary activities, sometimes they oppose it. I am absolutely sure that the labs which manage to learn the “combined warfare” will benefit long term over the more “traditional” labs.

Thus, many bioinformatician labs are concentrated on the results being “statistically significant” and “outperforming their competitors”, without thinking about clinical significance or usability of their software.

Image taken from here

Many clinicians blindly accept software suggested to them by someone (their friends? their department?). Sometimes in the past I was helping someone with their software, opening their clinical decision support system and finding out that the critical features were missing. Imagine you come to inspect your friend’s car and find out that the gear shift stick is missing. But where is it — you ask. I don’t know, your friend answers, our car provider just gave us the car as it is and we drive it. As a confirmation of my words, I remember when a prominent medical doctor came to our group after the hackathon and said, “I understood the necessity to learn this bioinformatics, I was trying to ignore it for so long, but it is not possible anymore”.

So finding a diagnosis for a patient happens not only like in a movie about Dr. House, but also in front of a computer, by methodologically looking for genetic variants which may explain patients problems. It can’t be done in a reasonable time without the help of computational methods. It can exactly be compared with “hacking” some problem at a hackathon since programming tools are used and sometimes are implemented just to solve this particular task in this particular patient.

One of the members of our team — from the molecular biology side — asked “but how do you decide which variants are responsible for the disease?” There is no single answer for this. At the Undiagnosed Hackathon I’ve met specialists who start to work from the phenotype, the ones who start from the variants — overall tens of absolutely different approaches. Diagnostics is still an art. At least until GPT-based solutions outsmart us.

In brief, finding a genetic diagnosis can be compared with a police investigation. You find out several suspects and then you find clues. Each clue has its own weight, some are convincing, some are just slightly moving the balance of the suspect’s guilt vs innocence. A medical geneticist has to choose suspects, combine the clues and cross the border of “the suspect is guilty beyond any reasonable doubt”.

Combinations of evidence of different strengths, which makes up a diagnosis (from here, posted for illustrateive purposes only)

The reason why the patient can remain undiagnosed vary — it can be that the variant occurs in a non coding part of the genome or even in coding part, but it was not described before and it is unclear if it has any consequence, or that the gene which is altered was not described yet, or the variant may be epigenetic, or technology limitations wont allow us to see the variant as a whole, if it occurs and a complex region in a human genome, or several genes play a role, or the expressivity of disease is variable. There are thousands of other reasons too.

This iconic duo could be a bioinformatician and a medical geneticist nowdays — one provides the support and another solves the mysteris

Undiagnosed Hackathon

A small part of this huge and modern institute in Stockholm

Finishing the introduction and coming back to the hackathon, almost 100 professionals from more than 20 countries came to Stockholm to Karolinska University Hospital for a weekend in order to solve 12 families with undiagnosed genetic conditions.

What worked especially well is the “invitation only” model, when organizers managed to invite participants from all over the world — from Sri Lanka and Australia to Ghana and Japan. And the patients — several of them were from Stockholm and even came in person to this event — and it was very motivational and it really erases the border between all the people involved. But most of them were from other countries in Africa, Thailand, China, Pakistan. It was not possible to ask them to come in person for the event — understandable, given the underlying medical conditions — but their clinicians were invited and gladly came, so they were able to clarify all the small details about their patients immediately when the questions arose.

Our wonderful group of computational scientists, medical doctors and patients — before the two days of hard work

The level of participants were different — from PhD students to heads of the labs, internationally recognised professors. But when each of us was coming into our study rooms where our teams were working on diagnostics, we were all equal. I guess for many highly ranked scientists and clinicians it was an interesting experience since they normally manage teams of people, but here they had to “return to the bench” and do diagnostics themselves. I guess it was fun for them.

We were divided into teams more or less uniformly, so there were approximately equal numbers of specialists in certain disciplines in each team. The teams were called: Smileys, Dinosaurs, Animals, Space and Unicorns. I joined the Smileys endeavor. Interestingly, many Solve-RD people were assigned to Smileys too. It was nice to feel that support of long term collaborators.

The Coat of Arms of the Undiagnosed Hackathon. Find the teams there! (special thanks to Yui for this logo)

There was a friendly competition between the teams for sure. However, given that each team got only several patients, the initial success was highly random. Later, the unsolved patients were shuffled again between teams, but these were already extremely difficult cases, so any “success” at the later stage was tens times more difficult than the initial one.

What would I change for the future?

Disclaimer: in no sense I criticize the organization team or bioinformaticians at Karolinska. The amount of work done in preparation was titanic. What I say are just suggestions on what could be done better for the next editions of the hackathon.

These rooms became our work place for 2 days

I beg your pardon for using military analogies again, but, speaking as a medical bioinformatician, I would say that our role is similar to artillery. We shape the battlefield before the infantry and tanks move into action. If we work well, the complexity of a problem for the genetic interpreters reduces greatly, but if we don’t work at all — it may make the diagnosis impossible to be found. Here I wish we, bioinformaticians, had a bit more time to shape the battlefield. The Karolinska bioinformatics team did a very good job, however, I believe we could do even better. Worth to admit — I understood which tools I would like to run only after I came to the hackathon.

Since I could not run my bioinformatic pipelines in just 2 days so the meaningful results were available for the interpreters in a reasonable time, I participated in the event as an interpreter and a bit as an IT person. I had an experience in the interpretation of weird structural variants and it could be useful, so I was investigating the candidate genes (provided to me by the partners from my team) for anything visible. Finding such an SV could solve a case “out of nothing”, which happened multiple times in the past. Unfortunately, it did not work this time. Just by chance, these 12 families did not have a causal SV which I was able to find.

An important consideration is that one of the most powerful annotations is allele frequency (after de novo status). For SNVs we can use external databases, such as GnomAD, but for SVs and CNVs it is highly dependent on the tool used. So it is not enough to just run tools on 30+ available datasets — ideally their results also had to be annotated with population specific allele frequency generated with the same tool. In house databases could help in this case.

A very important difference between the reanalysis projects I was participating before vs the current one was the breadth vs depth of investigation. Previously my reanalysis was including 1–2 omics datasets per patient but for thousands of them, and several to tens of HPO terms to describe the disease — so we were concentrating on so called “low hanging fruits”.

Now it was — short read WGS and RNA; long read WGS and RNA, methylation from both nanopore and arrays, extensive diagnostic information with videos MRI and whatnot, but only for 12 patients. So “breadth” vs “depth”. I have to admit, I was not ready to adapt my approach from the start. I adapted, but already towards the end of the hackathon. And it is not looking for the “low hanging fruits” — it is looking for the highest hanging fruits ever possible at the current level of technology.

As a computational person, I could not avoid this comparison (image from here)

How many cases were we able to solve? More than I expected (and I expected one out of twelve). Is it the final number? No. There will be follow up calls, investigations and even in person meetings to continue working on these cases.

No caption needed

What about the next Hackathons? They will happen, but I am not sure if I should go there. It is such a unique and outstanding experience, available only for a hundred professionals of the world per year, so I think it would be better for me to step out and allow my colleagues to experience this. However, if the stars align, I wish to come once again, to feel the excitement and the hard work I’ve done together with the best of the best specialists in medical genetics.

Epilogue

On the way to our first dinner

Non-scientific part — we had amazing dinners in the most beautiful places in Stockholm and outside, the Swedish food was delicious. I was trying to keep to my diet, but Swedish desserts, provided to us during coffee breaks (Princesstarte and Kanelbulle, if I am not mistaken), made me forget my promises to myself. We did not have free time during the hackathon, but we socialized during these dinners and each exchange of experience was valuable for me.

Princess’s cake will always stay in my heart and probably on my belly too…

You can read the interviews of other participants (in Swedish) here or watch the video from the Swedish broadcasting here.

Would I recommend going? Yes, yes, sure, if you associate your future with genetic medicine — don’t even think of refusing if you got an invitation. If I had money, I’d also sponsor this event, it will have a huge impact on the future of medicine.

--

--