Introducing Q&Aid — Winner of PyTorch Summer Hackathon

Bogdan Cebere
Published in
5 min readDec 13, 2020


Authors: Tudor Cebere, Andrei Manolache, Horia Ion, Bogdan Cebere.

Q&Aid is the healthcare assistant project concept that aims at democratizing access to high-quality diagnoses. It has the potential of comforting patients, unburdening doctors, and generating trust by building a lifelike doctor-patient relationship.

Q&Aid won 1st prize at the Global PyTorch Summer Hackathon 2020, Web/Mobile section. The code is open-source and available here.


The year 2020 brought changes worldwide, imposing new standards when it comes to protecting everybody around you. In this time of need, one of the most harmed and overloaded institutions is the hospital, which is the first line of defense against the pandemic and a desperately needed place for some of us with critical medical needs. With hospitals being filled so that those in need cannot go to the doctor, we thought a smart healthcare assistant might help.

What it does

Q&Aid provides a concept on how to address the healthcare institution overload problem by:

  • Providing the user answers to questions on clinical data,
  • Providing the hospital with a transcript of what the patient needs, reducing the waiting time, and unloading the hospital triage.

Q&Aid is a conversational agent that relies on a series of machine learning models to filter, label, and answer medical questions, based on a provided image as further described. The transcript can then be forwarded to the closest hospitals and the patient will be contacted by one of them to make an appointment.

Each hospital nearby has its models trained on private data that fine-tunes a visual question answering (VQA) model and other models based on available data (e.g., brain anomaly segmentation). The solution will aggregate all of the tasks these hospitals can do into a single chat app, offering the user results and features from all nearby hospitals. When the chat ends, the transcript is forwarded to each hospital, a doctor being in charge of the final decision.

High-level overview of Q&Aid’s analysis stages.

Q&Aid can simplify the hospital logic backend by standardizing it to a Health Intel Provider (HIP). A HIP is a collection of models trained on local data that receives a text and visual input, afterwards filtering, labeling and feeding the data to the right models and generating at the end output for the aggregator. Any hospital is identified as a HIP holding custom models and labeling based on its knowledge.

How we built it

There are three sections of the project that are worth mentioning:


Without any mobile development experience, we were looking for a solution to build an authenticated application that works on Android and iOS.

We chose React Native and AWS Amplify as solutions because of the plethora of tutorials and examples.

We used this tutorial for the development setup, and we followed the AWS Amplify tutorials here for creating an authenticated application.

The chat component is based on the awesome GiftedChat.


Built using FastAPI, the component creates a bridge between the mobile application and the Q-Aid-Models.

The API contains the following paths:

  • /sources: Returns a list of registered hospitals(or HIPs).
  • /capabilities: Returns a list of available tasks to perform.
  • /vqa: Handles a VQA query.
  • /segmentation: Handles a segmentation query.
  • /prefilter: Checks if the input is a valid medical image from a supported category and runs a predefined list of questions against the VQA logic. Already on a prefilter check, we return several insights about the input.

We built and pushed a Docker image using this Dockerfile, to ease the AWS deployment.

The final step, the AWS deployment, was inspired by this tutorial, and resulted in a set of scripts that deployed the full cloud infrastructure.


Visual Question Answering is a challenging task for modern Machine Learning. It requires an AI system that can understand both text and language, such that it can answer text-based questions given the visual context (an image, CT scan, MRI scan, etc.).


Our VQA engine is based on MedVQA, a state-of-the-art model trained on medical images and questions, using Meta-Learning and a Convolutional Autoencoder for representation extraction, as presented here.

Medical segmentation is the task of highlighting a region or a set of regions with a specific property. While this task is mostly solved in the general-purpose setup, in the medical scene this task is quite hard because of the difficulty of the problem, humans having a bigger error rate when highlighting abnormalities in the brain and the lack of data.

Our model uses an UNet architecture, a residual network based on downsampling and upsampling that has good performances on the localization of different features, as presented in the PyTorch hub, thanks to the work of Mateusz Buda.

Medical labeling is the task of choosing what kind of image the user is feeding into the app. So far, possible labels are brain, chest, breast, eyes, heart, elbow, forearm, hand, humerus, shoulder, wrist. Currently, our VQA model has support only for brain and chest, but we are working on adding support to multiple labels.

Our model uses a Densenet121 architecture from the torchvision module, the architecture having been proved suitable for medical imagery by projects like MONAI that uses it extensively.

Medical filtering is the task of labeling images in two sets, medical and non-medical, as we want to filter all non-medical data before being fed into the other machine learning models.

Our model uses a Densenet121 architecture from the torchvision module.


The datasets used in this project are the augmented version of:

What’s next

Q&Aid has several tracks for its future:

  • Integrating more medical machine learning models.
  • Integrating OpenMined’s technologies for privacy.
  • Recruiting medical experts, doctors, and patients.


  1. Binh D. Nguyen, Thanh-Toan Do, Binh X. Nguyen, Tuong Do, Erman Tjiputra, Quang D. Tran. Overcoming Data Limitation in Medical Visual Question Answering(MICCAI, 2019).


We thank Cătălina Albișteanu for providing valuable feedback and suggestions.

And congratulations to the PyTorch team for organizing a fantastic competition.