My (pregnant, epidemiologist) wife and daughter in the Wynwood Art District, Miami

LINK and DETECT: how ‘big lab data’ is helping to determine HIV Vertical Transmission routes of exposed infants

Lucien De Voux
Palindrome Data
Published in
3 min readDec 3, 2019

--

Lab data is going to be key in the next phase of HIV epidemic management. As more countries stabilise the number of patients on treatment, and mature the support services around them, the wealth of epidemiological value in their national lab data set increases.

One such example is detecting HIV transmission routes of exposed babies in South Africa.

Palindrome Data partnered with the National Institute for Communicable Diseases (NICD), a division of the National Health Laboratory Service (NHLS), and Boston University (BU) to leverage the vast historical lab data set to understand what data methods are helpful on such a large set and how to help identify and excavate the key meaningful patterns.

With existing query-based diagnostic reporting at scale, we can only tell that a child has become positive, not HOW and WHEN the infection occurred.

This work is thinking about vertical transmission and the next generation of HIV exposed infants. The World Health Organization (WHO) and the South African government have guidelines around exclusive breastfeeding in virally suppressed mothers to reduce HIV transmission risk for exposed babies.

Mother-to-child transmission routes

However, with existing query-based diagnostic reporting at scale, we can only tell that a child has become positive, not HOW and WHEN the infection occurred. This is quite critical as it means that we don’t actually know how many transmissions are occurring as a consequence of the breastfeeding guideline.

This is quite critical as it means that we don’t actually know how many transmissions are occurring as a consequence of the breastfeeding guideline.

The NICD’s Prof Gayle Sherman laid out an approach for the big data problem, comprising of 3 key components:

1. LINK: Leveraging BU’s advanced probabilistic matching algorithm, which allows for typos, mistakes and other inconsistencies, to retain 96% of identity matches, to link together HIV tests done at birth, 6–10weeks and 6 months, which was previously not feasible at scale.

2. DETECT: Combined with research work by Dr Ahmad Haeri Mazanderani, Prof Sherman and the NICD team showing how you could DETECT how a baby likely became positive by their sequence of tests and test results (or lack thereof)

3. PRIORITISE: Palindrome then encoded these rules to run on a national-scale surveillance set and retroactively historically calculate the levels of transmission via the different routes.

Palindrome then encoded these rules to run on a national-scale surveillance set and retroactively historically calculate the levels of transmission via the different routes.

Furthermore, we could see how some areas, facility types or possibly types of care-seeking behaviour differed. A major limitation on the national set is that its currently extremely hard to link the mother and child pair — which if possible, would allow understanding of the mother’s care experience, adherence, most recent VL and such and how this affects the child’s risk of outcome for the various different transmission routes.

The partnership with Palindrome aims to investigate and model the risk factors by transmission route that will allow us to identify high-risk NEGATIVE babies before they become positive.

--

--