Transforming Complex Medical Data into Clinical Insights with Jackalope

Sciforce
Sciforce
Published in
9 min readApr 30, 2024

--

Jackalope automates the transformation of complex medical data into standardized formats of OMOP CDM and SNOMED Clinical Terms (CT). This enables healthcare providers, practitioners, and scientists to effectively use real-world medical data for research and delivering enhanced patient care.

How Jackalope Appeared

Comprised of medical doctors from different specialties and software developers, our multidisciplinary team is currently focused on advancing HealthTech projects. Once we began the semantic mapping of medical terms, we realized that the process required significant time from subject matter experts — medical doctors, whose time is invariably costly in every sense. Moreover, there were no tools that could automate this process without losing the data.

That’s how the desire to facilitate this process appeared. First, we tried developing some solutions for internal use. Later, they grew into an open-source OHDSI tool Jackalope for meaningful post-coordination, which subsequently developed into a full-fledged product — Jackalope Plus — an AI-enhanced solution for mapping unmappable concepts.

We aimed to improve the efficiency and accuracy of converting data into standardized formats like the OMOP CDM and integrate it with SNOMED CT for better data integrity. As healthcare professionals, we needed a reliable, easy-to-use tool for quick, accurate medical data analysis. Now we are ready to offer such a tool to the OHDSI community and all interested audiences.

Opportunities

OMOP CDM is one of the few systems supporting Network Study in Real-World Medical Data, a growing but underdeveloped market. Currently, most medical research is a controlled trial based on a direct interaction with a patient. At the same time, large hospitals generate terabytes of data daily. Although underused due to limited capabilities of medical data processing, this data might be a valuable asset in research:

  • Using AI\ML, we can detect patterns and predict possible side effects of medicine, enhancing diagnostics and choosing better treatment for each patient.
  • Analyzing Big Data, we can detect global patterns and test our hypotheses on a large scale without actually interfering with patient’s health
  • Access to different data types allows clinicians to create individual treatments, especially when it comes to orphan diseases
  • This will also drive telemedicine forward, ensuring better access to treatment.

Jackalope offers a way to fill this gap, facilitating data exchange between researchers and clinical institutions from all over the world.

Users

Jackalope is used by researchers, data scientists, terminologists, and software developers to efficiently integrate, standardize, and interpret complex health datasets. These professionals rely on Jackalope to streamline data conversion and maintain data integrity across healthcare and research systems. It enhances drug development and clinical trials, improves management and analysis of clinical data, and supports data-driven decisions that bolster patient care and evidence-based medicine.

Challenges

Technical

  • Complexity of Data Conversion: Involves intricate processes especially when standard concepts are missing, risking the loss of clinical details.
  • Semantic Limitations of Existing Methods: Struggles to capture the full semantic meaning of data, particularly with complex meanings or rare variations.
  • Manual Creation of Post-Coordinated Expressions: Challenges with labor-intensive manual processes that require deep understanding and are prone to errors and inconsistencies.
  • Concept Granularity: Complex diagnoses, rare diseases, or specialized procedures may not seamlessly map to OMOP CDM concepts, potentially leading to a loss of clinical detail.
  • Terminology Mismatch: Source data using local or proprietary terminologies may not align perfectly with standard terminologies used in OMOP CDM, such as SNOMED CT or RxNorm.
  • Temporal Data Representation: Accurately representing temporal relationships within the OMOP CDM schema can be complex, affecting the analysis and interpretation of data over time.

Operational

  • Mental Labor Intensity and Time Consumption: Manual processes are time-consuming and labor-intensive, deterring efficient data processing and leading to delays.
  • Training and Expertise Requirements: This necessitates high levels of specialized training in medical terminology and data modeling, which limits the pool of capable personnel.
  • Cognitive Overload: Large volumes of data can overwhelm users, reducing efficiency and the likelihood of consistent engagement.

Data Integrity and Security

  • Loss of Clinical Details: Inefficient standardization processes can lead to the omission of crucial clinical details, impacting research accuracy.
  • Inconsistency in Data Quality: Manual processes prone to human error can lead to inconsistent data conversion and representation.
  • Data Security and Privacy: Need for robust security solutions to protect sensitive medical information and ensure compliance with regulations.

Systemic

  • Interoperability: Differing data platforms across countries complicate medical data exchange, hampering global collaboration.
  • Scalability: As the volume of data increases, systems must scale effectively without compromising performance, integrity, or security.

Legacy System Dependence

  • Outdated Methods: For many years, fuzzy matching has been the standard technique for data mapping, developed over a decade ago. While Natural Language Processing (NLP) is also used, its lower accuracy means many researchers still favor manual methods. No automated technique has yet matched the precision and detail of manual mapping when dealing with complex data.

Solution

Jackalope optimizes medical data handling and standardization by integrating the strengths of the OMOP Common Data Model (CDM) and SNOMED CT.

  • Parsing & Training the Model

We created a model to automatically parse complex medical texts, transforming them into SNOMED post-coordinated expressions. It has undergone extensive training in a diverse range of clinical texts to accurately extract attributes and decipher semantic relationships within the medical context.

  • Generation of Post-Coordinated Expressions:

After parsing, the system generates SNOMED post-coordinated expressions that standardize the representation of detailed medical concepts. This enhances the richness and usability of the data, making it ideal for complex medical analysis.

PCE is a key resource in Jackalope that captures nuanced clinical details, providing a granular representation of patient conditions, procedures, and observations. It promotes semantic interoperability and allows for the standardized representation of clinical concepts, facilitating data exchange and integration across systems. PCE also offers the flexibility to represent diverse clinical scenarios and supports advanced analytics, decision support, and research by preserving the richness of clinical information.

  • Integration with Jackalope Interface

The generated expressions are integrated into the Jackalope interface, which updates and expands the vocabulary within the OMOP CDM instance. This integration supports broader and more effective use of medical data across healthcare platforms, promoting enhanced interoperability and data sharing.

  • Standardized Data Framework with OMOP CDM

Jackalope uses the OMOP CDM to structure and standardize diverse healthcare data into a uniform format suitable for analysis. This model organizes data into predefined tables covering all key aspects of patient care, such as demographics, diagnoses, procedures, and medications.

  • Detailed Medical Terminologies from SNOMED CT:

Leveraging SNOMED CT, Jackalope incorporates comprehensive medical terminologies including codes, terms, synonyms, and definitions. This ensures that all processed medical data are not only standardized but also semantically accurate and detailed, adhering to SNOMED CT standards.

Features

  1. AI-Driven Data Conversion

Jackalope utilizes AI to automate the conversion of complex medical data into structured formats compatible with SNOMED CT and OMOP CDM. This involves mapping source data to standardized medical terminologies, where the tool efficiently suggests mappings and attributes for post-coordinated expressions.

2. Intelligent Mapping and Dynamic Expression Generation

Jackalope uses smart technology to match medical terms accurately with the right categories in its database. If a term doesn’t match perfectly, Jackalope can create new descriptions, capturing even tricky medical details.

3. Attribute Mapping and Semantic Enrichment

Jackalope uses ML linking complex medical attributes to standardized terms in OMOP CDM and SNOMED CT. This process assigns specific attributes to data points, improving data granularity and accuracy. Such precise mapping preserves clinical details and ensures compliance with global standards, facilitating better data integration and analysis.

4. Comprehensive Standardization and Integration:

Jackalope streamlines data sharing across various healthcare systems by consistently applying the OMOP Common Data Model (CDM) standards. It regularly updates its database with the latest medical terms to ensure accuracy and currency.

5. User-Friendly Interface with Global Scalability:

Jackalope features an intuitive, easy-to-navigate user interface that simplifies complex data management tasks, making it accessible to healthcare professionals with varying levels of technical expertise. It supports multiple languages and adheres to international data standards.

6. Enhanced Data Quality and Research Support:

Jackalope improves medical data precision and accessibility with advanced validation algorithms that ensure accuracy and consistency. It supports extensive research and clinical operations, such as cohort studies, effectiveness research, and adverse event monitoring, by enabling the aggregation and analysis of large datasets to advance medical research and enhance clinical outcomes.

Development process

The idea for Jackalope was conceived to address the challenge of automating the integration of medical concepts that could not be matched directly one-to-one into the SNOMED. This automation was aimed at simplifying the use of medical data.

The core functionality focused on using SNOMED’s post-coordinated expressions for accurately mapping complex medical concepts. Jackalope was developed to place these expressions into the correct locations within the SNOMED system already a part of the OMOP Standardized vocabularies.

Initially, Jackalope required manual effort to compose and collect these expressions, which limited its efficiency. To overcome these limitations, Jackalope’s functionality was expanded to automatically gather and integrate post-coordinated expressions.

Result

  • Enhanced Efficiency of ETL Process:

Jackalope automates the ETL processes, such as the detection of transformation of unmappable medical terms, cutting data conversion times by up to 50%. As a result, medical professionals can access updated patient information more rapidly, enabling timely and informed decision-making.

  • Improved Data Understanding:

It enhances the mapping of complex medical terms, allowing for more precise extraction of clinical information. This results in up to a 30% improvement in diagnostic accuracy, enabling individualized treatment plans that enhance patient outcomes.

  • Automation in Handling Complex Data:

By automating the transformation and processing of post-coordinated expressions, Jackalope reduces manual intervention in mapping intricate medical data by up to 70%. This significant reduction in manual tasks not only minimizes human errors by approximately 40% but also enhances the overall reliability and consistency of the data handling process.

  • Streamlined User Experience:

It simplifies the data conversion process, making it more user-friendly for healthcare professionals. This includes clearer interfaces and reduced complexity in navigating through the medical data, reducing the time healthcare professionals spend on data management by approximately 25%.

  • Efficient Medical Data Conversion

Jackalope enhances the efficiency of medical data conversion, enabling healthcare organizations to process data up to 40% faster. This accelerated processing capability ensures that large datasets are ready for analysis within minutes.

Jackalope Demo

  1. Logging in and Importing Data:

The interface displays an import data option where users can upload a CSV file using a designated delimiter for proper data organization.

2. Data Preview and Initial Mapping:

Here we see a data preview with columns like vocabulary, source code, and source name. Initial mapping tools are available to begin categorizing medical terms.

3. Parent Mapping and Semantic Search:

The interface presents a semantic search tool that suggests potential parent terms for medical terms that do not have direct matches, aiding in accurate categorization within the database.

4. Post-Coordinated Expression (PCE) Creation:

The system automatically generates post-coordinated expressions once parent terms are selected, displaying the process of attaching attributes to these terms to enhance semantic accuracy.

5. Attribute Mapping and Enhancement:

The interface allows users to add further attributes to the expressions, enriching the data details and context, crucial for precise medical analysis.

6. Review and Export Options:

The final result provides options to review the completed mappings and export the data in various formats like CSV or SQL for the use in further analysis.

Jackalope is available for Beta now! Contact us to request early access.

--

--

Sciforce
Sciforce

Ukraine-based IT company specialized in development of software solutions based on science-driven information technologies #AI #ML #IoT #NLP #Healthcare #DevOps