Evaluating External Events as Precursors of Soldier Musculoskeletal Injuries

d*classified
d*classified
Published in
4 min readAug 28, 2023

Chua Kah Sheng, Senior Data Scientist, Enterprise Digital Services, and his team used Natural Language Processing (NLP) to mine information from the unstructured clinical notes recorded by Medical Officers. We also linked medical data with demographic data to identify the top units with MSK-related visits for targeted interventions. This contributes to the formulation of injury management policies and allocation of medical resources to units that require more medical attention.

BACKGROUND CHALLENGE

Musculoskeletal (MSK) injuries are non-comabt injuries that impact soldier health and the readiness of our armed forces. Injuries affect readiness through limitations on deployment, missed training cycles, and in some cases, develop into chronic pain or long term disability or secondary health challenges after injury. To this end, DSTA and SAF closely monitors MSK in the Singapore Armed Forces (SAF).

Photo by Filip Andrejevic on Unsplash

APPROACH

Modelling predictors of injuries is a complex endeavour that should consider both intrinsic (e.g. pre-exisiting conditions, past injuries) and extrinsic factors (e.g. events and training programme). For this study, we looked closer at key-event evaluation prior to a reported injury. Examples of key events in the SAF include route marches and Individual Physical Proficiency Tests (IPPT).

This information is typically captured within the unstructured clinical notes recorded by the Medical Officers. Each clinical note typically ranges from 50 to 100 words. Users are required to extract information highlighted in yellow. However, it was too tedious to manually screen through thousands of clinical notes to extract the insights.

Example of Clinical Notes

Identify Leading Events of Injuries using Natural Language Processing (NLP)

To tackle this, the data science team worked closely with the HQ for Medical Corps (HQMC) to apply NLP, specifically deep learning techniques to mine thousands of unstructured clinical notes for useful information.

First, the unstructured data was pre-processed to remove irrelevant information to prepare for analysis. A text classification model was built to sieve out the sentences with high confidence score of containing events leading to the injury (e.g., complained of ankle pain during 16km route march). Thereafter, a Named Entity Recognition model was applied to these sentences to further sieve out the specific keywords or phrases with regard to the event leading to the injury (e.g., 16km route march). By using text clustering with word embedding techniques, we grouped up the extracted words into various groupings (e.g., “low rope, high wall” under the “Standard Obstacle Course”) for better interpretation. With our approach, we were able to accurately identify the groupings of events that lead to injuries with 82% accuracy.

Information Retrieval Pipeline

Identify High-Risk Units for Targeted Interventions

The insights extracted were linked up with other data sources such as demographics and medical visit records. This provided MINDEF/SAF a situation overview across units and to identify units with high risk of MSK injuries for targeted interventions. By presenting the insights via an analytics dashboard, we were able to visualise and understand better the leading events to MSK injuries and channel medical resources to high-risk units.

De-idenfied top units with MSK related injuries and health conditions
Keywords extracted from core areas

Conclusion

The application of Natural Language Processing techniques on unstructured clinical notes has elevated the analytic capabilities of the SAF, by extracting insights that were tedious to obtain with a manual approach. These new insights help MINDEF/SAF to better understand the leading causes of MSK injuries and empower units to create monthly reports (related to musculoskeletal injuries) independently. Previously, units were dependent on manual provision of data sets for generation of monthly reports. However, with the visualization dashboard, it has allowed units to access these data sets on demand. This helps to enable early intervention, increase commanders’ safety emphasis on specific activities to improve the overall soldier health by reducing impact to readines due to MSK conditions.

--

--