What is Natural Language Processing (NLP) and How Can it Be Used in Healthcare?

Salina Mendoza
Machine Learning For Everyday People
4 min readJan 27, 2020

--

Natural language process reduces the distance in capabilities between a human and a computer.

NLP is short for Neuro-Linguistic Programming. Neuro refers to your neurology; Linguistic refers to language; programming refers to how that neural language functions.”

In short, NLP is the language of your brain and human performance.

There are two areas of NLP:

  1. Natural Language Understanding (NLU) — applies machine learning (ML) toward breaking language down into concepts & relationships.
  2. Natural Language Generation (NLG) — builds natural linguistic phrases that represent a series of starting concepts.

They are both needed for true NLP success. It is necessary to use the right datasets for successful outcomes.

In order to start training a model, we first must categorize the data and turn words into numbers/vectors. Going from a set of categorical features in raw (unlabeled) text — words, letters, POS tags, word arrangement, word order, etc. — to a series of vectors.

Breakdown how text is labeled, tokenized, and embedded to train a neural network.
Breakdown how text is labeled, tokenized, and embedded to train a neural network.

There are two major use cases for using Natural Language Processing.

  1. Understand human speech as well as extracting data and meaning from the conversation
  2. Abstracting relevant data values from unstructured data in documents and databases

Differences between structured and unstructured data:

Structured data is highly organized and formatted so it is easily searchable in relational databases. Unstructured data does not have a pre-defined format or organization which makes it difficult to collect, process, or analyze.

Electronic Health Records (EHR) are digital versions of patient charts.

Source: https://www.healthcatalyst.com/

Potential ways to use NLP in Healthcare:

  • Enhance accuracy of Electronic Health Records (EHR) by transforming text into standardized data
  • Draw credible insights into large datasets that weren’t possible before EHRs
  • Improve clinical documentation by creating technology that utilizes speech-to-text dictation to be used and captured at point of care
  • Make computer-assisted-coding (CAC) more efficient by matching procedures with captured codes to maximize claims
  • Improve customer experiences with EHRs by automating analysis and customer service related activities such as assisting in orders or acting as a medical scribe
  • Educate patients by connecting NLP with EHRs to match up clinical terms from their documents with everyday language to share with patients via chatbot — making them more aware of their illnesses
  • Boost phenotyping capabilities by equipping clinicians with tools to extract and analyze unstructured data. This allows them to group and categorize patients based on observable physical or biochemical expressions. (Currently, clinicians use structured data to create phenotypes)
  • Create a real-time system that automates the process of reporting the adenoma detection rate (ADR) by analyzing large datasets of patient charts, reading through pathology reports, and calculating ADR continuously

Specific tasks NLP systems can include:

  • Summarize lengthy blocks of text i.e. clinical note or article, by identifying key concepts or phrases in the text
  • Map data elements of unstructured text into structured fields in EHRs to improve data integrity
  • Convert machine-learnable formats into natural language for reporting & education purposes
  • Using optimal character recognition (subset of computer vision) to turn images into text files to be analyzed and parsed. (i.e. PDF, scans of documents, etc)
  • Conducting speech recognition to allow note-taking that can be turned into text

Here are some current examples of using NLP in Healthcare:

IQVIA — uses unstructured and alternative data sources like social media as well as supporting medical documents to generate analytics regarding regulations and compliance. This is advertised to help companies keep tracking of ongoing changes in industry compliance.

Amazon — uses NLP for cohort analysis or the process of matching patients to enroll in clinical trials for a new drug. They sift through patient data to find out who would be the best participant.

Nuance — uses NLP to empower clinicians to securely document a patients story naturally on-the-spot into an EHR (electronic health record).

Takeaways

  1. NLP is short for Neuro-Linguistic Programming.
  2. NLP is the language of our brains and human performance.
  3. There are two cases for using NLP: understanding human speech and extracting insights as well as abstracting relevant values from documents.
  4. There are two data types used for NLP: structured and unstructured data.
  5. We find the most insights from unstructured data but is harder to analyze due to lack of format.
  6. Large datasets are needed to provide greater accuracy.
  7. In order to start training a model, we first must categorize the data and turn words into numbers/vectors.
  8. NLP can be used to improve clinician documentation with patients stories naturally.
  9. NLP can summarize lengthly blocks of text by identifying key concepts and phrases.
  10. Improve patient experience by automating analysis and customer service related activities such as assisting in orders or acting as a medical scribe.

Enjoy this? Let’s connect on Twitter or DM me on Instagram.

twitter.com/inababi

instagram.com/common2themind

--

--

Salina Mendoza
Machine Learning For Everyday People

Product @ PAC and Abstract Geometric Artist. Prev built @wegreenlight, @dreamitalive, @gen_110/@repowertalent brands. Dell Scholar. Obsessed with basketball.