Computer Vision Models and Natural Language Processing in Early Diagnostic Imaging

Aneesh Bhardwaj
Neurotech@Davis
Published in
7 min readMar 12, 2024

The up-and-coming automation of innovations in the new world is developing into never-before-seen technologies that improve many peoples’ quality of life in the most efficient way possible. This article dives into the world of using new techniques such as Natural Language Processing (NLPs) and Computer Vision (CV) models, to accurately diagnose patients with neurological disorders, as well as other health issues, for long-term benefits.

Written By: Aneesh Bhardwaj

Source: Eastgate Software, 2024:online
Source: EastGate Software, 2024:Online

An Introduction to Computer Vision And Natural Language Processing

When you think of computer vision, what comes to mind? Maybe you visualized the computer parsing through pieces of data and providing feedback to the user. Or you assumed the literal meaning: a computer with eyes that can see clearly. Which one is it? Well, if you said the first one, you’re correct!

CV or a Computer Vision model takes pieces of data, such as images or views from cameras, analyzes them, and spits out external information regarding the images it recognizes. In the medical field, the information fed to these models can be brain images, x-rays, or any photo of a physical condition. These examples of models built by machine learning engineers and Artificial Intelligence (AI) experts can detect the location of an image and accurately tell the user what type of data it is analyzing. In the long run, computer vision models can potentially improve medical treatment and the quality of life for patients by making diagnoses more efficient.

Now, what is a Natural Language Processing System? In simple terms, Natural Language Processing (NLP) is a form of AI that does text-to-speech translation and is used in conjunction with data mining. The main use of data mining in NLPs is to analyze the user’s voice. This data is received by the system to extract specific groups of texts that can be used for person classification and categorization to be analyzed in databases. A common example of natural language processing is in Gmail. NLPs serve as virtual assistants to the user by categorizing the emails into 3 groups: primary, social, and promotional.

Furthermore, one of NLPs’s applicable uses is in the healthcare system. The connection NLP has to the healthcare system is that it can be used for speech recognition, to easily develop medical reports, create chatbots for technology-to-user interaction on concerns people have about medical issues and more. NLP models interpret human language, which can also be implemented for therapy.

NLP techniques can also be used to identify cognitive impairments by analyzing text or speech data. NLP algorithms can detect indicators of cognitive disorders, like Alzheimer’s, by analyzing patients’ text and speech data, enabling early intervention and treatment. As of right now, the applications of CV models and NLPs within the healthcare system strive to create a difference in our society in the most efficient way possible. This can be done by working on surgical operations, analyzing patterns of genetic information to help doctors with diagnosing patients, and even maintaining patient records.

Source: Mihajlovic, 2019:Online

Deep Dive into CV and NLPs

Computer Vision and Natural Language Processing fall into 3 categories, called the 3 R’s:

1) The first R is recognition.

The first step to AI or any piece of technology is understanding what the technology is working with as well as recognizing and translating the data. After gaining a complete understanding of this AI technology, the next step is moving forward with its real-world application. In the context of AI in healthcare, radiologists use CT scans or X-rays to see what goes on inside of the person’s body, an image is uploaded into the system, and digital labels are allocated to individual objects that are inside the image itself (Jitendra). One prime example of this is the robotic movement of objects. Within these objects, both NLP and CV models train the data that they are working with and store that data in memory.

2) The second category is reconstruction.

Once the image is recognized and the system knows what to do with it, it turns it into something that can be viewed from multiple perspectives. The digitized model can be largely influenced later for further processing, and the accumulation of deep sensory information can be used to manipulate the data (Jitendra). This way, the original image collected has been reconstituted to be clearer, more dynamic, and applicable to the purpose at hand. After the image is recognized, for it to be accessible, NLPs can use the analyzed image and reconstruct it into a text-to-speech format where the user can finally ask any question, gaining all the necessary information.

3) The last category is reorganization.

Interestingly enough, the reorganization structure collects all of the data and puts the data into different groups to eventually develop the overall design. With the translation of human speech and manipulation of image data, both NLP and CV models can produce simple tasks that can alter the shape and color of images, as well as the way the user sees them (Jitendra).

Benefits of CV Models and NLPs in Early Diagnostic Imaging

CV, NLPs, and other forms of AI technology have stacked up numerous advantages for many healthcare clinics worldwide. The analysis of medical scans with CV models benefits places with limited infrastructure and training resources, enforcing the effectiveness of treatment within their patients. Right now, CV models enable remote diagnostics where it only needs medical images and cloud data to automate the diagnosis by running it through its system without any external advanced technology.

One prime example of this in use is a case study from Penn State University called Project Echo, where a deep-learning CV model developed in Uganda analyzed the echocardiogram of a patient with rheumatic heart disease (Intelliverse.ai). At the end of this study, they found that the model achieved about a 90% accuracy rate using a reference of their heart image database, correctly diagnosing the disease.

Another vital example is the research done by Padmini Pillai, a professor at MIT, who used a CV model for MRI scans in an Indian hospital detecting brain tumors with about 95% accuracy. These examples demonstrate a wide variety of AI implementations to improve many health outcomes using medical imaging, leading to more efficient diagnoses (Intelliverse.ai). This technology can assist with detecting early-age diagnosis and its level of accuracy increases in infants with neurological disorders.

Source: Sage Journals, Tanguay and others, 2022:Online

In terms of NLPs in the context of image processing, there are an immense number of applications. One of these primary applications includes image search and retrieval. By understanding the textual description of specific images, NLP techniques can improve the accuracy of locating the images as well as matching them with that particular image. This search and retrieval first goes through the indexing process which includes converting text into a format that is efficient for searching (Yusuf). Then, it goes through query processing where a user submits a message into the NLP processing system, understands the meaning of the message, and extracts relevant information. Lastly, is feedback and tuning. The NLP gives an option for the user to provide feedback from their proposal and over time, the NLP system improves from this feedback.

Not only do NLPs have a large role in imaging itself, but they are also proficient in translation rule-based systems specifically Silent Brain Infarctions (SBI) in children and elderly patients. The rule-based system is a system within AI, where based on a set of rules, the system will run ONLY if these rules are satisfied. This AI model runs on the direct commands the user inputs and will provide an output if these commands are met. Now back to SBIs. SBIs are much more common than a stroke and can be detected through an MRI in almost 20% of elderly patients (Fu). Even though there is such a high abundance of SBI, there is a lack of knowledge on this disease which causes treatment and therapy to go untouched.

A study on the “Effectiveness of Stroke Prevention” in the Silent Stroke Project finds that NLP algorithms extract information from different brain parts, contextualize this information to identify syndromes, and provide more accurate information on SBIs for these patients (Fu). NLPs don’t directly analyze neural data in the way that neuroscientists might analyze brain activity using fMRIs and EEGs. Instead, NLPs primarily focus on modeling patterns of human behavior, language, and thought processes. NLP techniques understand how individuals perceive the world and enhance personal development as well as behavioral changes.

Conclusion

The Future of Artificial Intelligence has taken the medical industry by storm. The expansion of Computer Vision and Natural Language Processing techniques has created a world of possibilities for diagnosing patients. The technology’s efficiency and accuracy during medical procedures benefit people with all types of healthcare issues. From robotics integration and augmented reality to accumulating knowledge and providing treatments for patients, this technology drastically improves the patient experience. With more automated diagnostics saving lives at an earlier stage and with more research on CV/NLP systems, we can construct a society where AI can do what humans can’t do today.

Works Cited

Chaudhry, Farhan, et al. “Machine Learning Applications in the Neuro ICU: A Solution to Big

Data Mayhem?” Frontiers, Frontiers, 9 Sept. 2020, www.frontiersin.org/journals/neurology/articles/10.3389/fneur.2020.554633/full.

Fu, Sunyang, et al. “Natural Language Processing for the Identification of Silent Brain Infarcts from Neuroimaging Reports.” JMIR Medical Informatics, vol. 7, no. 2, 3 May 2019, pp. e12109–e12109, https://doi.org/10.2196/12109. Accessed 5 Sept. 2023.

Intelliverse.ai. “How Computer Vision Is Bringing Expert-Level Diagnostics to Underserved

Communities in Africa.” LinkedIn, 6 Nov. 2023, www.linkedin.com/pulse/how-computer-vision-bringing-expert-level-diagnostics-leikf/.

Jitendra Malik a, et al. “The Three R’s of Computer Vision: Recognition, Reconstruction and

Reorganization.” Pattern Recognition Letters, North-Holland, 8 Feb. 2016, www.sciencedirect.com/science/article/pii/S0167865516000313.

Nathi Magubane, Wri, et al. “Challenges and Advances in Brain-Computer Interfaces.” Penn

Today, 11 Mar. 1970, penntoday.upenn.edu/news/challenges-and-advances-brain-computer-interfaces.

“Promoting the Responsible Advancement of Neurotechnology.” IBM Policy, 2 May 2022,

www.ibm.com/policy/promoting-the-responsible-advancement-of-neurotechnology/.

Yusuf, Jose. “Image Captioning: Bridging Computer Vision and Natural Language Processing.”

Comet, 22 Sept. 2023, www.comet.com/site/blog/image-captioning-bridging-computer-vision-and-natural-language-processing/#:~:text=Natural%20Language%20Processing%20for%20Text,comprehend%20and%20generate%20coherent%20sentences.

--

--