Can Artificial Intelligence Detect Pancreatic Cancer?

Arjun Virk
14 min readOct 25, 2023

--

An overview on how Artificial Intelligence is being used to accurately detect pancreatic cancer, improving survival rates and optimizing treatments for oncologists around the world

PET/MRI, CT Scan of Pancreatic tumour

It was back in early 2019 when Alan, a man in the UK, began to feel an infrequent stabbing pain under his ribs. He was living a perfectly normal life with his loving family in England and was a very healthy, recently retired man. But, after a while, his pain became more uncomfortable and frequent. His family knew something was up and brought him to the doctor. Alan was quite old-fashioned and stoic; he didn’t see the doctor unless necessary, so this pain of his was becoming serious.

The doctors ordered an ultrasound scan and gave him some antacids as he had some ingestion issues, and that was it. The ultrasound revealed a few gallstones, which the doctors thought were the root cause of his discomfort. He was given some medicine and was sent on his way.

It turned out the medicine wasn’t the key to easing his pain, and he continued to lose weight. The doctor just ordered some more scans. After a while, his family knew his condition was just going to get worse. The scans soon revealed kidney cancer and a shadow on his pancreas. His family was told the kidney tumor was very stable and was a candidate for the Whipple Operation.

Alan went through the complex operation and began his rounds of heavy-duty chemotherapy. After his initial treatment, he seemed to be doing well for a while and his energy returned. However, after one of his routine scans, they revealed his cancer in the pancreas returned and spread to the stomach lining. More chemotherapy was how he was being treated. With the passage of time, he got weaker, and further tests showed the spread continued, now in his leg muscle. His health slowly deteriorated, and on October 8th, 2022, Alan passed away just after he turned 66 years old.

This melancholy story is not subject to Alan, it’s the story of the hundreds of thousands who have fought against pancreatic cancer. It’s become common and very deadly. The combined five-year survival rate for pancreatic cancer is very low at just 5 to 10 percent, making it the third leading cause of cancer-related deaths in the US alone 😬. In fact, it is expected to become the second leading cause of death in the US by 2030. But why is that 🤔?

To be fair, it’s been 124 years since the first use of radiation therapy to cure cancer, 67 years since the first successful chemotherapy treatment of a metastatic tumour, and 52 years since the National Cancer Act was established. Despite these remarkable advances in cancer treatment, pancreatic cancer is still very deadly ☠️.

The Problem: Detecting Pancreatic Cancer is “Almost Impossible”

Detecting pancreatic cancer in its early stages is extremely challenging; UCLA Health deems it “almost impossible.” It’s because the pancreas is a tricky organ. For some context, the pancreas is responsible for producing enzymes that aid in digestion and hormones insulin and glucagon (which helps maintain blood sugar control). Positioned deep within the abdomen, it is situated just below the liver and behind the stomach. The rounded head of the pancreas nestles into the upper curve of the small intestine and narrows down near the spleen. Since the pancreas is surrounded by all of these internal organs, pancreatic tumours are extremely hard to see or feel.

Image of pancreas in the human body

What Makes It Hard to Detect?

  • Silent in early stages 🤫: Pancreatic cancer causes NO symptoms in its early stages. It’s only until the cancer has reached a more advanced stage and metastasized that symptoms begin to be felt by the patient. Treatment at advanced stages is limited, often times surgery is not an option.
  • Similar symptoms to other diseases 🪞: The symptoms that pancreatic cancer induces are very similar to those of other diseases. This can make it difficult to diagnose.
  • Most cancers are already metastatic 😈: The majority of pancreatic cancers are aggressive and found to already be metastatic when diagnosed. At this stage, treatment becomes very difficult for the patient and usually, there isn’t too much that can be done.
  • Limited Information from Imaging Tests 🩺 : Some imaging techniques like an MRI may indicate a mass in the pancreas, but conducting a biopsy is the most effective way without any AI tools to diagnose this type of cancer. However, due to the location of the pancreas, oncologists have a very difficult time getting a sample of the tumor. In addition, radiologists may not be able to see cancer because of the way the images are presented or because it’s too small for them to see.

The Solution: Using AI Detection Tools 🤖

Many tools have come out to improve the diagnosis techniques for pancreatic cancer, but the most effective so far have been AI medical imaging models. At times, these models have been more effective than actual doctors. Here are some people and groups building AI models that are used for pancreatic cancer diagnosis:

  1. MD Anderson Cancer Center: They are working on early-detection AI applications to help diagnose patients with pancreatic cancer. These AI models work by analyzing the pancreas when imaged by a CT scan or MRI. They are also working on a Convolutional Neural Network (CNN) that identifies which cases are most likely to develop into malignant cancer.
  2. Taiwan AI Labs: Researchers in Taiwan have been developing a computer-aided detection (CAD) tool that uses deep learning to detect pancreatic cancer in its early stages. The tool identified was able to identify the pancreas automatically, achieving 90% sensitivity and 96% accuracy.
  3. Mayo Clinic: The Mayo Clinic Comprehensive Cancer Center researchers used the world’s most extensive imaging dataset to create an AI model that autonomously can detect pancreatic cancer on standard CT scans when surgical treatment can still promise a cure.
  4. High School Students: At the the Regeneron ISEF 2023, two high school students from Louisville, Kentucky presented their custom, automated system which could detect gastrointestinal (including pancreatic) cancers before any serious symptoms arise. It combined robotics and Machine Learning to analyze blood samples to identify healthy patients, those with pancreatic, colorectal, or hepatic cancers, in 3 hours and $300.
  5. BilliScreen by UWash: A research team at the University of Washington created a simple and non-invasive tool to detect pancreatic cancer in adults. Their product is an app that utilizes a smartphone selfie of a person’s eyes using their paper glasses or a 3D printed box. Powered by computer vision, the app calculates the color information from the sclera of a person’s eyes and the machine learning algorithms correlate it with bilirubin levels. High bilirubin levels indicate the patient may be at risk for pancreatic cancer.

Who’s Tool is the Most Effective? 🦾

I believe that Mayo Clinic’s detection model does the best job in the early detection of pancreatic cancer. They’ve trained their model on the most extensive dataset of pancreatic cancer, has achieved high accuracy, and is very practical since it reads off CT scans. In my opinion, this AI model has the most potential to make the most impact in the field of early pancreatic cancer detection.

Diagram of how the Mayo Clinic Model works

Now although this model is very effective, there’s still room for improvement. If this model can be trained on other imaging tools like MRI’s and Ultrasounds it would be able to be used in more medical settings.

Breakdown of How AI Detection Models Work 👨🏽‍🏫

Since it would be difficult to break down each model listed above, I’ll be dissecting a general model that would be used. Here is the general process of creating these models.

  1. Data Collection: Putting Together Your Dataset ⛏

Models used for early detection of cancer need to first be trained on an extensive dataset of medical images with annotations from clinicians. These images can include CT scans, MRIs, ultrasound scans, or any other screening methods used to diagnose pancreatic cancer.

Large Image Dataset

The first step in developing an AI model for this purpose is to collect a very large dataset of medical images. These images include CT Scans, MRIs, or ultrasound scans of different stages of pancreatic cancer, and cancerous and non-cancerous images, all with annotations from radiologists and oncologists. These are the images that will be used to train the detection model and ensure there is a balanced dataset.

2. Data Preprocessing: Enhancing Our Dataset 📈

The medical images we are using in our dataset need to be preprocessed. The purpose of doing this is to raise the image’s quality so we can better analyze the results. Here is a list of the preprocessing steps we will have to take:

  • Image normalization: Adjusting pixel values to a standardized scale (important since our dataset is made up of different methods of medical screening)
  • Image resizing: Adjusting the size of the images to have the same dimensions
  • Anatomical Alignment: Adjusting the orientation and position of the images.
Preprocessed Image Dataset of a CT Scan (Pancreas)

3. Feature Extraction:

Various Computer Vision techniques are used to extract and pinpoint relevant features from the preprocessed image dataset. These features are usually distinctive patterns or structures that are common to pancreatic cancer. Here are the patterns used in this model:

  • Edge Detection: Identifies boundaries and edges of structures in an image. This will come in handy for highlighting irregularities in the organ or the presence of a tumour(s).
Edge Detection with a Set of Coins
  • Texture Analysis: This involves examining patterns and textures within an image. Certain textures might reveal irregularities, such as a tumour or changes in tissue density. Now you are probably wondering: “How can an algorithm determine texture based on an image?” Well, texture analysis can be based on the statistical measurement of pixel intensities and relationships, such as co-occurrence matrices and gray-level run-length matrices.
Overview of how Texture Analysis works for Images

4. Using Convolutional Neural Networks (CNN’s) 👁:

What is a CNN?

Our brains continuously process vast amounts of information as we perceive our surroundings through our eyes. They are composed of neurons, and each neuron specializes in a particular domain, forming connections with other neurons to detect and interpret the visual field. Similar to the neurons in the brain, each neuron within a Convolutional Neural Network (CNN) exclusively processes data within its receptive field. The arrangement of layers in a CNN is structured in a way that they initially detect simpler patterns such as edges and curves and progressively move on to recognizing more intricate patterns like faces and objects. CNNs are what allow computers to SEE and process visual information.

Anatomy of a CNN

There are several layers, each with a specific purpose, that work together to process and extract certain features from input images. These layers, when properly configured, allow the network to recognize patterns and make predictions.

Architecture of a CNN (Source)

Input Layer

The first layer of a CNN is the input layer. This layer receives the medical images as input. Images are represented as a grid of pixels, with each pixel containing information about the image’s intensity or colour. These pixel values are passed to the subsequent layers for further processing.

Convolution Layer

The convolutional layer is the backbone of CNNs. It’s responsible for carrying out most of the computational heavy lifting. This will help spot subtle and critical features in medical images.

This layer performs a dot product operation between two matrices: one is the set of learnable parameters, often referred to as a kernel, and the other to a restricted region of the input, referred to as the receptive field. Kernels are a specialized filter that goes over the input image, which in our case are medical images. They act like magnifying glasses, scrutinizing the image for specific patterns that might indicate the presence of cancer. This involves performing a dot product operation (a way to multiply two equal-length vectors together) between the kernel and small, localized portions of the image called receptive fields.

Convolution Operation

Each kernel is tailored to search for distinct features, such as irregular shapes, variations in tissue density, or other abnormalities associated with pancreatic cancer. By sliding the kernel across the entire image, it constructs a two-dimensional representation known as an activation map. This map highlights the areas where the kernel’s features of interest are detected. The sliding size of the kernel is a stride.

Activation Map for Animals — similar to what our map for our dataset will look like

The size of the activation map is determined by parameters like the input image dimensions, the number of kernels, the size of the kernels, and the stride, which represents how much the kernel shifts during each step. Ultimately, the activation map reveals potential cancerous regions within the image. Here’s a mathematical approach to determining the size of the activation map:

We will need to consider these things:

  • How big the original picture is (let’s say it’s W x W).
  • How many different detective kernels we’re using (that’s the number of kernels).
  • The size of these kernels (F).
  • How big a step the detective takes (the stride).

We can use this formula to find out how big the map will be:

Formula for Convolution

This layer essentially acts as a virtual pathology expert, scanning medical images pixel by pixel, revealing patterns that might otherwise go unnoticed by the human eye. It plays a vital role in identifying the subtle and intricate structures and irregularities that are indicative of pancreatic cancer, providing an essential early diagnostic tool.

Pooling Layer

Its primary purpose is to reduce the spatial dimensions of the feature maps generated by the convolutional layers. This spatial reduction holds several significant advantages for pancreatic cancer detection models, that helps us find cancer signs more effectively.

What Does the Pooling Layer Do?

  1. Summarizes Information: The pooling layer collects key details from nearby parts of an image, like zooming in on a part of a picture.
  2. Saves Time and Resources: It also helps us work faster and save computer power. This is crucial when looking at lots of medical pictures of the pancreas.
  3. Picks the Most Important Stuff: There are different ways to do this, like finding the biggest or strongest signal in a neighborhood of data.

When we use the pooling layer, it makes the image a bit smaller. This keeps our computer calculations simpler. It’s like zooming out of a picture of a city, but still seeing the important parts.

Pooling layers help the computer to recognize problems in different places of the image, even if the issue isn’t in the exact same spot every time. This is super useful for finding signs of pancreatic cancer accurately.

Fully Connected Layer

Neurons in this layer have full connectivity with all neurons in the preceding and succeeding layer as seen in regular FCNN. This is why it can be computed as usual by a matrix multiplication followed by a bias effect.

The FC layer helps to map the representation between the input and the output.

Non-Linearity Layer

Imagine trying to fit a straight line to a curvy rollercoaster graph. It just won’t work because real-life things are often not simple and straight. Convolutional layers in neural networks are like those straight lines. They work for some parts of an image but not for the whole picture.

To deal with this, we have “non-linearity layers.” These layers come right after the convolutional layers and make the neural network more flexible. They add curves and twists to the data, which helps the network understand the complexities of real-world images.

There are different types of curves or functions that these layers use:

  • Sigmoid: The sigmoid non-linearity has the mathematical form σ(κ) = 1/(1+e¯κ). It’s like taking a number and squeezing it between 0 and 1. But there’s a problem — when it’s super close to 0 or 1, it doesn’t like to change much.
  • Tanh: This one squeezes a number between -1 and 1 ([-1, 1]). It’s like a speedometer that can go forward and backward. Its output is zero-centric.
  • ReLU: Now, this one is like a light switch. If the number is positive, it’s “on,” and if it’s negative, it’s “off.” It’s simple, but it works really well. It’s like turning on and off the lights in your room to make it brighter. It computes the function ƒ(κ)=max (0,κ). In other words, the activation is simply a threshold at zero.

5. Training the Model 🏋🏽:

The CNN is trained on the labeled dataset from Step 1, where images are annotated as either having pancreatic cancer or not. Our model will learn to differentiate between cancerous and non-cancerous images by adjusting its internal parameters during training. Loss functions, such as cross-entropy, are used to measure the difference between the predicted and actual labels. Typically, a CNN goes near 100% accuracy in about 10 epochs. An epoch is one complete pass through the entire dataset during the training of a neural network.

Detection of a Tumour Attached to the Pancreas by an AI Model

There Are Limitations to These Tools However 🚧

With any introduction of a novel solution, especially in the medical field, there will be many limitations. Here are some challenges that companies and organizations are facing when trying to mainstream their models:

Data Quality and Quantity: A big hurdle faced is the availability of high-quality, labeled medical data for training these AI models. Gathering and curating large, diverse, and representative datasets can be very difficult. In addition, it can be challenging to obtain longitudinal data that covers a patient’s medical history, which is crucial for understanding disease progression. AI models are only as good as the data they are trained on.

Interoperability: Healthcare data is often stored in various formats and systems that may not be compatible. Integrating data from different sources and healthcare institutions can be complex and may require significant efforts to ensure data consistency and accuracy.

Implementation challenges: Incorporating AI into clinical practice may be challenging due to the lack of standardization in cancer-related health data. Companies must undertake robust development and testing before its adoption into healthcare systems. In addition, some companies that want to expand to countries that are less economically developed are facing many struggles. There is often a lack of imaging technology and computers to implement their models. Their data is often based on patients from places in Europe and North America too. The biology of humans varies between continents and their dataset wouldn’t be diverse enough to make accurate predictions.

Interested in learning more 🤔? You can learn more about this by watching this video: https://www.cbsnews.com/video/ai-could-help-predict-pancreatic-cancer-study-finds/.

Concluding Thoughts 💭:

In conclusion, the integration of AI tools for early detection of pancreatic cancer has been a giant leap for the industry and patient care as a whole. A combo between AI and medical professionals will only improve the medical industry and give more time for doctors to do what they love: Interacting with patients. But, a detection tool like this is only the beginning of what’s to come. I think the future holds promise, with ongoing research and development.

To end this article, I would like to leave you with a question 🧐: “Now knowing how the model works for pancreatic cancer detection, would you trust it to make your diagnoses? How much trust do you have in AI? There is still room for error despite how advanced technology has become.” Would love to hear what you think in the comments.

--

--

Arjun Virk

A high schooler on a mission to change the world with AI