Intel Edge AI Technology in the Realm of Biopharma and Drug Development

OpenVINO™ toolkit
OpenVINO-toolkit
Published in
11 min readMar 8, 2024

Author: Nooshin Nabizadeh, December 2023

Creating AI Pipeline for Biopharma: Insights and Challenges

In the ever-evolving landscape of biopharmaceutical technology and drug development, a recent effort in the field of Cell Analytics for Monoclonal Antibody Production has shed light on the crucial role of Edge AI Technology in navigating complex challenges of scaling and producing solutions.

Intel has been involved in this process with a variety of partners. One of Intel’s contributions to the cell image project centers around processing brightfield images using an AI pipeline containing multiple deep learning models. The purpose of the pipeline is to identify cells and other biological components and to provide feedback on dynamic biological characteristics such as cell morphology, viability, and phenotypic changes among others.

Working on cell-AI projects usually brings a unique set of challenges to the forefront.

First, it is an interdisciplinary field and the knowledge gap between data scientists and biopharma experts requires more back-and-forth clear communications for planning and validity checks. Frequently when attempting to implement AI solutions in the laboratory, data scientists and bench scientists struggle the fully grasp of the nature and needs of each other’s role. This lack of mutual understanding can also hinder the usability and scalability of an AI solution needing to be integrated into diverse lab environments.

The second challenge is instrument variability. Different plate reader1 microscopes have different hardware, different optics, and different apertures which cause their produced images not to be consistent. This adds an extra layer of work to assess and address these inconsistencies along the way (like regular tracked calibration and adjustment). Additionally, equipment vendor-to-vendor differences, culture temperature, medium conditions, and genetic modifications can all affect the variability of data and the inherent transferability of the deep learning pipeline. This would drive the need to monitor the performance of DL models at the edge and cloud ML ops components.

The third challenge is about obtaining peer-review labels because the process is based on supervised Machine Learning and obtaining clean accurate labels is very costly and time-consuming.

And the last challenge is about the model deployment. In most cases, cloud deployment is not an option due to data size and data privacy. Produced images from plate reader microscopes are huge and transferring data to the cloud and sending the results back would create high latency because a huge amount of data must be streamed (30Gb per hour). And more importantly, laboratories are usually not willing to share the data. Due to these two constraints, cloud deployments are not usually an option, and the pipeline must be deployed at the edge.

Considering the above-mentioned challenges, let’s talk about a common use case in cell therapy and the early-stage drug development field and a proposed AI pipeline.

CHO Cell Segmentation Use Case

CHO stands for Chinese Hamster Overy cells. These are the preferred cells for protein synthesis due to their ability to make very large complex protein molecules, which include monoclonal antibodies (MABS), fusion proteins, hormones, coagulation factors, etc. So, in CHO cells the protein they produce is the “product” vs stem cells or CAR-T cells where the cells are the “product”.

As part of commercial protein production, the health viability and production capability need to be measured. In labs, this is done indirectly using assessors like glucose concentration, temperature, dissolved O2, and pH. To directly measure the cells a culture and stain are needed.

Here is the workflow:

1. Culture cells
2. Fix cells — wash in expensive reagents to remove the culture medium.
3. Permeabilization — wash in more expensive chemicals to permeabilize the cell membrane (to stain for intercellular proteins).
4. Blocking — incubate cells in another expensive reagent to prevent binding of no specific antibodies.
5. Primary Antibody Incubation — antibody specifically to bind to a protein that is being produced.
6. Washing — removing unbound Primary Antibodies using more expensive chemicals.
7. Nuclear staining — use nuclear stain like DAPI to visualize cell nuclei then wash with the same chemicals from the washing step
8. Mounting — get ready to read in the microscope (plate reader1)
9. Imaging — Stained cells …. count them up and determine the state in the protein production cycle and relative cell health (eventually they peter out and stop producing and the batch needs to be flushed. (Cell count, viability number, etc. are the output not the image)

Using an AI pipeline including multiple Deep Learning models and data pre and post-processing, we can go from Step 1 directly to Step 9, removing the majority of the labor and latency in getting actionable results out of a staining workflow and bypassing expensive specialty chemicals requirement. Intel has put together a reference implementation for deploying said pipeline and inferencing of these images on the edge as part of the Cell Image project https://www.cellimage.ie/. OpenVINO Toolkit, OpenVINO Model Server, and AI Connect for Scientific Data are used in this design. Let’s briefly talk about each of these wonderful SW packages.

Intel SW Packages Applied in Cell Therapy Solutions

Intel® Distribution of OpenVINO™ toolkit optimizes, tunes, and runs comprehensive deep learning inferencing on general-purpose Intel architecture. It provides high compute performance and rich deployment options, from edge to cloud, and in recent years has grown to quite a broad ecosystem with multiple tools, repositories, and components.

At the core of the OpenVINO toolkit, we have the OpenVINO Runtime that loads and runs the models. The run-time employs plugins that are responsible for efficiently executing low-level operations that the deep learning model has on Intel HW. We have different plug-ins for different HW, like CPU plugins, GPU plugins, and heterogeneous plugins.

The CPU plugin achieves a high performance of neural networks on the CPU, using the Intel® Math Kernel Library for Deep Neural Networks (Intel® MKL-DNN).

The GPU plugin uses the Intel® Compute Library for Deep Neural Networks (clDNN) to infer deep neural networks on GPUs.

The heterogeneous plugin enables computing the inference of one network on several devices. The purposes of executing networks in heterogeneous mode are to:

· Utilize the power of accelerators to process the heaviest parts of the network and to execute unsupported layers on fallback devices like the CPU.
· Utilize all available hardware more efficiently during one inference.

Another part of the OpenVINO toolkit is the model optimizer which optimizes and converts the model from popular deep learning frameworks like TensorFlow, PyTorch, and ONNX, to OpenVINO intermediate representation format. The models are optimized with techniques such as quantization, freezing, fusion, and more. Models can be deployed across a mix of Intel® hardware and environments, on-premise and on-device, in the browser, or the cloud.

Besides inference, OpenVINO provides the Neural Network Compression Framework (NNCF) tool for implementing compression algorithms on models during training.

Figure 1: OpenVINO™ overview. For detailed documentation about OpenVINO™ see: openvinio.ai

OpenVINO™ Model Server (OVMS)

When it comes to deployment, you can use OpenVINO Runtime, or you can use OpenVINO Model Server or OVMS for short.

OVMS is a scalable, high-performance tool for serving AI models and pipelines. It also enables centralizing AI model management that helps to maintain consistent AI models across numerous devices, clouds, or compute nodes. In a simple word, it is a microservice that loads your models, manages them, and exposes their capabilities through network API, so other components in your system can work with them and make use of those models. OVMS exposes two kinds of API — TensorFlow Serving and KServe compatible and both provide inference, model status, and model metadata services via gRPC or RESTful API2.

Why would you use OVMS instead of OpenVINO run time? There are some situations in which there is no other option. OpenVINO is a C++ project and has an official Python binding but what if your SW stack is in another language? Then you would need to implement your interface, which is challenging. In that case, since OVMS has already the capabilities you need to include OpenVINO into your system it can simplify the work. In addition, when your system solution already works in the microservice paradigm, using OVMS is an obvious choice. Maybe you don’t want to expand your application by including OpenVINO into the business logic of other components or you don’t want to mess with building system. Or maybe some of your applications are running on not very powerful devices, like mobiles, and you don’t want to burden them with heavy inferencing load. You need to delegate computation to a more powerful machine. With OVMS exposing networking API you can have your components running on multiple devices and just send your data in the request format to the OVMS and it would return model output as the response.

Because of the network API and the fact that OVMS is a microservice, it is a good fit for scaling up your solution. For example, if you have having multi-node Kubernetes cluster you can make multiple replicas and set a load balancer in front of them achieving high availability and high throughput that extends beyond the capability of a single node. Aggregation of such a whole system would be easily achievable by employing OVMS.

Besides these, for security and privacy purposes OVMS enables you to host your model server on a machine that you trust, and all the other applications that access it from inside or outside cannot see your model. you just expose its interface with other applications and they can’t access or see the model.

Figure 2. OpenVINO Model Server

Let’s look at the OVMS structure (Figure 2). From the top, we have a network interface. We have gRPC and Restful endpoints. At both endpoints, we support TF serving API and KServe API and with those, you can call inference. You can also call the metadata, which will give you information about what kind of input the model expects and what kind of output you can expect from the model.

Another feature is that you can serve multiple models at the same time. You specify them in the configuration file and OVMS takes care of model management. Model Server monitors the locations of model files and also supports model versioning. Another point worthy of mentioning is that the model location does not need to be a local file system. OVMS supports remote storage file systems like Google Cloud, AWS S3, and Azure.

AI Connect for Scientific Data (AiCSD)

AI Connect for Scientific Data (AiCSD) is an open-source software sample that connects data from scientific instruments to AI pipelines and runs workloads at the edge.

It also manages pipelines for image processing and automated image comparisons. AiCSD is a containerized microservices-based solution utilizing open-source EdgeX Services and connected by a secure Redis Message Broker and various communication APIs, which makes it adaptable for different use cases and settings. Figure 3 shows the services created for this reference implementation.

The architectural components of AiCSD include:

· Microservices: Provided by Intel, the microservices include a user interface and applications for managing files and jobs.
· EdgeX Application Services: AiCSD uses the APIs from the EdgeX Applications Services to communicate and transfer information.
· EdgeX Services: The services include the database, message broker, and security services.
· Pipeline Execution: AiCSD furnishes an example pipeline for pipeline management.
· File System: AiCSD stores and manages input and output files.
· Third-party Input Devices: The devices supply the images that will be processed. Examples include an optical microscope or conveyor belt camera.

The reference architecture lets images be processed using assigned jobs. The job tracks the movement of the file, the status, and any results or outputs from the pipeline. To process a job, some tasks help match information about a job to the appropriate pipeline to run.

The process can be elaborated as below:

1. The Input Device/Imager writes the file to the OEM file system in a directory that is watched by the File Watcher. When the File Watcher detects the file, it sends the job (JSON struct of particular fields) to the Data Organizer via HTTP Request.
2. The Data Organizer sends the job to the Job Repository to create a new job in the Redis Database. The job information is then sent to the Task Launcher to determine if there is a task that matches the job. If there is, the job proceeds to the File Sender (OEM).
3. The File Sender (OEM) is responsible for sending both the job and the file to the File Receiver (Gateway). Once the File Receiver (Gateway) has written the file to the Gateway file system, the job is then sent on to the Task Launcher.
4. The Task Launcher verifies that there is a matching task for the job before sending it to the appropriate pipeline using the EdgeX Message Bus (via Redis). The ML pipeline subscribes to the appropriate topic and processes the file in its pipeline. The output file (if there is one) is written to the file system and the job is sent back to the Task Launcher.
5. The Task Launcher then decides if there is an output file or if there are just results. In the case of only results and no output file, the Task Launcher marks the job as complete. If there is an output file, the Task Launcher sends the job onward to the File Sender (Gateway).
6. The File Sender (Gateway) publishes the job information to the EdgeX Message Bus via Redis for the File Receiver (OEM) to subscribe and pull. The File Receiver (OEM) sends an HTTP request to the File Sender (Gateway) for the output file(s). The file(s) are sent as part of the response and the File Receiver (OEM) writes the output file(s) to the file system.

Figure 3: Architecture and High-level Dataflow

AI Pipeline for CHO Cell Segmentation Use Case

As scientific devices (plate reader1) generate data files in their local file system, they need to be transferred to another device for analysis where AI software and hardware resources are available. This divergence of data and model locations requires a flexible microservice-based solution. We use AiCSD edge microservice infrastructure to move the data to the edge companion compute device because AiCSD utilizes EdgeX Foundry Microservices to facilitate the automatic detection, management, and processing of scientific instrument data. This microservice flexibility is important in addressing the heterogeneous system integration and asymmetric data interfacing inherent in this project.

The complex AI pipeline includes an image preprocessing step, inference of multiple Deep Learning models optimized by the OpenVINO toolkit, and image postprocessing step which are all containerized using another open-source tool BentoML. Using OpenVINO Toolkit decreases the latency of inferencing and speeds up the process. Final results (cell segmentation and viability detection results) are produced in the edge companion compute file system. It then notifies EdgeX Message Bus (part of AiCSD microservice infrastructure) to copy the results to the original scientific device’s local file system and save them if needed.

Figure 4 shows an example of using DL models to process cell images. In this example, UNet is used to mask and count for MSC Nuclei.

Figure 4. MSC Nuclei counting using UNet deep learning model.

References:

1. A plate reader is a laboratory instrument used to obtain images from samples in microtiter plates. The reader shines a specific calibrated frequency of light (UV, visible, fluorescence, etc.) through the samples in the wells of the plate. Plate reader microscopy data sets have inherent variability which drives the requirement of regular tracked calibration and adjustment.
2.https://docs.openvino.ai/archive/2023.2/ovms_what_is_openvino_model_server.html

Notices & Disclaimers

Intel technologies may require enabled hardware, software, or service activation.

No product or component can be absolutely secure.

Your costs and results may vary.

© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.

--

--

OpenVINO™ toolkit
OpenVINO-toolkit

Deploy high-performance deep learning productively from edge to cloud with the OpenVINO™ toolkit.