Image Processing with Custom Python and NiFi 2.0

Tim Spann
6 min readMar 13, 2024

--

Apache NiFi, Image Processing, BLIP, HuggingFace, Transformers, Python, Image Captioning

Overview of Apache NiFi Data Flow

Example Flow for Processing with all the image processors
Detail flow for sending output to Discord and Slack

Step 1: CaptionImage

Image (JPG, PNG, GIF) is input.

Output is original image and caption attribute.

Step 2: FacialEmotionsImageDetection

Image (JPG, PNG, GIF) is input.

Output is original image and label# attribute and score# attribute.

label1
neutral
label2
angry
label3
happy
label4
sad
label5
fear
score1
0.19129344820976257
score2
0.18247057497501373
score3
0.16521458327770233
score4
0.16067346930503845
score5
0.1489986777305603

Step 3: RESNetImageClassification

Image (JPG, PNG, GIF) is input.

Output is original image and classificationlabel attribute

Step 4: NSFWImageDetection

Image (JPG, PNG, GIF) is input.

Output is original image and normal and nsfw attributes with scores.

Step 5: Route on NSFW Status of Image

Step 6: Create Discord and Slack Message (UpdateAttribute)

Step 7: Build new JSON File (this is required for Discord)

Step 8a: Send to Discord

Step 8b: Send to Slack

CaptionImage

The first processor I have added to assist with Image processing and analytics is the CaptionImage processor that utilizes HuggingFace Transformers and Salesforce BLIP model.

Here is the source code for the new CaptionImage processor.

https://github.com/tspannhw/FLaNK-python-processors/blob/main/CaptionImage.py

See this article for additional information and a use case:

Example Output

caption

someone holding a cell phone with a cat in the background

The best part of this processor is the image is not lost or changed, we just add an attribute for caption.

Sorry BLIP, it’s actually a radiation detector.

This second image was done better.

caption

there is a man standing on a stage with a microphone

Meta Data Sent to Dischord

For all of my new Python processors I put together a quick realistic workflow and recorded it. Let’s take a look at all of this in action.

FacialEmotionsImageDetector

The second processor is for Facial Emotions Image Detector.

This processor extracts Facial Emotions and returns them as attributes.

RESNetImageClassification

The third processor is Res-Net 50 Image Classification.

Output from this processor is the attribute classificationlabel.

NSFWImageDetection

This is the fourth processor to detecting NSFW images.

Output to Slack

Image Analysis ==== NiFi 
On Date: ${date}
File Name: ${filename}
uuid : ${uuid}
Caption: ${caption}
Message Channel: ${messagechannel}
User uploaded: ${messagerealname} ${messageusername}
MSG Timestamp: ${messagetimestamp}
Time Zone: ${messageusertz}
mime-type: ${mime-type}
Title: ${title}
Classification RES-NET: ${classificationlabel}
Label: ${label1} Score: ${score1:trim():toDecimal():multiply(100)}
Label 2: ${label2} Score: ${score2:toDecimal():multiply(100)}
Label 3: ${label3} Score: ${score3:toDecimal():multiply(100)}
Label 4: ${label4} Score: ${score4:toDecimal():multiply(100)}
Label 5: ${label5} Score: ${score5:toDecimal():multiply(100)}
Normal: ${normal:toDecimal():multiply(100)}
NSFW: ${nsfw:toDecimal():multiply(100)}
=====

Slack Input of Images To Analyze

Output from NiFi to Slack

#reports Channel

As you can see we send all the fields we filled with attribute values plus the attached JSON Flow File.

Output from NiFi to Discord

OTHER NEW PYTHON PROCESSORS

RESOURCES

--

--

Tim Spann

Principal Developer Advocate, Zilliz. Milvus, Attu, Towhee, GenAI, Big Data, IoT, Deep Learning, Streaming, Machine Learning. https://www.datainmotion.dev/