The Latest in Real-Tim(e) Analytics: Generative AI, LLM and Beyond

Published in

Cloudera

5 min readOct 28, 2023

Thursday October 26th, we had a meetup with two Tim’s speaking on real-time analytics plus LLM and more. It was a great event at Cloudera’s NYC office in Manhattan.

For all the materials, slides and code:

meetups/2023Oct26NYC at main · tspannhw/meetups

Meetup Materials. Contribute to tspannhw/meetups development by creating an account on GitHub.

github.com

Quick Statistics

30 attendees in-person
2 speakers named Tim
1 LLM Result indicating Timistan exists
8 ASF Projects mentioned (Apache NiFi, Apache Pinot, Apache Kafka, Apache Flink, Apache Iceberg, Apache Spark, Apache Tika, Apache OpenNLP)
9 Whole NYC Pizzas eaten
20+ sodas drunk
2 demos on large language models
2 demos running Apache NiFi and WatsonX.AI
1 source code https://github.com/tspannhw/FLaNK-watsonx.ai/tree/main
2 slide sets (https://www.slideshare.net/bunkertor/26oct2023adding-generative-ai-to-realtime-streaming-pipelines-nyc-meetup, )
100 times NYC Evolve Mentioned
1 broken video recording https://www.youtube.com/watch?v=GzZPWbYqcIY
Lots of Pinot, no wine

Tim Spann spoke first on utilizing Apache NiFi, Apache Kafka and Apache Flink to build a live data pipeline connected to various Large Language Models (LLM) for Generative AI.

26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC Meetup

26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC Meetup - Download as a PDF or view online for free

www.slideshare.net

As is tradition for meetups we had a few hiccups, the first was pizza was 40 minutes late that drove us to run a little late. The second was worse is that the live feed did not have audio. I had to redub my piece. Hopefully Tim Veil can do his Pinot piece later.

Tim Spann Meetup Talk

My first no code real-time demo is receiving events from Slack, parsing events that include Q: questions that someone posts in a “General” channel on my nifi-se Slack group and then routing, filtering, enriching, prompt building and feeding those questions to various REST endpoints that support running Generative AI models. For this run I utilized WatsonX.AI’s IBM Cloud Generative AI REST end point and meta-llama/llama-2–70b-chat.

One of the routing options we do within Cloudera DataFlow (Powered by Apache NiFi) is utilize a SQL query powered by Apache Calcite to determine if I really typed a query. As you can see it is really easy.

SELECT * FROM FLOWFILE
WHERE upper(inputs) like ‘QUESTION:%’
OR upper(inputs) like ‘Q:%’

For some examples I call HuggingFace, Cloudera Machine Learning, Amazon Bedrock and others. For the meetup, I used WatsonX.AI as I am working with IBM on a few streaming AI examples as IBM utilized Cloudera DataFlow and Streaming.

To call WatsonX.AI you first must get a token, utilizing your unique secure key.

Get identity Token

https://iam.cloud.ibm.com/identity/token

With an HTTP Payload:

grant_type=urn%3Aibm%3Aparams%3Aoauth%3Agrant-type%3Aapikey&apikey=YOURIBMKEY

Once you have a token and parse it, you can call the endpoint and run your prompt against the model.

Call Model after identity grant using Identity Token

https://us-south.ml.cloud.ibm.com/ml/v1-beta/generation/text?version=2023-05-29

Building our prompt utilizing a cleaned up version of the question.

{
“model_id”: “meta-llama/llama-2–70b-chat”,
“input”: “${inputs:urlEncode()}”,
“parameters”: {
“decoding_method”: “greedy”,
“max_new_tokens”: 200,
“min_new_tokens”: 50,
“stop_sequences”: [],
“repetition_penalty”: 1
},
“project_id”: “SomeProjectId,CreateThisInthePromptToolInIBMCloud”
}