The Latest in Real-Tim(e) Analytics: Generative AI, LLM and Beyond
Thursday October 26th, we had a meetup with two Tim’s speaking on real-time analytics plus LLM and more. It was a great event at Cloudera’s NYC office in Manhattan.
For all the materials, slides and code:
Quick Statistics
- 30 attendees in-person
- 2 speakers named Tim
- 1 LLM Result indicating Timistan exists
- 8 ASF Projects mentioned (Apache NiFi, Apache Pinot, Apache Kafka, Apache Flink, Apache Iceberg, Apache Spark, Apache Tika, Apache OpenNLP)
- 9 Whole NYC Pizzas eaten
- 20+ sodas drunk
- 2 demos on large language models
- 2 demos running Apache NiFi and WatsonX.AI
- 1 source code https://github.com/tspannhw/FLaNK-watsonx.ai/tree/main
- 2 slide sets (https://www.slideshare.net/bunkertor/26oct2023adding-generative-ai-to-realtime-streaming-pipelines-nyc-meetup, )
- 100 times NYC Evolve Mentioned
- 1 broken video recording https://www.youtube.com/watch?v=GzZPWbYqcIY
- Lots of Pinot, no wine
Tim Spann spoke first on utilizing Apache NiFi, Apache Kafka and Apache Flink to build a live data pipeline connected to various Large Language Models (LLM) for Generative AI.
As is tradition for meetups we had a few hiccups, the first was pizza was 40 minutes late that drove us to run a little late. The second was worse is that the live feed did not have audio. I had to redub my piece. Hopefully Tim Veil can do his Pinot piece later.
My first no code real-time demo is receiving events from Slack, parsing events that include Q: questions that someone posts in a “General” channel on my nifi-se Slack group and then routing, filtering, enriching, prompt building and feeding those questions to various REST endpoints that support running Generative AI models. For this run I utilized WatsonX.AI’s IBM Cloud Generative AI REST end point and meta-llama/llama-2–70b-chat.
One of the routing options we do within Cloudera DataFlow (Powered by Apache NiFi) is utilize a SQL query powered by Apache Calcite to determine if I really typed a query. As you can see it is really easy.
SELECT * FROM FLOWFILE
WHERE upper(inputs) like ‘QUESTION:%’
OR upper(inputs) like ‘Q:%’
For some examples I call HuggingFace, Cloudera Machine Learning, Amazon Bedrock and others. For the meetup, I used WatsonX.AI as I am working with IBM on a few streaming AI examples as IBM utilized Cloudera DataFlow and Streaming.
To call WatsonX.AI you first must get a token, utilizing your unique secure key.
Get identity Token
https://iam.cloud.ibm.com/identity/token
With an HTTP Payload:
grant_type=urn%3Aibm%3Aparams%3Aoauth%3Agrant-type%3Aapikey&apikey=YOURIBMKEY
Once you have a token and parse it, you can call the endpoint and run your prompt against the model.
Call Model after identity grant using Identity Token
https://us-south.ml.cloud.ibm.com/ml/v1-beta/generation/text?version=2023-05-29
Building our prompt utilizing a cleaned up version of the question.
{
“model_id”: “meta-llama/llama-2–70b-chat”,
“input”: “${inputs:urlEncode()}”,
“parameters”: {
“decoding_method”: “greedy”,
“max_new_tokens”: 200,
“min_new_tokens”: 50,
“stop_sequences”: [],
“repetition_penalty”: 1
},
“project_id”: “SomeProjectId,CreateThisInthePromptToolInIBMCloud”
}
Example Slack Questions and Responses
Join me next week in Manhattan.
Kafka Topic Event Data with Generated Text
Flink SQL Against Kafka Topic Messages From Enriched LLM Results in NiFi
Second Demo
Source Code
References
- https://www.ibm.com/docs/en/watsonx-as-a-service?topic=models-
- https://www.ibm.com/docs/en/watsonx-as-a-service?topic=models-prompt-tips
- https://www.ibm.com/docs/en/watsonx-as-a-service?topic=tips-sample-prompts
- https://www.ibm.com/docs/en/watsonx-as-a-service?topic=models-parameters
- https://www.meetup.com/futureofdata-newyork/events/295516928/
2023 — Tim Spann