FLaNK AI — 01 April 2024

Tim Spann
4 min readApr 1, 2024

01-April-2024

FLaNK / KNIFe AI Weekly

https://knifeai.blogspot.com/

Tim Spann @PaaSDev

https://pebble.is/PaaSDev

https://vimeo.com/flankstack

https://www.youtube.com/@FLaNK-Stack

https://www.threads.net/@tspannhw

https://medium.com/@tspann/subscribe

https://www.cloudera.com/campaign/apache-nifi-for-dummies.html

https://ossinsight.io/analyze/tspannhw

COOL CHARITY by KIDS!

https://www.unveilx.org/

CODE + COMMUNITY

Please join my meetup group NJ/NYC/Philly/Virtual.

http://www.meetup.com/futureofdata-princeton/

https://www.meetup.com/futureofdata-newyork/

https://www.meetup.com/futureofdata-philadelphia/

This is Issue #131

https://github.com/tspannhw/FLiPStackWeekly

https://www.cloudera.com/solutions/dim-developer.html

New Releases

Apache Hive 4.0.0 https://hub.docker.com/r/apache/hive

Articles

Meetup Report

Real-Time Irish Transit Analytics

Adding Generative AI Results to SQL Streams

Image Processing with Custom Python and Apache NiFi 2.0

Cloudera + GenAI + NVIDIA NIM Microservices https://menews247.com/cloudera-to-enhance-genai-with-nvidia-nim-microservices/

https://blog.cloudera.com/data-architecture-and-strategy-in-the-ai-era/

https://blog.cloudera.com/clouderas-rhel-volution-powering-the-cloud-with-red-hat/

https://developer.nvidia.com/blog/translate-your-enterprise-data-into-actionable-insights-with-nvidia-nemo-retriever/

https://drive.google.com/file/d/11lCJAB272ruBa7AAVwYxaN2E2xooWizG/view

https://jack-vanlightly.com/blog/2024/3/26/the-sisyphean-struggle-and-the-new-era-of-data-infrastructure

https://pypi.org/project/streaming-jupyter-integrations/

https://thenewstack.io/how-nvidia-gpu-acceleration-supercharged-milvus-vector-database/

NiFi 2.0 Python https://medium.com/@sudeep.singh99/a-beginners-guide-to-nifi-2-0-custom-python-processor-ac6d8c7bda7b

Make sure you are on the write MacOS version for new Java https://blogs.oracle.com/java/post/java-on-macos-14-4

https://www.datanami.com/2024/03/22/zilliz-unveils-game-changing-features-for-vector-search

https://towardsdatascience.com/automated-detection-of-data-quality-issues-54a3cb283a91

https://mlops.community/7-methods-to-secure-llm-apps-from-prompt-injections-and-jailbreaks/?

https://www.startdataengineering.com/post/change-data-capture-using-debezium-kafka-and-pg/

https://medium.com/@hubert.dulay/stream-processing-vs-real-time-olap-vs-streaming-database-339c75ca6772

https://www.cloudera.com/about/news-and-blogs/press-releases/2024-03-28-global-survey-reveals-90-of-it-leaders-believe-that-unifying-the-data-lifecycle-on-a-single-platform-is-critical-for-analytics-and-ai.html

https://nvidianews.nvidia.com/news/nvidia-blackwell-platform-arrives-to-power-a-new-era-of-computing

https://netflixtechblog.com/bending-pause-times-to-your-will-with-generational-zgc-256629c9386b

https://www.uber.com/en-GB/blog/balancing-hdfs-datanodes-in-the-uber-datalake/

https://techcrunch.com/2024/03/31/why-aws-google-and-oracle-are-backing-the-valkey-redis-fork/

Videos

Meetup Talk NYC

Irish Rail Preview

TCF Pro 2024

Streaming Traffic Cameras

NiFi 101

March 11, 2024 Princeton 23 Orchard Event

march 15, 2024 Trenton TCF

march 28, 2024 Meetup

Events

April 2, 2024: XtremeJ 2024. Virtual. https://xtremej.dev/2023/schedule/

April 8–11, 2024: NLIT Summit. Seattle. https://www.fbcinc.com/e/nlit/default.aspx

April 11, 2024: Conf42 LLM. Virtual.

April 12, 2024: AI Max Conference. 23 Orchard Princeton https://www.startupgrind.com/events/details/startup-grind-princeton-presents-startup-grind-hosts-ai-max-summit/

April 2024: AI Meetup NJ

EMEA | APAC: April 24, 2024 9:30 AM CEST | 1:00 PM IST AMER EVENT: Apr 25, 2024 9:00 AM PDT | 12:00 PM EDT Register Now: http://spr.ly/6047Z3AjN

May 8–9, 2024: Data Summit 2024. Boston, MA. https://www.dbta.com/DataSummit/2024/default.aspx https://www.dbta.com/DataSummit/2024/Timothy-Spann.aspx

May 21, 2024: Gen AI and Beyond with NiFi 2.0. Virtual.

June 12, 2024: Budapest Data + ML Forum. Virtual.

https://budapestdata.hu/2024/en/

Cloudera Events https://www.cloudera.com/about/events.html

https://www.cloudera.com/events/cloudera-now-cdp.html?internal_keyplay=ALL&internal_campaign=FY25-Q1-AMER-WS-Cloudera-Now-Events-Page-P06&cid=701Hr000000tW6qIAE&internal_link=p06

More Events: https://www.linkedin.com/pulse/schedule-2024-tim-spann--y4coe

Code

Models

Tools

New

Vector Db built on clickhouse https://github.com/myscale/myscaledb

Cool Tool — LLM Synthetic Data Generators

https://github.com/geraldyong/OpenAI_Synthetic/tree/main

https://github.com/quentinlintz/synthetic-data-generator

https://medium.com/@n-demia/how-to-prepare-test-data-via-openai-api-in-postman-7e378dde1f53

https://github.com/datadreamer-dev/DataDreamer

https://huggingface.co/collections/rbiswasfc/synthetic-data-generation-65ee68e821ddaff47073ed02

Flink Connectors (scroll down)

https://flink.apache.org/downloads/

Avro

Can’t handle numbers bigger than 19 decimals

Throwback Article

Get your Flink and Kafka

Discount

Discount access to DataSummit 2024 https://secure.infotoday.com/RegForms/DataSummit/?Priority=24SPKR

© 2020–2024 Tim Spann

--

--

Tim Spann

Principal Developer Advocate, Zilliz. Milvus, Attu, Towhee, GenAI, Big Data, IoT, Deep Learning, Streaming, Machine Learning. https://www.datainmotion.dev/