Next Generation NiFi Sneak Peak

Tim Spann
Cloudera
Published in
5 min readDec 6, 2023

Apache NiFi 2.0.0-M1 is available

You will need to run JDK 21! If you are not running SDKMan, now is the time.

Let’s give it a quick trial run. What I notice first is that it’s fast, clean and stable.

I like to explore NiFi settings first.

Look at all those types of reporting tasks, send that NiFi provenance, logs and errors somewhere expensive in the cloud. Or keep them local and cheap.

Flow analysis rules are cool to help your new developers. Lots of work is going there. Perhaps some LLM?

There’s a ton more controller services for every database and cloud thing ever. If you have the time and money explore all the big three public cloud services.

If the normal way parameters were stored or accessed (too good for JSON files?) doesn’t work for you, then look at all these amazing options.

You can get parameters from AWS Secrets Manager, Azure Key Value Secrets, a JDBC Database, Environment Variables like 1990, a File, GCP Servers Manager or Hashi Corp Vault! Don’t see your favorite, build it, the API and everything is open and extensible.

Let’s build a new Processor Group.

They can be stateless now! So many options, such streaming.

New Features of 2.0.0-M1

  • Initial version of native Python API for Processors
  • Stateless Execution mode for Process Groups
  • Flow Analysis Rules API
  • Kubernetes-based Leader Election and State Management extensions
  • Python-based Processors for interacting with ChatGPT and Vector Databases
  • ListenOTLP Processor for collecting OpenTelemetry
  • ListenSlack and ConsumeSlack Processors for handling messages from Slack
  • EncryptContentAge and DecryptContentAge Processors supporting age-encryption.org specification
  • Schema Registry Services for Amazon Glue and Apicurio
  • Parameter Provider for 1Password Vault
  • YamlTreeReader for YAML as Records
  • PackageFlowFile Processor for writing file streams and attributes as FlowFile Version 3
  • Migrated from H2 Database Engine to JetBrains Xodus for storing Flow Configuration History

New Feature

  • [NIFI-8294] — Processor and Service for Microsoft Azure Data Explorer Integration
  • [NIFI-8497] — Add a SlackRecordSink controller service
  • [NIFI-8650] — Flow Analysis
  • [NIFI-9206] — Create a processor that is capable of removing fields from records
  • [NIFI-9972] — Add Processor for Copying Azure Blobs
  • [NIFI-10222] — Add Apicurio Schema Registry Service
  • [NIFI-10784] — Add Processor for Querying Apache IoTDB
  • [NIFI-11149] — Add PutRedisHashRecord processor
  • [NIFI-11167] — Add Excel Record Reader
  • [NIFI-11197] — Add YAML Record Reader
  • [NIFI-11230] — Log Warnings on Startup for OS Best Practices
  • [NIFI-11366] — MiNiFi/C2 — Support access via LB / Proxy
  • [NIFI-11385] — Expose JMX metrics from NiFi JVM
  • [NIFI-11466] — Add a ModifyCompression processor
  • [NIFI-11514] — Deprecate config.yml based configuration in favor of flow.json
  • [NIFI-11549] — Add Azure Queue Storage Processors using Azure SDK 12
  • [NIFI-11556] — Allow Process Group to be run as a Stateless Flow
  • [NIFI-11585] — Add ADLSCredentialsControllerServiceLookup
  • [NIFI-11586] — Add AzureStorageCredentialsControllerServiceLookup_v12
  • [NIFI-11807] — Add an ExtractRecordSchema processor
  • [NIFI-11827] — Create Schema Registry Controller Service for AWS Glue
  • [NIFI-11830] — Allow JSLTTransformJSON to apply the transform to each object rather than the whole file
  • [NIFI-11889] — Add Record Handling to PutTCP
  • [NIFI-11938] — Create a Processor to consume slack message events
  • [NIFI-11985] — Implement a processor to consume documents from Elasticsearch indices
  • [NIFI-12023] — Add FastCSV parser to CSVReader
  • [NIFI-12024] — Add CSV Writer property to CSVRecordSetWriter along with a FastCSV implementation
  • [NIFI-12033] — Add Processors Supporting age-encryption.org
  • [NIFI-12038] — Create processor to package FlowFiles
  • [NIFI-12068] — Allow documenting specific use cases for processors and controller services in annotations
  • [NIFI-12115] — Add ListenOTLP Processor for Collecting OpenTelemetry
  • [NIFI-12130] — PutIceberg: Ability to configure snapshot properties via dynamic attributes
  • [NIFI-12139] — Allow for cleaner migration of extensions’ configuration
  • [NIFI-12140] — Possibility to configure ACLs for redis connections
  • [NIFI-12233] — Parameter Context Provider for 1Password
  • [NIFI-12240] — Add Python processors that are capable of interacting with vector stores

Resources

Rate My Setup

--

--

Tim Spann
Cloudera

Principal Developer Advocate, Zilliz. Milvus, Attu, Towhee, GenAI, Big Data, IoT, Deep Learning, Streaming, Machine Learning. https://www.datainmotion.dev/