Data professionals can use stream flows to quickly put together realtime streaming applications without having to write flow topology code. Users design flows by using a web-based drag-and-drop UI, and by specifying parameter values. As for the business logic of data processing, some of it can also be implemented by using specialized operators, with no coding. We are continuously adding more out-of-the-box coverage, but there might always be cases where custom coding is required.
This article clarifies when to use which type of custom code operator, and recommends best practices. For more details about stream flows, see streams flow documentation.
So when can you rely on specialized streams flow operators like Filter, and when is it best to write your own code, and how?
Specialized operators or custom code?
When constructing Streams Flows, you can drag and drop various operators from the palette onto the canvas.
As a rule: If there is a specialized operator that fits the bill, use it!
Some specialized operators
- Sources (ingestion): IBM Event Streams, IBM WatsonIoT, MQTT.
- Targets (sinks): IBM Cloudant, IBM Event Streams, IBM Cloud Object Storage, Redis.
- Processing and analytics: Filter, Aggregation (count, sum, average, min, max, standard deviation).
But what if there is no specialized operator for what you are building?
Cases for custom code
- Connect to sources and targets for which there is no specialized operator in the palette.
Examples: MongoDB, Cassandra.
- Specific business logic data transformations and calculations, beyond what is covered by Filter and the available Aggregation functions.
- Parse, convert, or format data according to a format that is not supported in specialized operators.
— Parse dates and times that arrive as strings in non-ISO-8601 format.
— Parse data that’s arriving in Avro format.
- Pre- and post-process data for model scoring.
- Advanced data extraction using regular expressions and other types of custom parsing.
- Easily generate sample data, such as for incremental development or demos.
Cloud functions or (Python) code operators?
These are the operators for inserting custom code:
So which one is best to use? It depends. This table points out the differences to help you pick the right approach for each use case.
Develop Code operators effectively
For enhanced productivity, it can help to combine built-in coding support with external tools.
The built-in coding support includes:
- Python syntax highlighting and validation, as you type.
- Logging user messages and raising user errors. At runtime, they all conveniently appear as notifications in the streams flow UI, on the Metrics page. You can also download the user log which contains these user messages and errors.
- As with all streams flow operators, at runtime, you can view sampled data and throughput rate at the inputs and outputs of code operators.
Utilizing external tools
For support that complements built-in coding assistance:
- Use an external tool for authoring your code, such as:
a. your favorite editor or IDE, with support for:
— auto-complete, refactor, auto-layout, …
— test/debug your business logic
— version control integration
b. or a notebook, where you can quickly run and test your code.
2. Use the developed code and:
— copy it into the code operator editor, arranging it appropriately inside the applicable callback functions (`init`, `process`, `produce`) ,
— or turn it into an external code package, to use from within your code operator.
Here is how to install Python packages and use them in Streams Flows code operators.
Important: In all these cases, keep in mind that in stream flows, this code will be run in a Python 3.5 interpreter.
When writing custom code, take into account limitations of streams flow development:
- The debug facilities and runtime customization are limited.
- There is no built-in version control.
This article can help data professionals quickly build realtime streaming applications by designing stream flows, with no or minimal custom coding.
We are giving the following recommendations regarding custom code:
- Where available, use specialized operators rather than writing custom code.
- There are typical use cases that require writing custom code.
- Based on the provided comparison table, pick between a Code operator and a Cloud Function operator, if you need to write custom code.
- Consider productivity-enhancing practices when authoring code in Code operators.
Let us know about functional gaps that you would like to be covered by specialized operators.