Review of the latest Elyra Pipeline Editor improvements
The Visual Pipeline Editor is probably the most frequently used Elyra feature. Based on feedback we’ve received from users via our community channels, we’ve improved the editor to make it more powerful and easier to use. This blog post highlights three major improvements you’ll find in version 3.12.
Pipeline node defaults
When building pipelines that contain many nodes, configuring the properties for each node can get quite tedious. Let’s say, for example, you’re creating a pipeline that runs several Jupyter notebooks using the built-in generic components. At a minimum you’ll have to assign a runtime image for each notebook to configure this pipeline:
- open the node properties for the first node
- configure the runtime image for this node
- open the node properties for the second node
- …
You can avoid these repetitive steps by configuring node defaults. These defaults are configured in the pipeline properties tab, which is accessed via the slide-out panel on the right hand side.
In the node properties tab, applicable default values are displayed and can be overridden, if desired.
Node defaults are divided into three categories:
- Defaults that apply to all nodes, generic and custom
- Defaults that apply only to generic nodes
- Defaults that apply only to custom nodes
Starting with Elyra version 3.12, the release-specific pipeline documentation includes a list of supported default properties and links to the property description.
For the most current release, this is the documentation link.
The easiest way to access the release-specific documentation for your Elyra deployment is from the JupyterLab launcher.
Easier node property input
If you are upgrading to version 3.12 from an earlier release of Elyra, you’ll notice that you won’t have to specify properties in a proprietary text format anymore.
Configure Kubernetes properties
Elyra supports two external pipeline runtimes: Kubeflow Pipelines and Apache Airflow. Both runtimes have in common that they run on top of Kubernetes.
Apache Airflow can also be configured to not use Kubernetes but Elyra does not support this type of configuration.
Each pipeline node is executed in a Kubernetes pod, which can be granted access to environment variables, volumes, and other resources. Pods can be associated with identifying metadata (labels) and non-identifying metadata (annotations) that can be used to organize and describe them.
It’s beyond the scope of this post to explain these concepts, and others, such as tolerations, which can now be configured in the additional properties section for each node, or as pipeline node defaults. The updated documentation provides a brief description and a link to the related Kubernetes topic.
If specified, these properties are listed when you describe the pod using
kubectl describe pod
or in the Kubeflow Pipelines Central Dashboard or Apache Airflow GUI.
If you are running Jupyter notebooks or scripts using Elyra, the most commonly used properties are environment variables, secrets (a secure approach to using environment variables without exposing their values) and data volumes.
We are in the process of updating the best practices documentation for generic components to reflect these latest enhancements.
Thanks for reading! Until next time