Python based visualizations within Kubeflow Pipelines

Kirin Patel
5 min readSep 4, 2019

--

Visualizing outputs within Kubeflow Pipelines previously required foreknowledge about the outputs of a pipeline and how to visualize those outputs in a meaningful way before a visualization component could be confidently included within a pipeline. This imposes limitations because a visualization must be specified before outputs are obtained.

Visualizations within Kubeflow Pipelines are not malleable, extensible, or simple. They are not malleable or extensible because visualizations must be specified before a pipeline run and cannot be changed after the fact. For a visualization to be shown for a pipeline in Kubeflow Pipelines it must be included as a component. They are not simple because they must be wrapped in a component, this requires 100s of lines of code compared to other methods of visualizing data, such as within a Jupyter Notebook.

This is why Python based visualizations are being introduced. Python based visualizations allow users of Kubeflow Pipelines to generate visualizations quickly while also decoupling the process of generating visualizations from the process of writing and running a pipeline, all while keeping the development of visualizations within the familiar framework of Kubeflow Pipelines.

TFDV (Tensorflow Data Validation) visualization that was generated with Python based visualizations.
Interactive ROC curve visualization that was generated with Python based visualizations.

What are Python based visualizations?

Similarly to how visualizations can be generated with components, Python based visualizations allow for the usage of popular and familiar Python libraries to generate visualizations within Kubeflow Pipelines. But, unlike a custom visualization created within a component, Python based visualizations are not a part of a pipeline. Instead, they consist of a separate service which is created your existing Kubernetes cluster which allows for visualizations to be created alongside pipelines, whether they are currently being run or had been run at some point in the past, even before the introduction of Python based visualizations.

Python based visualizations provide two categories of visualizations. The first being predefined visualizations. These visualizations are provided by default in Kubeflow Pipelines and serve as a way for you and your customers to easily and quickly generate powerful visualizations. The second category is custom visualizations. Custom visualizations allow you and your customers to provide Python visualization code to be used to generate visualizations. These visualizations allow for rapid development, experimentation, and customizability when visualizing results.

Python based visualizations rely on three parts: the frontend, the API server, and the Python visualization service. The architecture of all three is as follows. The frontend is responsible for creating the visualization request and displaying the results of the created requests. The API server is responsible for transposing the request provided by the frontend to a request that is understandable by the python visualization service, returning the result of the transposed request to the frontend, and gracefully handling incorrectly formatted requests from the frontend and any errors encountered with the Python visualization service. Finally, the Python visualization service is responsible for generating a visualization from a provided request.

New visualization creator component within the artifacts tab that allows users to generate visualizations using Python based visualizations.

The new user interface that is used to specify and create visualizations.

How to use Python based visualizations

For the most up to date documentation on how to use Python based visualizations, view the preview documentation, which will be released with Kubeflow v0.7.0, or view the README.md file within the Kubeflow Pipelines repository. For details about the documentation release status, you can follow this pull request.

Using predefined visualizations

Visualization creator component when a predefined visualization is selected, showing the ability to provide a source and optional arguments.
  1. Open the details of a run.
  2. Select a component.
  3. The component that is selected does not matter. But, if you want to visualize the output of a specific component, it is easier to do that within that component.
  4. Select the Artifacts tab.
  5. At the top of the tab you should see a card labelled Visualization Creator
  6. Within the card, provide a visualization type, a source, and any necessary arguments.
  7. Any required or optional arguments will be shown as a placeholders.
  8. Click Generate Visualization.
  9. View generated visualization by scrolling down to the bottom of the panel on the right side of the page.

Using custom visualizations

Visualization creator component when a predefined visualization is selected, showing the ability to provide Python code, a source, and optional arguments.

Where predefined visualizations offer simple and straightforward access to popular and powerful visualizations, custom visualizations allow for complete control over how a visualization should be generated.

Start by enabling custom visualizations within Kubeflow Pipelines.

If you have not yet deployed Kubeflow Pipelines to your cluster, you can edit the frontend deployment YAML file to include the following YAML that specifies that custom visualizations are allowed via environment variables:

- env:
- name: ALLOW_CUSTOM_VISUALIZATIONS
value: true

If you already have Kubeflow Pipelines deployed within a cluster, you can edit the frontend deployment YAML to specify that custom visualizations are allowed in the same way described above. Details about updating deployments can be found in the Kubernetes documentation about updating a deployment.

  1. Open the details of a run.
  2. Select a component.
    The component that is selected does not matter. But, if you want to visualize the output of a specific component, it is easier to do that within that component.
  3. Select the Artifacts tab.
  4. At the top of the tab you should see a card labelled Visualization Creator.
  5. Within the card, select the CUSTOM visualization type then provide a source, and any necessary arguments (the source and argument variables are optional for custom visualizations).
  6. Provide the custom visualization code.
  7. Click Generate Visualization.
  8. View generated visualization by scrolling down to the bottom of the panel on the right side of the page.

--

--