Kubeflow’s first doc sprint — it’s a wrap

Published in

kubeflow

5 min readJul 19, 2019

Sarah Maddox, Kubeflow technical writer

The Kubeflow community held a doc sprint on July 10–12. For three days, in time zones around the world, we worked together to write documentation and build samples that help people understand and use Kubeflow.

Kubeflow is an open source system for building and deploying machine-learning workflows on Kubernetes. The concepts, goals, and technology are complex. Some of the doc sprinters were already familiar with Kubeflow, others had to learn enough to be able to document it within the three days of the doc sprint. No small task!

UX research shows that docs are super important to people’s experience of Kubeflow. In fact, one of our mini talks (see below) discussed this very point. The goals of the doc sprint were to create tutorials, fix bugs, and give the community an opportunity to learn about Kubeflow and, in many cases, to make their first open source contributions.

The participants

Some of the doc sprinters joined us on site in Kirkland, WA in the United States. Others worked entirely online. We spanned the EMEA, APAC, and US time zones. Participants were from JPMorgan Chase, Microsoft, Canonical, Agile Stacks, Arrikto, Google, and more. Quite a few were technical writers, students, and engineers looking for their first experience with Kubeflow. Some wanted to learn about contributing to open source.

We shared status updates via video conference calls, asked questions in the Kubeflow Slack workspace, and communicated via comments on pull requests and issues.

What we accomplished

The focus areas of the doc sprint were to create getting-started guides for specific platforms, build end-to-end tutorials for top use cases, write how-to and trouble-shooting guides for those tricky issues that require the attention of an expert, and fix bugs. We did pretty well in achieving our goals!

Doc sprint stats to date:

28 pull requests merged
29 issues closed
7 pull requests under review

The Kubeflow Doc Sprint Kanban board tracked our status during the sprint.

A GitHub project board showing issues in backlog, in progress, etc. — The Kubeflow Doc Sprint Kanban board

Here’s a summary of the docs and samples that we built during the sprint:

A redesigned website home page, featuring a new card-based design, smooth glide on scroll, and richer more up-to-date information.
The new metadata component introduced in Kubeflow v0.6.
A Jupyter Notebook that builds and runs a simple ML workflow using Kubeflow Pipelines.
A tutorial for Kubeflow Pipelines on Agile Stacks.
How to install Kubeflow on prem in a multi-node Kubernetes cluster.
A guide to authentication with Kubeflow on Google Cloud Platform.
A reference table of Dockerfiles for all of Kubeflow’s images.
A guide to connecting with LDAP in on-prem Kubeflow with Arrikto.
How to visualize Markdown output from a Kubeflow Pipelines component.
A refactoring of the examples and components sections, to improve readability and findability.
A refactored getting-started page. We want to continue working in this area of the docs.
The updated end-to-end tutorial for Kubeflow Pipelines on GCP.
The benefits of using Jupyter Notebooks within a Kubeflow deployment.
An experiment using a video in a getting-started guide. We plan to trial more videos on selected Kubeflow pages.
Multiple PRs updating the docs for the upcoming move from ksonnet to kustomize for application configuration management.
Updated guide to hyperparameter tuning for Katib v1alpha2.
Setup docs for Kubeflow on Microsoft Azure, created in preparation for the sprint.
Restyled reference tables, such as in the TFJob reference, also completed in preparation for the sprint.
A Jupyter Notebook that runs a Pix2Pix model using TensorFlow 2.0 in Kubeflow Pipelines (work in progress).
How to migrate a Python script to a Kubeflow pipeline (work in progress).
Getting started with Kubeflow on Docker Desktop (work in progress).
Example of Kubeflow Pipelines on Microsoft Azure (work in progress).
Example of a Named Entity Recognition model using Kubeflow, Keras, and GCP (work in progress).
Multi-user isolation (work in progress).
All the doc-sprint PRs completed and in progress.

Learning on the sprint

A doc sprint wouldn’t be a doc sprint without hints on technical writing and UX! One of our goals was to give the sprinters the opportunity to learn some techniques and concepts that are outside their usual daily grind.

So we tried something new: mini talks during the sprint, just 10 minutes per talk:

Perhaps most importantly, we got to know each other. The Slack workspace was full of buzz and cheerful emojis. When the going got tough, people jumped in to help.

Preparing for a doc sprint

A lot of preparation goes into a successful doc sprint. We wanted to make sure that each sprinter had a good experience of the sprint itself and came away with some accomplishments under their belt.

Define the goals of the doc sprint. Two months before the sprint date, we began working on a wish list of tutorials and guides that we wanted to build. We started the wish list in a spreadsheet, then moved it to our issue tracker to form the Kubeflow Doc Sprint Kanban board. Everyone in the Kubeflow community had the opportunity to add to and refine the wishlist. We discussed it in person, in the community mailing group, and in community meetings.

As a result, when the first day of the sprint dawned, we had a sizable backlog of tasks for people to work on. Some of the backlog issues were large and meaty (for example, develop a pipelines tutorial for a specific machine learning model using TensorFlow 2.0), some were doc refactoring (for example, make the components section easier to navigate), some were bug fixes.

Provide guides for the sprinters, showing them how to take part in the sprint and how to update the docs. Our doc sprint included people in locations around the world and in varying time zones. Sprinters needed to get up and running even if no-one else was around to help them. We provided a comprehensive sprinter’s guide, how to update the docs, agenda, and more on the doc sprint wiki.

Comment from Fabien Da Silva on the Kubeflow Slack workspace

The doc sprint is a wrap, but work goes on!

Participants said that they enjoyed learning Kubeflow and experiencing a doc sprint for the first time.

Some of the participants are interested in continued contributions to the project. Yes please! We welcome contributions to the project. Building docs and examples is a great way to get started with a product.

Thank you to everyone who took part in the Kubeflow Doc Sprint and helped make it a success.