Building a New Platform for Snowflake Documentation: Part I

Several months ago we launched our new documentation experience. It’s been a huge hit with our customers, and even ourselves (since we also use our documentation!) Launching the new experience took about four months, and we want to share the technical journey we took to create our new platform.

Snowflake has always been a big believer in documentation. Since the earliest days of Snowflake, we have had a team of dedicated writers authoring content continually.

As a developer, documentation is one of the primary factors I take into consideration when adopting new libraries, vendors or technologies in general. If it doesn’t have good documentation, the odds of successfully implementing the technology decreases considerably.

Good documentation is crucial to adopt any technology. Great documentation is possible with a robust and modern platform to support it.

Assessing the Prior System

While Snowflake had invested in creating content, it hadn’t invested as much in the technology delivering that content to customers.

Our prior documentation site was based entirely on Sphinx assets uploaded to S3, with CloudFront acting as our CDN. This certainly was a simple and performant solution, but the entire experience was dedicated by (and therefore limited by) the Sphinx platform. Customizations outside Sphinx’s capabilities (or outside any statically generated site) were not possible.

Deployments to this system were uncomfortably manual with a lot of opportunity for error. Snowflake engineering makes extensive use of tools like Jenkins, containers and automation, however the documentation platform stood outside the scope of our engineering organization, and therefore did not receive those process improvements as Snowflake grew and evolved as a company.

The design of the prior system was also locked in the original design language of Snowflake’s earliest days. This made the prior documentation site seem “behind” the other Snowflake efforts, which represented something of a problematic paradox. Documentation must convey how to use the newest features — if a user is subliminally suggested the documentation is “behind”, it can produce concerns that the content itself is not up to date.

The navigation was another significant area that required improvement. Due to how Sphinx worked, we had a single navigation tree containing every document. With a documentation set consisting of ~1800 pages, this was difficult to navigate. The navigation mixed explanatory content, reference content and release notes together. It was a lot to consider at once, so we knew we had to simplify this area.

With outdated and manual technology, an old and misaligned site design, and a complex site navigation, the existing site needed serious updates. But what updates should we make?

Exploring What Documentation Could Be

To understand where we could go with our documentation, we started by asking ourselves what documentation systems we enjoyed. As developers using documentation on a near continual basis, we had a fairly strong idea which experiences had stood out to us in the past.

We combined this with a general search for articles on good documentation systems, in addition to a focused search on Hacker News for discussions related to documentation.

Combining the usual suspects from these three sources game us a short list (shout outs to Stripe, Twilio, Github, Heap, Postman, Shopify, Hashicorp and Square), and with that we delved into their documentation experiences. We noted common patterns (such as as-you-type search, copy buttons for code blocks, light and dark themes), their overall navigation and content strategy, page structure, presentation, and other features. We devised a shortlist of existing gaps (patterns we commonly found in these systems which ours lacked) and desirable features which were appropriate for our specific needs.

Beyond the external evaluation, we also looked internally for guidance. We discussed at length with the Knowledge Management team, the primary stakeholders of the system, what they liked about the existing system, didn’t like, and what sore spots existed. The writers revealed a lot of manual processes that we, as engineers, relished at the idea of automating. They also had a long list of features they had wanted to implement for years, but were unable to without engineering support. We also reached out to other teams at Snowflake, including our Sales Engineers, who frequently talk to our customers, and our Developer Relations team, to learn what they felt our documentation could be doing for Snowflake.

Defining our Technical Approach

Before starting the technical updates to the documentation platform, we needed to define what we were doing, and more importantly, not doing. Whatever we did had to take the strengths of the prior system and either build on or continue to improve it.

Our team was highly skeptical of complete system rewrites — we have rarely seen those go well. We also didn’t want to lose 8 years of documentation effort.

Snowflake’s documentation consists of ~1800 pages of reStructuredText with 20 authors constantly contributing to that content. Switching away from that authoring language, requiring any kind of migration of content or mass modification of those files was simply out of the question. We also couldn’t impose work on those writers — they had to remain entirely focused on delivering their content as we built the new system. Additionally, as we built the system, we were still publishing the prior system. To avoid disrupting workflows, whatever we built had to be backwards compatible.

The technical spec also had one interesting twist — we did not want to have a negative impact on our search engine placement. This is a bit of a fluid space, but the easiest approach to being confident your content is being indexed by search engines is by sending all content down with the initial HTML load. This immediately precludes a lot of modern single page application design, which typically delivers a near-empty initial HTML page and populates content later with client-side requests. While some search engines can respect this, it’s a risk. If we can avoid that for peace of mind, all the better. Snowflake’s Documentation search engine optimization was actually in a pretty good place, so we didn’t want to negatively impact that.

In summary:

Do

  • Ensure backwards compatibility with existing system during development
  • Update the deployment and technology stack to modern Snowflake engineering standards
  • Update the design language
  • Provide for greater navigation flexibility

Do Not

  • Interrupt the content writers
  • Require mass content migration or updates
  • Require backwards incompatible changes
  • Negatively impact our search engine placement

In our next post, we’ll cover what we did and how we turned these goals into action and our new platform!

--

--