Photo by Christina @ wocintechchat.com on Unsplash

Writing Technical Design Docs, Revisited

Engineering Insights

Talin
Published in
4 min readDec 3, 2019

--

In the past ten months I’ve received a lot of interest for my essay, Writing Technical Design Docs. In this article I’d like to present an example of a more refined design doc template. This is the template we use in the Creative Tools division at Amazon Web Services. It was created by my co-worker Graeme McHale, with input and suggestions by myself and others on the Creative Tools team. It is used here with his permission.

Technical Design Document Template

1. Description of the Problem

Give a brief (one paragraph) summary of the problem you are solving. Work to keep away from technical detail. Try instead to talk about this in terms of the problem as it pertains to your customers. At the end of reading this document, any team member should be able to understand the problem, how you intend to solve it, and who are the stakeholders.

2. Solution Requirements

Briefly describe what is required of a solution which addresses the problem. Try to steer clear of the how (implementation detail) and concentrate on what is required any solution in order to address the problem outlined above.

3. Glossary

Link to the service wide glossary, and define any new terms used in this document.

4. Out of Scope (Non-goals)

Explicitly call out what is not in scope (Sometimes articulating what you are not going to do is an easier way to define scope than to talk about what you are going to do.)

5. Assumptions

What are you assuming will be true or in place to make your solution successful?

6. Solution

Your solution goes here.

Start by including a high level diagram and decompose from there. Please diagram (where possible) how your solution interacts with other subsystems and services (including sequence diagrams for complex interactions).

Aim to answer:

  • What are you going to deliver?
  • What are your upstream and downstream dependencies?
  • How does it fit in to the broader service?
  • How will it scale?
  • What are the limits of your solution?
  • How will you ensure fault tolerance and quick recovery after failure?
  • How might your solution evolve to meet future requirements?

7. Security Considerations

What are the security implications of your solution? If your solution is large enough in scope to warrant its own threat model, please add it here, otherwise please describe how your solution impacts existing threat models.

8. Cost Analysis

A high level analysis of the costs that will be incurred in running your chosen solution on a day-to-day basis.

9. Cross-region Considerations

If applicable, how does your solution optimize or is compatible with cross-region requirements. This includes data transfer costs between regions, availability of the service in different data centers, latency issues, etc.

10. Operational Readiness Considerations

Discuss how your solution will support operational excellence, ensuring customer satisfaction with a frugal level of support.

Aim to answer:

  • How your chosen solution will be deployed?
  • What metrics and alarms will be key to monitoring the health of your solution?
  • How are your solution limits enforced?
  • Will there be any throttling or blacklisting mechanisms in place?
  • Will there be any data recovery mechanisms in place?
  • If this is a multi-tenant solution, how are you dealing with noisy neighbor issues?
  • How will your solution be debugged when problems occur?
  • How will your solution recover in case of a brown-out?
  • Are there any operational tools required for your solution?

11. Risks and Open Issues

If there are any risks or unknowns, list them here. Are there any open questions which could impact your design for which you do not currently have answers? How are you going to get answers? Will any required team members be loaned to other teams during the time slated for implementation? Are all of required dependencies available in all the regions you need them? What are the one way doors, and are we sure we want to go through them?

12. Solutions considered and discarded

What alternatives have you have considered and discarded? Why don’t these work? Be brief, linking to other documents for details is ok, but always provide a summary inline.

Only alternative solutions that an impartial observer would deem credible need be documented.

If an alternative solution is not appropriate now, but may be in the future, please discuss potential migration paths.

13. Work Required

Include a high level breakdown of the work required to implement your proposed solution, including t-shirt size estimates (S, M, XL) where appropriate. Also, specifically call out if this solution requires resources from other teams to be completed (away teams, dependencies etc.)

14. High-level Test Plan

At a high level, describe how your chosen solution be tested.

15. References

Links to any other documents that may be relevant, or sources you wish to cite.

--

--

Talin
Machine Words

I’m not a mad scientist. I’m a mad natural philosopher.