SQL Style Guide

Derek Chang
Stratus Data Blog
Published in
4 min readAug 29, 2019

--

“Programs are meant to be read by humans and only incidentally for computers to execute.”
— H. Abelson and G. Sussman (in “Structure and Interpretation of Computer Programs”)

When we write SQL queries, we are communicating through code and the way our code looks can enhance a reader’s understanding. Our queries are destined to be read and maintained by somebody other than ourselves. When code looks different and is hard to read, time and mental energy are wasted in trying to parse the code, but when code is clear and consistent, the mind can relax and purely focus on the content. Here is an example of a SQL snippet that is challenging to read. Can you easily understand the query?

Additionally, using a clear and consistent style helps make potential errors and bugs more obvious as unfamiliar patterns will stand out. In this post, we share our SQL styling which presents clear, consistent SQL queries. Let’s go through the query example above and see how we can make the code clearer and easy to read!

1. Line Spacing and Indentation

  • Major keywords (SELECT, FROM, JOIN, WHERE, GROUP BY, HAVING, ORDER BY, LIMIT) should be left-aligned and have their own line.
  • Create new line after each keyword definition.
  • Create new line after a comma when separating multiple columns into logical groups.
  • Create newline after semicolons to separate queries for easier reading
  • All sub-statements should be indented compared with their parent statement. Example: sub-statements of the CASE statement should be indented relative to the CASE statement itself.
  • Separate code into related sections, which helps to ease the readability of large chunks of code.

2. Common Table Expressions over Sub-queries

  • Sub-queries should be written with common table expressions (CTE) (WITH statements), with exception of simple ‘one-liners’ like SELECT GENERATE_SERIES(10). CTE’s provide a clear way of defining and naming sub-queries. It is also very helpful when sequentially building up sections of a large query.

3. Spaces and Capitalize Keywords

  • Include spaces before and after symbols (=, +, -, *, /, <, >)
  • Insert spaces after commas (,)
  • Place spaces surrounding apostrophes (‘) where not within parentheses or with a trailing comma or semicolon.
  • Although syntax highlighting makes identifying keywords easy, we cannot always assume it is available, like in an email. Capitalizing all keywords provides extra clarity.

4. Naming Conventions

  • Don’t use CamelCase — it is difficult to scan quickly.
  • Ensure names are unique and do not exist as a keyword.
  • For computed data (SUM(), AVG(), or COUNT()) provide a descriptive alias.
  • Use underscores where you would naturally include a space in the name (first name becomes first_name).
  • When joining tables, the table alias should be the first letter of each word in the object’s name. If there is already a correlation with the same name then append a number.
  • Always include the AS keyword when creating an alias.
  • When choosing columns from a query that uses a JOIN, explicitly identify the column with the table alias. Example: prepend release_date, birthdate, title, and genre with the appropriate table alias.
  • Avoid abbreviations, but if you have to use them, make sure they are commonly understood.

5. Comments

  • Include comments in SQL code where necessary. Use the C style opening ‘/*’ and closing ‘*/’ for multi-line comments and ‘ — ‘ for inline comments
  • When commenting on a block of code or long lines of code, place the comment above and left-aligned with the code block.
  • When commenting on a single line of code, comment at the end of the line with ‘ — ‘

Much better, right?

A good, general rule-of-thumb to follow is to use consistent and descriptive names as well as make judicious use of white space, new lines, and indentation to make code easier to read.

--

--