Details missed by the “Investigate a relational database” course (Part 2)

I am an Udacity mentor, and I helped many students move forward with their Investigate a Relational Database project.

I wrote two articles relevant for this course already. The first one discussing the importance of file formats, and the second reviewing the most common mistake in students’ submissions.

This article will discuss the second most common reason submissions fail to meet the passing grade requirements.

Unproperly formatted code.

Why is formatting important?

Software is usually written once but read many more times. And in most real-life projects, the code will be read by many more developers than the person writing it.

Even closed source software is read than written by more people; in a team, communication is key, and the formatting is meant to facilitate that communication over the most-used channel, the source code.

Imagine everyone in a team wrote their code as they, with different capitalization, naming conventions, or indentation approaches. Reading the code will take more effort than necessary and will slow down the whole development effort.

For example, compare the following:

select Rental_month,
Rental_year,
storeid,
rental_count
FROM (
SELECT sr.store_id storeid,
DATE_PART('month',rental_date) AS Rental_month,
DATE_PART('year',rental_date) AS Rental_year,
COUNT(r.rental_id) AS rental_count
from store sr
JOIN staff sf
ON sr.store_id = sf.store_id
JOIN payment p
ON sf.staff_id = p.staff_id
JOIN rental r
ON r.rental_id = p.rental_id
GROUP BY 1,2,3) t1
ORDER BY 2;

Keywords are hard to see because of the variable names being having inconsistent casing. The same applies to the field and function names.

Indentation is also important. It should help us easily locate the different sections of the query, but we have to read and understand the query to identify its sections in this example.

In contrast, a properly formatted query is easier to read:

SELECT
rental_month,
rental_year,
storeid,
rental_count
FROM (
SELECT
sr.store_id storeid,
date_part('month', rental_date) AS rental_month,
date_part('year', rental_date) AS rental_year,
count(r.rental_id) AS rental_count
FROM
store sr
JOIN staff sf ON sr.store_id = sf.store_id
JOIN payment p ON sf.staff_id = p.staff_id
JOIN rental r ON r.rental_id = p.rental_id
GROUP BY
1,
2,
3) t1
ORDER BY
2;

The subquery is trivial to locate. The different sections of the outer query are easy to identify thanks to the keywords being all uppercase and being indented at the start of the row. In contrast, the subquery ones are indented further.

How should we format our SQL code?

Each programming language community has some base formatting guidelines that can be tweak by each team to meet their purposes better.

The formatting of SQL could vary, but some characteristics are almost universally accepted and practiced.

  • Keywords and function names should be written in uppercase.
  • Field names and aliases should be written in lower case.
  • Use indentation to differentiate Subqueries, CTEs, and in general, the sections of the query.

A widely referenced style guide list more important parts of the style; we should be mindful that this could vary between SQL communities or development teams.

But formatting code looks like a lot of work!

If manually formatting code looks to you like a waste of human effort, we agree. It is an important but repetitive task, the kind of tasks that humans are bad at.

This course doesn’t spend any time on how to edit code locally. Still, other courses do need local editors, one of the ones mentioned in other courses and most popular among developers, in general, is atom.

Fortunately, atom is an excellent choice to edit SQL; it has built-in syntax highlighting, and we can add plug-ins to automate tasks like code formatting.

Installing the formatting extension in atom

In this course, we work with PostgreSQL. Hence we should install a plugin to format SQL queries meant to run in PostgreSQL.

The plug-in uses a command-line tool called pg-formatter, but it will do all the necessary setup for us.

We can install it using the atom interface by typing pg-formatter in the search box of the install panel of the preferences. And choosing install.

Conclusion

We have explored why SQL code formatting is important, how to use the formatting to a tool instead of doing it ourselves, and how to use it to format our scripts.

Now, go back and learn!

If you enjoyed the article and found it useful, please consider buying me a coffee.

--

--

--

Reviewers address the most common problems faced by students.

Recommended from Medium

Write actionable backlog items

Should Perl die gracefully?

Os issues de Jorge Luis Borges

BIG Industries helps banks modernise in a customer-friendly way

How to Automate a YouTube Thumbnail With Python

Azure Functions with Docker

Semver is good, and the crisis in open source tools

How To Fix CSS Stylelint Errors Generated By SASS.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Oscar Mauricio Forero Carrillo

Oscar Mauricio Forero Carrillo

More from Medium

Django ORM(Object–relational mapping) ‘OR’ and ‘AND’ query

Steps to create a database in MySQL

MySQL vs PostgreSQL