A couple of quick SQL tips for beginners (Data Science, Business Intelligence and related)

Thomas Tegtmeier
MyTake
Published in
2 min readOct 11, 2019
Photo by Caspar Camille Rubin on Unsplash

Here are a couple of quick tips gathered from many websites, industry experience and from an experienced and skilled “SQL-GOD”- friend. I hope they’ll help you.

NOTE: Many of the following points should be treated as rules of thumb (each SQL Engine works differently and the best way relies on many variables like data, query, indexes, volume, etc…)

  1. Prefer joins over subqueries ( reduce subqueries all you can ). Try to avoid subqueries with external references like: A inner join ( select ….. ) as B on A.id = B.id
  2. Minimize functions/”operations” to fields like cast (“::varchar” , “cast( field as int)”) or replace/substring/convert/coalesce/etc…. If you really need to do any of them, call them at the top of the query ( the result after all the other filters were applied). Improving the database data types at the source would be better.
  3. “int” joins are faster than “varchar” ones (at least should be)
  4. Prefer always inner joins if allowed (left and right joins do the work for inner join plus extra work)
  5. Use an editor (VSCode , Sublimte, Atom,…) with refactor / renaming option (it’s like an improved find & replace command). The pointers to the tables/subqueries could be in a beginning the classic a, b, c,…, t1, t2, t3… , but when you are ready with your query they should be renamed to the original names of the tables or a meaningful and descriptive alias. That is very easy with the refactor option. That make your script more readable for anyone who’s going to check what you’ve done (including your future self).
  6. If it were difficult to make any improvements to the data model, or it cannot be done, then use new pre-calculation models/tables, that prepare the data every X hours or according to the need of the business. Then work cleanly on those tables without transforming data into queries. This is not free, it requires a lot of disk space and to implement the processing.
  7. Ask the database admin for indexes for the tables, that will help a lot if there were not already set up

If you like this article, please let me know that with a couple of claps, I will be very thankful.

--

--

Thomas Tegtmeier
MyTake
Writer for

Solve business problems with data. Keep it Simple