SQL part 2

Mando Iwanaga
3 min readOct 11, 2019

--

SQL basics

Continuation of SQL basics… If you haven’t seen my initial blog on SQL basics, check it out here.

SQL Aggregate Functions are a way of aggregating data that is used often. It will perform a calculation that will return a single value. It is often used in the SELECT statement.

COUNT |  Returns the number of rows 
SUM | Returns the sum of all the values in a numeric column
MIN | Returns the lowest value in a column
MAX | Returns the highest value in a column
AVG | Returns the average value of a numeric column
Example Syntax Use:SELECT COUNT(ColumnName), SUM(ColumnName), MIN(ColumnName), MAX(ColumnName), AVG(ColumnName)
FROM TableName;
  • Note on NULLS: COUNT( ) and AVG ( ) ignores NULLS, SUM ( ) treats NULLS as 0
  • Note on MIN and MAX: these can be used on date/time and non-numerical columns as well. It will return the earliest date/time or closest non-numerical value to the letter “a”, and vice-versa for MAX.
  • Note on returned column headers: your returned column name will be exactly as stated in the SELECT statement. For example, if we have a column called ‘UserName’ and we want to know how many total users there are. So we use the COUNT( ) function like so: SELECT COUNT(UserName) FROM TableName. This will return the total users with the column header as COUNT(UserName). Situations like this will be confusing to others reading your query, so we should rename our columns using aliases. It can be done using AS.
  • If the alias name contains spaces in it, it must be wrapped in double quotes
SELECT COUNT(UserName) AS total_users
FROM TableName;
ORSELECT COUNT(UserName) AS "Total Users"
FROM TableName;
  • Now when someone reads your query they can easily understand what each column represents.

GROUP BY allows you to group rows of the same values into summary rows. This is useful when you want to analyze data more. For example, if you have a table of store transactions, and you want to know the total transactions per month. We can use SELECT COUNT(*) FROM Transactions GROUP BY month; and viola! Example syntax:

SELECT ColumnName1, COUNT(*)
FROM TableName
GROUP BY ColumnName1;

HAVING has similar functions as the WHERE clause; it is used to filter on a specified condition. The difference is that WHERE is used to filter rows while HAVING is used to filter groups of rows. Example syntax:

SELECT ColumnName1, COUNT(*) AS total
FROM TableName
GROUP BY ColumnName1
HAVING total > 10;

DISTINCT will allow you to view unique values (no duplicates).

  • You can also use it with Aggregate Functions; sample syntax:
SELECT DISTINCT ColumnName
FROM TableName;
SELECT COUNT(DISTINCT ColumnName)
FROM TableName;

Note: The order of how you write queries matter. Correct syntax order:

SELECT DISTINCT ColumnName1, aggregate_function(ColumnName2)
FROM
WHERE
GROUP BY
HAVING
ORDER BY;

The order in which the query is actually executed however is different. To learn more about this you can check out this great article.

CASE is used to filter rows into categories based on specified condition/conditions, creating a new column of data of categories as a result. I like to think of it as (if ~ then) statements.

  • CASE statements consist of WHEN, THEN, and END/END AS
  • You can use ELSE to specify the remaining values
  • For example, we have a table of store items and their prices, and we want to categorize the items as cheap, normal, expensive:
SELECT Item, 
CASE
WHEN Price < 10 THEN "cheap"
WHEN Price < 50 THEN "normal"
ELSE "expensive"
END AS price_category
FROM StoreItems;
  • You can use AND/OR inside of CASE statements for more filtering options. Let’s say our StoreItems table also has a Status column that states whether the item is ‘new’ or ‘used’, and we only want the new items categorized.
SELECT Item,
CASE
WHEN Price < 10 AND Status = "new" THEN "cheap"
WHEN Price < 50 AND Status = "new" THEN "normal"
WHEN Price >= 50 AND Status = "new" THEN "expensive"
ELSE "used"
END AS new_items_categories
FROM StoreItems;

Thanks for reading as I continue in my pursuit to master SQL!

Stay tuned for more! :)

For examples on a dataset check out this repo I made!

References:

--

--