BETWEEN A and SQL: How Different Operators and Set Functions Can Improve Your Data Analysis

Martyna Adam
Learning SQL
Published in
7 min readJul 29, 2022

Welcome to part four of the learning journey, going ‘From A to SQL’! Time flies when you are querying data. Over the last 3 chapters, we’ve covered the foundations of databases, and understood different DDL and DML statements to enable us to change the structure of a database, or alter the data held within it. You can access Part 3 here. Now it’s time for a new challenge!

Photo by Social Mode on Unsplash

If you are pursuing any career which involves data, I can almost guarantee that at some point you will have to perform some data analysis. So, what exactly is it? Put simply, data analysis is the practice of working with data to derive useful information and insights, to enable you, or your stakeholders, to make informed decisions. I have seen first-hand how much data acts as evidence to underpin crucial business decisions. If you think about a huge SQL database and the tables within it (I’m talking large number of tables and thousands of rows of data), simple SELECT queries outputting a list of rows may not be very useful to see what the data is telling us.

Sometimes we don’t just want to extract the data. A stakeholder may ask you various questions about the data, and organising, understanding, analysing, and interpreting the data will be key. We can do this in SQL by using set functions as well as GROUP BY, ORDER BY and HAVING statements.

What is a function?

A function is a method that we can use to make a data operation, such as adding all the values together or finding the average. Fortunately, SQL already has a set number of these functions defined!

Table showing SQL Set Function Syntax and Description. Table created by author.

Most of these seem easy but think about their use. If your boss asked you what the total sales amount was for the business so far, you could use SUM() in your SELECT query, and find the information for them within seconds!

What is the difference between ORDER BY and GROUP BY?

If we want to sort the results of our output, we can use ORDER BY with our SELECT statement. We can sort outputs by more than one column, and we can add ASC to sort the data in an ascending way, and DESC to sort in a descending way. If we are using ORDER BY with a WHERE clause, it will always go after the WHERE clause.

Q. Does the ORDER BY also sort the data in the database?

No, the ORDER BY statement only sorts the output of the query. The database remains unchanged.

On the other hand, GROUP BY is used to group together rows which have the same values, to summarise the data. To avoid errors (trust me I know…) make sure that the columns you specify in the GROUP BY appear in your SELECT list. Just to make things more complicated, you can’t use a WHERE with a GROUP BY. Instead you need to use HAVING as this is the keyword that applies to records which have been summarised.

Q. Will you be HAVING an example of this below?

Absolutely!

That makes sense, but what’s this ‘operator’ business?

An operator is a special word or character which we can call to query the database. There are different types of operators, but don’t worry you don’t necessarily need to remember the names of the different types! The important part is remember the key word, what it does, and how it can help you analyse your data.

  1. Comparison Operators

Remember maths at school when the teacher explained the ‘greater than’ and ‘less than’ symbols using crocodiles? Don’t worry I won’t start doing that now! The point is, those symbols can also be used in SQL as an operator. We can compare the value of different expressions and the crucial part is that the answer will be a True or False. Examples include <, >, < =, > =, =, != .

The one you may not be familiar with is != which means not equal to.

To combine multiple of these expressions, we need a condition keyword. These are a) AND meaning both conditions need to be met, b) OR meaning either condition needs to be met, c) IS NOT meaning not equal to, but it can only be used with NULL.

2. Logical Operators

These types of operators are very useful when you don’t have the exact search criteria that you want to query your data on. The main logical operators are:

  • BETWEEN/ NOT BETWEEN which selects values within a range. A good thing to watch out for is that the results will be inclusive of the value you have specified. So if you wrote BETWEEN 5 and 7, the results can show 5,6 and 7.
  • IN allows for a multiple selection of values when using WHERE
  • LIKE allows you to return a result which matches a specific pattern. These can be tricky to understand. To a beginner they can seem like someone who has pressed random buttons on a keyboard. I won’t overload you with information, so for now the w3schools website has a good overview of this type of operator and the examples of patterns that you could specify.

That seems like a whole dictionary of new words! Things will become clearer with an example.

Ready? Set? SELECT SUM().

Let’s continue with the Business database example we have been working on throughout the journey. To assist us with the data analysis, let’s add the following data into the Orders table:

Table showing data for a SQL database. Table created by author.
ALTER TABLE ORDERSADD SalesPerson VARCHAR(30);
ALTER TABLE ORDERSADD SalesAmount FLOAT(4,2);
UPDATE ORDERS as oSET o.SalesPerson = 'James'WHERE o.Order_Nr IN ('01', '04', '06');
UPDATE ORDERS as oSET o.SalesAmount = 46.50WHERE o.Order_Nr = '01';

To save space, I have only provided the syntax for the first Sales Person, and the first Sales Amount, but the same logic will apply for the rest of the data.

Now it’s time to do some data analysis! We will be using the updated Orders table. Here are the questions we need to answer , and the syntax to get the correct data.

Q1. How many records order took place in August where the shipping day was also Monday?

SELECT *FROM ORDERS as oWHERE o.Order_Month = 'August'AND o.Shipping_Day = 'Monday';

Q2. How many orders had their shipping day as Wednesday or Friday?

SELECT *FROM ORDERS as oWHERE o.Shipping_Day = 'Wednesday'OR o.Shipping_Day = 'Friday';

Q3. Show all the sales people and their sales amounts for orders placed in June, August and March.

SELECT o.SalesPerson, o.SalesAmountFROM ORDERS as oWHERE o.Order_Month IN('June', 'August', 'March');

Q4. What is the total sales amount?

SELECT SUM(o.SalesAmount)FROM ORDERS as o;

Q5. What is the count of unique first names among our sales people?

SELECT COUNT(DISTINCT o.SalesPerson)FROM ORDERS as o;

Q6. Select the average sales amount of the orders, where the sales person is James, Linda and Jennifer, sorted in descending order.

SELECT AVG(o.SalesAmount) AS 'Average Sales Amount', o.SalesPersonFROM Orders as oGROUP BY o.SalesPersonHAVING o.SalesPerson IN ('James', 'Linda', 'Jennifer')ORDER BY AVG(o.SalesAmount) DESC;

Notice we added here some syntax we haven’t discussed before. As we start to use set functions for data analysis, it’s good practice to start changing the column name of the output of the column so that we don’t get confused as to what the data shows. For this we use an alias, and we set it using AS. This doesn’t change the column name of the database table, just the output.

SQL output grid. Image created by author.

For better readability, we can use ROUND(column, number of decimal places) to round the average.

SELECT ROUND(AVG(o.SalesAmount), 2) AS 'Average Sales Amount', o.SalesPersonFROM Orders as oGROUP BY o.SalesPersonHAVING o.SalesPerson IN ('James', 'Linda', 'Jennifer')ORDER BY AVG(o.SalesAmount) DESC;
SQL output grid. Image created by author.

Q7. What is the total sales amount for each sales person, including the number of sales per person, their average, lowest and highest sale amounts, sorted by total sales in an ascending order?

SELECT o.SalesPerson,SUM(o.SalesAmount) AS 'Total Sales',COUNT(o.SalesAmount) AS 'Number of Sales',ROUND(AVG(o.SalesAmount),2) AS 'Average Sales Amount',MIN(o.SalesAmount) AS 'Lowest Sales Amount',MAX(o.SalesAmount) AS 'Highest Sales Amount'FROM Orders as oGROUP BY o.SalesPersonORDER BY SUM(o.SalesAmount) ASC;
SQL output grid. Image created by author.

Congratulations! You are one step closer to mastering SQL. Remember, it’s all about small, consistent steps and lots of practice. This article covered a lot of new syntax, so don’t worry if it all doesn’t make perfect sense just yet. The important thing is to be able to follow along and understand the logic.

Keep your eyes peeled for the next sequel of my learning journey :)

Go from SELECT * to interview-worthy project. Get our free 5-page guide.

--

--