Numeric Data Types and Summary Functions in SQL

Shawn
3 min readNov 9, 2022

--

Let’s first take a look at what kind of Numeric data type in SQL:

Division:

We can choose what kind of type we want to get as an output.

Function for Numeric data

min(): minimum value

max(): maximum value

avg(): average value

Variance:

  • Population variance : var_pop()
  • Sample variance: var_samp() / same as variance()

Standard deviation:

  • Population Standard Deviation: stddev_pop()
  • Sample Standard Deviation: stddev_samp() / stddev()

Round:

Truncate

This is very useful using with grouping:

Case study:

We got a data “stackoverflow” as below:

Summarize the distribution of the number of questions with the tag “dropbox” on Stack Overflow per day by binning the data.

We first create boundary: lower/upper

then count the question, combine the boundary with our data.

result

corr(): correlation between

Case study:

  • Compute the mean (avg()) and median assets of Fortune 500 companies by sector.
  • Use the percentile_disc() function to compute the median

https://gist.github.com/ShawnXXycarb/7bd47b432d36cca3075f0bbc27dae5de

We can create temp table to simplify a query

While you can join the results of multiple similar queries together with UNION, sometimes it's easier to break a query down into steps. You can do this by creating a temporary table and inserting rows into it.

Compute the correlations between each pair of profits, profits_change, and revenues_change from the Fortune 500 data.

If you enjoy this article please click the thump up (Keep pressing can send multiple thump up!) and follow me, I’ll keep updating content about data science! Hope you like it.

Reference:

Christina Maimone: EXPLO RATO RY DATA ANALYSIS IN SQL, datacamp

--

--

Shawn

Self taught — Data Analyst | Business Intelligence Specialist | Business Analyst | Data scientist