Basics MySQL knowledge required for Data Science.

Gaurav Rajpal
Analytics Vidhya
Published in
12 min readFeb 1, 2021
Source: Internet

In this blog we will be discuss about the fundamental course of SQL which is useful for all professional be it data analyst, business analyst, data scientist to extract, manipulate or draw insights from data stored in SQL databases. We will be looking at MySQL database starting from introduction, installation and practical sessions.

INTRODUCTION :

1. Why do we need databases ?

The rapid explosion of data creates a necessity to have a place to store and retrieve data efficiently. We are actually leaving in age of the data.

Various sources of data | Source: Analytics Vidhya

We require the databases mainly for 4 operation i.e. Store, Update, Delete and Search. We really need fast search and as we will see later in the blog which is one of the main reason in using specialization databases.

As a data scientist perspective we need data to know how to retrieve this data, manipulate it and save the processed data.

2. What is a database ?

It is an organized collection of data, generally stored and accessed electronically from a computer system.

3. What are the good properties of a database ?

Eg. Food Recipe Blog.

Properties of database | Source: Analytics Vidhya

4. What are different types of databases ?

Different types of databases | Source: Analytics Vidhya

5. How data is stored in database ?

Data Storage | Source: Analytics Vidhya

As we see in No-SQL type of databases which is schema less and everything is stored as JSON objects we can see we have key-value type of notification. We can nest things at many levels which is not possible in relational databases and hence No-SQL way of defining tables is much more simple and flexible and is much more popular recently.

6. Which are the list of companies using MySQL ?

80 % of the companies in worldwide uses MySQL. Below is the list of companies using MySQL.

Companies list using MySQL | Source: Analytics Vidhya

INSTALLATION MySQL / MariaDB

In this part we will learn how to install MySQL. Please refer the following link for installation of MySQL / MariaDB (MariaDB is fork version of MySQL acquired by Oracle).

We will be installing MySQL 5.7 for this blog (Windows)

Installation Links:

GETTING STARTED

In this part we will learn we will learn what are types of queries and get started with writing most of the used queries like SELECT & INSERT. We will also look at the data types of MySQL.

1. What is SQL ?

SQL stands for ‘Structured Query Language’ . It is a language to communicate with databases. It is standard which means that, it does not depends on any hardware. We can create, maintain and retrieve the information related to database and its data using this language.

2. History of SQL ?

History of SQL | Source: Analytics Vidhya.

3. What are the types of command in SQL ?

We have following types of SQL statements:

  • DDL (Data Definition Language)
  • DML (Data Manipulation Language)
  • DCL (Data Control Language)

4. Exploring Databases.

In this part we will see how we can create a database and connect to that database and use it for our business purpose.

Execution steps for creating database | Source: Analytics Vidhya

Before creating tables and exploring more let us look at the data types of the MySQL. Ref link is given below.

Data Types in MySQL | Source: Analytics Vidhya

5. Creating Tables.

Here we will see how we can create the structure of the table in MySQL.

Creation of Tables in MySQL | Source: Analytics Vidhya

6. Inserting Records in Tables.

Here we will see how we can insert the records in our newly created table User Details.

Inserting records in tables | Source: Analytics Vidhya.

7. NULL & NOT NULL.

Difference between Null & Not Null | Source: Analytics Vidhya

MODIFYING DATABASE STRUCTURE

In this part we will see how we can use DELETE and UPDATE command. Also we will see how we can view the structure of the table using DESCRIBE and lastly we will use ALTER command to alter the structure of the table.

1.UPDATE Command.

UPDATE command is used to update the data in the tables in a database.

Note:

  • Updates can happen only one column at a time.
  • Using where clause in an update helps us to update the rows which we are interested in.
  • If where clause is not mentioned it will set all rows of column name mentioned to new value.
Demonstration of UPDATE command | Source: Analytics Vidhya.

2.DELETE Command.

DELETE command is use to delete the records from the table.

Note:

  • All the matching rows corresponding to where clause will be deleted from the table.
  • If where clause is not used all rows will be deleted.
Demonstration of DELETE command | Source: Analytics Vidhya.

3. DESCRIBE Command.

DESCRIBE or SHOW CREATE TABLE are basically used to show the structure of the table. The difference between two is that DESCRIBE gives us the structure whereas another one give us the syntax of create table which can used further.

Demonstration of DESCRIBE command | Source: Analytics Vidhya

4. ALTER Command.

Using ALTER command we can add a column in our table, change it’s name along with change its data type as well.

Demonstration of ALTER command for adding column and changing its name| Source: Analytics Vidhya

We can also change the datatype of the column using ALTER command.

Demonstration of ALTER command for modifying the datatype | Source: Analytics Vidhya.

IMPORTING & EXPORTING DATA

In this part we will see how we can import and export the data in MySQL which can be in flat files, csv or excel format. We will also see how we can backup the database along with its objects and restore it on another server.

1. Importing data from CSV to MySQL.

Demonstration of loading csv file contents into MySQL tables | Source: Analytics Vidhya

2. Exporting data from MySQL to CSV.

Demonstration of exporting data from MySQL into CSV file | Source: Analytics Vidhya

3. Backing up databases in MySQL.

In this part we will learn how to backup the databases into SQL file.

Demonstration of backing up database in MySQL | Source: Analytics Vidhya

4. Restoring database in MySQL.

Here we will see how we can restore the database which is dropped accidently by the user.

For restoring the database first we need to create an empty database and then redirect the output SQL file generated in backup phase.

Steps for restoring a database in MySQL | Source: Analytics Vidhya

DATA ANALYSIS USING MySQL

This module plays an important role and we will look at the data analysis part using MySQL. As a data scientist we need to retrieve the insights from the data so this is what this module is all about.

List of modules covered under data analysis module | Source: Analytics Vidhya

1. Counting Rows & Items.

Demonstration of counting rows and items | Source : Analytics Vidhya

2. Aggregate Functions (SUM, AVG, STDDEV, MIN, MAX)

Demonstration of aggregate functions | Source: Analytics Vidhya

3. Slicing required data in MySQL.

In this part we will see IN, OR, NOT operators. Also we will see AND and BETWEEN operators of where clause.

IN OPERATOR:

Demonstration of IN Operator in My SQL | Source: Analytics Vidhya

NOT IN OPERATOR:

Demonstration of NOT IN Operator in MySQL | Source: Analytics Vidhya

BETWEEN OPERATOR:

Demonstration of BETWEEN in MySQL | Source: Analytics Vidhya

4. Limiting the data in MySQL.

In this part we will see how we can limit the number of rows to view from our tables.

We can limit the number of rows using LIMIT keyword.

Demonstration of number of rows in MySQL | Source: Analytics Vidhya

5. Sorting data in MySQL.

ASCENDING ORDER:

Demonstration of sorting salary in ASCENDING order and getting 2nd lowest salary | Source: Analytics Vidhya

DESCENDING ORDER:

Demonstration of sorting salary in DESCENDING order and getting 2nd largest salary | Source: Analytics Vidhya

6. Filtering patterns using LIKE and wildcard in MySQL.

Demonstration of filtering operation in MySQL | Source: Analytics Vidhya

7. Grouping, Rolling up and Filtering in group in MySQL.

Getting count of unique department | Source: Analytics Vidhya
Examples of filtering rolled up data in MySQL | Source: Analytics Vidhya

DESCRIPTIVE ANALAYSIS: FIFA 2019

The above link consists a project using MySQL queries. We have done descriptive analysis on FIFA 2019 players dataset.

WORKING ON JOINS IN MySQL.

In real scenario data is not stored in a single table. It is stored in multiple tables. In this module we will see how we can get the data from multiple tables using something known as joins and subqueries.

Types of Joins.

Types of Joins in MySQL | Source: Analytics Vidhya

1.Left Join.

Description about LEFT JOIN | Source: Analytics Vidhya
Practical examination LEFT JOIN | Source: Analytics Vidhya

2. Inner Join.

Description about INNER JOIN | Source: Analytics Vidhya

3. Cross Join / Cartesian Join.

We must note that this type of join is rarely used in practice.

Description about CROSS JOIN | Source: Analytics Vidhya

4. Right Join.

Description about RIGHT JOIN | Source: Analytics Vidhya

5. Self Join.

Self Join is a scenario where we join a table to itself. It is complex to debug and must be use only when we require pair wise list. Eg. Pairwise players belonging to same club in FIFA_2019 dataset or pairwise distribution of players whose age difference is less than 10.

Description about SELF JOIN | Source: Analytics Vidhya

INTRODUCTION TO INDEXING

Indexing helps us to make query and search fast in the database.

Description about INDEX |Source: Analytics Vidhya

1. How indexing works ?

Let us consider the example of Library and understand a scenario where indexing will help us for faster search.

Case 1: If there is a scene that a new book has comes in and along with that there are few other books as well which needs to updated or removed or have become outdated. So if a librarian keep a new book on some random self so to find that book next time it will take a lot of time.

Case 2: If a smart librarian has sorted all the books based on the subject followed by author in alphabetical order then keeping a new book which belongs to XYZ subject for XYZ author becomes easier for fetching.

So, considering the case 2 we can say that, indexing on subject followed by author will make our search faster and will also help us for insertion, updating selves followed by removing outdated books without any difficulty.

2. Relationship.

Types of Relationship | Source: Analytics Vidhya

3. Table Constraints.

Table constraints put constraints on the data which is inserted in the table. If we try to insert data violating that constraint MySQL won’t allow us to insert that data.

Description about Table Constraints | Source: Analytics Vidhya

MySQL BUILT IN FUNCTIONS.

In this section we will discuss about some of the built in functions used in MySQL which can help us in reporting.

1.String Function: CONCAT

This function is used to concatenate strings from the data. For eg. if we need to concatenate first and last name of the person from the data. We use the concat keyword in MySQL to achieve this.

Example of CONCAT operation | Source: Analytics Vidhya

2. String Function: CASE

This function is use to convert strings into lowercase or uppercase. We use ucase (UPPER) or lcase (LOWER) keyword in MySQL to achieve this.

Example of CASE operation | Source: Analytics Vidhya

3. String Function: Trimming Strings

This function is use to remove trailing and leading spaces in a string. We use ltrim, rtrim and trim functions of MySQL to achieve this.

Example of TRIMMING strings | Source: Analytics Vidhya

4. String Function: Extracting / Slicing Strings

This is use to extract some part of strings. We use left, right and substring keyword for extracting first, last and some characters of string.

Example of SLICING strings | Source: Analytics Vidhya.

5. Date-Time Functions: Current Date & Time.

We use now() function to get the current date along with timestamp. Along with that we can also use curdate(), curtime(), unix_timestamp() and observe the following result.

Example for getting current date and time using MySQL functions | Source: Analytics Vidhya

6. Date-Time Functions: Extracting Date & Time.

Examples for extracting date and time from a particular column in MySQL | Source: Analytics Vidhya

7. Date-Time Function: Extracting Month / Day / Year

Example for extracting Month / Day / Year from a particular column in MySQL | Source: Analytics Vidhya

8. Date-Time Function: Formatting Date-Time

Example of formatting date column with required date format | Source: Analytics Vidhya

9. Numeric Functions

There are various numeric functions which are used in MySQL. We will see following function:

  1. round() function which is used to round the number to nearest integer.
Example of ROUND function in MySQL| Source: Analytics Vidhya

2. floor() function which is use to round down the values.

Example of FLOOR function in MySQL | Source: Analytics Vidhya

3.ceil() function which is used to round up the values.

Example of CEIL function in MySQL | Source: Analytics Vidhya

4.truncate() function which help us to display number of the decimal points required.

Example of TRUNCATE function in MySQL | Source: Analytics Vidhya

5. abs() functions to remove the negative sign of the numbers.

Example of ABS function in MySQL | Source: Analytics Vidhya

MySQL WINDOWS FUNCTIONS

MySQL windows functions are used to perform data analysis calculation more easily. We will be using OVER() and PARTITIONED BY quite often if we want to use windows function.

OVER(): It determines the window (set of rows)

PARTITIONED BY : It splits the result into partitions where window function is applied.

To demonstrate windows functions we will be creating demo database called SALES. Consider the schema of the database / table below.

Schema of sales DB | Source: YouTube (SQL Tutorial — Window Functions by Bearded Dev)

1. To get Total amount / Sales.

To get the total amount per sales and then populate back to the main table we would have use CTE (Common Table Expression). The query along with the output is shown below.

Demonstration of CTE function in MySQL | Source: YouTube (SQL Tutorial — Window Functions by Bearded Dev)

Now to debug and grasp the above query is little bit difficult. It is also not easiest to read. So, here comes the windows function to rescue us from this problem where we will be using OVER clause which helps us to divide the data in windows and use PARTITION BY clause along with it to further break down the data.

Demonstration of Windows Function | Source: YouTube (SQL Tutorial — Window Functions by Bearded Dev)

2. To get Number of Line Amount / Sales along with Total Sales.

Demonstration of Windows Function | Source: YouTube (SQL Tutorial — Window Functions by Bearded Dev)

3. To get the Daily Sales.

Demonstration of Windows Function | Source: YouTube (SQL Tutorial — Window Functions by Bearded Dev)

Conclusion

So, in this blog we learnt basic MySQL knowledge required for any Data Scientist. We started with introduction and understanding the history. We defined what are DDL, DML and DCL commands. We studied about them in details and explored them with examples. We understood how analysis can be carried out by writing SQL queries and did descriptive analysis of FIFA_2019 players project. We also understood what are various types of joins and built in functions. Last but not the least we understood the windows functions which can be helpful in reporting purposes.

Hope you understood the basic intuition behind this blog.

Do connect with me on LinkedIn : https://www.linkedin.com/in/gaurav-rajpal/

If you liked it please give 50 applauses.

--

--

Analytics Vidhya
Analytics Vidhya
Gaurav Rajpal
Gaurav Rajpal