Connect Python with MySQL

Lina Haidar
4 min readAug 9, 2021

--

Photo by Hitesh Choudhary on Unsplash

This is the third out of four posts of the entire project Automating an end-to-end Data Pipeline on AWS Cloud.

In the first post Collect Data Using Web-scraping and APIs, we collected data and stored it in lists. In the second post Create a Database on MySQL Workbench, we set up a database on MySQL workbench with a structure that matches the data. In this post, we are going to fill this database with the data we collected using Connector/Python.

Normally, inserting values directly in the database using MySQL can be done by running the following query:

-- to manually fill in a table
insert into table_name ( parameter1 , parameter2 ) values ( value1 ,
value2 ) ;
-- Example : insert into cities ( city_id , city_name , country ) values
( 2950159 , ’ Berlin ’ , ’ Germany ’) ;

However, if the data is big, it would take months to fill the database this way.

Fortunately, there are alternative and more efficient ways to fill databases. One of them is by using Connector/Python, which can be summarized in two steps:

Step 1: connect python with MySQL server

To create a connection to the MySQL server, run the following lines of code:

# import the library
import mysql.connector
cnx = mysql.connector.connect(user='root',password='your_root_pwd', # your password of MySQL serverhost='127.0.0.1', # Server name or IP address on which MySQL is running. 127.0.0.1 for local hostdatabase='db_name' # the name of the database you want to connect to)# cnx is a MySQLConnection object# connect to databasecursor = cnx.cursor()

Once the connection to the MySQL server is established, we are ready to insert our values into the database.

Step 2: run SQL queries from Python to fill in the database

Basically, inserting values one by one is also possible and it is done using a similar way to what we did using a SQL query on MySQL workbench in the previous post. Below is an example of inserting only one row of values into a table:

query = """INSERT INTO table_name (parameter1 , parameter2 ) values( value1 , value2 );"""# after the query is defined, you are ready to execute itcursor.execute(query)
cnx.commit()
# close the connection after the insertion is completecursor.close()
cnx.close()

Notice that when inserting values, you need to specify the table name of the database: table_name. Also, the names of parameters must match with those of the table. And of course the datatype of the values must match with what you already defined when creating the table. For example, if one of the variables (or column name) of the table is set to be an integer:

city_id INT, 

then the inserted value must be an integer.

Now we are ready to do the real work which is inserting the values we collected in the first post into our database that we set up in the second post. We are going to take the above code and do a little modification such that instead of inserting one value, the entire list gets passed into the table. In the first post of this project, we stored the values of the flights in four lists: flight_num, departure_icao, arrival_icao, arrival_time. The following code inserts those lists in the database we created in the second post:

# inserting values of flights into MySQL database
# connect to database
# define connection details
cnx = mysql.connector.connect(
user='host',
password='your password',
host='127.0.0.1', # to connect to your local instance
database='gans' #type the name of the database you want to use here
)# connect to database
cursor = cnx.cursor()
# you should have flight_num, departure_icao, arrival_icao and arrival_time as listsquery = 'INSERT INTO flights (flight_num, departure_icao, arrival_icao, arrival_time) VALUES (%s, %s, %s, %s);'for i in range(len(flight_num)):
value = (flight_num[i], departure_icao[i],
arrival_icao[i], arrival_time[i] )
cursor.execute(query, value)
cnx.commit()
cursor.close()
cnx.close()

The additional step that we added here is creating a tuple of lists, and the for loop executes the same query for each of the values of the lists inside the tuple. If the codes runs without any error, you can check the flights table on my MySQL workbench by running:

SELECT * FROM flights ;

To fill in the other tables, the same approach can be followed. You need to only change the name of the table and its parameters.

If you followed the project from the start, by now, the tables on MySQL
should be filled.

It would seem that this is the end of the project. Well, in some cases, depending on the type of the project, filling the tables on our local host should be the last step. However, in our project we are collecting data of weather forecasts and flights of cities. The values change over time. This means that we have to run our scripts every day to first collect the daily weather forecasts and flights, and then to insert these values on our database.

Luckily, there is an alternative way. We can automate the data pipeline using AWS lambda functions. This means, we can set a trigger to let the lambda function run periodically according to our needs. This way, we do not have to worry about updating our databases frequently without our further intervention. More on this topic on the next post.

--

--