TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Member-only story

Dramatically improve your database insert speed with a simple upgrade

Mike Huls
6 min readMay 9, 2021

--

your python script after being upgraded with fast_executemany (Image by NASA on Unsplash)

Uploading data to your database is easy with Python. Load your data into a Pandas dataframe and use the dataframe.to_sql() method. But have you ever noticed that the insert takes a lot of time when working with large tables? We don’t have time to sit around, waiting for our queries to finish!

graph that shows the extreme speed of fast_executemany
Image based on François Leblanc on leblancfg.com

With a few tweaks you can make inserts A LOT faster. Compare the write times of the brown (default to_sql() method) with the green bar (our goal). Also notice that the vertical axis is on a logarithmic scale! At the end of this article you’ll be able to perform lightning fast database operations. Ready? Lets go!

Goals and steps

We’ll work towards the superfast insertion method in two steps. The first part focusses on how to properly connect to our database, the second part will explore 4 ways to insert data in ascendingly fast order.

1. Connecting to our database

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Mike Huls
Mike Huls

Written by Mike Huls

I write about interesting programming-related things: techniques, system architecture, software design and how to apply them in the best way. — mikehuls.com

Responses (6)