Analyzing large files locally in seconds with DuckDB and DataGrip

Daniel Palma
3 min readMay 28, 2022

If you have ever received a huge csv file that you had to analyze or just quickly wanted to peek into to check it’s structure your go-to tool is usually pandas and a small Python script.

But if you are like me and always have DataGrip (or any other JDBC-compatible SQL IDE) open and for quick routine checks like this prefer SQL compared to Python, this guide is for you!

If you haven’t head about the two pieces of tech we’ll use here’s a short description of each:

DuckDB is an in-process. SQL OLAP database management system · All the benefits of a database, none of the hassle.

DataGrip is a database management environment for developers. It is designed to query, create, and manage databases. Databases can work locally, on a server, or in the cloud.

Set up the environment

  1. Download the DuckDB JDBC driver from Maven.
  2. In DataGrip create a new Driver configuration using the download jar file.

3. Create a new Data Source, the connection URL should be just jdbc:duckdb: as shown in the screenshot below.

--

--