Published in


From Kaggle to Snowflake

In a previous blogpost I showed how to load .csv-files into Snowflake. I downloaded these files manually from Kaggle. In this post I show you how to make use of the Kaggle API to remove the manual download part.

Install Kaggle

I have used Kaggle in a Anaconda environment. Therefore I have a separate environment in which I installed Kaggle.

Kaggle API key

First we have to create a Kaggle API key which is necessary to connect to Kaggle. If you have a Kaggle account, you can create new API Token from you account settings (<user_name>/account).

Standard implementation

Clicking the button above generates a kaggle.json-file. This file needs to be stored in a folder called .kaggle in your home directory. The kaggle.json-file has the following structure:

In the Python-script you can use the OS environment variables directly to authenticate, like presented below:

Customised example

For this example, I was curious whether I could include the kaggle.json content to the Credentials-file I used in my previous example.

Authentication in this customised example goes hand in hand with the authentication to Snowflake. The same Credentials-file is referenced for both Snowflake as well as Kagggle:

Download from Kaggle

Next step is downloading files from Kaggle. For this we reference the Kaggle API, specifically; the dataset_download_files() method

Unzip records

Data from Kaggle is downloaded in .zip-format. You can unzip the files from within the Kaggle API; ‘unzip=True’.

An alternative is to unzip the files via the statement below:


The remainder of is similar to the previous post; From .csv to Snowflake.

  • Reading .csv Data
  • Creating Snowflake objects
  • Loading Data into Snowflake

Find the code for this blogpost on Github.

Thanks for reading and till next time.

Daan Bakboord — DaAnalytics

Snowflake articles from engineers using Snowflake to power their data.

Recommended from Medium

MasterNodes Tips on Running a Masternode on a Virtual Machine Behind a Firewall

How to work synchronously with Firebase + Coroutines + LiveData + MVVM + Clean Architecture

Building a Z80 Disassembler in Elixir

How to send out an email alert when there is no activity on an opportunity record for 3 weeks…

Hack The Box — Cap

Data Wrangling Solutions — Working With Dates — Part 3

Bulk Importing ASICs

Politics, Python, and Wikipedia

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Daan Bakboord

Daan Bakboord

Cloud ☁️ Data & Analytics 📊 Engineer @ DaAnalytics | Manager Data & Analytics @ Pong | Snowflake ❄️ Data Superhero | Modern Cloud ☁️ Data Stack enthusiast

More from Medium

How to Geocode Addresses in Snowflake

How to Use Snowflake with Tecton

Snowflake Scripting Series: Branching Construct

Snowflake Data Clean Rooms: The Problem with …Yao’s Millionaires’ Problem