Downloading a dataset with Kaggle API and simulating a fictional camera

Koray Çağlar
3 min readSep 18, 2022

--

Note: This tutorial is part of a larger tutorial. See the main article here:

In this tutorial I will explain how to download a dataset from Kaggle using Kaggle API.

After that, concerning the main project, we will write a python script to transfer files from the dataset to a folder to simulate a camera.

Installing the Kaggle API:

We should install pip first:

$ sudo apt install python3-pip

Install the Kaggle API:

$ pip install kaggle

Now if you enter command “kaggle” you will get a “kaggle: command not found” error. A symbolic link between /.local/bin and /usr/bin should be created. You do that by entering the following command:

$ sudo ln -s ~/.local/bin/kaggle /usr/bin/kaggle

You will also get an error “Could not find kaggle.json” error when you enter a kaggle command. That’s because you should have an API token to use Kaggle API. If you do not have a Kaggle account, go create one. If you have an account, go to the account page.

Click your profile photo in the top-right. Then click Account.

In the API section, click “Create New API Token”. It should download your kaggle.json file.

Back to your VM, click “Upload File” button in the top-right.

Upload the kaggle.json file you just downloaded. The VM can crash, just try again. After uploading, write “ls” to see the file in your system.

We should move the json file into the .kaggle folder. To do so:

$ mv kaggle.json .kaggle

You can enter Kaggle commands now.

Download a dataset

Go to a dataset’s page. Click the 3-points button next to the Download button. Click “Copy API command”.

In my case the dataset is “Marble surface anomaly detection”. Link to the dataset:

Marble Surface Anomaly Detection — 2 | Kaggle

Enter the copied command to download the dataset zip file. My dataset’s command is as follows:

$ kaggle datasets download -d wardaddy24/marble-surface-anomaly-detection-2

Write “ls” to see the downloaded zip file. We need to extract our data from the zip file. Download the unzip package by entering the following command:

$ sudo apt install unzip

The command to unzip the zip file in my case (change the name if yours is different):

$ unzip marble-surface-anomaly-detection-2.zip

Write “ls” to see the dataset folder. Remove the zip file to have a tidy workspace:

$ rm marble-surface-anomaly-detection-2.zip

Simulate a camera

As we do not have an IRL camera, we should make it appear as like we do have a camera. We do that by writing a Python script. The script will copy an image from the dataset to our newly created “imageserver” folder. You can imagine it like an IRL camera is taking a new photo and sending it to the imageserver folder every 3 seconds.

Create a new folder named “imageserver”:

$ mkdir imageserver

Create a new python script named “camera.py”:

$ sudo nano camera.py

“nano” is a text editor. Inside the editor copy the code:

Change the “username” parts in the directories with your username.

Press Ctrl + X to exit from the editor. Press Y to save it and then Enter.

Run the script:

$ python3 camera.py

The script transfers an image file every 3 seconds. Wait a bit. Then press Ctrl + C to stop the script. Now go to the imageserver folder and list the contents. You should see some jpg files.

--

--