Download Gitter Room to a JSON File

A step-by-step guide to download Gitter room’s messages for your Data Science Project.

INDRAJITH EKANAYAKE
Geek Culture
3 min readApr 19, 2021

--

Gitter Home Page: https://gitter.im/

I recently saw freeCodeCamp Gitter Chat, 2015–2017 on Kaggle and this made me curious about how to download a given Gitter chat as a JSON for any Data Science project.

As per the website, Gitter is an open-source instant messaging and chat room system for developers and users of GitLab and GitHub repositories. Using the chat rooms these developers were discussing their problems and solutions for years. For example, the above-mentioned dataset contains over 5 Million messages. So if you are a Data Science geek and hungry for data this method will allow you to download any given Gitter chat room into JSON in minutes.

After some research, the only way I figured out was to use the gitter-export-room npm package to Export a JSON archive of a Gitter room’s messages, but the documentation is not pretty straight forward and it took me some time to understand the flow. So, from this article, my aim is to make the thing easy as pie.

gitter-export-room NPM package

Step 1: (If you already have installed Node.js please skip this step.)

NPM is a package manager for the JavaScript programming language. and you can download the latest from https://nodejs.org/en/download/. Once you installed NPM we can verify the installation by npm -v command.

Verify the installation

Step 2: Install the gitter-export-room package globally

Run npm install --global gitter-export-roomto install the package globally.

Install the gitter-export-room package globally

Step 3: Get the personal access token to access via Gitter API

Log into Gitter Developer and copy the personal access token

Copy the personal access token

Step 4: List down your Gitter rooms

Head back to the terminal and run the gitter-export-room --token <your_token> listcommand to list down your Gitter rooms.

Note: In order to download the chat as JSON we need to join the Gitter room first.

List down Gitter rooms

Step 5: Download the data by room ID

Now we have listed down room IDs and we can download the data into JSON by room ID. For that we use gitter-export-room --token <your_token> id <room_id> >file.jsoncommand and it will create file.json in your current directory.

Download the data by room ID
JSON data

One Final Request

Here my primary intention is to use this public data to train machine learning models for my chatbot project and I do not want to invade private data. My final request from the world is to use the data for good and not to invade other's privacy.

All the tribute goes to Christopher Hiller the creator of gitter-export-room NPM package.

--

--

INDRAJITH EKANAYAKE
Geek Culture

Microsoft MVP | Lecturer | Researcher enjoys simplifying tech. Connect with me on https://www.linkedin.com/in/indrajithek/