Parsing Ripe Bulk Whois Data

Published in

Parsing Bulk Whois Data

2 min readJun 18, 2019

At one time, I was working with large amounts of whois data from numerous sources like Arin, Afrinic, Ripe, etc. Some of these sources suggested using Perl or Ruby to parse these out. But I am a Python guy, so I set out to parse these with my language of choice.

In the following, you will see how it is done with Python.

After downloading the latest Ripe DB file from ripe.net, we need to read this data in. I chose the pandas library for that.

The data looks like the following after the initial read containing 87,447,972 rows:

To be nice, the data is a mess. We have several rows for the same CIDR block that start with the same words like ‘descr:’ and ‘remarks:’ and rows that have useless information like the first 5. So some cleaning needs to be done.