Open Data Project

Published in

Design Computing

1 min readApr 28, 2017

This week I have started working on the Open Data Project. My dataset is a list of all registered dogs and cats in Geelong. I started using the csv/pandas module but then found out pandas already has a built-in read_csv function:

import pandas as pd; pd.read_csv(“registeredpets.csv”)

And that’s it.

Lots of issues with the actual data:

Loads of inconsistencies. Ages range from 0–70+ (Human or dog years?)
Breeds may be the same for rows but typed in differently (eg one row has a dog breed “Labrador Cross Breed Dog”, which I’d consider the same as “Labrador”)

What graphs I want to implement:

Dogs v Cats
Ages of both / compare
Registered/Not registered
Suburb/maybe a geographical map
(Un)common names

Columns: Suburb, breed, type (dog/cat), colour, registered, age, animal name.

There are around 46,000+ pets

Open Data Project

Written by TA