Adventures with Python: Warsaw’s bike parkings
Python has always been a language I wanted to improve in. Mostly because of its data tools. Also being an active bicycle commuter I to grab data from a crowd-sourced Warsaw’s bicycle parking location aggregator. Once scrapped the dataset is straightforward: lat-lng coordinates, number of racks at a given location, its address and some additional data.
At this point I should warn you, dear reader, that unlike in my previous blog posts there are no code snippets here and I write about very basic stuff.
Warsaw is divided into 18 districts (or suburbs).
Asking how many racks are in each of those districts already gave me an opportunity to experiment with Python’s ecosystem. Here are some useful tools I’ve discovered:
- Shapely for all geo operations. I’ve used it to determine if a point (rack location) is within a polygon (suburb boundaries)
- lxml — a great XML library. Used for mangling OpenStreetMap data. I used OSM for obtaining the district boundaries.
- pandas — the starting-point for data analysis in Python.
- IPython Notebook — Python in your web browser
- TileMill (unrelated to Python) — a desktop app (OSX/Win/Ubuntu) for styling and displaying map data. Supports input in GeoJSON, CSV and Shapefiles among others
Here are the results:
Śródmieście (downtown) unsurprisingly has the most racks (764), Ursynów (490), Praga Południe (416) and Mokotów (402) are next. Sadly, in the centrally located Żoliborz (65) and Wola (100) it is the hardest to safely park your bike.
What’s next? I’ve started reading Python for Data Analysis from O’Reilly. As there are a few other Warsaw datasets I might just use them for practicing what learn.
The city cycle transport official recently announced (pl) which districts will be setting up new bike racks. Perhaps it’s a good idea to check the numbers and put some pressure on the less active districts (Żoliborz or Wola).
Originally published at maciejb.me on April 22, 2014.