Building A Bike Share Simulation Using Python

Marco Sanchez-Ayala
The Startup
Published in
8 min readMay 17, 2020
Photo by Anthony Fomin on Unsplash

Some Background

In my last post, I introduced a Python project aimed at implementing object-oriented programming. The goal was to create a simple bike share simulation modeled after New York City’s Citibike, of which I am a big fan!

A bike share system can quite easily be modeled with OOP because there are just a few basic components that interact and need to be tracked:

  1. Bikes
  2. Docks
  3. Stations

A bike share system has a fixed number of stations. Each station holds a fixed number of docks that can hold bikes that users can ride between stations. Example 1 below shows possible bike movement across a simple implementation of my simulation’s bike share system.

Example 1

We have three stations, each with a single dock, and one bike total. Bike 0 can be checked out of Station 0 and moved to Station 1 or 2 (assuming that there are no joy rides resulting in a bike being checked out of and into the same station). The time at which the bike is checked out and checked in is recorded, and the trip duration is used to calculate the price of that ride. Once at Station 1 or 2, the bike can be checked out again and moved to one of the open stations.

Bikes can only be checked into stations with empty docks. Example 2 shows a case where Bike 0 can only move to Station 1 because Station 2 is occupied.

Example 2

In reality, the system will have more stations, docks, and bikes. The base implementation of my bike share simulation has by default 9 stations of equal size with a total of 135 docks and 80 bikes.

In my previous post, I mentioned that I’d share some code, but I think there may be more value in instead explaining my process and organization since I’d like to document my process here more than anything. After all, the code itself is available on GitHub and will make much more sense after the following explanation.

Planning

I’m a firm believer of having at least some outline of a project before jumping in, so the very first thing I did was to plan out the major pieces of this simulation and how they would interact. It wouldn’t make much sense jumping into the logic of the simulation itself right away, so I started with the core components of a bike share system.

As described above, there are really only 3 pieces here: bikes, docks, and stations. Because I knew there will ultimately be several of each that all behave similarly to their counterparts, these three entities became excellent candidates for having their own classes. It’s worth noting though that I intentionally leave out customers because they’re not necessary to keep track of the ride lengths and prices.

Without any coding, I started out by writing down what each class would need in order for a system full of these objects to capture a complete picture of a bike share system. First I considered stations, then docks, and finally bikes.

Station

  • Unique identifier: to keep track of which stations are being checked in or out of.
  • Location: a coordinate pair to be able to calculate distances (and thus trip duration) between stations.
  • Size: the fixed number of docks this station contains.
  • Docks: one or more docks that can hold bikes.
  • Activity log: to keep track of which bikes come and go at what times, and how much was paid for rides.

Dock

  • Unique identifier: to keep track of which dock within the current station is being checked in or out of. Only needs to be unique within this station. In other words, each station starts with Dock 0 as in Examples 1 and 2 above.
  • Bike: a Bike object waiting to be checked out or None.
  • Check in/check out methods: a way to record the times and locations at which bikes enter/exit this dock.

Bike

  • Unique identifier: to keep track of which bike in the system is moving around.
  • Rate: the price that this bike costs to use for different amounts of time.
  • Condition: to keep track of the wear and tear. Originally, I thought this would be cool to implement, but I decided there would be too much to incorporate since the idea was that after a certain number of uses, the bike would need to be repaired. It would be an unnecessary amount of work given that my goal is really just to build a simulation using OOP, not to model a bike share system perfectly.

Lastly, I knew there would need to be some way to put these pieces together: a Simulation class.

Simulation

This class needs to instantiate a given number of Stations with enough Docks to fit a predetermined number of Bikes and have extra room for people to be able to check in bikes at other Stations. Simulation needs to then run a loop that simulates the passing of time with each iteration representing one minute. Every minute, this loop needs to

  1. Determine if any bikes will be checked out, determine where it would be, and execute the check out(s).
  2. Determine if any bikes need to be checked in, determine if the destination is available, and execute the check in(s) if possible. If there are no open docks at the destination, then the bike needs to be sent to another station to park.
  3. Keep track of any bikes “in transit”.

At the end of the simulation, I wanted this class to output some statistics like the total number of completed trips, the total revenue for the bike share company, the average trip duration, etc.

Implementation

To actually implement this, I first created a repository with the following empty files for my models.

Models

bike-share-sim
|-- sim
| |-- __init__.py
| |-- bike.py
| |-- dock.py
| |-- station.py
|-- test
| |-- __init__.py
| |-- test_bike.py
| |-- test_dock.py
| |-- test_station.py

I started by first implementing bike.py, as it’s the simplest of the three models and also is a building block needed for the other two. As I developed bike.py I also used the unittest package to develop test_bike.py to make sure every method in bike.py was working as expected. I repeated this process with Docks and Stations. Soon thereafter, I had all the pieces I needed to actually create a simulation.

Simulation

This was the most difficult part of the project because its implementation involved figuring out how to simulate people coming and checking out bikes at different times. There are many approaches, but after some research I decided to use a Poisson process because this would allow to predict bike checkouts every minute given a predetermined average checkout rate that can be modified to suit the needs of the simulation.

Every minute, I calculate the probability of K checkouts occurring, P(K). I use lambda = 4 rides/min based on some research I did into DC’s Q1 bike share data (accessible from Mode Analytics’ platform). However, picking a K value every minute is where I had to get creative.

I decided that the upper half of a normal distribution with mu = 1 and sigma= 1would be appropriate to model the number of people considering renting a bike at any given minute of the day for a system with 130 bikes. It seemed reasonable to me that every minute, generally speaking there is a chance that 1 or maybe 2 people could possibly try to check out a bike (although possible 3 or even 4 people could check one out). I say half the distribution because I only take the distribution for K >= 1 to avoid calculating the probability that 0 or a negative amount of people check out bikes.

Thus, every minute of the simulation I randomly pick a K from the distribution above and plug it into the Poisson probability function to calculate the probability (as a percentage) that K people will actually check out a bike that minute. Lastly, I pick a random number from 1–100 every minute. If random number <= P(K), then at minute t, K people attempt to check out bikes from the system.

I say “attempt” because there is the possibility that all bikes are currently in use, but that almost never happens.

The implementation of this class required several (repeated) constants to be defined, which I put into a separate module, consts.py for easy access and modification.

Putting Everything Together

The last thing I wanted to do was to make the whole simulation callable from the command line. Thus, in the root directory of this project, I created __main__.py that simply instantiates a Simulation object with a given duration in minutes to carry out the simulation.

Sample Run

The simulation instantiates the following bike share system:

There are 9 stations at evenly spaced out locations, each with 15 docks. 80 bikes are spread evenly throughout the stations. The simulation runs for 120 minutes and generates a summarized activity log for every minute of the simulation. Here is an example.

Minute: 33

— — Customer tried to check in a bike

— — Bike checked into Station: 2 Dock: 14

— — — —

— — Customer tried to check in a bike

— — Bike checked into Station: 5 Dock: 1

— — — —

At the end of the simulation, we get a summary message such as this:

SIMULATION IS OVER AFTER 120 MINUTES

There were 19 rides

Total revenue was $67.5

The average price per ride was $3.55

The average ride length was 16 minutes

Next Steps

It would be nice to tinker with this more to be able to have some command line input of simulation duration and number of bikes. Also, a visual component would be really nice as well. Perhaps a simple GUI could be built to visualize the simulation. Lastly, I’d like to incorporate other real bike share system features such as different types of bikes (such as ElectricBike and thus perhaps ElectricDock), wear and tear, and different models of the same type of bike.

--

--